Skip to content

Commit c3dd676

Browse files
AlisonSchofielddavejiang
authored andcommitted
cxl/region: Add inject and clear poison by region offset
Add CXL region debugfs attributes to inject and clear poison based on an offset into the region. These new interfaces allow users to operate on poison at the region level without needing to resolve Device Physical Addresses (DPA) or target individual memdevs. The implementation uses a new helper, region_offset_to_dpa_result() that applies decoder interleave logic, including XOR-based address decoding when applicable. Note that XOR decodes rely on driver internal xormaps which are not exposed to userspace. So, this support is not only a simplification of poison operations that could be done using existing per memdev operations, but also it enables this functionality for XOR interleaved regions for the first time. New debugfs attributes are added in /sys/kernel/debug/cxl/regionX/: inject_poison and clear_poison. These are only exposed if all memdevs participating in the region support both inject and clear commands, ensuring consistent and reliable behavior across multi-device regions. If tracing is enabled, these operations are logged as cxl_poison events in /sys/kernel/tracing/trace. The ABI documentation warns users of the significant risks that come with using these capabilities. A CXL Maturity Map update shows this user flow is now supported. Signed-off-by: Alison Schofield <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Link: https://patch.msgid.link/f3fd8628ab57ea79704fb2d645902cd499c066af.1754290144.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <[email protected]>
1 parent 25a0207 commit c3dd676

File tree

5 files changed

+228
-4
lines changed

5 files changed

+228
-4
lines changed

Documentation/ABI/testing/debugfs-cxl

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,20 @@ Description:
1919
is returned to the user. The inject_poison attribute is only
2020
visible for devices supporting the capability.
2121

22+
TEST-ONLY INTERFACE: This interface is intended for testing
23+
and validation purposes only. It is not a data repair mechanism
24+
and should never be used on production systems or live data.
25+
26+
DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
27+
poison injection can result in permanent data loss. Injected
28+
poison may render data permanently inaccessible even after
29+
clearing, as the clear operation writes zeros and does not
30+
recover original data.
31+
32+
SYSTEM STABILITY RISK: For volatile memory, poison injection
33+
can cause kernel crashes, system instability, or unpredictable
34+
behavior if the poisoned addresses are accessed by running code
35+
or critical kernel structures.
2236

2337
What: /sys/kernel/debug/cxl/memX/clear_poison
2438
Date: April, 2023
@@ -35,6 +49,79 @@ Description:
3549
The clear_poison attribute is only visible for devices
3650
supporting the capability.
3751

52+
TEST-ONLY INTERFACE: This interface is intended for testing
53+
and validation purposes only. It is not a data repair mechanism
54+
and should never be used on production systems or live data.
55+
56+
CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
57+
specified address range and removes the address from the poison
58+
list. It does NOT recover or restore original data that may have
59+
been present before poison injection. Any original data at the
60+
cleared address is permanently lost and replaced with zeros.
61+
62+
CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
63+
purposes only and should not be used as a data repair tool.
64+
Clearing poison is fundamentally different from data recovery
65+
or error correction.
66+
67+
What: /sys/kernel/debug/cxl/regionX/inject_poison
68+
Date: August, 2025
69+
70+
Description:
71+
(WO) When a Host Physical Address (HPA) is written to this
72+
attribute, the region driver translates it to a Device
73+
Physical Address (DPA) and identifies the corresponding
74+
memdev. It then sends an inject poison command to that memdev
75+
at the translated DPA. Refer to the memdev ABI entry at:
76+
/sys/kernel/debug/cxl/memX/inject_poison for the detailed
77+
behavior. This attribute is only visible if all memdevs
78+
participating in the region support both inject and clear
79+
poison commands.
80+
81+
TEST-ONLY INTERFACE: This interface is intended for testing
82+
and validation purposes only. It is not a data repair mechanism
83+
and should never be used on production systems or live data.
84+
85+
DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
86+
poison injection can result in permanent data loss. Injected
87+
poison may render data permanently inaccessible even after
88+
clearing, as the clear operation writes zeros and does not
89+
recover original data.
90+
91+
SYSTEM STABILITY RISK: For volatile memory, poison injection
92+
can cause kernel crashes, system instability, or unpredictable
93+
behavior if the poisoned addresses are accessed by running code
94+
or critical kernel structures.
95+
96+
What: /sys/kernel/debug/cxl/regionX/clear_poison
97+
Date: August, 2025
98+
99+
Description:
100+
(WO) When a Host Physical Address (HPA) is written to this
101+
attribute, the region driver translates it to a Device
102+
Physical Address (DPA) and identifies the corresponding
103+
memdev. It then sends a clear poison command to that memdev
104+
at the translated DPA. Refer to the memdev ABI entry at:
105+
/sys/kernel/debug/cxl/memX/clear_poison for the detailed
106+
behavior. This attribute is only visible if all memdevs
107+
participating in the region support both inject and clear
108+
poison commands.
109+
110+
TEST-ONLY INTERFACE: This interface is intended for testing
111+
and validation purposes only. It is not a data repair mechanism
112+
and should never be used on production systems or live data.
113+
114+
CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
115+
specified address range and removes the address from the poison
116+
list. It does NOT recover or restore original data that may have
117+
been present before poison injection. Any original data at the
118+
cleared address is permanently lost and replaced with zeros.
119+
120+
CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
121+
purposes only and should not be used as a data repair tool.
122+
Clearing poison is fundamentally different from data recovery
123+
or error correction.
124+
38125
What: /sys/kernel/debug/cxl/einj_types
39126
Date: January, 2024
40127
KernelVersion: v6.9

Documentation/driver-api/cxl/maturity-map.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@ Accelerator
173173
User Flow Support
174174
-----------------
175175

176-
* [0] Inject & clear poison by HPA
176+
* [2] Inject & clear poison by region offset
177177

178178
Details
179179
=======

drivers/cxl/core/core.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,10 @@ enum cxl_poison_trace_type {
135135
CXL_POISON_TRACE_CLEAR,
136136
};
137137

138+
enum poison_cmd_enabled_bits;
139+
bool cxl_memdev_has_poison_cmd(struct cxl_memdev *cxlmd,
140+
enum poison_cmd_enabled_bits cmd);
141+
138142
long cxl_pci_get_latency(struct pci_dev *pdev);
139143
int cxl_pci_get_bandwidth(struct pci_dev *pdev, struct access_coordinate *c);
140144
int cxl_update_hmat_access_coordinates(int nid, struct cxl_region *cxlr,

drivers/cxl/core/memdev.c

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,14 @@ static ssize_t security_erase_store(struct device *dev,
200200
static struct device_attribute dev_attr_security_erase =
201201
__ATTR(erase, 0200, NULL, security_erase_store);
202202

203+
bool cxl_memdev_has_poison_cmd(struct cxl_memdev *cxlmd,
204+
enum poison_cmd_enabled_bits cmd)
205+
{
206+
struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
207+
208+
return test_bit(cmd, mds->poison.enabled_cmds);
209+
}
210+
203211
static int cxl_get_poison_by_memdev(struct cxl_memdev *cxlmd)
204212
{
205213
struct cxl_dev_state *cxlds = cxlmd->cxlds;

drivers/cxl/core/region.c

Lines changed: 128 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
33
#include <linux/memregion.h>
44
#include <linux/genalloc.h>
5+
#include <linux/debugfs.h>
56
#include <linux/device.h>
67
#include <linux/module.h>
78
#include <linux/memory.h>
@@ -3003,9 +3004,8 @@ struct dpa_result {
30033004
u64 dpa;
30043005
};
30053006

3006-
static int __maybe_unused region_offset_to_dpa_result(struct cxl_region *cxlr,
3007-
u64 offset,
3008-
struct dpa_result *result)
3007+
static int region_offset_to_dpa_result(struct cxl_region *cxlr, u64 offset,
3008+
struct dpa_result *result)
30093009
{
30103010
struct cxl_region_params *p = &cxlr->params;
30113011
struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
@@ -3648,6 +3648,105 @@ static void shutdown_notifiers(void *_cxlr)
36483648
unregister_mt_adistance_algorithm(&cxlr->adist_notifier);
36493649
}
36503650

3651+
static void remove_debugfs(void *dentry)
3652+
{
3653+
debugfs_remove_recursive(dentry);
3654+
}
3655+
3656+
static int validate_region_offset(struct cxl_region *cxlr, u64 offset)
3657+
{
3658+
struct cxl_region_params *p = &cxlr->params;
3659+
resource_size_t region_size;
3660+
u64 hpa;
3661+
3662+
if (offset < p->cache_size) {
3663+
dev_err(&cxlr->dev,
3664+
"Offset %#llx is within extended linear cache %#llx\n",
3665+
offset, p->cache_size);
3666+
return -EINVAL;
3667+
}
3668+
3669+
region_size = resource_size(p->res);
3670+
if (offset >= region_size) {
3671+
dev_err(&cxlr->dev, "Offset %#llx exceeds region size %#llx\n",
3672+
offset, region_size);
3673+
return -EINVAL;
3674+
}
3675+
3676+
hpa = p->res->start + offset;
3677+
if (hpa < p->res->start || hpa > p->res->end) {
3678+
dev_err(&cxlr->dev, "HPA %#llx not in region %pr\n", hpa,
3679+
p->res);
3680+
return -EINVAL;
3681+
}
3682+
3683+
return 0;
3684+
}
3685+
3686+
static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
3687+
{
3688+
struct dpa_result result = { .dpa = ULLONG_MAX, .cxlmd = NULL };
3689+
struct cxl_region *cxlr = data;
3690+
int rc;
3691+
3692+
ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
3693+
if ((rc = ACQUIRE_ERR(rwsem_read_intr, &region_rwsem)))
3694+
return rc;
3695+
3696+
ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
3697+
if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
3698+
return rc;
3699+
3700+
if (validate_region_offset(cxlr, offset))
3701+
return -EINVAL;
3702+
3703+
rc = region_offset_to_dpa_result(cxlr, offset, &result);
3704+
if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
3705+
dev_dbg(&cxlr->dev,
3706+
"Failed to resolve DPA for region offset %#llx rc %d\n",
3707+
offset, rc);
3708+
3709+
return rc ? rc : -EINVAL;
3710+
}
3711+
3712+
return cxl_inject_poison_locked(result.cxlmd, result.dpa);
3713+
}
3714+
3715+
DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_inject_fops, NULL,
3716+
cxl_region_debugfs_poison_inject, "%llx\n");
3717+
3718+
static int cxl_region_debugfs_poison_clear(void *data, u64 offset)
3719+
{
3720+
struct dpa_result result = { .dpa = ULLONG_MAX, .cxlmd = NULL };
3721+
struct cxl_region *cxlr = data;
3722+
int rc;
3723+
3724+
ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
3725+
if ((rc = ACQUIRE_ERR(rwsem_read_intr, &region_rwsem)))
3726+
return rc;
3727+
3728+
ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
3729+
if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
3730+
return rc;
3731+
3732+
if (validate_region_offset(cxlr, offset))
3733+
return -EINVAL;
3734+
3735+
rc = region_offset_to_dpa_result(cxlr, offset, &result);
3736+
if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
3737+
dev_dbg(&cxlr->dev,
3738+
"Failed to resolve DPA for region offset %#llx rc %d\n",
3739+
offset, rc);
3740+
3741+
return rc ? rc : -EINVAL;
3742+
}
3743+
3744+
return cxl_clear_poison_locked(result.cxlmd, result.dpa);
3745+
}
3746+
3747+
DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL,
3748+
cxl_region_debugfs_poison_clear, "%llx\n");
3749+
36513750
static int cxl_region_can_probe(struct cxl_region *cxlr)
36523751
{
36533752
struct cxl_region_params *p = &cxlr->params;
@@ -3677,6 +3776,7 @@ static int cxl_region_probe(struct device *dev)
36773776
{
36783777
struct cxl_region *cxlr = to_cxl_region(dev);
36793778
struct cxl_region_params *p = &cxlr->params;
3779+
bool poison_supported = true;
36803780
int rc;
36813781

36823782
rc = cxl_region_can_probe(cxlr);
@@ -3700,6 +3800,31 @@ static int cxl_region_probe(struct device *dev)
37003800
if (rc)
37013801
return rc;
37023802

3803+
/* Create poison attributes if all memdevs support the capabilities */
3804+
for (int i = 0; i < p->nr_targets; i++) {
3805+
struct cxl_endpoint_decoder *cxled = p->targets[i];
3806+
struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
3807+
3808+
if (!cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_INJECT) ||
3809+
!cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_CLEAR)) {
3810+
poison_supported = false;
3811+
break;
3812+
}
3813+
}
3814+
3815+
if (poison_supported) {
3816+
struct dentry *dentry;
3817+
3818+
dentry = cxl_debugfs_create_dir(dev_name(dev));
3819+
debugfs_create_file("inject_poison", 0200, dentry, cxlr,
3820+
&cxl_poison_inject_fops);
3821+
debugfs_create_file("clear_poison", 0200, dentry, cxlr,
3822+
&cxl_poison_clear_fops);
3823+
rc = devm_add_action_or_reset(dev, remove_debugfs, dentry);
3824+
if (rc)
3825+
return rc;
3826+
}
3827+
37033828
switch (cxlr->mode) {
37043829
case CXL_PARTMODE_PMEM:
37053830
rc = devm_cxl_region_edac_register(cxlr);

0 commit comments

Comments
 (0)