* [RFC PATCH v9 00/11] EDAC: Scrub: Introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers
@ 2024-07-16 15:03 shiju.jose
2024-07-16 15:03 ` [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver shiju.jose
` (10 more replies)
0 siblings, 11 replies; 30+ messages in thread
From: shiju.jose @ 2024-07-16 15:03 UTC (permalink / raw)
To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
Cc: bp, tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
kangkang.shen, wanghuiqiang, linuxarm, shiju.jose
From: Shiju Jose <shiju.jose@huawei.com>
Previously known as "ras: scrub: introduce subsystem + CXL/ACPI-RAS2 drivers".
EDAC based Subsystem for controlling RAS Features
=================================================
The proposed EDAC based subsystem for controlling RAS features and
expose the feature's control attributes to the userspace in sysfs.
Some Examples:
- Scrub control
- Error Check Scrub (ECS) control
- ACPI RAS2 features
- ACPI Address Range Scrubbing (ARS)
- Post Package Repair (PPR) etc.
High level design is illustrated in the following diagram.
_______________________________________________
| Userspace - Rasdaemon |
| ____________ |
| | RAS CXL | _____________ |
| | Err Handler|----->| | |
| |____________| | RAS Dynamic | |
| ____________ | Scrub | |
| | RAS Memory |----->| Controller | |
| | Err Handler| |_____________| |
| |____________| | |
|__________________________|____________________|
|
|
_______________________________|______________________________
| Kernel EDAC based SubSystem | for RAS Features Control |
| ______________________________|____________________________ |
|| EDAC Core Sysfs EDAC| Bus | |
|| __________________________|_______ _____________ | |
|| |/sys/bus/edac/devices/<dev>/scrub/| | EDAC Device | | |
|| |/sys/bus/edac/devices/<dev>/ecs*/ |<->| EDAC MC | | |
|| |/sys/bus/edac/devices/<dev>/ars/ | | EDAC Sysfs | | |
|| |/sys/bus/edac/devices/<dev>/ppr/ | | EDAC Module | | |
|| |__________________________________| |_____________| | |
|| | EDAC Bus | |
|| Get | | |
|| __________ Feature's | __________ | |
|| | |Descs _________|______ | | | |
|| |EDAC Scrub|<-----| EDAC RAS |---->| EDAC ARS | | |
|| |__________| |Control Feature | |__________| | |
|| __________ | Driver | __________ | |
|| | |<-----|________________|---->| | | |
|| |EDAC ECS | Register RAS | Features | EDAC PPR | | |
|| |__________| | |__________| | |
|| ______________________|___________________ | |
||_________|_____________|_____________|____________|________| |
| _______|____ _____|______ ____|______ ___|_____ |
| | | | CXL Mem | | | | | |
| | ACPI RAS2 | | Driver | | ACPI ARS | | PPR | |
| | Driver | | Scrub,ECS | | Driver | | Driver | |
| |____________| |___________| |___________| |_________| |
| | | | | |
|________|______________|______________|___________|___________|
| | | |
_______|______________|______________|___________|___________
| __|______________|_ ____________|___________|_____ |
| | | |
| | Platform HW and Firmware | |
| |__________________________________________________| |
|_____________________________________________________________|
1. EDAC Features components - Create feature specific descriptors.
2. EDAC RAS Feature driver - Get feature's attr descriptors from the
EDAC RAS feature component and registers device's RAS features with
EDAC bus and expose the feature's sysfs attributes under the sysfs
EDAC bus.
3. RAS dynamic scrub controller - Userspace sample module added in the
rasdaemon to start scrubbing when excess number of related errors
are reported in a short span of time.
The added EDAC feature specific components (e.g. EDAC scrub, EDAC ECS,
EDAC PPR etc) do callbacks to the parent driver (e.g. CXL driver,
ACPI RAS driver etc) for the controls rather than just letting the
caller deal with it because of the following reasons.
1. Enforces a common API across multiple implementations can do that
via review, but that's not generally gone well in the long run for
subsystems that have done it (several have later moved to callback
and feature list based approaches).
2. Gives a path for 'intercepting' in the EDAC feature driver.
An example for this is that we could intercept PPR repair calls
and sanity check that the memory in question is offline before
passing back to the underlying code. Sure we could rely on doing
that via some additional calls from the parent driver, but the
ABI will get messier.
3. (Speculative) we may get in kernel users of some features in the
long run.
More details of the common RAS features are described in the following
sections.
Memory Scrubbing
================
Increasing DRAM size and cost has made memory subsystem reliability
an important concern. These modules are used where potentially
corrupted data could cause expensive or fatal issues. Memory errors are
one of the top hardware failures that cause server and workload crashes.
Memory scrub is a feature where an ECC engine reads data from
each memory media location, corrects with an ECC if necessary and
writes the corrected data back to the same memory media location.
The memory DIMMs could be scrubbed at a configurable rate to detect
uncorrected memory errors and attempts to recover from detected memory
errors providing the following benefits.
- Proactively scrubbing memory DIMMs reduces the chance of a correctable
error becoming uncorrectable.
- Once detected, uncorrected errors caught in unallocated memory pages are
isolated and prevented from being allocated to an application or the OS.
- The probability of software/hardware products encountering memory
errors is reduced.
Some details of background can be found in Reference [5].
There are 2 types of memory scrubbing,
1. Background (patrol) scrubbing of the RAM whilest the RAM is otherwise
idle.
2. On-demand scrubbing for a specific address range/region of memory.
There are several types of interfaces to HW memory scrubbers
identified such as ACPI NVDIMM ARS(Address Range Scrub), CXL memory
device patrol scrub, CXL DDR5 ECS, ACPI RAS2 memory scrubbing.
The scrub control varies between different memory scrubbers. To allow
for standard userspace tooling there is a need to present these controls
with a standard ABI.
Introduce generic memory EDAC scrub control which allows user to
control underlying scrubbers in the system via generic sysfs scrub
control interface.
Use case of common scrub control feature
========================================
1. There are several types of interfaces to HW memory scrubbers identified
such as ACPI NVDIMM ARS(Address Range Scrub), CXL memory device patrol
scrub, CXL DDR5 ECS, ACPI RAS2 memory scrubbing features and software
based memory scrubber(discussed in the community Reference [5]).
Also some scrubbers support controlling (background) patrol scrubbing
(ACPI RAS2, CXL) and/or on-demand scrubbing(ACPI RAS2, ACPI ARS).
However the scrub controls varies between memory scrubbers. Thus there
is a requirement for a standard generic sysfs scrub controls exposed
to the userspace for the seamless control of the HW/SW scrubbers in
the system by admin/scripts/tools etc.
2. Scrub controls in user space allow the user to disable the scrubbing
in case disabling of the background patrol scrubbing or changing the
scrub rate are needed for other purposes such as performance-aware
operations which requires the background operations to be turned off
or reduced.
3. Allows to perform on-demand scrubbing for specific address range if
supported by the scrubber.
4. User space tools controls scrub the memory DIMMs regularly at a
configurable scrub rate using the sysfs scrub controls discussed help,
- to detect uncorrectable memory errors early before user accessing memory,
which helps to recover the detected memory errors.
- reduces the chance of a correctable error becoming uncorrectable.
5. Policy control for hotplugged memory. There is not necessarily a system
wide bios or similar in the loop to control the scrub settings on a CXL
device that wasn't there at boot. What that setting should be is a policy
decision as we are trading of reliability vs performance - hence it should
be in control of userspace. As such, 'an' interface is needed. Seems more
sensible to try and unify it with other similar interfaces than spin
yet another one.
The draft version of userspace code for dynamic scrub control, based
on frequency of memory errors reported to the userspace, is added in
rasdaemon and enabled, tested for CXL device based patrol scrubbing feature
and ACPI RAS2 based scrubbing feature.
https://github.com/shijujose4/rasdaemon/tree/scrub_control_6_june_2024
Comparison of scrubbing features
================================
................................................................
. . ACPI . CXL patrol. CXL ECS . ARS .
. Name . RAS2 . scrub . . .
................................................................
. . . . . .
. On-demand . Supported . No . No . Supported .
. Scrubbing . . . . .
. . . . . .
................................................................
. . . . . .
. Background . Supported . Supported . Supported . No .
. scrubbing . . . . .
. . . . . .
................................................................
. . . . . .
. Mode of . Scrub ctrl. per device. per memory. Unknown .
. scrubbing . per NUMA . . media . .
. . domain. . . . .
................................................................
. . . . . .
. Query scrub . Supported . Supported . Supported . Supported .
. capabilities . . . . .
. . . . . .
................................................................
. . . . . .
. Setting . Supported . No . No . Supported .
. address range. . . . .
. . . . . .
................................................................
. . . . . .
. Setting . Supported . Supported . No . No .
. scrub rate . . . . .
. . . . . .
................................................................
. . . . . .
. Unit for . Not . in hours . No . No .
. scrub rate . Defined . . . .
. . . . . .
................................................................
. . Supported . . . .
. Scrub . on-demand . No . No . Supported .
. status/ . scrubbing . . . .
. Completion . only . . . .
................................................................
. UC error . .CXL general.CXL general. ACPI UCE .
. reporting . Exception .media/DRAM .media/DRAM . notify and.
. . .event/media.event/media. query .
. . .scan? .scan? . ARS status.
................................................................
. . . . . .
. Clear UC . No . No . No . Supported .
. error . . . . .
. . . . . .
................................................................
. . . . . .
. Translate . No . No . No . Supported .
. *(1)SPA to . . . . .
. *(2)DPA . . . . .
................................................................
. . . . . .
. Error inject . No . Can inject. No . Supported .
. . . poison for. . .
. . . CXL . . .
................................................................
*(1) - SPA - System Physical Address. See section 9.19.7.8
Function Index 5 - Translate SPA of ACPI spec r6.5.
*(2) - DPA - Device Physical Address. See section 9.19.7.8
Function Index 5 - Translate SPA of ACPI spec r6.5.
CXL Scrubbing features
======================
Add support for control CXL patrol scrubber and ACPI RAS2 HW based memory
patrol scrubber and register with the EDAC scrub to expose the scrub
controls to the userspace tool.
CXL spec r3.1 section 8.2.9.9.11.1 describes the memory device patrol scrub
control feature. The device patrol scrub proactively locates and makes
corrections to errors in regular cycle. The patrol scrub control allows the
request to configure patrol scrubber's input configurations.
The patrol scrub control allows the requester to specify the number of
hours in which the patrol scrub cycles must be completed, provided that
the requested number is not less than the minimum number of hours for the
patrol scrub cycle that the device is capable of. In addition, the patrol
scrub controls allow the host to disable and enable the feature in case
disabling of the feature is needed for other purposes such as
performance-aware operations which require the background operations to be
turned off.
The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
Specification (JESD79-5) and allows the DRAM to internally read, correct
single-bit errors, and write back corrected data bits to the DRAM array
while providing transparency to error counts.
The DDR5 device contains number of memory media FRUs per device. The
DDR5 ECS feature and thus the ECS control driver supports configuring
the ECS parameters per FRU.
ACPI RAS2 Hardware-based Memory Scrubbing
=========================================
ACPI spec 6.5 section 5.2.21 ACPI RAS2 describes ACPI RAS2 table
provides interfaces for platform RAS features and supports independent
RAS controls and capabilities for a given RAS feature for multiple
instances of the same component in a given system.
Memory RAS features apply to RAS capabilities, controls and operations
that are specific to memory. RAS2 PCC sub-spaces for memory-specific RAS
features have a Feature Type of 0x00 (Memory).
The platform can use the hardware-based memory scrubbing feature to expose
controls and capabilities associated with hardware-based memory scrub
engines. The RAS2 memory scrubbing feature supports following as per spec,
- Independent memory scrubbing controls for each NUMA domain, identified
using its proximity domain.
Note: However AmpereComputing has single entry repeated as they have
centralized controls.
- Provision for background (patrol) scrubbing of the entire memory system,
as well as on-demand scrubbing for a specific region of memory.
ACPI Address Range Scrubbing(ARS)
================================
ARS allows the platform to communicate memory errors to system software.
This capability allows system software to prevent accesses to addresses
with uncorrectable errors in memory. ARS functions manage all NVDIMMs
present in the system. Only one scrub can be in progress system wide
at any given time.
Following functions are supported as per the specification.
1. Query ARS Capabilities for a given address range, indicates platform
supports the ACPI NVDIMM Root Device Unconsumed Error Notification.
2. Start ARS triggers an Address Range Scrub for the given memory range.
Address scrubbing can be done for volatile memory, persistent memory,
or both.
3. Query ARS Status command allows software to get the status of ARS,
including the progress of ARS and ARS error record.
4. Clear Uncorrectable Error.
5. Translate SPA
6. ARS Error Inject etc.
Note: Support for ARS is not added in this series because to reduce the
line of code for review and could be added after initial code is merged.
We'd like feedback on whether this is of interest to ARS community?
Series adds,
1. Generic EDAC RAS feature driver, EDAC scrub driver, EDAC ECS driver
supports memory scrub control, ECS control and other RAS features
in the system.
2. Support for CXL feature mailbox commands, which is used by
CXL device scrubbing features.
3. CXL scrub driver supporting patrol scrub control (device and
region based).
4. CXL ECS driver supporting ECS control feature.
5. ACPI RAS2 driver adds OS interface for RAS2 communication through
PCC mailbox and extracts ACPI RAS2 feature table (RAS2) and
create platform device for the RAS memory features, which binds
to the memory ACPI RAS2 driver.
7. Memory ACPI RAS2 driver gets the PCC subspace for communicating
with the ACPI compliant platform supports ACPI RAS2. Add callback
functions and registers with EDAC scrub to support user to
control the HW patrol scrubbers exposed to the kernel via the
ACPI RAS2 table.
The QEMU series to support the CXL specific scrub features is
available here,
https://lore.kernel.org/linux-cxl/20240705123039.963781-3-Jonathan.Cameron@huawei.com/T/#mae1e799306d2841eb7e1b637b82046114b926d56
Open Questions based on feedbacks from the community:
1. Leo: Standardize unit for scrub rate, for example ACPI RAS2 does not define
unit for the scrub rate. RAS2 clarification needed.
2. Jonathan: Any need for discoverability of capability to scan different regions,
such as global PA space to the userspace. Left as future extension.
3. Jiaqi:
- STOP_PATROL_SCRUBBER from RAS2 must be blocked and, must not be exposed to
OS/userspace. Stopping patrol scrubber is unacceptable for platform where
OEM has enabled patrol scrubber, because the patrol scrubber is a key part
of logging and is repurposed for other RAS actions.
If the OEM does not want to expose this control, they should lock it down so the
interface is not exposed to the OS. These features are optional afterall.
- "Requested Address Range"/"Actual Address Range" (region to scrub) is a
similarly bad thing to expose in RAS2.
If the OEM does not want to expose this, they should lock it down so the
interface is not exposed to the OS. These features are optional afterall.
4. Borislav:
- How the scrub control exposed to the userspace will be used?
POC added in rasdaemon with dynamic scrub control for CXL memory media
errors and memory errors reported to the userspace.
https://github.com/shijujose4/rasdaemon/tree/scrub_control_6_june_2024
- Is the scrub interface is sufficient for the use cases?
- Who is going to use scrub controls tools/admin/scripts?
1) Rasdaemon for dynamic control
2) Udev script for more static 'defaults' on hotplug etc.
References:
1. ACPI spec r6.5 section 5.2.21 ACPI RAS2.
2. ACPI spec r6.5 section 9.19.7.2 ARS.
3. CXL spec r3.1 8.2.9.9.11.1 Device patrol scrub control feature
4. CXL spec r3.1 8.2.9.9.11.2 DDR5 ECS feature
5. Background information about kernel support for memory scan, memory
error detection and ACPI RASF.
https://lore.kernel.org/all/20221103155029.2451105-1-jiaqiyan@google.com/
6. Discussions on RASF:
https://lore.kernel.org/lkml/20230915172818.761-1-shiju.jose@huawei.com/#r
Changes
=======
v8 -> v9:
1. Feedback from Borislav:
- Add scrub control driver to the EDAC on feedback from Borislav.
- Changed DEVICE_ATTR_..() static.
- Changed the write permissions for scrub control sysfs files as
root-only.
2. Feedback from Fan:
- Optimized cxl_get_feature() function by using min() and removed
feat_out_min_size.
- Removed unreached return from cxl_set_feature() function.
- Changed the term "rate" to "cycle_in_hours" in all the
scrub control code.
- Allow cxl_mem_probe() continue if cxl_mem_patrol_scrub_init() fail,
with just a debug warning.
3. Feedback from Jonathan:
- Removed patch __free() based cleanup function for acpi_put_table.
and added fix in the acpi ras2 driver.
4. Feedback from Dan Williams:
- Allow cxl_mem_probe() continue if cxl_mem_patrol_scrub_init() fail,
with just a debug warning.
- Add support for CXL region based scrub control.
5. Feedback from Daniel Ferguson on RAS2 drivers:
In the ACPI RAS2 driver,
- Incorporated the changes given for clearing error reported.
- Incorporated the changes given for check the Set RAS Capability
status and return an appropriate error.
In the RAS2 memory driver,
- Added more checks for start/stop bg and on-demand scrubbing
so that addr range in cache do not get cleared and restrict
permitted operations during scrubbing.
v7 -> v8:
1. Add more detailed cover letter and add info for basic analysis
of ACPI ARS for comment from Dan Williams.
2. Changed file name etc from ras2 to acpi_ras2 in memory ACPI RAS2
driver for comment from Boris.
3. Add documents for usage for comment from Jonathan.
4. Changed logic in memory/acpi_ras2.c for enable background
scrubbing to allow setting the scrub rate.
5. Merged memory/acpi_ras2_common.c with memory/acpi_ras2.c and
obselete code, suggested by Jonathan.
6. Initial optimizations and cleanup especially in the memory/acpi_ras2.
7. Removed CXL ECS support for time being.
8. Removed support for region based scrub control from the scrub
subsytem, which was needed for the CXL ECS, can be added later
if required.
9. Fixed the format of few comments and a definition in CXL feature
code for the feedbacks from Fan.
11. Jonathan done several optimizations, interface changes and
cleanups all over the code.
12. Fixes for feedbacks from Daniel Ferguson(Amperecomputing)
for RAS2.
13. Workaround for a RAS2 case of only one actual controller as
reported by Daniel Ferguson(AmpereComputing) in their hardware.
14. Feedback from Yazen, move the common scrub and ras2 changes
under /drivers/ras/.
15. Drop patch ACPICA: ACPI 6.5: Add support for RAS2 table because
Rafael queued the patch.
https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=bleeding-edge&id=9726d821f88e284ecd998b76ae5f2174721cd9dc
v6 -> v7:
1. Main changes for comments from Jonathan, Thanks.
1.1. CXL
- Changes for deal with small mail box and supporting multipart
feature data transfers.
- Provide more specific parameters to mbox supported/get/set features
interface functions.
- kvmalloc -> kmalloc in CXL scrub mem allocation for feature commands.
- Changed the way using __free(kfree)
- Removed readback and verify for setting CXL scrub patrol and ECS
parameters. Could be added later if needed.
- In is_visible() callback functions for scrub control sysfs attrs
changed to writeback the default attribute mode value instead of
setting per attrs.
- Add documentation for sysfs interfaces for CXL ECS scrub control.
1.2. RAS2
- In rasf common code, rename rasf to ras2 because RASF seems obselete.
- Replace pr_* with dev_* log function calls from ACPI RAS2 and
memory RAS2 drivers.
- In rasf common code, rename rasf to ras2.
- Removed including unnecessary .h file from memory RAS2 driver.
- In is_visible() callback functions for scrub control sysfs attrs
changed to writeback the default attribute mode value instead of
setting per attribute.
2. Changes for comments from Fan, Thanks.
- Add debug message if cxl patrol scrub and ecs init function
calls fail.
3. Updated cover letter for feedback from Dan Williams.
v5 -> v6:
1. Changes for comments from Davidlohr, Thanks.
- Update CXL feature code based on spec 3.1.
- attrb -> attr
- Use enums with default counting.
2. Rebased to the latest kernel.
v4 -> v5:
1. Following are the main changes made based on the feedback from Dan Williams on v4.
1.1. In the scrub subsystem the common scrub control attributes are statically defined
instead of dynamically created.
1.2. Add scrub subsystem support externally defined attribute group.
Add CXL ECS driver define ECS specific attribute group and pass to
the scrub subsystem.
1.3. Move cxl_mem_ecs_init() to cxl/core/region.c so that the CXL region_id
is used in the registration with the scrub subsystem.
1.4. Add previously posted RASF common and RAS2 patches to this scrub series.
2. Add support for the 'enable_background_scrub' attribute
for RAS2, on request from Bill Schwartz(wschwartz@amperecomputing.com).
v3 -> v4:
1. Fixes for the warnings/errors reported by kernel test robot.
2. Add support for reading the 'enable' attribute of CXL patrol scrub.
Changes
v2 -> v3:
1. Changes for comments from Davidlohr, Thanks.
- Updated cxl scrub kconfig
- removed usage of the flag is_support_feature from
the function cxl_mem_get_supported_feature_entry().
- corrected spelling error.
- removed unnecessary debug message.
- removed export feature commands to the userspace.
2. Possible fix for the warnings/errors reported by kernel
test robot.
3. Add documentation for the common scrub configure attributes.
v1 -> v2:
1. Changes for comments from Dave Jiang, Thanks.
- Split patches.
- reversed xmas tree declarations.
- declared flags as enums.
- removed few unnecessary variable initializations.
- replaced PTR_ERR_OR_ZERO() with IS_ERR() and PTR_ERR().
- add auto clean declarations.
- replaced while loop with for loop.
- Removed allocation from cxl_get_supported_features() and
cxl_get_feature() and make change to take allocated memory
pointer from the caller.
- replaced if/else with switch case.
- replaced sprintf() with sysfs_emit() in 2 places.
- replaced goto label with return in few functions.
2. removed unused code for supported attributes from ecs.
3. Included following common patch for scrub configure driver
to this series.
"memory: scrub: Add scrub driver supports configuring memory scrubbers
in the system"
Jonathan Cameron (1):
platform: Add __free() based cleanup function for platform_device_put
Shiju Jose (10):
EDAC: Add generic EDAC RAS feature driver
EDAC: Add EDAC scrub control driver
EDAC: Add EDAC ECS control driver
cxl/mbox: Add GET_SUPPORTED_FEATURES mailbox command
cxl/mbox: Add GET_FEATURE mailbox command
cxl/mbox: Add SET_FEATURE mailbox command
cxl/memscrub: Add CXL memory device patrol scrub control feature
cxl/memscrub: Add CXL memory device ECS control feature
ACPI:RAS2: Add ACPI RAS2 driver
ras: scrub: ACPI RAS2: Add memory ACPI RAS2 driver
Documentation/ABI/testing/sysfs-edac-scrub | 64 ++
Documentation/scrub/edac-scrub.rst | 107 +++
drivers/acpi/Kconfig | 10 +
drivers/acpi/Makefile | 1 +
drivers/acpi/ras2.c | 391 ++++++++++
drivers/cxl/Kconfig | 19 +
drivers/cxl/core/Makefile | 1 +
drivers/cxl/core/mbox.c | 135 ++++
drivers/cxl/core/memscrub.c | 842 +++++++++++++++++++++
drivers/cxl/core/region.c | 6 +
drivers/cxl/cxlmem.h | 129 ++++
drivers/cxl/mem.c | 4 +
drivers/edac/Makefile | 1 +
drivers/edac/edac_ecs.c | 396 ++++++++++
drivers/edac/edac_ras_feature.c | 161 ++++
drivers/edac/edac_scrub.c | 312 ++++++++
drivers/ras/Kconfig | 10 +
drivers/ras/Makefile | 1 +
drivers/ras/acpi_ras2.c | 401 ++++++++++
include/acpi/ras2_acpi.h | 59 ++
include/linux/edac_ras_feature.h | 130 ++++
include/linux/platform_device.h | 1 +
22 files changed, 3181 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-edac-scrub
create mode 100644 Documentation/scrub/edac-scrub.rst
create mode 100755 drivers/acpi/ras2.c
create mode 100644 drivers/cxl/core/memscrub.c
create mode 100755 drivers/edac/edac_ecs.c
create mode 100755 drivers/edac/edac_ras_feature.c
create mode 100755 drivers/edac/edac_scrub.c
create mode 100644 drivers/ras/acpi_ras2.c
create mode 100644 include/acpi/ras2_acpi.h
create mode 100755 include/linux/edac_ras_feature.h
--
2.34.1
^ permalink raw reply [flat|nested] 30+ messages in thread
* [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver
2024-07-16 15:03 [RFC PATCH v9 00/11] EDAC: Scrub: Introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
@ 2024-07-16 15:03 ` shiju.jose
2024-07-16 18:00 ` fan
2024-07-17 10:00 ` Mauro Carvalho Chehab
2024-07-16 15:03 ` [RFC PATCH v9 02/11] EDAC: Add EDAC scrub control driver shiju.jose
` (9 subsequent siblings)
10 siblings, 2 replies; 30+ messages in thread
From: shiju.jose @ 2024-07-16 15:03 UTC (permalink / raw)
To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
Cc: bp, tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
kangkang.shen, wanghuiqiang, linuxarm, shiju.jose
From: Shiju Jose <shiju.jose@huawei.com>
Add generic EDAC driver supports registering RAS features supported
in the system. The driver exposes feature's control attributes to the
userspace in /sys/bus/edac/devices/<dev-name>/<ras-feature>/
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
drivers/edac/Makefile | 1 +
drivers/edac/edac_ras_feature.c | 155 +++++++++++++++++++++++++++++++
include/linux/edac_ras_feature.h | 66 +++++++++++++
3 files changed, 222 insertions(+)
create mode 100755 drivers/edac/edac_ras_feature.c
create mode 100755 include/linux/edac_ras_feature.h
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 9c09893695b7..c532b57a6d8a 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -10,6 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
edac_core-y += edac_module.o edac_device_sysfs.o wq.o
+edac_core-y += edac_ras_feature.o
edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
diff --git a/drivers/edac/edac_ras_feature.c b/drivers/edac/edac_ras_feature.c
new file mode 100755
index 000000000000..24a729fea66f
--- /dev/null
+++ b/drivers/edac/edac_ras_feature.c
@@ -0,0 +1,155 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * EDAC RAS control feature driver supports registering RAS
+ * features with the EDAC and exposes the feature's control
+ * attributes to the userspace in sysfs.
+ *
+ * Copyright (c) 2024 HiSilicon Limited.
+ */
+
+#define pr_fmt(fmt) "EDAC RAS CONTROL FEAT: " fmt
+
+#include <linux/edac_ras_feature.h>
+
+static void edac_ras_dev_release(struct device *dev)
+{
+ struct edac_ras_feat_ctx *ctx =
+ container_of(dev, struct edac_ras_feat_ctx, dev);
+
+ kfree(ctx);
+}
+
+const struct device_type edac_ras_dev_type = {
+ .name = "edac_ras_dev",
+ .release = edac_ras_dev_release,
+};
+
+static void edac_ras_dev_unreg(void *data)
+{
+ device_unregister(data);
+}
+
+static int edac_ras_feat_scrub_init(struct device *parent,
+ struct edac_scrub_data *sdata,
+ const struct edac_ras_feature *sfeat,
+ const struct attribute_group **attr_groups)
+{
+ sdata->ops = sfeat->scrub_ops;
+ sdata->private = sfeat->scrub_ctx;
+
+ return 1;
+}
+
+static int edac_ras_feat_ecs_init(struct device *parent,
+ struct edac_ecs_data *edata,
+ const struct edac_ras_feature *efeat,
+ const struct attribute_group **attr_groups)
+{
+ int num = efeat->ecs_info.num_media_frus;
+
+ edata->ops = efeat->ecs_ops;
+ edata->private = efeat->ecs_ctx;
+
+ return num;
+}
+
+/**
+ * edac_ras_dev_register - register device for ras features with edac
+ * @parent: client device.
+ * @name: client device's name.
+ * @private: parent driver's data to store in the context if any.
+ * @num_features: number of ras features to register.
+ * @ras_features: list of ras features to register.
+ *
+ * Returns 0 on success, error otherwise.
+ * The new edac_ras_feat_ctx would be freed automatically.
+ */
+int edac_ras_dev_register(struct device *parent, char *name,
+ void *private, int num_features,
+ const struct edac_ras_feature *ras_features)
+{
+ const struct attribute_group **ras_attr_groups;
+ struct edac_ras_feat_ctx *ctx;
+ int attr_gcnt = 0;
+ int ret, feat;
+
+ if (!parent || !name || !num_features || !ras_features)
+ return -EINVAL;
+
+ ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+ if (!ctx)
+ return -ENOMEM;
+
+ ctx->dev.parent = parent;
+ ctx->private = private;
+
+ /* Double parse so we can make space for attributes */
+ for (feat = 0; feat < num_features; feat++) {
+ switch (ras_features[feat].feat) {
+ case ras_feat_scrub:
+ attr_gcnt++;
+ break;
+ case ras_feat_ecs:
+ attr_gcnt += ras_features[feat].ecs_info.num_media_frus;
+ break;
+ default:
+ ret = -EINVAL;
+ goto ctx_free;
+ }
+ }
+
+ ras_attr_groups = devm_kzalloc(parent,
+ (attr_gcnt + 1) * sizeof(*ras_attr_groups),
+ GFP_KERNEL);
+ if (!ras_attr_groups) {
+ ret = -ENOMEM;
+ goto ctx_free;
+ }
+
+ attr_gcnt = 0;
+ for (feat = 0; feat < num_features; feat++, ras_features++) {
+ if (ras_features->feat == ras_feat_scrub) {
+ if (!ras_features->scrub_ops)
+ continue;
+ ret = edac_ras_feat_scrub_init(parent, &ctx->scrub,
+ ras_features, &ras_attr_groups[attr_gcnt]);
+ if (ret < 0)
+ goto ctx_free;
+
+ attr_gcnt += ret;
+ } else if (ras_features->feat == ras_feat_ecs) {
+ if (!ras_features->ecs_ops)
+ continue;
+ ret = edac_ras_feat_ecs_init(parent, &ctx->ecs,
+ ras_features, &ras_attr_groups[attr_gcnt]);
+ if (ret < 0)
+ goto ctx_free;
+
+ attr_gcnt += ret;
+ } else {
+ ret = -EINVAL;
+ goto ctx_free;
+ }
+ }
+ ras_attr_groups[attr_gcnt] = NULL;
+ ctx->dev.bus = edac_get_sysfs_subsys();
+ ctx->dev.type = &edac_ras_dev_type;
+ ctx->dev.groups = ras_attr_groups;
+ dev_set_drvdata(&ctx->dev, ctx);
+ ret = dev_set_name(&ctx->dev, name);
+ if (ret)
+ goto ctx_free;
+
+ ret = device_register(&ctx->dev);
+ if (ret) {
+ put_device(&ctx->dev);
+ return ret;
+ }
+
+ return devm_add_action_or_reset(parent, edac_ras_dev_unreg, &ctx->dev);
+
+ctx_free:
+ kfree(ctx);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(edac_ras_dev_register);
diff --git a/include/linux/edac_ras_feature.h b/include/linux/edac_ras_feature.h
new file mode 100755
index 000000000000..000e99141023
--- /dev/null
+++ b/include/linux/edac_ras_feature.h
@@ -0,0 +1,66 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * EDAC RAS control features.
+ *
+ * Copyright (c) 2024 HiSilicon Limited.
+ */
+
+#ifndef __EDAC_RAS_FEAT_H
+#define __EDAC_RAS_FEAT_H
+
+#include <linux/types.h>
+#include <linux/edac.h>
+
+#define EDAC_RAS_NAME_LEN 128
+
+enum edac_ras_feat {
+ ras_feat_scrub,
+ ras_feat_ecs,
+ ras_feat_max
+};
+
+struct edac_ecs_ex_info {
+ u16 num_media_frus;
+};
+
+/*
+ * EDAC RAS feature information structure
+ */
+struct edac_scrub_data {
+ const struct edac_scrub_ops *ops;
+ void *private;
+};
+
+struct edac_ecs_data {
+ const struct edac_ecs_ops *ops;
+ void *private;
+};
+
+struct device;
+
+struct edac_ras_feat_ctx {
+ struct device dev;
+ void *private;
+ struct edac_scrub_data scrub;
+ struct edac_ecs_data ecs;
+};
+
+struct edac_ras_feature {
+ enum edac_ras_feat feat;
+ union {
+ const struct edac_scrub_ops *scrub_ops;
+ const struct edac_ecs_ops *ecs_ops;
+ };
+ union {
+ struct edac_ecs_ex_info ecs_info;
+ };
+ union {
+ void *scrub_ctx;
+ void *ecs_ctx;
+ };
+};
+
+int edac_ras_dev_register(struct device *parent, char *dev_name,
+ void *parent_pvt_data, int num_features,
+ const struct edac_ras_feature *ras_features);
+#endif /* __EDAC_RAS_FEAT_H */
--
2.34.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC PATCH v9 02/11] EDAC: Add EDAC scrub control driver
2024-07-16 15:03 [RFC PATCH v9 00/11] EDAC: Scrub: Introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
2024-07-16 15:03 ` [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver shiju.jose
@ 2024-07-16 15:03 ` shiju.jose
2024-07-17 12:56 ` Mauro Carvalho Chehab
2024-07-16 15:03 ` [RFC PATCH v9 03/11] EDAC: Add EDAC ECS " shiju.jose
` (8 subsequent siblings)
10 siblings, 1 reply; 30+ messages in thread
From: shiju.jose @ 2024-07-16 15:03 UTC (permalink / raw)
To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
Cc: bp, tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
kangkang.shen, wanghuiqiang, linuxarm, shiju.jose
From: Shiju Jose <shiju.jose@huawei.com>
Add generic EDAC scrub control driver supports configuring the memory scrubbers
in the system. The device with scrub feature, get the scrub descriptor from the
EDAC scrub and registers with the EDAC RAS feature driver, which adds the sysfs
scrub control interface. The scrub control attributes are available to the
userspace in /sys/bus/edac/devices/<dev-name>/scrub/.
Generic EDAC scrub driver and the common sysfs scrub interface promotes
unambiguous access from the userspace irrespective of the underlying scrub
devices.
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
Documentation/ABI/testing/sysfs-edac-scrub | 64 +++++
drivers/edac/Makefile | 2 +-
drivers/edac/edac_ras_feature.c | 1 +
drivers/edac/edac_scrub.c | 312 +++++++++++++++++++++
include/linux/edac_ras_feature.h | 28 ++
5 files changed, 406 insertions(+), 1 deletion(-)
create mode 100644 Documentation/ABI/testing/sysfs-edac-scrub
create mode 100755 drivers/edac/edac_scrub.c
diff --git a/Documentation/ABI/testing/sysfs-edac-scrub b/Documentation/ABI/testing/sysfs-edac-scrub
new file mode 100644
index 000000000000..dd19afd5e165
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-edac-scrub
@@ -0,0 +1,64 @@
+What: /sys/bus/edac/devices/<dev-name>/scrub
+Date: Oct 2024
+KernelVersion: 6.12
+Contact: linux-edac@vger.kernel.org
+Description:
+ The sysfs edac bus devices /<dev-name>/scrub subdirectory
+ belongs to the memory scrub control feature, where <dev-name>
+ directory corresponds to a device/memory region registered
+ with the edac scrub driver and thus registered with the
+ generic edac ras driver too.
+
+What: /sys/bus/edac/devices/<dev-name>/scrub/addr_range_base
+Date: Oct 2024
+KernelVersion: 6.12
+Contact: linux-edac@vger.kernel.org
+Description:
+ (RW) The base of the address range of the memory region
+ to be scrubbed (on-demand scrubbing).
+
+What: /sys/bus/edac/devices/<dev-name>/scrub/addr_range_size
+Date: Oct 2024
+KernelVersion: 6.12
+Contact: linux-edac@vger.kernel.org
+Description:
+ (RW) The size of the address range of the memory region
+ to be scrubbed (on-demand scrubbing).
+
+What: /sys/bus/edac/devices/<dev-name>/scrub/enable_background
+Date: Oct 2024
+KernelVersion: 6.12
+Contact: linux-edac@vger.kernel.org
+Description:
+ (RW) Start/Stop background(patrol) scrubbing if supported.
+
+What: /sys/bus/edac/devices/<dev-name>/scrub/enable_on_demand
+Date: Oct 2024
+KernelVersion: 6.12
+Contact: linux-edac@vger.kernel.org
+Description:
+ (RW) Start/Stop on-demand scrubbing the memory region
+ if supported.
+
+What: /sys/bus/edac/devices/<dev-name>/scrub/name
+Date: Oct 2024
+KernelVersion: 6.12
+Contact: linux-edac@vger.kernel.org
+Description:
+ (RO) name of the memory scrubber
+
+What: /sys/bus/edac/devices/<dev-name>/scrub/cycle_in_hours_available
+Date: Oct 2024
+KernelVersion: 6.12
+Contact: linux-edac@vger.kernel.org
+Description:
+ (RO) Supported range for the scrub cycle in hours by the
+ memory scrubber.
+
+What: /sys/bus/edac/devices/<dev-name>/scrub/cycle_in_hours
+Date: Oct 2024
+KernelVersion: 6.12
+Contact: linux-edac@vger.kernel.org
+Description:
+ (RW) The scrub cycle in hours specified and it must be with in the
+ supported range by the memory scrubber.
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index c532b57a6d8a..de56cbd039eb 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -10,7 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
edac_core-y += edac_module.o edac_device_sysfs.o wq.o
-edac_core-y += edac_ras_feature.o
+edac_core-y += edac_ras_feature.o edac_scrub.o
edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
diff --git a/drivers/edac/edac_ras_feature.c b/drivers/edac/edac_ras_feature.c
index 24a729fea66f..48927f868372 100755
--- a/drivers/edac/edac_ras_feature.c
+++ b/drivers/edac/edac_ras_feature.c
@@ -36,6 +36,7 @@ static int edac_ras_feat_scrub_init(struct device *parent,
{
sdata->ops = sfeat->scrub_ops;
sdata->private = sfeat->scrub_ctx;
+ attr_groups[0] = edac_scrub_get_desc();
return 1;
}
diff --git a/drivers/edac/edac_scrub.c b/drivers/edac/edac_scrub.c
new file mode 100755
index 000000000000..0b07eafd3551
--- /dev/null
+++ b/drivers/edac/edac_scrub.c
@@ -0,0 +1,312 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Generic EDAC scrub driver supports controlling the memory
+ * scrubbers in the system and the common sysfs scrub interface
+ * promotes unambiguous access from the userspace.
+ *
+ * Copyright (c) 2024 HiSilicon Limited.
+ */
+
+#define pr_fmt(fmt) "EDAC SCRUB: " fmt
+
+#include <linux/edac_ras_feature.h>
+
+static ssize_t addr_range_base_show(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_scrub_ops *ops = ctx->scrub.ops;
+ u64 base, size;
+ int ret;
+
+ ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base, &size);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "0x%llx\n", base);
+}
+
+static ssize_t addr_range_size_show(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_scrub_ops *ops = ctx->scrub.ops;
+ u64 base, size;
+ int ret;
+
+ ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base, &size);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "0x%llx\n", size);
+}
+
+static ssize_t addr_range_base_store(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_scrub_ops *ops = ctx->scrub.ops;
+ u64 base, size;
+ int ret;
+
+ ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base, &size);
+ if (ret)
+ return ret;
+
+ ret = kstrtou64(buf, 16, &base);
+ if (ret < 0)
+ return ret;
+
+ ret = ops->write_range(ras_feat_dev->parent, ctx->scrub.private, base, size);
+ if (ret)
+ return ret;
+
+ return len;
+}
+
+static ssize_t addr_range_size_store(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ const char *buf,
+ size_t len)
+{
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_scrub_ops *ops = ctx->scrub.ops;
+ u64 base, size;
+ int ret;
+
+ ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base, &size);
+ if (ret)
+ return ret;
+
+ ret = kstrtou64(buf, 16, &size);
+ if (ret < 0)
+ return ret;
+
+ ret = ops->write_range(ras_feat_dev->parent, ctx->scrub.private, base, size);
+ if (ret)
+ return ret;
+
+ return len;
+}
+
+static ssize_t enable_background_store(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_scrub_ops *ops = ctx->scrub.ops;
+ bool enable;
+ int ret;
+
+ ret = kstrtobool(buf, &enable);
+ if (ret < 0)
+ return ret;
+
+ ret = ops->set_enabled_bg(ras_feat_dev->parent, ctx->scrub.private, enable);
+ if (ret)
+ return ret;
+
+ return len;
+}
+
+static ssize_t enable_background_show(struct device *ras_feat_dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_scrub_ops *ops = ctx->scrub.ops;
+ bool enable;
+ int ret;
+
+ ret = ops->get_enabled_bg(ras_feat_dev->parent, ctx->scrub.private, &enable);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "%d\n", enable);
+}
+
+static ssize_t enable_on_demand_show(struct device *ras_feat_dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_scrub_ops *ops = ctx->scrub.ops;
+ bool enable;
+ int ret;
+
+ ret = ops->get_enabled_od(ras_feat_dev->parent, ctx->scrub.private, &enable);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "%d\n", enable);
+}
+
+static ssize_t enable_on_demand_store(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_scrub_ops *ops = ctx->scrub.ops;
+ bool enable;
+ int ret;
+
+ ret = kstrtobool(buf, &enable);
+ if (ret < 0)
+ return ret;
+
+ ret = ops->set_enabled_od(ras_feat_dev->parent, ctx->scrub.private, enable);
+ if (ret)
+ return ret;
+
+ return len;
+}
+
+static ssize_t name_show(struct device *ras_feat_dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_scrub_ops *ops = ctx->scrub.ops;
+ int ret;
+
+ ret = ops->get_name(ras_feat_dev->parent, ctx->scrub.private, buf);
+ if (ret)
+ return ret;
+
+ return strlen(buf);
+}
+
+static ssize_t cycle_in_hours_show(struct device *ras_feat_dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_scrub_ops *ops = ctx->scrub.ops;
+ u64 val;
+ int ret;
+
+ ret = ops->cycle_in_hours_read(ras_feat_dev->parent, ctx->scrub.private, &val);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "0x%llx\n", val);
+}
+
+static ssize_t cycle_in_hours_store(struct device *ras_feat_dev, struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_scrub_ops *ops = ctx->scrub.ops;
+ long val;
+ int ret;
+
+ ret = kstrtol(buf, 10, &val);
+ if (ret < 0)
+ return ret;
+
+ ret = ops->cycle_in_hours_write(ras_feat_dev->parent, ctx->scrub.private, val);
+ if (ret)
+ return ret;
+
+ return len;
+}
+
+static ssize_t cycle_in_hours_range_show(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_scrub_ops *ops = ctx->scrub.ops;
+ u64 min_schrs, max_schrs;
+ int ret;
+
+ ret = ops->cycle_in_hours_range(ras_feat_dev->parent, ctx->scrub.private,
+ &min_schrs, &max_schrs);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "0x%llx-0x%llx\n", min_schrs, max_schrs);
+}
+
+static DEVICE_ATTR_RW(addr_range_base);
+static DEVICE_ATTR_RW(addr_range_size);
+static DEVICE_ATTR_RW(enable_background);
+static DEVICE_ATTR_RW(enable_on_demand);
+static DEVICE_ATTR_RO(name);
+static DEVICE_ATTR_RW(cycle_in_hours);
+static DEVICE_ATTR_RO(cycle_in_hours_range);
+
+static struct attribute *scrub_attrs[] = {
+ &dev_attr_addr_range_base.attr,
+ &dev_attr_addr_range_size.attr,
+ &dev_attr_enable_background.attr,
+ &dev_attr_enable_on_demand.attr,
+ &dev_attr_name.attr,
+ &dev_attr_cycle_in_hours.attr,
+ &dev_attr_cycle_in_hours_range.attr,
+ NULL
+};
+
+static umode_t scrub_attr_visible(struct kobject *kobj,
+ struct attribute *a, int attr_id)
+{
+ struct device *ras_feat_dev = kobj_to_dev(kobj);
+ struct edac_ras_feat_ctx *ctx;
+ const struct edac_scrub_ops *ops;
+
+ ctx = dev_get_drvdata(ras_feat_dev);
+ if (!ctx)
+ return 0;
+
+ ops = ctx->scrub.ops;
+ if (a == &dev_attr_addr_range_base.attr ||
+ a == &dev_attr_addr_range_size.attr) {
+ if (ops->read_range && ops->write_range)
+ return a->mode;
+ if (ops->read_range)
+ return 0444;
+ return 0;
+ }
+ if (a == &dev_attr_enable_background.attr) {
+ if (ops->set_enabled_bg && ops->get_enabled_bg)
+ return a->mode;
+ if (ops->get_enabled_bg)
+ return 0444;
+ return 0;
+ }
+ if (a == &dev_attr_enable_on_demand.attr) {
+ if (ops->set_enabled_od && ops->get_enabled_od)
+ return a->mode;
+ if (ops->get_enabled_od)
+ return 0444;
+ return 0;
+ }
+ if (a == &dev_attr_name.attr)
+ return ops->get_name ? a->mode : 0;
+ if (a == &dev_attr_cycle_in_hours_range.attr)
+ return ops->cycle_in_hours_range ? a->mode : 0;
+ if (a == &dev_attr_cycle_in_hours.attr) { /* Write only makes little sense */
+ if (ops->cycle_in_hours_read && ops->cycle_in_hours_write)
+ return a->mode;
+ if (ops->cycle_in_hours_read)
+ return 0444;
+ return 0;
+ }
+
+ return 0;
+}
+
+static const struct attribute_group scrub_attr_group = {
+ .name = "scrub",
+ .attrs = scrub_attrs,
+ .is_visible = scrub_attr_visible,
+};
+
+/**
+ * edac_scrub_get_desc - get edac scrub's attr descriptor
+ *
+ * Returns attribute_group for the scrub feature.
+ */
+const struct attribute_group *edac_scrub_get_desc(void)
+{
+ return &scrub_attr_group;
+}
diff --git a/include/linux/edac_ras_feature.h b/include/linux/edac_ras_feature.h
index 000e99141023..462f9ecbf9d4 100755
--- a/include/linux/edac_ras_feature.h
+++ b/include/linux/edac_ras_feature.h
@@ -19,6 +19,34 @@ enum edac_ras_feat {
ras_feat_max
};
+/**
+ * struct scrub_ops - scrub device operations (all elements optional)
+ * @read_range: read base and offset of scrubbing range.
+ * @write_range: set the base and offset of the scrubbing range.
+ * @get_enabled_bg: check if currently performing background scrub.
+ * @set_enabled_bg: start or stop a bg-scrub.
+ * @get_enabled_od: check if currently performing on-demand scrub.
+ * @set_enabled_od: start or stop an on-demand scrub.
+ * @cycle_in_hours_range: retrieve limits on supported cycle in hours.
+ * @cycle_in_hours_read: read the scrub cycle in hours.
+ * @cycle_in_hours_write: set the scrub cycle in hours.
+ * @get_name: get the memory scrubber's name.
+ */
+struct edac_scrub_ops {
+ int (*read_range)(struct device *dev, void *drv_data, u64 *base, u64 *size);
+ int (*write_range)(struct device *dev, void *drv_data, u64 base, u64 size);
+ int (*get_enabled_bg)(struct device *dev, void *drv_data, bool *enable);
+ int (*set_enabled_bg)(struct device *dev, void *drv_data, bool enable);
+ int (*get_enabled_od)(struct device *dev, void *drv_data, bool *enable);
+ int (*set_enabled_od)(struct device *dev, void *drv_data, bool enable);
+ int (*cycle_in_hours_range)(struct device *dev, void *drv_data, u64 *min, u64 *max);
+ int (*cycle_in_hours_read)(struct device *dev, void *drv_data, u64 *schrs);
+ int (*cycle_in_hours_write)(struct device *dev, void *drv_data, u64 schrs);
+ int (*get_name)(struct device *dev, void *drv_data, char *buf);
+};
+
+const struct attribute_group *edac_scrub_get_desc(void);
+
struct edac_ecs_ex_info {
u16 num_media_frus;
};
--
2.34.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC PATCH v9 03/11] EDAC: Add EDAC ECS control driver
2024-07-16 15:03 [RFC PATCH v9 00/11] EDAC: Scrub: Introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
2024-07-16 15:03 ` [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver shiju.jose
2024-07-16 15:03 ` [RFC PATCH v9 02/11] EDAC: Add EDAC scrub control driver shiju.jose
@ 2024-07-16 15:03 ` shiju.jose
2024-07-17 13:08 ` Mauro Carvalho Chehab
2024-07-17 17:13 ` nifan.cxl
2024-07-16 15:03 ` [RFC PATCH v9 04/11] cxl/mbox: Add GET_SUPPORTED_FEATURES mailbox command shiju.jose
` (7 subsequent siblings)
10 siblings, 2 replies; 30+ messages in thread
From: shiju.jose @ 2024-07-16 15:03 UTC (permalink / raw)
To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
Cc: bp, tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
kangkang.shen, wanghuiqiang, linuxarm, shiju.jose
From: Shiju Jose <shiju.jose@huawei.com>
Add EDAC ECS (Error Check Scrub) control driver supports configuring
the memory device's ECS feature.
The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
Specification (JESD79-5) and allows the DRAM to internally read, correct
single-bit errors, and write back corrected data bits to the DRAM array
while providing transparency to error counts.
The DDR5 device contains number of memory media FRUs per device. The
DDR5 ECS feature and thus the ECS control driver supports configuring
the ECS parameters per FRU.
The memory devices supports ECS feature register with the EDAC ECS driver
and thus with the generic EDAC RAS feature driver, which adds the sysfs
ECS control interface. The ECS control attributes are exposed to the
userspace in /sys/bus/edac/devices/<dev-name>/ecs_fruX/.
Generic EDAC ECS driver and the common sysfs ECS interface promotes
unambiguous control from the userspace irrespective of the underlying
devices, support ECS feature.
The support for ECS feature is added separately because the DDR5 ECS
feature's control attributes are dissimilar from those of the scrub
feature.
Note: Documentation can be added if necessary.
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
drivers/edac/Makefile | 2 +-
drivers/edac/edac_ecs.c | 396 +++++++++++++++++++++++++++++++
drivers/edac/edac_ras_feature.c | 5 +
include/linux/edac_ras_feature.h | 36 +++
4 files changed, 438 insertions(+), 1 deletion(-)
create mode 100755 drivers/edac/edac_ecs.c
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index de56cbd039eb..c1412c7d3efb 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -10,7 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
edac_core-y += edac_module.o edac_device_sysfs.o wq.o
-edac_core-y += edac_ras_feature.o edac_scrub.o
+edac_core-y += edac_ras_feature.o edac_scrub.o edac_ecs.o
edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
diff --git a/drivers/edac/edac_ecs.c b/drivers/edac/edac_ecs.c
new file mode 100755
index 000000000000..37dabd053c36
--- /dev/null
+++ b/drivers/edac/edac_ecs.c
@@ -0,0 +1,396 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ECS driver supporting controlling on die error check scrub
+ * (e.g. DDR5 ECS). The common sysfs ECS interface promotes
+ * unambiguous access from the userspace.
+ *
+ * Copyright (c) 2024 HiSilicon Limited.
+ */
+
+#define pr_fmt(fmt) "EDAC ECS: " fmt
+
+#include <linux/edac_ras_feature.h>
+
+#define EDAC_ECS_FRU_NAME "ecs_fru"
+
+enum edac_ecs_attributes {
+ ecs_log_entry_type,
+ ecs_log_entry_type_per_dram,
+ ecs_log_entry_type_per_memory_media,
+ ecs_mode,
+ ecs_mode_counts_rows,
+ ecs_mode_counts_codewords,
+ ecs_reset,
+ ecs_name,
+ ecs_threshold,
+ ecs_max_attrs
+};
+
+struct edac_ecs_dev_attr {
+ struct device_attribute dev_attr;
+ int fru_id;
+};
+
+struct edac_ecs_fru_context {
+ char name[EDAC_RAS_NAME_LEN];
+ struct edac_ecs_dev_attr ecs_dev_attr[ecs_max_attrs];
+ struct attribute *ecs_attrs[ecs_max_attrs + 1];
+ struct attribute_group group;
+};
+
+struct edac_ecs_context {
+ u16 num_media_frus;
+ struct edac_ecs_fru_context *fru_ctxs;
+};
+
+#define to_ecs_dev_attr(_dev_attr) \
+ container_of(_dev_attr, struct edac_ecs_dev_attr, dev_attr)
+
+static ssize_t log_entry_type_show(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_ecs_ops *ops = ctx->ecs.ops;
+ u64 val;
+ int ret;
+
+ ret = ops->get_log_entry_type(ras_feat_dev->parent, ctx->ecs.private,
+ ecs_dev_attr->fru_id, &val);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "0x%llx\n", val);
+}
+
+static ssize_t log_entry_type_store(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_ecs_ops *ops = ctx->ecs.ops;
+ long val;
+ int ret;
+
+ ret = kstrtol(buf, 10, &val);
+ if (ret < 0)
+ return ret;
+
+ ret = ops->set_log_entry_type(ras_feat_dev->parent, ctx->ecs.private,
+ ecs_dev_attr->fru_id, val);
+ if (ret)
+ return ret;
+
+ return len;
+}
+
+static ssize_t log_entry_type_per_dram_show(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_ecs_ops *ops = ctx->ecs.ops;
+ u64 val;
+ int ret;
+
+ ret = ops->get_log_entry_type_per_dram(ras_feat_dev->parent, ctx->ecs.private,
+ ecs_dev_attr->fru_id, &val);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "0x%llx\n", val);
+}
+
+static ssize_t log_entry_type_per_memory_media_show(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_ecs_ops *ops = ctx->ecs.ops;
+ u64 val;
+ int ret;
+
+ ret = ops->get_log_entry_type_per_memory_media(ras_feat_dev->parent,
+ ctx->ecs.private,
+ ecs_dev_attr->fru_id, &val);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "0x%llx\n", val);
+}
+
+static ssize_t mode_show(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_ecs_ops *ops = ctx->ecs.ops;
+ u64 val;
+ int ret;
+
+ ret = ops->get_mode(ras_feat_dev->parent, ctx->ecs.private,
+ ecs_dev_attr->fru_id, &val);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "0x%llx\n", val);
+}
+
+static ssize_t mode_store(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_ecs_ops *ops = ctx->ecs.ops;
+ long val;
+ int ret;
+
+ ret = kstrtol(buf, 10, &val);
+ if (ret < 0)
+ return ret;
+
+ ret = ops->set_mode(ras_feat_dev->parent, ctx->ecs.private,
+ ecs_dev_attr->fru_id, val);
+ if (ret)
+ return ret;
+
+ return len;
+}
+
+static ssize_t mode_counts_rows_show(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_ecs_ops *ops = ctx->ecs.ops;
+ u64 val;
+ int ret;
+
+ ret = ops->get_mode_counts_rows(ras_feat_dev->parent, ctx->ecs.private,
+ ecs_dev_attr->fru_id, &val);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "0x%llx\n", val);
+}
+
+static ssize_t mode_counts_codewords_show(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_ecs_ops *ops = ctx->ecs.ops;
+ u64 val;
+ int ret;
+
+ ret = ops->get_mode_counts_codewords(ras_feat_dev->parent, ctx->ecs.private,
+ ecs_dev_attr->fru_id, &val);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "0x%llx\n", val);
+}
+
+static ssize_t reset_store(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_ecs_ops *ops = ctx->ecs.ops;
+ long val;
+ int ret;
+
+ ret = kstrtol(buf, 10, &val);
+ if (ret < 0)
+ return ret;
+
+ ret = ops->reset(ras_feat_dev->parent, ctx->ecs.private,
+ ecs_dev_attr->fru_id, val);
+ if (ret)
+ return ret;
+
+ return len;
+}
+
+static ssize_t name_show(struct device *ras_feat_dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_ecs_ops *ops = ctx->ecs.ops;
+ int ret;
+
+ ret = ops->get_name(ras_feat_dev->parent, ctx->ecs.private,
+ ecs_dev_attr->fru_id, buf);
+ if (ret)
+ return ret;
+
+ return strlen(buf);
+}
+
+static ssize_t threshold_show(struct device *ras_feat_dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_ecs_ops *ops = ctx->ecs.ops;
+ int ret;
+ u64 val;
+
+ ret = ops->get_threshold(ras_feat_dev->parent, ctx->ecs.private,
+ ecs_dev_attr->fru_id, &val);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "0x%llx\n", val);
+}
+
+static ssize_t threshold_store(struct device *ras_feat_dev,
+ struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_ecs_ops *ops = ctx->ecs.ops;
+ long val;
+ int ret;
+
+ ret = kstrtol(buf, 10, &val);
+ if (ret < 0)
+ return ret;
+
+ ret = ops->set_threshold(ras_feat_dev->parent, ctx->ecs.private,
+ ecs_dev_attr->fru_id, val);
+ if (ret)
+ return ret;
+
+ return len;
+}
+
+static umode_t ecs_attr_visible(struct kobject *kobj,
+ struct attribute *a, int attr_id)
+{
+ struct device *ras_feat_dev = kobj_to_dev(kobj);
+ struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+ const struct edac_ecs_ops *ops = ctx->ecs.ops;
+
+ switch (attr_id) {
+ case ecs_log_entry_type:
+ if (ops->get_log_entry_type && ops->set_log_entry_type)
+ return a->mode;
+ if (ops->get_log_entry_type)
+ return 0444;
+ return 0;
+ case ecs_log_entry_type_per_dram:
+ return ops->get_log_entry_type_per_dram ? a->mode : 0;
+ case ecs_log_entry_type_per_memory_media:
+ return ops->get_log_entry_type_per_memory_media ? a->mode : 0;
+ case ecs_mode:
+ if (ops->get_mode && ops->set_mode)
+ return a->mode;
+ if (ops->get_mode)
+ return 0444;
+ return 0;
+ case ecs_mode_counts_rows:
+ return ops->get_mode_counts_rows ? a->mode : 0;
+ case ecs_mode_counts_codewords:
+ return ops->get_mode_counts_codewords ? a->mode : 0;
+ case ecs_reset:
+ return ops->reset ? a->mode : 0;
+ case ecs_name:
+ return ops->get_name ? a->mode : 0;
+ case ecs_threshold:
+ if (ops->get_threshold && ops->set_threshold)
+ return a->mode;
+ if (ops->get_threshold)
+ return 0444;
+ return 0;
+ default:
+ return 0;
+ }
+}
+
+#define EDAC_ECS_ATTR_RO(_name, _fru_id) \
+ ((struct edac_ecs_dev_attr) { .dev_attr = __ATTR_RO(_name), \
+ .fru_id = _fru_id })
+
+#define EDAC_ECS_ATTR_WO(_name, _fru_id) \
+ ((struct edac_ecs_dev_attr) { .dev_attr = __ATTR_WO(_name), \
+ .fru_id = _fru_id })
+
+#define EDAC_ECS_ATTR_RW(_name, _fru_id) \
+ ((struct edac_ecs_dev_attr) { .dev_attr = __ATTR_RW(_name), \
+ .fru_id = _fru_id })
+
+static int ecs_create_desc(struct device *ecs_dev,
+ const struct attribute_group **attr_groups,
+ u16 num_media_frus)
+{
+ struct edac_ecs_context *ecs_ctx;
+ u32 fru;
+
+ ecs_ctx = devm_kzalloc(ecs_dev, sizeof(*ecs_ctx), GFP_KERNEL);
+ if (!ecs_ctx)
+ return -ENOMEM;
+
+ ecs_ctx->num_media_frus = num_media_frus;
+ ecs_ctx->fru_ctxs = devm_kcalloc(ecs_dev, num_media_frus,
+ sizeof(*ecs_ctx->fru_ctxs),
+ GFP_KERNEL);
+ if (!ecs_ctx->fru_ctxs)
+ return -ENOMEM;
+
+ for (fru = 0; fru < num_media_frus; fru++) {
+ struct edac_ecs_fru_context *fru_ctx = &ecs_ctx->fru_ctxs[fru];
+ struct attribute_group *group = &fru_ctx->group;
+ int i;
+
+ fru_ctx->ecs_dev_attr[0] = EDAC_ECS_ATTR_RW(log_entry_type, fru);
+ fru_ctx->ecs_dev_attr[1] = EDAC_ECS_ATTR_RO(log_entry_type_per_dram, fru);
+ fru_ctx->ecs_dev_attr[2] = EDAC_ECS_ATTR_RO(log_entry_type_per_memory_media, fru);
+ fru_ctx->ecs_dev_attr[3] = EDAC_ECS_ATTR_RW(mode, fru);
+ fru_ctx->ecs_dev_attr[4] = EDAC_ECS_ATTR_RO(mode_counts_rows, fru);
+ fru_ctx->ecs_dev_attr[5] = EDAC_ECS_ATTR_RO(mode_counts_codewords, fru);
+ fru_ctx->ecs_dev_attr[6] = EDAC_ECS_ATTR_WO(reset, fru);
+ fru_ctx->ecs_dev_attr[7] = EDAC_ECS_ATTR_RO(name, fru);
+ fru_ctx->ecs_dev_attr[8] = EDAC_ECS_ATTR_RW(threshold, fru);
+ for (i = 0; i < ecs_max_attrs; i++)
+ fru_ctx->ecs_attrs[i] = &fru_ctx->ecs_dev_attr[i].dev_attr.attr;
+
+ sprintf(fru_ctx->name, "%s%d", EDAC_ECS_FRU_NAME, fru);
+ group->name = fru_ctx->name;
+ group->attrs = fru_ctx->ecs_attrs;
+ group->is_visible = ecs_attr_visible;
+
+ attr_groups[fru] = group;
+ }
+
+ return 0;
+}
+
+/**
+ * edac_ecs_get_desc - get edac ecs descriptors
+ * @ecs_dev: client ecs device
+ * @attr_groups: pointer to attrribute group container
+ * @num_media_frus: number of media FRUs in the device
+ *
+ * Returns 0 on success, error otherwise.
+ */
+int edac_ecs_get_desc(struct device *ecs_dev,
+ const struct attribute_group **attr_groups,
+ u16 num_media_frus)
+{
+ if (!ecs_dev || !attr_groups || !num_media_frus)
+ return -EINVAL;
+
+ return ecs_create_desc(ecs_dev, attr_groups, num_media_frus);
+}
diff --git a/drivers/edac/edac_ras_feature.c b/drivers/edac/edac_ras_feature.c
index 48927f868372..a02ffbcc1c1e 100755
--- a/drivers/edac/edac_ras_feature.c
+++ b/drivers/edac/edac_ras_feature.c
@@ -47,10 +47,15 @@ static int edac_ras_feat_ecs_init(struct device *parent,
const struct attribute_group **attr_groups)
{
int num = efeat->ecs_info.num_media_frus;
+ int ret;
edata->ops = efeat->ecs_ops;
edata->private = efeat->ecs_ctx;
+ ret = edac_ecs_get_desc(parent, attr_groups, num);
+ if (ret)
+ return ret;
+
return num;
}
diff --git a/include/linux/edac_ras_feature.h b/include/linux/edac_ras_feature.h
index 462f9ecbf9d4..153f8a3557f1 100755
--- a/include/linux/edac_ras_feature.h
+++ b/include/linux/edac_ras_feature.h
@@ -47,10 +47,46 @@ struct edac_scrub_ops {
const struct attribute_group *edac_scrub_get_desc(void);
+/**
+ * struct ecs_ops - ECS device operations (all elements optional)
+ * @get_log_entry_type: read the log entry type value.
+ * @set_log_entry_type: set the log entry type value.
+ * @get_log_entry_type_per_dram: read the log entry type per dram value.
+ * @get_log_entry_type_memory_media: read the log entry type per memory media value.
+ * @get_mode: read the mode value.
+ * @set_mode: set the mode value.
+ * @get_mode_counts_rows: read the mode counts rows value.
+ * @get_mode_counts_codewords: read the mode counts codewords value.
+ * @reset: reset the ECS counter.
+ * @get_threshold: read the threshold value.
+ * @set_threshold: set the threshold value.
+ * @get_name: get the ECS's name.
+ */
+struct edac_ecs_ops {
+ int (*get_log_entry_type)(struct device *dev, void *drv_data, int fru_id, u64 *val);
+ int (*set_log_entry_type)(struct device *dev, void *drv_data, int fru_id, u64 val);
+ int (*get_log_entry_type_per_dram)(struct device *dev, void *drv_data,
+ int fru_id, u64 *val);
+ int (*get_log_entry_type_per_memory_media)(struct device *dev, void *drv_data,
+ int fru_id, u64 *val);
+ int (*get_mode)(struct device *dev, void *drv_data, int fru_id, u64 *val);
+ int (*set_mode)(struct device *dev, void *drv_data, int fru_id, u64 val);
+ int (*get_mode_counts_rows)(struct device *dev, void *drv_data, int fru_id, u64 *val);
+ int (*get_mode_counts_codewords)(struct device *dev, void *drv_data, int fru_id, u64 *val);
+ int (*reset)(struct device *dev, void *drv_data, int fru_id, u64 val);
+ int (*get_threshold)(struct device *dev, void *drv_data, int fru_id, u64 *threshold);
+ int (*set_threshold)(struct device *dev, void *drv_data, int fru_id, u64 threshold);
+ int (*get_name)(struct device *dev, void *drv_data, int fru_id, char *buf);
+};
+
struct edac_ecs_ex_info {
u16 num_media_frus;
};
+int edac_ecs_get_desc(struct device *ecs_dev,
+ const struct attribute_group **attr_groups,
+ u16 num_media_frus);
+
/*
* EDAC RAS feature information structure
*/
--
2.34.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC PATCH v9 04/11] cxl/mbox: Add GET_SUPPORTED_FEATURES mailbox command
2024-07-16 15:03 [RFC PATCH v9 00/11] EDAC: Scrub: Introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
` (2 preceding siblings ...)
2024-07-16 15:03 ` [RFC PATCH v9 03/11] EDAC: Add EDAC ECS " shiju.jose
@ 2024-07-16 15:03 ` shiju.jose
2024-07-17 17:28 ` nifan.cxl
2024-07-16 15:03 ` [RFC PATCH v9 05/11] cxl/mbox: Add GET_FEATURE " shiju.jose
` (6 subsequent siblings)
10 siblings, 1 reply; 30+ messages in thread
From: shiju.jose @ 2024-07-16 15:03 UTC (permalink / raw)
To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
Cc: bp, tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
kangkang.shen, wanghuiqiang, linuxarm, shiju.jose
From: Shiju Jose <shiju.jose@huawei.com>
Add support for GET_SUPPORTED_FEATURES mailbox command.
CXL spec 3.1 section 8.2.9.6 describes optional device specific features.
CXL devices supports features with changeable attributes.
Get Supported Features retrieves the list of supported device specific
features. The settings of a feature can be retrieved using Get Feature
and optionally modified using Set Feature.
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
drivers/cxl/core/mbox.c | 27 ++++++++++++++++++
drivers/cxl/cxlmem.h | 61 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 88 insertions(+)
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 2626f3fff201..9b9b1d26454e 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1324,6 +1324,33 @@ int cxl_set_timestamp(struct cxl_memdev_state *mds)
}
EXPORT_SYMBOL_NS_GPL(cxl_set_timestamp, CXL);
+int cxl_get_supported_features(struct cxl_memdev_state *mds,
+ u32 count, u16 start_index,
+ struct cxl_mbox_get_supp_feats_out *feats_out)
+{
+ struct cxl_mbox_get_supp_feats_in pi;
+ struct cxl_mbox_cmd mbox_cmd;
+ int rc;
+
+ pi.count = cpu_to_le32(count);
+ pi.start_index = cpu_to_le16(start_index);
+
+ mbox_cmd = (struct cxl_mbox_cmd) {
+ .opcode = CXL_MBOX_OP_GET_SUPPORTED_FEATURES,
+ .size_in = sizeof(pi),
+ .payload_in = &pi,
+ .size_out = count,
+ .payload_out = feats_out,
+ .min_out = sizeof(*feats_out),
+ };
+ rc = cxl_internal_send_cmd(mds, &mbox_cmd);
+ if (rc < 0)
+ return rc;
+
+ return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_get_supported_features, CXL);
+
int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
struct cxl_region *cxlr)
{
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 19aba81cdf13..b0e1565b9d2e 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -530,6 +530,7 @@ enum cxl_opcode {
CXL_MBOX_OP_GET_LOG_CAPS = 0x0402,
CXL_MBOX_OP_CLEAR_LOG = 0x0403,
CXL_MBOX_OP_GET_SUP_LOG_SUBLIST = 0x0405,
+ CXL_MBOX_OP_GET_SUPPORTED_FEATURES = 0x0500,
CXL_MBOX_OP_IDENTIFY = 0x4000,
CXL_MBOX_OP_GET_PARTITION_INFO = 0x4100,
CXL_MBOX_OP_SET_PARTITION_INFO = 0x4101,
@@ -699,6 +700,63 @@ struct cxl_mbox_set_timestamp_in {
} __packed;
+/*
+ * Get Supported Features CXL 3.1 Spec 8.2.9.6.1
+ */
+
+/*
+ * Get Supported Features input payload
+ * CXL rev 3.1 section 8.2.9.6.1 Table 8-95
+ */
+struct cxl_mbox_get_supp_feats_in {
+ __le32 count;
+ __le16 start_index;
+ u8 rsvd[2];
+} __packed;
+
+/*
+ * Get Supported Features Supported Feature Entry
+ * CXL rev 3.1 section 8.2.9.6.1 Table 8-97
+ */
+/* Supported Feature Entry : Payload out attribute flags */
+#define CXL_FEAT_ENTRY_FLAG_CHANGABLE BIT(0)
+#define CXL_FEAT_ENTRY_FLAG_DEEPEST_RESET_PERSISTENCE_MASK GENMASK(3, 1)
+#define CXL_FEAT_ENTRY_FLAG_PERSIST_ACROSS_FIRMWARE_UPDATE BIT(4)
+#define CXL_FEAT_ENTRY_FLAG_SUPPORT_DEFAULT_SELECTION BIT(5)
+#define CXL_FEAT_ENTRY_FLAG_SUPPORT_SAVED_SELECTION BIT(6)
+
+enum cxl_feat_attr_value_persistence {
+ CXL_FEAT_ATTR_VALUE_PERSISTENCE_NONE,
+ CXL_FEAT_ATTR_VALUE_PERSISTENCE_CXL_RESET,
+ CXL_FEAT_ATTR_VALUE_PERSISTENCE_HOT_RESET,
+ CXL_FEAT_ATTR_VALUE_PERSISTENCE_WARM_RESET,
+ CXL_FEAT_ATTR_VALUE_PERSISTENCE_COLD_RESET,
+ CXL_FEAT_ATTR_VALUE_PERSISTENCE_MAX
+};
+
+struct cxl_mbox_supp_feat_entry {
+ uuid_t uuid;
+ __le16 index;
+ __le16 get_size;
+ __le16 set_size;
+ __le32 attr_flags;
+ u8 get_version;
+ u8 set_version;
+ __le16 set_effects;
+ u8 rsvd[18];
+} __packed;
+
+/*
+ * Get Supported Features output payload
+ * CXL rev 3.1 section 8.2.9.6.1 Table 8-96
+ */
+struct cxl_mbox_get_supp_feats_out {
+ __le16 nr_entries;
+ __le16 nr_supported;
+ u8 rsvd[4];
+ struct cxl_mbox_supp_feat_entry feat_entries[];
+} __packed;
+
/* Get Poison List CXL 3.0 Spec 8.2.9.8.4.1 */
struct cxl_mbox_poison_in {
__le64 offset;
@@ -830,6 +888,9 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
enum cxl_event_type event_type,
const uuid_t *uuid, union cxl_event *evt);
int cxl_set_timestamp(struct cxl_memdev_state *mds);
+int cxl_get_supported_features(struct cxl_memdev_state *mds,
+ u32 count, u16 start_index,
+ struct cxl_mbox_get_supp_feats_out *feats_out);
int cxl_poison_state_init(struct cxl_memdev_state *mds);
int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
struct cxl_region *cxlr);
--
2.34.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC PATCH v9 05/11] cxl/mbox: Add GET_FEATURE mailbox command
2024-07-16 15:03 [RFC PATCH v9 00/11] EDAC: Scrub: Introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
` (3 preceding siblings ...)
2024-07-16 15:03 ` [RFC PATCH v9 04/11] cxl/mbox: Add GET_SUPPORTED_FEATURES mailbox command shiju.jose
@ 2024-07-16 15:03 ` shiju.jose
2024-07-17 18:08 ` nifan.cxl
2024-07-16 15:03 ` [RFC PATCH v9 06/11] cxl/mbox: Add SET_FEATURE " shiju.jose
` (5 subsequent siblings)
10 siblings, 1 reply; 30+ messages in thread
From: shiju.jose @ 2024-07-16 15:03 UTC (permalink / raw)
To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
Cc: bp, tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
kangkang.shen, wanghuiqiang, linuxarm, shiju.jose
From: Shiju Jose <shiju.jose@huawei.com>
Add support for GET_FEATURE mailbox command.
CXL spec 3.1 section 8.2.9.6 describes optional device specific features.
The settings of a feature can be retrieved using Get Feature command.
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
drivers/cxl/core/mbox.c | 37 +++++++++++++++++++++++++++++++++++++
drivers/cxl/cxlmem.h | 27 +++++++++++++++++++++++++++
2 files changed, 64 insertions(+)
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 9b9b1d26454e..b1eeed508459 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1351,6 +1351,43 @@ int cxl_get_supported_features(struct cxl_memdev_state *mds,
}
EXPORT_SYMBOL_NS_GPL(cxl_get_supported_features, CXL);
+size_t cxl_get_feature(struct cxl_memdev_state *mds,
+ const uuid_t feat_uuid, void *feat_out,
+ size_t feat_out_size,
+ enum cxl_get_feat_selection selection)
+{
+ size_t data_to_rd_size, size_out;
+ struct cxl_mbox_get_feat_in pi;
+ struct cxl_mbox_cmd mbox_cmd;
+ size_t data_rcvd_size = 0;
+ int rc;
+
+ size_out = min(feat_out_size, mds->payload_size);
+ pi.uuid = feat_uuid;
+ pi.selection = selection;
+ do {
+ data_to_rd_size = min(feat_out_size - data_rcvd_size, mds->payload_size);
+ pi.offset = cpu_to_le16(data_rcvd_size);
+ pi.count = cpu_to_le16(data_to_rd_size);
+
+ mbox_cmd = (struct cxl_mbox_cmd) {
+ .opcode = CXL_MBOX_OP_GET_FEATURE,
+ .size_in = sizeof(pi),
+ .payload_in = &pi,
+ .size_out = size_out,
+ .payload_out = feat_out + data_rcvd_size,
+ .min_out = data_to_rd_size,
+ };
+ rc = cxl_internal_send_cmd(mds, &mbox_cmd);
+ if (rc < 0 || mbox_cmd.size_out == 0)
+ return 0;
+ data_rcvd_size += mbox_cmd.size_out;
+ } while (data_rcvd_size < feat_out_size);
+
+ return data_rcvd_size;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_get_feature, CXL);
+
int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
struct cxl_region *cxlr)
{
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index b0e1565b9d2e..25698a6fbe66 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -531,6 +531,7 @@ enum cxl_opcode {
CXL_MBOX_OP_CLEAR_LOG = 0x0403,
CXL_MBOX_OP_GET_SUP_LOG_SUBLIST = 0x0405,
CXL_MBOX_OP_GET_SUPPORTED_FEATURES = 0x0500,
+ CXL_MBOX_OP_GET_FEATURE = 0x0501,
CXL_MBOX_OP_IDENTIFY = 0x4000,
CXL_MBOX_OP_GET_PARTITION_INFO = 0x4100,
CXL_MBOX_OP_SET_PARTITION_INFO = 0x4101,
@@ -757,6 +758,28 @@ struct cxl_mbox_get_supp_feats_out {
struct cxl_mbox_supp_feat_entry feat_entries[];
} __packed;
+/*
+ * Get Feature CXL 3.1 Spec 8.2.9.6.2
+ */
+
+/*
+ * Get Feature input payload
+ * CXL rev 3.1 section 8.2.9.6.2 Table 8-99
+ */
+enum cxl_get_feat_selection {
+ CXL_GET_FEAT_SEL_CURRENT_VALUE,
+ CXL_GET_FEAT_SEL_DEFAULT_VALUE,
+ CXL_GET_FEAT_SEL_SAVED_VALUE,
+ CXL_GET_FEAT_SEL_MAX
+};
+
+struct cxl_mbox_get_feat_in {
+ uuid_t uuid;
+ __le16 offset;
+ __le16 count;
+ u8 selection;
+} __packed;
+
/* Get Poison List CXL 3.0 Spec 8.2.9.8.4.1 */
struct cxl_mbox_poison_in {
__le64 offset;
@@ -891,6 +914,10 @@ int cxl_set_timestamp(struct cxl_memdev_state *mds);
int cxl_get_supported_features(struct cxl_memdev_state *mds,
u32 count, u16 start_index,
struct cxl_mbox_get_supp_feats_out *feats_out);
+size_t cxl_get_feature(struct cxl_memdev_state *mds,
+ const uuid_t feat_uuid, void *feat_out,
+ size_t feat_out_size,
+ enum cxl_get_feat_selection selection);
int cxl_poison_state_init(struct cxl_memdev_state *mds);
int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
struct cxl_region *cxlr);
--
2.34.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC PATCH v9 06/11] cxl/mbox: Add SET_FEATURE mailbox command
2024-07-16 15:03 [RFC PATCH v9 00/11] EDAC: Scrub: Introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
` (4 preceding siblings ...)
2024-07-16 15:03 ` [RFC PATCH v9 05/11] cxl/mbox: Add GET_FEATURE " shiju.jose
@ 2024-07-16 15:03 ` shiju.jose
2024-07-17 20:13 ` nifan.cxl
2024-07-16 15:03 ` [RFC PATCH v9 07/11] cxl/memscrub: Add CXL memory device patrol scrub control feature shiju.jose
` (4 subsequent siblings)
10 siblings, 1 reply; 30+ messages in thread
From: shiju.jose @ 2024-07-16 15:03 UTC (permalink / raw)
To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
Cc: bp, tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
kangkang.shen, wanghuiqiang, linuxarm, shiju.jose
From: Shiju Jose <shiju.jose@huawei.com>
Add support for SET_FEATURE mailbox command.
CXL spec 3.1 section 8.2.9.6 describes optional device specific features.
CXL devices supports features with changeable attributes.
The settings of a feature can be optionally modified using Set Feature
command.
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
drivers/cxl/core/mbox.c | 71 +++++++++++++++++++++++++++++++++++++++++
drivers/cxl/cxlmem.h | 33 +++++++++++++++++++
2 files changed, 104 insertions(+)
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index b1eeed508459..50ecd2bd7372 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1388,6 +1388,77 @@ size_t cxl_get_feature(struct cxl_memdev_state *mds,
}
EXPORT_SYMBOL_NS_GPL(cxl_get_feature, CXL);
+/*
+ * FEAT_DATA_MIN_PAYLOAD_SIZE - min extra number of bytes should be
+ * available in the mailbox for storing the actual feature data so that
+ * the feature data transfer would work as expected.
+ */
+#define FEAT_DATA_MIN_PAYLOAD_SIZE 10
+int cxl_set_feature(struct cxl_memdev_state *mds,
+ const uuid_t feat_uuid, u8 feat_version,
+ void *feat_data, size_t feat_data_size,
+ u8 feat_flag)
+{
+ struct cxl_memdev_set_feat_pi {
+ struct cxl_mbox_set_feat_hdr hdr;
+ u8 feat_data[];
+ } __packed;
+ size_t data_in_size, data_sent_size = 0;
+ struct cxl_mbox_cmd mbox_cmd;
+ size_t hdr_size;
+ int rc = 0;
+
+ struct cxl_memdev_set_feat_pi *pi __free(kfree) =
+ kmalloc(mds->payload_size, GFP_KERNEL);
+ pi->hdr.uuid = feat_uuid;
+ pi->hdr.version = feat_version;
+ feat_flag &= ~CXL_SET_FEAT_FLAG_DATA_TRANSFER_MASK;
+ hdr_size = sizeof(pi->hdr);
+ /*
+ * Check minimum mbox payload size is available for
+ * the feature data transfer.
+ */
+ if (hdr_size + FEAT_DATA_MIN_PAYLOAD_SIZE > mds->payload_size)
+ return -ENOMEM;
+
+ if ((hdr_size + feat_data_size) <= mds->payload_size) {
+ pi->hdr.flags = cpu_to_le32(feat_flag |
+ CXL_SET_FEAT_FLAG_FULL_DATA_TRANSFER);
+ data_in_size = feat_data_size;
+ } else {
+ pi->hdr.flags = cpu_to_le32(feat_flag |
+ CXL_SET_FEAT_FLAG_INITIATE_DATA_TRANSFER);
+ data_in_size = mds->payload_size - hdr_size;
+ }
+
+ do {
+ pi->hdr.offset = cpu_to_le16(data_sent_size);
+ memcpy(pi->feat_data, feat_data + data_sent_size, data_in_size);
+ mbox_cmd = (struct cxl_mbox_cmd) {
+ .opcode = CXL_MBOX_OP_SET_FEATURE,
+ .size_in = hdr_size + data_in_size,
+ .payload_in = pi,
+ };
+ rc = cxl_internal_send_cmd(mds, &mbox_cmd);
+ if (rc < 0)
+ return rc;
+
+ data_sent_size += data_in_size;
+ if (data_sent_size >= feat_data_size)
+ return 0;
+
+ if ((feat_data_size - data_sent_size) <= (mds->payload_size - hdr_size)) {
+ data_in_size = feat_data_size - data_sent_size;
+ pi->hdr.flags = cpu_to_le32(feat_flag |
+ CXL_SET_FEAT_FLAG_FINISH_DATA_TRANSFER);
+ } else {
+ pi->hdr.flags = cpu_to_le32(feat_flag |
+ CXL_SET_FEAT_FLAG_CONTINUE_DATA_TRANSFER);
+ }
+ } while (true);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_set_feature, CXL);
+
int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
struct cxl_region *cxlr)
{
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 25698a6fbe66..c3cb8e2736b5 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -532,6 +532,7 @@ enum cxl_opcode {
CXL_MBOX_OP_GET_SUP_LOG_SUBLIST = 0x0405,
CXL_MBOX_OP_GET_SUPPORTED_FEATURES = 0x0500,
CXL_MBOX_OP_GET_FEATURE = 0x0501,
+ CXL_MBOX_OP_SET_FEATURE = 0x0502,
CXL_MBOX_OP_IDENTIFY = 0x4000,
CXL_MBOX_OP_GET_PARTITION_INFO = 0x4100,
CXL_MBOX_OP_SET_PARTITION_INFO = 0x4101,
@@ -780,6 +781,34 @@ struct cxl_mbox_get_feat_in {
u8 selection;
} __packed;
+/*
+ * Set Feature CXL 3.1 Spec 8.2.9.6.3
+ */
+
+/*
+ * Set Feature input payload
+ * CXL rev 3.1 section 8.2.9.6.3 Table 8-101
+ */
+/* Set Feature : Payload in flags */
+#define CXL_SET_FEAT_FLAG_DATA_TRANSFER_MASK GENMASK(2, 0)
+enum cxl_set_feat_flag_data_transfer {
+ CXL_SET_FEAT_FLAG_FULL_DATA_TRANSFER,
+ CXL_SET_FEAT_FLAG_INITIATE_DATA_TRANSFER,
+ CXL_SET_FEAT_FLAG_CONTINUE_DATA_TRANSFER,
+ CXL_SET_FEAT_FLAG_FINISH_DATA_TRANSFER,
+ CXL_SET_FEAT_FLAG_ABORT_DATA_TRANSFER,
+ CXL_SET_FEAT_FLAG_DATA_TRANSFER_MAX
+};
+#define CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET BIT(3)
+
+struct cxl_mbox_set_feat_hdr {
+ uuid_t uuid;
+ __le32 flags;
+ __le16 offset;
+ u8 version;
+ u8 rsvd[9];
+} __packed;
+
/* Get Poison List CXL 3.0 Spec 8.2.9.8.4.1 */
struct cxl_mbox_poison_in {
__le64 offset;
@@ -918,6 +947,10 @@ size_t cxl_get_feature(struct cxl_memdev_state *mds,
const uuid_t feat_uuid, void *feat_out,
size_t feat_out_size,
enum cxl_get_feat_selection selection);
+int cxl_set_feature(struct cxl_memdev_state *mds,
+ const uuid_t feat_uuid, u8 feat_version,
+ void *feat_data, size_t feat_data_size,
+ u8 feat_flag);
int cxl_poison_state_init(struct cxl_memdev_state *mds);
int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
struct cxl_region *cxlr);
--
2.34.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC PATCH v9 07/11] cxl/memscrub: Add CXL memory device patrol scrub control feature
2024-07-16 15:03 [RFC PATCH v9 00/11] EDAC: Scrub: Introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
` (5 preceding siblings ...)
2024-07-16 15:03 ` [RFC PATCH v9 06/11] cxl/mbox: Add SET_FEATURE " shiju.jose
@ 2024-07-16 15:03 ` shiju.jose
2024-07-18 22:02 ` fan
2024-07-16 15:03 ` [RFC PATCH v9 08/11] cxl/memscrub: Add CXL memory device ECS " shiju.jose
` (3 subsequent siblings)
10 siblings, 1 reply; 30+ messages in thread
From: shiju.jose @ 2024-07-16 15:03 UTC (permalink / raw)
To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
Cc: bp, tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
kangkang.shen, wanghuiqiang, linuxarm, shiju.jose
From: Shiju Jose <shiju.jose@huawei.com>
CXL spec 3.1 section 8.2.9.9.11.1 describes the device patrol scrub control
feature. The device patrol scrub proactively locates and makes corrections
to errors in regular cycle.
Allow specifying the number of hours within which the patrol scrub must be
completed, subject to minimum and maximum limits reported by the device.
Also allow disabling scrub allowing trade-off error rates against
performance.
Add support for CXL memory device based patrol scrub control.
Register with EDAC RAS control feature driver, which gets the scrub attr
descriptors from the EDAC scrub and expose sysfs scrub control attributes
to the userspace.
For example CXL device based scrub control for the CXL mem0 device is exposed
in /sys/bus/edac/devices/cxl_mem0/scrub/
Also add support for region based CXL memory patrol scrub control.
CXL memory region may be interleaved across one or more CXL memory devices.
For example region based scrub control for CXL region1 is exposed in
/sys/bus/edac/devices/cxl_region1/scrub/
Open Questions:
Q1: CXL 3.1 spec defined patrol scrub control feature at CXL memory devices with
supporting set scrub cycle and enable/disable scrub. but not based on HPA range.
Thus presently scrub control for a region is implemented based on all associated
CXL memory devices.
What is the exact use case for the CXL region based scrub control?
How the HPA range, which Dan asked for region based scrubbing is used?
Does spec change is required for patrol scrub control feature with support
for setting the HPA range?
Q2: Both CXL device based and CXL region based scrub control would be enabled
at the same time in a system?
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
Documentation/scrub/edac-scrub.rst | 70 +++++
drivers/cxl/Kconfig | 19 ++
drivers/cxl/core/Makefile | 1 +
drivers/cxl/core/memscrub.c | 413 +++++++++++++++++++++++++++++
drivers/cxl/core/region.c | 6 +
drivers/cxl/cxlmem.h | 8 +
drivers/cxl/mem.c | 4 +
7 files changed, 521 insertions(+)
create mode 100644 Documentation/scrub/edac-scrub.rst
create mode 100644 drivers/cxl/core/memscrub.c
diff --git a/Documentation/scrub/edac-scrub.rst b/Documentation/scrub/edac-scrub.rst
new file mode 100644
index 000000000000..cf7d8b130204
--- /dev/null
+++ b/Documentation/scrub/edac-scrub.rst
@@ -0,0 +1,70 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================
+EDAC Scrub control
+===================
+
+Copyright (c) 2024 HiSilicon Limited.
+
+:Author: Shiju Jose <shiju.jose@huawei.com>
+:License: The GNU Free Documentation License, Version 1.2
+ (dual licensed under the GPL v2)
+:Original Reviewers:
+
+- Written for: 6.12
+- Updated for:
+
+Introduction
+------------
+The edac scrub driver provides interfaces for controlling the
+memory scrubbers in the system. The scrub device drivers in the
+system register with the edac scrub. The driver exposes the
+scrub controls to the user in the sysfs.
+
+The File System
+---------------
+
+The control attributes of the registered scrubbers could be
+accessed in the /sys/bus/edac/devices/<dev-name>/scrub/
+
+sysfs
+-----
+
+Sysfs files are documented in
+`Documentation/ABI/testing/sysfs-edac-scrub-control`.
+
+Example
+-------
+
+The usage takes the form shown in this example::
+
+1. CXL memory device patrol scrubber
+1.1 device based
+root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub/cycle_in_hours_range
+0x1-0xff
+root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub/cycle_in_hours
+0xc
+root@localhost:~# echo 30 > /sys/bus/edac/devices/cxl_mem0/scrub/cycle_in_hours
+root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub/cycle_in_hours
+0x1e
+root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_mem0/scrub/enable_background
+root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub/enable_background
+1
+root@localhost:~# echo 0 > /sys/bus/edac/devices/cxl_mem0/scrub/enable_background
+root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub/enable_background
+0
+
+1.2. region based
+root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub/cycle_in_hours_range
+0x1-0xff
+root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub/cycle_in_hours
+0xc
+root@localhost:~# echo 30 > /sys/bus/edac/devices/cxl_region0/scrub/cycle_in_hours
+root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub/cycle_in_hours
+0x1e
+root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_region0/scrub/enable_background
+root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub/enable_background
+1
+root@localhost:~# echo 0 > /sys/bus/edac/devices/cxl_region0/scrub/enable_background
+root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub/enable_background
+0
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 99b5c25be079..7da70685a2db 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -145,4 +145,23 @@ config CXL_REGION_INVALIDATION_TEST
If unsure, or if this kernel is meant for production environments,
say N.
+config CXL_SCRUB
+ bool "CXL: Memory scrub feature"
+ depends on CXL_PCI
+ depends on CXL_MEM
+ depends on EDAC
+ help
+ The CXL memory scrub control is an optional feature allows host to
+ control the scrub configurations of CXL Type 3 devices, which
+ supports patrol scrubbing.
+
+ Registers with the scrub subsystem to provide control attributes
+ of CXL memory device scrubber to the user.
+ Provides interface functions to support configuring the CXL memory
+ device patrol scrubber.
+
+ Say 'y/n' to enable/disable control of memory scrub parameters for
+ CXL.mem devices. See section 8.2.9.9.11.1 of CXL 3.1 specification
+ for detailed description of CXL memory patrol scrub control feature.
+
endif
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 9259bcc6773c..e0fc814c3983 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -16,3 +16,4 @@ cxl_core-y += pmu.o
cxl_core-y += cdat.o
cxl_core-$(CONFIG_TRACING) += trace.o
cxl_core-$(CONFIG_CXL_REGION) += region.o
+cxl_core-$(CONFIG_CXL_SCRUB) += memscrub.o
diff --git a/drivers/cxl/core/memscrub.c b/drivers/cxl/core/memscrub.c
new file mode 100644
index 000000000000..430f85b01f6c
--- /dev/null
+++ b/drivers/cxl/core/memscrub.c
@@ -0,0 +1,413 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * CXL memory scrub driver.
+ *
+ * Copyright (c) 2024 HiSilicon Limited.
+ *
+ * - Provides functions to configure patrol scrub feature of the
+ * CXL memory devices.
+ * - Registers with the scrub subsystem driver to expose the sysfs attributes
+ * to the user for configuring the CXL memory patrol scrub feature.
+ */
+
+#define pr_fmt(fmt) "CXL_MEM_SCRUB: " fmt
+
+#include <cxlmem.h>
+#include <linux/cleanup.h>
+#include <linux/limits.h>
+#include <cxl.h>
+#include <linux/edac_ras_feature.h>
+
+#define CXL_DEV_NUM_RAS_FEATURES 2
+
+/*ToDo: This reusable function will be moved to a common file */
+static int cxl_mem_get_supported_feature_entry(struct cxl_memdev *cxlmd, const uuid_t *feat_uuid,
+ struct cxl_mbox_supp_feat_entry *feat_entry_out)
+{
+ struct cxl_mbox_supp_feat_entry *feat_entry;
+ struct cxl_dev_state *cxlds = cxlmd->cxlds;
+ struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
+ int feat_index, feats_out_size;
+ int nentries, count;
+ int ret;
+
+ feat_index = 0;
+ feats_out_size = sizeof(struct cxl_mbox_get_supp_feats_out) +
+ sizeof(struct cxl_mbox_supp_feat_entry);
+ struct cxl_mbox_get_supp_feats_out *feats_out __free(kfree) =
+ kmalloc(feats_out_size, GFP_KERNEL);
+ if (!feats_out)
+ return -ENOMEM;
+
+ while (true) {
+ memset(feats_out, 0, feats_out_size);
+ ret = cxl_get_supported_features(mds, feats_out_size,
+ feat_index, feats_out);
+ if (ret)
+ return ret;
+
+ nentries = feats_out->nr_entries;
+ if (!nentries)
+ return -EOPNOTSUPP;
+
+ /* Check CXL memdev supports the feature */
+ feat_entry = feats_out->feat_entries;
+ for (count = 0; count < nentries; count++, feat_entry++) {
+ if (uuid_equal(&feat_entry->uuid, feat_uuid)) {
+ memcpy(feat_entry_out, feat_entry,
+ sizeof(*feat_entry_out));
+ return 0;
+ }
+ }
+ feat_index += nentries;
+ }
+}
+
+#define CXL_SCRUB_NAME_LEN 128
+
+/* CXL memory patrol scrub control definitions */
+#define CXL_MEMDEV_PS_GET_FEAT_VERSION 0x01
+#define CXL_MEMDEV_PS_SET_FEAT_VERSION 0x01
+
+static const uuid_t cxl_patrol_scrub_uuid =
+ UUID_INIT(0x96dad7d6, 0xfde8, 0x482b, 0xa7, 0x33, 0x75, 0x77, 0x4e, \
+ 0x06, 0xdb, 0x8a);
+
+/* CXL memory patrol scrub control functions */
+struct cxl_patrol_scrub_context {
+ u16 get_feat_size;
+ u16 set_feat_size;
+ struct cxl_memdev *cxlmd;
+ struct cxl_region *cxlr;
+};
+
+/**
+ * struct cxl_memdev_ps_params - CXL memory patrol scrub parameter data structure.
+ * @enable: [IN & OUT] enable(1)/disable(0) patrol scrub.
+ * @scrub_cycle_changeable: [OUT] scrub cycle attribute of patrol scrub is changeable.
+ * @scrub_cycle_hrs: [IN] Requested patrol scrub cycle in hours.
+ * [OUT] Current patrol scrub cycle in hours.
+ * @min_scrub_cycle_hrs:[OUT] minimum patrol scrub cycle in hours supported.
+ */
+struct cxl_memdev_ps_params {
+ bool enable;
+ bool scrub_cycle_changeable;
+ u16 scrub_cycle_hrs;
+ u16 min_scrub_cycle_hrs;
+};
+
+enum cxl_scrub_param {
+ cxl_ps_param_enable,
+ cxl_ps_param_scrub_cycle,
+};
+
+#define CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK BIT(0)
+#define CXL_MEMDEV_PS_SCRUB_CYCLE_REALTIME_REPORT_CAP_MASK BIT(1)
+#define CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK GENMASK(7, 0)
+#define CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK GENMASK(15, 8)
+#define CXL_MEMDEV_PS_FLAG_ENABLED_MASK BIT(0)
+
+struct cxl_memdev_ps_rd_attrs {
+ u8 scrub_cycle_cap;
+ __le16 scrub_cycle_hrs;
+ u8 scrub_flags;
+} __packed;
+
+struct cxl_memdev_ps_wr_attrs {
+ u8 scrub_cycle_hrs;
+ u8 scrub_flags;
+} __packed;
+
+static int cxl_mem_ps_get_attrs(struct cxl_memdev_state *mds,
+ struct cxl_memdev_ps_params *params)
+{
+ size_t rd_data_size = sizeof(struct cxl_memdev_ps_rd_attrs);
+ size_t data_size;
+ struct cxl_memdev_ps_rd_attrs *rd_attrs __free(kfree) =
+ kmalloc(rd_data_size, GFP_KERNEL);
+ if (!rd_attrs)
+ return -ENOMEM;
+
+ data_size = cxl_get_feature(mds, cxl_patrol_scrub_uuid, rd_attrs,
+ rd_data_size, CXL_GET_FEAT_SEL_CURRENT_VALUE);
+ if (!data_size)
+ return -EIO;
+
+ params->scrub_cycle_changeable = FIELD_GET(CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK,
+ rd_attrs->scrub_cycle_cap);
+ params->enable = FIELD_GET(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
+ rd_attrs->scrub_flags);
+ params->scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
+ rd_attrs->scrub_cycle_hrs);
+ params->min_scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK,
+ rd_attrs->scrub_cycle_hrs);
+
+ return 0;
+}
+
+static int cxl_ps_get_attrs(struct device *dev, void *drv_data,
+ struct cxl_memdev_ps_params *params)
+{
+ struct cxl_patrol_scrub_context *cxl_ps_ctx = drv_data;
+ struct cxl_memdev *cxlmd;
+ struct cxl_dev_state *cxlds;
+ struct cxl_memdev_state *mds;
+ u16 min_scrub_cycle = 0;
+ int i, ret;
+
+ if (cxl_ps_ctx->cxlr) {
+ struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
+ struct cxl_region_params *p = &cxlr->params;
+
+ for (i = p->interleave_ways - 1; i >= 0; i--) {
+ struct cxl_endpoint_decoder *cxled = p->targets[i];
+
+ cxlmd = cxled_to_memdev(cxled);
+ cxlds = cxlmd->cxlds;
+ mds = to_cxl_memdev_state(cxlds);
+ ret = cxl_mem_ps_get_attrs(mds, params);
+ if (ret)
+ return ret;
+
+ if (params->min_scrub_cycle_hrs > min_scrub_cycle)
+ min_scrub_cycle = params->min_scrub_cycle_hrs;
+ }
+ params->min_scrub_cycle_hrs = min_scrub_cycle;
+ return 0;
+ }
+ cxlmd = cxl_ps_ctx->cxlmd;
+ cxlds = cxlmd->cxlds;
+ mds = to_cxl_memdev_state(cxlds);
+
+ return cxl_mem_ps_get_attrs(mds, params);
+}
+
+static int cxl_mem_ps_set_attrs(struct device *dev, struct cxl_memdev_state *mds,
+ struct cxl_memdev_ps_params *params,
+ enum cxl_scrub_param param_type)
+{
+ struct cxl_memdev_ps_wr_attrs wr_attrs;
+ struct cxl_memdev_ps_params rd_params;
+ int ret;
+
+ ret = cxl_mem_ps_get_attrs(mds, &rd_params);
+ if (ret) {
+ dev_err(dev, "Get cxlmemdev patrol scrub params failed ret=%d\n",
+ ret);
+ return ret;
+ }
+
+ switch (param_type) {
+ case cxl_ps_param_enable:
+ wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
+ params->enable);
+ wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
+ rd_params.scrub_cycle_hrs);
+ break;
+ case cxl_ps_param_scrub_cycle:
+ if (params->scrub_cycle_hrs < rd_params.min_scrub_cycle_hrs) {
+ dev_err(dev, "Invalid CXL patrol scrub cycle(%d) to set\n",
+ params->scrub_cycle_hrs);
+ dev_err(dev, "Minimum supported CXL patrol scrub cycle in hour %d\n",
+ params->min_scrub_cycle_hrs);
+ return -EINVAL;
+ }
+ wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
+ params->scrub_cycle_hrs);
+ wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
+ rd_params.enable);
+ break;
+ }
+
+ ret = cxl_set_feature(mds, cxl_patrol_scrub_uuid, CXL_MEMDEV_PS_SET_FEAT_VERSION,
+ &wr_attrs, sizeof(wr_attrs),
+ CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET);
+ if (ret) {
+ dev_err(dev, "CXL patrol scrub set feature failed ret=%d\n", ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+static int cxl_ps_set_attrs(struct device *dev, void *drv_data,
+ struct cxl_memdev_ps_params *params,
+ enum cxl_scrub_param param_type)
+{
+ struct cxl_patrol_scrub_context *cxl_ps_ctx = drv_data;
+ struct cxl_memdev *cxlmd;
+ struct cxl_dev_state *cxlds;
+ struct cxl_memdev_state *mds;
+ int ret, i;
+
+ if (cxl_ps_ctx->cxlr) {
+ struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
+ struct cxl_region_params *p = &cxlr->params;
+
+ for (i = p->interleave_ways - 1; i >= 0; i--) {
+ struct cxl_endpoint_decoder *cxled = p->targets[i];
+
+ cxlmd = cxled_to_memdev(cxled);
+ cxlds = cxlmd->cxlds;
+ mds = to_cxl_memdev_state(cxlds);
+ ret = cxl_mem_ps_set_attrs(dev, mds, params, param_type);
+ if (ret)
+ return ret;
+ }
+ } else {
+ cxlmd = cxl_ps_ctx->cxlmd;
+ cxlds = cxlmd->cxlds;
+ mds = to_cxl_memdev_state(cxlds);
+
+ return cxl_mem_ps_set_attrs(dev, mds, params, param_type);
+ }
+
+ return 0;
+}
+
+static int cxl_patrol_scrub_get_enabled_bg(struct device *dev, void *drv_data, bool *enabled)
+{
+ struct cxl_memdev_ps_params params;
+ int ret;
+
+ ret = cxl_ps_get_attrs(dev, drv_data, ¶ms);
+ if (ret)
+ return ret;
+
+ *enabled = params.enable;
+
+ return 0;
+}
+
+static int cxl_patrol_scrub_set_enabled_bg(struct device *dev, void *drv_data, bool enable)
+{
+ struct cxl_memdev_ps_params params = {
+ .enable = enable,
+ };
+
+ return cxl_ps_set_attrs(dev, drv_data, ¶ms, cxl_ps_param_enable);
+}
+
+static int cxl_patrol_scrub_get_name(struct device *dev, void *drv_data, char *name)
+{
+ struct cxl_patrol_scrub_context *cxl_ps_ctx = drv_data;
+ struct cxl_memdev *cxlmd = cxl_ps_ctx->cxlmd;
+
+ if (cxl_ps_ctx->cxlr) {
+ struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
+
+ return sysfs_emit(name, "cxl_region%d_patrol_scrub\n", cxlr->id);
+ }
+
+ return sysfs_emit(name, "cxl_%s_patrol_scrub\n", dev_name(&cxlmd->dev));
+}
+
+static int cxl_patrol_scrub_write_scrub_cycle_hrs(struct device *dev, void *drv_data,
+ u64 scrub_cycle_hrs)
+{
+ struct cxl_memdev_ps_params params = {
+ .scrub_cycle_hrs = scrub_cycle_hrs,
+ };
+
+ return cxl_ps_set_attrs(dev, drv_data, ¶ms, cxl_ps_param_scrub_cycle);
+}
+
+static int cxl_patrol_scrub_read_scrub_cycle_hrs(struct device *dev, void *drv_data,
+ u64 *scrub_cycle_hrs)
+{
+ struct cxl_memdev_ps_params params;
+ int ret;
+
+ ret = cxl_ps_get_attrs(dev, drv_data, ¶ms);
+ if (ret)
+ return ret;
+
+ *scrub_cycle_hrs = params.scrub_cycle_hrs;
+
+ return 0;
+}
+
+static int cxl_patrol_scrub_read_scrub_cycle_hrs_range(struct device *dev, void *drv_data,
+ u64 *min, u64 *max)
+{
+ struct cxl_memdev_ps_params params;
+ int ret;
+
+ ret = cxl_ps_get_attrs(dev, drv_data, ¶ms);
+ if (ret)
+ return ret;
+ *min = params.min_scrub_cycle_hrs;
+ *max = U8_MAX; /* Max set by register size */
+
+ return 0;
+}
+
+static const struct edac_scrub_ops cxl_ps_scrub_ops = {
+ .get_enabled_bg = cxl_patrol_scrub_get_enabled_bg,
+ .set_enabled_bg = cxl_patrol_scrub_set_enabled_bg,
+ .get_name = cxl_patrol_scrub_get_name,
+ .cycle_in_hours_read = cxl_patrol_scrub_read_scrub_cycle_hrs,
+ .cycle_in_hours_write = cxl_patrol_scrub_write_scrub_cycle_hrs,
+ .cycle_in_hours_range = cxl_patrol_scrub_read_scrub_cycle_hrs_range,
+};
+
+int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr)
+{
+ struct edac_ras_feature ras_features[CXL_DEV_NUM_RAS_FEATURES];
+ struct cxl_patrol_scrub_context *cxl_ps_ctx;
+ struct cxl_mbox_supp_feat_entry feat_entry;
+ char cxl_dev_name[CXL_SCRUB_NAME_LEN];
+ int rc, i, num_ras_features = 0;
+
+ if (cxlr) {
+ struct cxl_region_params *p = &cxlr->params;
+
+ for (i = p->interleave_ways - 1; i >= 0; i--) {
+ struct cxl_endpoint_decoder *cxled = p->targets[i];
+
+ cxlmd = cxled_to_memdev(cxled);
+ memset(&feat_entry, 0, sizeof(feat_entry));
+ rc = cxl_mem_get_supported_feature_entry(cxlmd, &cxl_patrol_scrub_uuid,
+ &feat_entry);
+ if (rc < 0)
+ return rc;
+ if (!(feat_entry.attr_flags & CXL_FEAT_ENTRY_FLAG_CHANGABLE))
+ return -EOPNOTSUPP;
+ }
+ } else {
+ rc = cxl_mem_get_supported_feature_entry(cxlmd, &cxl_patrol_scrub_uuid,
+ &feat_entry);
+ if (rc < 0)
+ return rc;
+
+ if (!(feat_entry.attr_flags & CXL_FEAT_ENTRY_FLAG_CHANGABLE))
+ return -EOPNOTSUPP;
+ }
+
+ cxl_ps_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_ps_ctx), GFP_KERNEL);
+ if (!cxl_ps_ctx)
+ return -ENOMEM;
+
+ *cxl_ps_ctx = (struct cxl_patrol_scrub_context) {
+ .get_feat_size = feat_entry.get_size,
+ .set_feat_size = feat_entry.set_size,
+ };
+ if (cxlr) {
+ snprintf(cxl_dev_name, sizeof(cxl_dev_name),
+ "cxl_region%d", cxlr->id);
+ cxl_ps_ctx->cxlr = cxlr;
+ } else {
+ snprintf(cxl_dev_name, sizeof(cxl_dev_name),
+ "%s_%s", "cxl", dev_name(&cxlmd->dev));
+ cxl_ps_ctx->cxlmd = cxlmd;
+ }
+
+ ras_features[num_ras_features].feat = ras_feat_scrub;
+ ras_features[num_ras_features].scrub_ops = &cxl_ps_scrub_ops;
+ ras_features[num_ras_features].scrub_ctx = cxl_ps_ctx;
+ num_ras_features++;
+
+ return edac_ras_dev_register(&cxlmd->dev, cxl_dev_name, NULL,
+ num_ras_features, ras_features);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_mem_ras_features_init, CXL);
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 3c2b6144be23..14db9d301747 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3304,6 +3304,12 @@ static int cxl_region_probe(struct device *dev)
p->res->start, p->res->end, cxlr,
is_system_ram) > 0)
return 0;
+
+ rc = cxl_mem_ras_features_init(NULL, cxlr);
+ if (rc)
+ dev_warn(&cxlr->dev, "CXL ras features init for region_id=%d failed\n",
+ cxlr->id);
+
return devm_cxl_add_dax_region(cxlr);
default:
dev_dbg(&cxlr->dev, "unsupported region mode: %d\n",
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index c3cb8e2736b5..9a0eb41e5997 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -958,6 +958,14 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
+/* cxl memory scrub functions */
+#ifdef CONFIG_CXL_SCRUB
+int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr);
+#else
+static inline int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr)
+{ return 0; }
+#endif
+
#ifdef CONFIG_CXL_SUSPEND
void cxl_mem_active_inc(void);
void cxl_mem_active_dec(void);
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 0c79d9ce877c..7c8360e2e09b 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -117,6 +117,10 @@ static int cxl_mem_probe(struct device *dev)
if (!cxlds->media_ready)
return -EBUSY;
+ rc = cxl_mem_ras_features_init(cxlmd, NULL);
+ if (rc)
+ dev_warn(&cxlmd->dev, "CXL ras features init failed\n");
+
/*
* Someone is trying to reattach this device after it lost its port
* connection (an endpoint port previously registered by this memdev was
--
2.34.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC PATCH v9 08/11] cxl/memscrub: Add CXL memory device ECS control feature
2024-07-16 15:03 [RFC PATCH v9 00/11] EDAC: Scrub: Introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
` (6 preceding siblings ...)
2024-07-16 15:03 ` [RFC PATCH v9 07/11] cxl/memscrub: Add CXL memory device patrol scrub control feature shiju.jose
@ 2024-07-16 15:03 ` shiju.jose
2024-07-19 18:43 ` fan
2024-07-16 15:03 ` [RFC PATCH v9 09/11] platform: Add __free() based cleanup function for platform_device_put shiju.jose
` (2 subsequent siblings)
10 siblings, 1 reply; 30+ messages in thread
From: shiju.jose @ 2024-07-16 15:03 UTC (permalink / raw)
To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
Cc: bp, tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
kangkang.shen, wanghuiqiang, linuxarm, shiju.jose
From: Shiju Jose <shiju.jose@huawei.com>
CXL spec 3.1 section 8.2.9.9.11.2 describes the DDR5 Error Check
Scrub (ECS) control feature.
The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
Specification (JESD79-5) and allows the DRAM to internally read, correct
single-bit errors, and write back corrected data bits to the DRAM array
while providing transparency to error counts.
The ECS control allows the requester to change the log entry type, the ECS
threshold count provided that the request is within the definition
specified in DDR5 mode registers, change mode between codeword mode and
row count mode, and reset the ECS counter.
Register with EDAC RAS control feature driver, which gets the ECS attr
descriptors from the EDAC ECS and expose sysfs ECS control attributes
to the userspace.
For example ECS control for the memory media FRU 0 in CXL mem0 device is
in /sys/bus/edac/devices/cxl_mem0/ecs_fru0/
Note: The documentation can be added if necessary.
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
drivers/cxl/core/memscrub.c | 429 ++++++++++++++++++++++++++++++++++++
1 file changed, 429 insertions(+)
diff --git a/drivers/cxl/core/memscrub.c b/drivers/cxl/core/memscrub.c
index 430f85b01f6c..9be230ea989a 100644
--- a/drivers/cxl/core/memscrub.c
+++ b/drivers/cxl/core/memscrub.c
@@ -351,13 +351,411 @@ static const struct edac_scrub_ops cxl_ps_scrub_ops = {
.cycle_in_hours_range = cxl_patrol_scrub_read_scrub_cycle_hrs_range,
};
+/* CXL DDR5 ECS control definitions */
+#define CXL_MEMDEV_ECS_GET_FEAT_VERSION 0x01
+#define CXL_MEMDEV_ECS_SET_FEAT_VERSION 0x01
+
+static const uuid_t cxl_ecs_uuid =
+ UUID_INIT(0xe5b13f22, 0x2328, 0x4a14, 0xb8, 0xba, 0xb9, 0x69, 0x1e, \
+ 0x89, 0x33, 0x86);
+
+struct cxl_ecs_context {
+ u16 num_media_frus;
+ u16 get_feat_size;
+ u16 set_feat_size;
+ struct cxl_memdev *cxlmd;
+};
+
+/**
+ * struct cxl_ecs_params - CXL memory DDR5 ECS parameter data structure.
+ * @log_entry_type: ECS log entry type, per DRAM or per memory media FRU.
+ * @threshold: ECS threshold count per GB of memory cells.
+ * @mode: codeword/row count mode
+ * 0 : ECS counts rows with errors
+ * 1 : ECS counts codeword with errors
+ * @reset_counter: [IN] reset ECC counter to default value.
+ */
+struct cxl_ecs_params {
+ u8 log_entry_type;
+ u16 threshold;
+ u8 mode;
+ bool reset_counter;
+};
+
+enum {
+ CXL_ECS_PARAM_LOG_ENTRY_TYPE,
+ CXL_ECS_PARAM_THRESHOLD,
+ CXL_ECS_PARAM_MODE,
+ CXL_ECS_PARAM_RESET_COUNTER,
+};
+
+#define CXL_ECS_LOG_ENTRY_TYPE_MASK GENMASK(1, 0)
+#define CXL_ECS_REALTIME_REPORT_CAP_MASK BIT(0)
+#define CXL_ECS_THRESHOLD_COUNT_MASK GENMASK(2, 0)
+#define CXL_ECS_MODE_MASK BIT(3)
+#define CXL_ECS_RESET_COUNTER_MASK BIT(4)
+
+static const u16 ecs_supp_threshold[] = { 0, 0, 0, 256, 1024, 4096 };
+
+enum {
+ ECS_LOG_ENTRY_TYPE_DRAM = 0x0,
+ ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU = 0x1,
+};
+
+enum {
+ ECS_THRESHOLD_256 = 3,
+ ECS_THRESHOLD_1024 = 4,
+ ECS_THRESHOLD_4096 = 5,
+};
+
+enum {
+ ECS_MODE_COUNTS_ROWS = 0,
+ ECS_MODE_COUNTS_CODEWORDS = 1,
+};
+
+struct cxl_ecs_rd_attrs {
+ u8 ecs_log_cap;
+ u8 ecs_cap;
+ __le16 ecs_config;
+ u8 ecs_flags;
+} __packed;
+
+struct cxl_ecs_wr_attrs {
+ u8 ecs_log_cap;
+ __le16 ecs_config;
+} __packed;
+
+/* CXL DDR5 ECS control functions */
+static int cxl_mem_ecs_get_attrs(struct device *dev, void *drv_data, int fru_id,
+ struct cxl_ecs_params *params)
+{
+ struct cxl_ecs_context *cxl_ecs_ctx = drv_data;
+ struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd;
+ struct cxl_dev_state *cxlds = cxlmd->cxlds;
+ struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
+ size_t rd_data_size;
+ u8 threshold_index;
+ size_t data_size;
+
+ rd_data_size = cxl_ecs_ctx->get_feat_size;
+
+ struct cxl_ecs_rd_attrs *rd_attrs __free(kfree) =
+ kmalloc(rd_data_size, GFP_KERNEL);
+ if (!rd_attrs)
+ return -ENOMEM;
+
+ params->log_entry_type = 0;
+ params->threshold = 0;
+ params->mode = 0;
+ data_size = cxl_get_feature(mds, cxl_ecs_uuid, rd_attrs,
+ rd_data_size, CXL_GET_FEAT_SEL_CURRENT_VALUE);
+ if (!data_size)
+ return -EIO;
+
+ params->log_entry_type = FIELD_GET(CXL_ECS_LOG_ENTRY_TYPE_MASK,
+ rd_attrs[fru_id].ecs_log_cap);
+ threshold_index = FIELD_GET(CXL_ECS_THRESHOLD_COUNT_MASK,
+ rd_attrs[fru_id].ecs_config);
+ params->threshold = ecs_supp_threshold[threshold_index];
+ params->mode = FIELD_GET(CXL_ECS_MODE_MASK,
+ rd_attrs[fru_id].ecs_config);
+ return 0;
+}
+
+static int cxl_mem_ecs_set_attrs(struct device *dev, void *drv_data, int fru_id,
+ struct cxl_ecs_params *params, u8 param_type)
+{
+ struct cxl_ecs_context *cxl_ecs_ctx = drv_data;
+ struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd;
+ struct cxl_dev_state *cxlds = cxlmd->cxlds;
+ struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
+ size_t rd_data_size, wr_data_size;
+ u16 num_media_frus, count;
+ size_t data_size;
+ int ret;
+
+ num_media_frus = cxl_ecs_ctx->num_media_frus;
+ rd_data_size = cxl_ecs_ctx->get_feat_size;
+ wr_data_size = cxl_ecs_ctx->set_feat_size;
+ struct cxl_ecs_rd_attrs *rd_attrs __free(kfree) =
+ kmalloc(rd_data_size, GFP_KERNEL);
+ if (!rd_attrs)
+ return -ENOMEM;
+
+ data_size = cxl_get_feature(mds, cxl_ecs_uuid, rd_attrs,
+ rd_data_size, CXL_GET_FEAT_SEL_CURRENT_VALUE);
+ if (!data_size)
+ return -EIO;
+ struct cxl_ecs_wr_attrs *wr_attrs __free(kfree) =
+ kmalloc(wr_data_size, GFP_KERNEL);
+ if (!wr_attrs)
+ return -ENOMEM;
+
+ /* Fill writable attributes from the current attributes read for all the media FRUs */
+ for (count = 0; count < num_media_frus; count++) {
+ wr_attrs[count].ecs_log_cap = rd_attrs[count].ecs_log_cap;
+ wr_attrs[count].ecs_config = rd_attrs[count].ecs_config;
+ }
+
+ /* Fill attribute to be set for the media FRU */
+ switch (param_type) {
+ case CXL_ECS_PARAM_LOG_ENTRY_TYPE:
+ if (params->log_entry_type != ECS_LOG_ENTRY_TYPE_DRAM &&
+ params->log_entry_type != ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU) {
+ dev_err(dev,
+ "Invalid CXL ECS scrub log entry type(%d) to set\n",
+ params->log_entry_type);
+ dev_err(dev,
+ "Log Entry Type 0: per DRAM 1: per Memory Media FRU\n");
+ return -EINVAL;
+ }
+ wr_attrs[fru_id].ecs_log_cap = FIELD_PREP(CXL_ECS_LOG_ENTRY_TYPE_MASK,
+ params->log_entry_type);
+ break;
+ case CXL_ECS_PARAM_THRESHOLD:
+ wr_attrs[fru_id].ecs_config &= ~CXL_ECS_THRESHOLD_COUNT_MASK;
+ switch (params->threshold) {
+ case 256:
+ wr_attrs[fru_id].ecs_config |= FIELD_PREP(
+ CXL_ECS_THRESHOLD_COUNT_MASK,
+ ECS_THRESHOLD_256);
+ break;
+ case 1024:
+ wr_attrs[fru_id].ecs_config |= FIELD_PREP(
+ CXL_ECS_THRESHOLD_COUNT_MASK,
+ ECS_THRESHOLD_1024);
+ break;
+ case 4096:
+ wr_attrs[fru_id].ecs_config |= FIELD_PREP(
+ CXL_ECS_THRESHOLD_COUNT_MASK,
+ ECS_THRESHOLD_4096);
+ break;
+ default:
+ dev_err(dev,
+ "Invalid CXL ECS scrub threshold count(%d) to set\n",
+ params->threshold);
+ dev_err(dev,
+ "Supported scrub threshold count: 256,1024,4096\n");
+ return -EINVAL;
+ }
+ break;
+ case CXL_ECS_PARAM_MODE:
+ if (params->mode != ECS_MODE_COUNTS_ROWS &&
+ params->mode != ECS_MODE_COUNTS_CODEWORDS) {
+ dev_err(dev,
+ "Invalid CXL ECS scrub mode(%d) to set\n",
+ params->mode);
+ dev_err(dev,
+ "Mode 0: ECS counts rows with errors"
+ " 1: ECS counts codewords with errors\n");
+ return -EINVAL;
+ }
+ wr_attrs[fru_id].ecs_config &= ~CXL_ECS_MODE_MASK;
+ wr_attrs[fru_id].ecs_config |= FIELD_PREP(CXL_ECS_MODE_MASK,
+ params->mode);
+ break;
+ case CXL_ECS_PARAM_RESET_COUNTER:
+ wr_attrs[fru_id].ecs_config &= ~CXL_ECS_RESET_COUNTER_MASK;
+ wr_attrs[fru_id].ecs_config |= FIELD_PREP(CXL_ECS_RESET_COUNTER_MASK,
+ params->reset_counter);
+ break;
+ default:
+ dev_err(dev, "Invalid CXL ECS parameter to set\n");
+ return -EINVAL;
+ }
+
+ ret = cxl_set_feature(mds, cxl_ecs_uuid, CXL_MEMDEV_ECS_SET_FEAT_VERSION,
+ wr_attrs, wr_data_size,
+ CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET);
+ if (ret) {
+ dev_err(dev, "CXL ECS set feature failed ret=%d\n", ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+static int cxl_ecs_get_log_entry_type(struct device *dev, void *drv_data, int fru_id, u64 *val)
+{
+ struct cxl_ecs_params params;
+ int ret;
+
+ ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
+ if (ret)
+ return ret;
+
+ *val = params.log_entry_type;
+
+ return 0;
+}
+
+static int cxl_ecs_set_log_entry_type(struct device *dev, void *drv_data, int fru_id, u64 val)
+{
+ struct cxl_ecs_params params = {
+ .log_entry_type = val,
+ };
+
+ return cxl_mem_ecs_set_attrs(dev, drv_data, fru_id, ¶ms, CXL_ECS_PARAM_LOG_ENTRY_TYPE);
+}
+
+static int cxl_ecs_get_log_entry_type_per_dram(struct device *dev, void *drv_data,
+ int fru_id, u64 *val)
+{
+ struct cxl_ecs_params params;
+ int ret;
+
+ ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
+ if (ret)
+ return ret;
+
+ if (params.log_entry_type == ECS_LOG_ENTRY_TYPE_DRAM)
+ *val = 1;
+ else
+ *val = 0;
+
+ return 0;
+}
+
+static int cxl_ecs_get_log_entry_type_per_memory_media(struct device *dev, void *drv_data,
+ int fru_id, u64 *val)
+{
+ struct cxl_ecs_params params;
+ int ret;
+
+ ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
+ if (ret)
+ return ret;
+
+ if (params.log_entry_type == ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU)
+ *val = 1;
+ else
+ *val = 0;
+
+ return 0;
+}
+
+static int cxl_ecs_get_mode(struct device *dev, void *drv_data, int fru_id, u64 *val)
+{
+ struct cxl_ecs_params params;
+ int ret;
+
+ ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
+ if (ret)
+ return ret;
+
+ *val = params.mode;
+
+ return 0;
+}
+
+static int cxl_ecs_set_mode(struct device *dev, void *drv_data, int fru_id, u64 val)
+{
+ struct cxl_ecs_params params = {
+ .mode = val,
+ };
+
+ return cxl_mem_ecs_set_attrs(dev, drv_data, fru_id, ¶ms, CXL_ECS_PARAM_MODE);
+}
+
+static int cxl_ecs_get_mode_counts_rows(struct device *dev, void *drv_data, int fru_id, u64 *val)
+{
+ struct cxl_ecs_params params;
+ int ret;
+
+ ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
+ if (ret)
+ return ret;
+
+ if (params.mode == ECS_MODE_COUNTS_ROWS)
+ *val = 1;
+ else
+ *val = 0;
+
+ return 0;
+}
+
+static int cxl_ecs_get_mode_counts_codewords(struct device *dev, void *drv_data,
+ int fru_id, u64 *val)
+{
+ struct cxl_ecs_params params;
+ int ret;
+
+ ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
+ if (ret)
+ return ret;
+
+ if (params.mode == ECS_MODE_COUNTS_CODEWORDS)
+ *val = 1;
+ else
+ *val = 0;
+
+ return 0;
+}
+
+static int cxl_ecs_reset(struct device *dev, void *drv_data, int fru_id, u64 val)
+{
+ struct cxl_ecs_params params = {
+ .reset_counter = val,
+ };
+
+ return cxl_mem_ecs_set_attrs(dev, drv_data, fru_id, ¶ms, CXL_ECS_PARAM_RESET_COUNTER);
+}
+
+static int cxl_ecs_get_threshold(struct device *dev, void *drv_data, int fru_id, u64 *val)
+{
+ struct cxl_ecs_params params;
+ int ret;
+
+ ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
+ if (ret)
+ return ret;
+
+ *val = params.threshold;
+
+ return 0;
+}
+
+static int cxl_ecs_set_threshold(struct device *dev, void *drv_data, int fru_id, u64 val)
+{
+ struct cxl_ecs_params params = {
+ .threshold = val,
+ };
+
+ return cxl_mem_ecs_set_attrs(dev, drv_data, fru_id, ¶ms, CXL_ECS_PARAM_THRESHOLD);
+}
+
+static int cxl_ecs_get_name(struct device *dev, void *drv_data, int fru_id, char *name)
+{
+ struct cxl_ecs_context *cxl_ecs_ctx = drv_data;
+ struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd;
+
+ return sysfs_emit(name, "cxl_%s_ecs_fru%d\n", dev_name(&cxlmd->dev), fru_id);
+}
+
+static const struct edac_ecs_ops cxl_ecs_ops = {
+ .get_log_entry_type = cxl_ecs_get_log_entry_type,
+ .set_log_entry_type = cxl_ecs_set_log_entry_type,
+ .get_log_entry_type_per_dram = cxl_ecs_get_log_entry_type_per_dram,
+ .get_log_entry_type_per_memory_media = cxl_ecs_get_log_entry_type_per_memory_media,
+ .get_mode = cxl_ecs_get_mode,
+ .set_mode = cxl_ecs_set_mode,
+ .get_mode_counts_codewords = cxl_ecs_get_mode_counts_codewords,
+ .get_mode_counts_rows = cxl_ecs_get_mode_counts_rows,
+ .reset = cxl_ecs_reset,
+ .get_threshold = cxl_ecs_get_threshold,
+ .set_threshold = cxl_ecs_set_threshold,
+ .get_name = cxl_ecs_get_name,
+};
+
int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr)
{
struct edac_ras_feature ras_features[CXL_DEV_NUM_RAS_FEATURES];
struct cxl_patrol_scrub_context *cxl_ps_ctx;
struct cxl_mbox_supp_feat_entry feat_entry;
char cxl_dev_name[CXL_SCRUB_NAME_LEN];
+ struct cxl_ecs_context *cxl_ecs_ctx;
int rc, i, num_ras_features = 0;
+ int num_media_frus;
if (cxlr) {
struct cxl_region_params *p = &cxlr->params;
@@ -407,6 +805,37 @@ int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr)
ras_features[num_ras_features].scrub_ctx = cxl_ps_ctx;
num_ras_features++;
+ if (!cxlr) {
+ rc = cxl_mem_get_supported_feature_entry(cxlmd, &cxl_ecs_uuid, &feat_entry);
+ if (rc < 0)
+ goto feat_register;
+
+ if (!(feat_entry.attr_flags & CXL_FEAT_ENTRY_FLAG_CHANGABLE))
+ goto feat_register;
+ num_media_frus = feat_entry.get_size/
+ sizeof(struct cxl_ecs_rd_attrs);
+ if (!num_media_frus)
+ goto feat_register;
+
+ cxl_ecs_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_ecs_ctx), GFP_KERNEL);
+ if (!cxl_ecs_ctx)
+ goto feat_register;
+ *cxl_ecs_ctx = (struct cxl_ecs_context) {
+ .get_feat_size = feat_entry.get_size,
+ .set_feat_size = feat_entry.set_size,
+ .num_media_frus = num_media_frus,
+ .cxlmd = cxlmd,
+ };
+
+ ras_features[num_ras_features].feat = ras_feat_ecs;
+ ras_features[num_ras_features].ecs_ops = &cxl_ecs_ops;
+ ras_features[num_ras_features].ecs_ctx = cxl_ecs_ctx;
+ ras_features[num_ras_features].ecs_info.num_media_frus = num_media_frus;
+ num_ras_features++;
+ }
+
+feat_register:
+
return edac_ras_dev_register(&cxlmd->dev, cxl_dev_name, NULL,
num_ras_features, ras_features);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC PATCH v9 09/11] platform: Add __free() based cleanup function for platform_device_put
2024-07-16 15:03 [RFC PATCH v9 00/11] EDAC: Scrub: Introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
` (7 preceding siblings ...)
2024-07-16 15:03 ` [RFC PATCH v9 08/11] cxl/memscrub: Add CXL memory device ECS " shiju.jose
@ 2024-07-16 15:03 ` shiju.jose
2024-07-16 15:03 ` [RFC PATCH v9 10/11] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
2024-07-16 15:03 ` [RFC PATCH v9 11/11] ras: scrub: ACPI RAS2: Add memory " shiju.jose
10 siblings, 0 replies; 30+ messages in thread
From: shiju.jose @ 2024-07-16 15:03 UTC (permalink / raw)
To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
Cc: bp, tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
kangkang.shen, wanghuiqiang, linuxarm, shiju.jose
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Add __free() based cleanup function for platform_device_put().
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
include/linux/platform_device.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 7a41c72c1959..1ddc35623b4c 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -232,6 +232,7 @@ extern int platform_device_add_data(struct platform_device *pdev,
extern int platform_device_add(struct platform_device *pdev);
extern void platform_device_del(struct platform_device *pdev);
extern void platform_device_put(struct platform_device *pdev);
+DEFINE_FREE(platform_device_put, struct platform_device *, if (_T) platform_device_put(_T))
struct platform_driver {
int (*probe)(struct platform_device *);
--
2.34.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC PATCH v9 10/11] ACPI:RAS2: Add ACPI RAS2 driver
2024-07-16 15:03 [RFC PATCH v9 00/11] EDAC: Scrub: Introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
` (8 preceding siblings ...)
2024-07-16 15:03 ` [RFC PATCH v9 09/11] platform: Add __free() based cleanup function for platform_device_put shiju.jose
@ 2024-07-16 15:03 ` shiju.jose
2024-07-16 15:03 ` [RFC PATCH v9 11/11] ras: scrub: ACPI RAS2: Add memory " shiju.jose
10 siblings, 0 replies; 30+ messages in thread
From: shiju.jose @ 2024-07-16 15:03 UTC (permalink / raw)
To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
Cc: bp, tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
kangkang.shen, wanghuiqiang, linuxarm, shiju.jose
From: Shiju Jose <shiju.jose@huawei.com>
Add support for ACPI RAS2 feature table (RAS2) defined in the
ACPI 6.5 Specification, section 5.2.21.
Driver contains RAS2 Init, which extracts the RAS2 table and driver
adds platform device for each memory features which binds to the
RAS2 memory driver.
Driver uses PCC mailbox to communicate with the ACPI HW and the
driver adds OSPM interfaces to send RAS2 commands.
Co-developed-by: A Somasundaram <somasundaram.a@hpe.com>
Signed-off-by: A Somasundaram <somasundaram.a@hpe.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
drivers/acpi/Kconfig | 10 +
drivers/acpi/Makefile | 1 +
drivers/acpi/ras2.c | 391 +++++++++++++++++++++++++++++++++++++++
include/acpi/ras2_acpi.h | 59 ++++++
4 files changed, 461 insertions(+)
create mode 100755 drivers/acpi/ras2.c
create mode 100644 include/acpi/ras2_acpi.h
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index e3a7c2aedd5f..482080f1f0c5 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -284,6 +284,16 @@ config ACPI_CPPC_LIB
If your platform does not support CPPC in firmware,
leave this option disabled.
+config ACPI_RAS2
+ bool "ACPI RAS2 driver"
+ select MAILBOX
+ select PCC
+ help
+ The driver adds support for ACPI RAS2 feature table(extracts RAS2
+ table from OS system table) and OSPM interfaces to send RAS2
+ commands via PCC mailbox subspace. Driver adds platform device for
+ the RAS2 memory features which binds to the RAS2 memory driver.
+
config ACPI_PROCESSOR
tristate "Processor"
depends on X86 || ARM64 || LOONGARCH || RISCV
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index 39ea5cfa8326..4d62633ee4c2 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -99,6 +99,7 @@ obj-$(CONFIG_ACPI_EC_DEBUGFS) += ec_sys.o
obj-$(CONFIG_ACPI_BGRT) += bgrt.o
obj-$(CONFIG_ACPI_CPPC_LIB) += cppc_acpi.o
obj-$(CONFIG_ACPI_SPCR_TABLE) += spcr.o
+obj-$(CONFIG_ACPI_RAS2) += ras2.o
obj-$(CONFIG_ACPI_DEBUGGER_USER) += acpi_dbg.o
obj-$(CONFIG_ACPI_PPTT) += pptt.o
obj-$(CONFIG_ACPI_PFRUT) += pfr_update.o pfr_telemetry.o
diff --git a/drivers/acpi/ras2.c b/drivers/acpi/ras2.c
new file mode 100755
index 000000000000..07fe8ac02d25
--- /dev/null
+++ b/drivers/acpi/ras2.c
@@ -0,0 +1,391 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Implementation of ACPI RAS2 driver.
+ *
+ * Copyright (c) 2024 HiSilicon Limited.
+ *
+ * Support for RAS2 - ACPI 6.5 Specification, section 5.2.21
+ *
+ * Driver contains ACPI RAS2 init, which extracts the ACPI RAS2 table and
+ * get the PCC channel subspace for communicating with the ACPI compliant
+ * HW platform which supports ACPI RAS2. Driver adds platform devices
+ * for each RAS2 memory feature which binds to the memory ACPI RAS2 driver.
+ */
+
+#define pr_fmt(fmt) "ACPI RAS2: " fmt
+
+#include <linux/delay.h>
+#include <linux/export.h>
+#include <linux/ktime.h>
+#include <linux/platform_device.h>
+#include <acpi/pcc.h>
+#include <acpi/ras2_acpi.h>
+
+/*
+ * Arbitrary Retries for PCC commands because the
+ * remote processor could be much slower to reply.
+ */
+#define RAS2_NUM_RETRIES 600
+
+#define RAS2_FEATURE_TYPE_MEMORY 0x00
+
+/* global variables for the RAS2 PCC subspaces */
+static DEFINE_MUTEX(ras2_pcc_subspace_lock);
+static LIST_HEAD(ras2_pcc_subspaces);
+
+static int ras2_report_cap_error(u32 cap_status)
+{
+ switch (cap_status) {
+ case ACPI_RAS2_NOT_VALID:
+ case ACPI_RAS2_NOT_SUPPORTED:
+ return -EPERM;
+ case ACPI_RAS2_BUSY:
+ return -EBUSY;
+ case ACPI_RAS2_FAILED:
+ case ACPI_RAS2_ABORTED:
+ case ACPI_RAS2_INVALID_DATA:
+ return -EINVAL;
+ default: /* 0 or other, Success */
+ return 0;
+ }
+}
+
+static int ras2_check_pcc_chan(struct ras2_pcc_subspace *pcc_subspace)
+{
+ struct acpi_ras2_shared_memory __iomem *generic_comm_base = pcc_subspace->pcc_comm_addr;
+ ktime_t next_deadline = ktime_add(ktime_get(), pcc_subspace->deadline);
+ u32 cap_status;
+ u16 status;
+ u32 ret;
+
+ while (!ktime_after(ktime_get(), next_deadline)) {
+ /*
+ * As per ACPI spec, the PCC space will be initialized by
+ * platform and should have set the command completion bit when
+ * PCC can be used by OSPM
+ */
+ status = readw_relaxed(&generic_comm_base->status);
+ if (status & RAS2_PCC_CMD_ERROR) {
+ cap_status = readw_relaxed(&generic_comm_base->set_capabilities_status);
+ ret = ras2_report_cap_error(cap_status);
+
+ status &= ~RAS2_PCC_CMD_ERROR;
+ writew_relaxed(status, &generic_comm_base->status);
+ return ret;
+ }
+ if (status & RAS2_PCC_CMD_COMPLETE)
+ return 0;
+ /*
+ * Reducing the bus traffic in case this loop takes longer than
+ * a few retries.
+ */
+ msleep(10);
+ }
+
+ return -EIO;
+}
+
+/**
+ * ras2_send_pcc_cmd() - Send RAS2 command via PCC channel
+ * @ras2_ctx: pointer to the ras2 context structure
+ * @cmd: command to send
+ *
+ * Returns: 0 on success, an error otherwise
+ */
+int ras2_send_pcc_cmd(struct ras2_scrub_ctx *ras2_ctx, u16 cmd)
+{
+ struct ras2_pcc_subspace *pcc_subspace = ras2_ctx->pcc_subspace;
+ struct acpi_ras2_shared_memory *generic_comm_base = pcc_subspace->pcc_comm_addr;
+ static ktime_t last_cmd_cmpl_time, last_mpar_reset;
+ struct mbox_chan *pcc_channel;
+ unsigned int time_delta;
+ static int mpar_count;
+ int ret;
+
+ guard(mutex)(&ras2_pcc_subspace_lock);
+ ret = ras2_check_pcc_chan(pcc_subspace);
+ if (ret < 0)
+ return ret;
+ pcc_channel = pcc_subspace->pcc_chan->mchan;
+
+ /*
+ * Handle the Minimum Request Turnaround Time(MRTT)
+ * "The minimum amount of time that OSPM must wait after the completion
+ * of a command before issuing the next command, in microseconds"
+ */
+ if (pcc_subspace->pcc_mrtt) {
+ time_delta = ktime_us_delta(ktime_get(), last_cmd_cmpl_time);
+ if (pcc_subspace->pcc_mrtt > time_delta)
+ udelay(pcc_subspace->pcc_mrtt - time_delta);
+ }
+
+ /*
+ * Handle the non-zero Maximum Periodic Access Rate(MPAR)
+ * "The maximum number of periodic requests that the subspace channel can
+ * support, reported in commands per minute. 0 indicates no limitation."
+ *
+ * This parameter should be ideally zero or large enough so that it can
+ * handle maximum number of requests that all the cores in the system can
+ * collectively generate. If it is not, we will follow the spec and just
+ * not send the request to the platform after hitting the MPAR limit in
+ * any 60s window
+ */
+ if (pcc_subspace->pcc_mpar) {
+ if (mpar_count == 0) {
+ time_delta = ktime_ms_delta(ktime_get(), last_mpar_reset);
+ if (time_delta < 60 * MSEC_PER_SEC) {
+ dev_dbg(ras2_ctx->dev,
+ "PCC cmd not sent due to MPAR limit");
+ return -EIO;
+ }
+ last_mpar_reset = ktime_get();
+ mpar_count = pcc_subspace->pcc_mpar;
+ }
+ mpar_count--;
+ }
+
+ /* Write to the shared comm region. */
+ writew_relaxed(cmd, &generic_comm_base->command);
+
+ /* Flip CMD COMPLETE bit */
+ writew_relaxed(0, &generic_comm_base->status);
+
+ /* Ring doorbell */
+ ret = mbox_send_message(pcc_channel, &cmd);
+ if (ret < 0) {
+ dev_err(ras2_ctx->dev,
+ "Err sending PCC mbox message. cmd:%d, ret:%d\n",
+ cmd, ret);
+ return ret;
+ }
+
+ /*
+ * If Minimum Request Turnaround Time is non-zero, we need
+ * to record the completion time of both READ and WRITE
+ * command for proper handling of MRTT, so we need to check
+ * for pcc_mrtt in addition to CMD_READ
+ */
+ if (cmd == RAS2_PCC_CMD_EXEC || pcc_subspace->pcc_mrtt) {
+ ret = ras2_check_pcc_chan(pcc_subspace);
+ if (pcc_subspace->pcc_mrtt)
+ last_cmd_cmpl_time = ktime_get();
+ }
+
+ if (pcc_channel->mbox->txdone_irq)
+ mbox_chan_txdone(pcc_channel, ret);
+ else
+ mbox_client_txdone(pcc_channel, ret);
+
+ return ret >= 0 ? 0 : ret;
+}
+EXPORT_SYMBOL_GPL(ras2_send_pcc_cmd);
+
+static int ras2_register_pcc_channel(struct device *dev, struct ras2_scrub_ctx *ras2_ctx,
+ int pcc_subspace_id)
+{
+ struct acpi_pcct_hw_reduced *ras2_ss;
+ struct mbox_client *ras2_mbox_cl;
+ struct pcc_mbox_chan *pcc_chan;
+ struct ras2_pcc_subspace *pcc_subspace;
+
+ if (pcc_subspace_id < 0)
+ return -EINVAL;
+
+ mutex_lock(&ras2_pcc_subspace_lock);
+ list_for_each_entry(pcc_subspace, &ras2_pcc_subspaces, elem) {
+ if (pcc_subspace->pcc_subspace_id == pcc_subspace_id) {
+ ras2_ctx->pcc_subspace = pcc_subspace;
+ pcc_subspace->ref_count++;
+ mutex_unlock(&ras2_pcc_subspace_lock);
+ return 0;
+ }
+ }
+ mutex_unlock(&ras2_pcc_subspace_lock);
+
+ pcc_subspace = kcalloc(1, sizeof(*pcc_subspace), GFP_KERNEL);
+ if (!pcc_subspace)
+ return -ENOMEM;
+ pcc_subspace->pcc_subspace_id = pcc_subspace_id;
+ ras2_mbox_cl = &pcc_subspace->mbox_client;
+ ras2_mbox_cl->dev = dev;
+ ras2_mbox_cl->knows_txdone = true;
+
+ pcc_chan = pcc_mbox_request_channel(ras2_mbox_cl, pcc_subspace_id);
+ if (IS_ERR(pcc_chan)) {
+ kfree(pcc_subspace);
+ return PTR_ERR(pcc_chan);
+ }
+ pcc_subspace->pcc_chan = pcc_chan;
+ ras2_ss = pcc_chan->mchan->con_priv;
+ pcc_subspace->comm_base_addr = ras2_ss->base_address;
+
+ /*
+ * ras2_ss->latency is just a Nominal value. In reality
+ * the remote processor could be much slower to reply.
+ * So add an arbitrary amount of wait on top of Nominal.
+ */
+ pcc_subspace->deadline = ns_to_ktime(RAS2_NUM_RETRIES * ras2_ss->latency *
+ NSEC_PER_USEC);
+ pcc_subspace->pcc_mrtt = ras2_ss->min_turnaround_time;
+ pcc_subspace->pcc_mpar = ras2_ss->max_access_rate;
+ pcc_subspace->pcc_comm_addr = acpi_os_ioremap(pcc_subspace->comm_base_addr,
+ ras2_ss->length);
+ /* Set flag so that we dont come here for each CPU. */
+ pcc_subspace->pcc_channel_acquired = true;
+
+ mutex_lock(&ras2_pcc_subspace_lock);
+ list_add(&pcc_subspace->elem, &ras2_pcc_subspaces);
+ pcc_subspace->ref_count++;
+ mutex_unlock(&ras2_pcc_subspace_lock);
+ ras2_ctx->pcc_subspace = pcc_subspace;
+
+ return 0;
+}
+
+static void ras2_unregister_pcc_channel(void *ctx)
+{
+ struct ras2_scrub_ctx *ras2_ctx = ctx;
+ struct ras2_pcc_subspace *pcc_subspace = ras2_ctx->pcc_subspace;
+
+ if (!pcc_subspace || !pcc_subspace->pcc_chan)
+ return;
+
+ guard(mutex)(&ras2_pcc_subspace_lock);
+ if (pcc_subspace->ref_count > 0)
+ pcc_subspace->ref_count--;
+ if (!pcc_subspace->ref_count) {
+ list_del(&pcc_subspace->elem);
+ pcc_mbox_free_channel(pcc_subspace->pcc_chan);
+ kfree(pcc_subspace);
+ }
+}
+
+/**
+ * devm_ras2_register_pcc_channel() - Register RAS2 PCC channel
+ * @dev: pointer to the ras2 device
+ * @ras2_ctx: pointer to the ras2 context structure
+ * @pcc_subspace_id: identifier of the RAS2 PCC channel.
+ *
+ * Returns: 0 on success, an error otherwise
+ */
+int devm_ras2_register_pcc_channel(struct device *dev, struct ras2_scrub_ctx *ras2_ctx,
+ int pcc_subspace_id)
+{
+ int ret;
+
+ ret = ras2_register_pcc_channel(dev, ras2_ctx, pcc_subspace_id);
+ if (ret)
+ return ret;
+
+ return devm_add_action_or_reset(dev, ras2_unregister_pcc_channel, ras2_ctx);
+}
+EXPORT_SYMBOL_NS_GPL(devm_ras2_register_pcc_channel, ACPI_RAS2);
+
+static struct platform_device *ras2_add_platform_device(char *name, int channel)
+{
+ int ret;
+ struct platform_device *pdev __free(platform_device_put) =
+ platform_device_alloc(name, PLATFORM_DEVID_AUTO);
+ if (!pdev)
+ return ERR_PTR(-ENOMEM);
+
+ ret = platform_device_add_data(pdev, &channel, sizeof(channel));
+ if (ret)
+ return ERR_PTR(ret);
+
+ ret = platform_device_add(pdev);
+ if (ret)
+ return ERR_PTR(ret);
+
+ return_ptr(pdev);
+}
+
+static int __init ras2_acpi_init(void)
+{
+ struct acpi_table_header *pAcpiTable = NULL;
+ struct acpi_ras2_pcc_desc *pcc_desc_list;
+ struct acpi_table_ras2 *pRas2Table;
+ struct platform_device *pdev;
+ int pcc_subspace_id;
+ acpi_size ras2_size;
+ acpi_status status;
+ u8 count = 0, i;
+ int ret;
+
+ status = acpi_get_table("RAS2", 0, &pAcpiTable);
+ if (ACPI_FAILURE(status) || !pAcpiTable) {
+ pr_err("ACPI RAS2 driver failed to initialize, get table failed\n");
+ return -EINVAL;
+ }
+
+ ras2_size = pAcpiTable->length;
+ if (ras2_size < sizeof(struct acpi_table_ras2)) {
+ pr_err("ACPI RAS2 table present but broken (too short #1)\n");
+ ret = -EINVAL;
+ goto free_ras2_table;
+ }
+
+ pRas2Table = (struct acpi_table_ras2 *)pAcpiTable;
+ if (pRas2Table->num_pcc_descs <= 0) {
+ pr_err("ACPI RAS2 table does not contain PCC descriptors\n");
+ ret = -EINVAL;
+ goto free_ras2_table;
+ }
+
+ struct platform_device **pdev_list __free(kfree) =
+ kcalloc(pRas2Table->num_pcc_descs, sizeof(*pdev_list),
+ GFP_KERNEL);
+ if (!pdev_list) {
+ ret = -ENOMEM;
+ goto free_ras2_table;
+ }
+
+ pcc_desc_list = (struct acpi_ras2_pcc_desc *)(pRas2Table + 1);
+ /* Double scan for the case of only one actual controller */
+ pcc_subspace_id = -1;
+ count = 0;
+ for (i = 0; i < pRas2Table->num_pcc_descs; i++, pcc_desc_list++) {
+ if (pcc_desc_list->feature_type != RAS2_FEATURE_TYPE_MEMORY)
+ continue;
+ if (pcc_subspace_id == -1) {
+ pcc_subspace_id = pcc_desc_list->channel_id;
+ count++;
+ }
+ if (pcc_desc_list->channel_id != pcc_subspace_id)
+ count++;
+ }
+ if (count == 1) {
+ pdev = ras2_add_platform_device("acpi_ras2", pcc_subspace_id);
+ if (!pdev) {
+ ret = -ENODEV;
+ goto free_ras2_pdev;
+ }
+ pdev_list[0] = pdev;
+ return 0;
+ }
+
+ count = 0;
+ for (i = 0; i < pRas2Table->num_pcc_descs; i++, pcc_desc_list++) {
+ if (pcc_desc_list->feature_type != RAS2_FEATURE_TYPE_MEMORY)
+ continue;
+ pcc_subspace_id = pcc_desc_list->channel_id;
+ /* Add the platform device and bind ACPI RAS2 memory driver */
+ pdev = ras2_add_platform_device("acpi_ras2", pcc_subspace_id);
+ if (!pdev)
+ goto free_ras2_pdev;
+ pdev_list[count++] = pdev;
+ }
+
+ acpi_put_table(pAcpiTable);
+ return 0;
+
+free_ras2_pdev:
+ for (i = count; i >= 0; i++)
+ platform_device_put(pdev_list[i]);
+
+free_ras2_table:
+ acpi_put_table(pAcpiTable);
+
+ return ret;
+}
+late_initcall(ras2_acpi_init)
diff --git a/include/acpi/ras2_acpi.h b/include/acpi/ras2_acpi.h
new file mode 100644
index 000000000000..cb99201f12d2
--- /dev/null
+++ b/include/acpi/ras2_acpi.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * RAS2 ACPI driver header file
+ *
+ * (C) Copyright 2014, 2015 Hewlett-Packard Enterprises
+ *
+ * Copyright (c) 2024 HiSilicon Limited
+ */
+
+#ifndef _RAS2_ACPI_H
+#define _RAS2_ACPI_H
+
+#include <linux/acpi.h>
+#include <linux/mailbox_client.h>
+#include <linux/mutex.h>
+#include <linux/types.h>
+
+#define RAS2_PCC_CMD_COMPLETE BIT(0)
+#define RAS2_PCC_CMD_ERROR BIT(2)
+
+/* RAS2 specific PCC commands */
+#define RAS2_PCC_CMD_EXEC 0x01
+
+struct device;
+
+/* Data structures for PCC communication and RAS2 table */
+struct pcc_mbox_chan;
+
+struct ras2_pcc_subspace {
+ int pcc_subspace_id;
+ struct mbox_client mbox_client;
+ struct pcc_mbox_chan *pcc_chan;
+ struct acpi_ras2_shared_memory __iomem *pcc_comm_addr;
+ u64 comm_base_addr;
+ bool pcc_channel_acquired;
+ ktime_t deadline;
+ unsigned int pcc_mpar;
+ unsigned int pcc_mrtt;
+ struct list_head elem;
+ u16 ref_count;
+};
+
+struct ras2_scrub_ctx {
+ struct device *dev;
+ struct ras2_pcc_subspace *pcc_subspace;
+ int id;
+ struct device *scrub_dev;
+ bool bg;
+ u64 base, size;
+ u8 schrs, schrs_min, schrs_max; /* schrs - scrub cycle in hours */
+ /* Lock to provide mutually exclusive access to PCC channel */
+ struct mutex lock;
+};
+
+int ras2_send_pcc_cmd(struct ras2_scrub_ctx *ras2_ctx, u16 cmd);
+int devm_ras2_register_pcc_channel(struct device *dev, struct ras2_scrub_ctx *ras2_ctx,
+ int pcc_subspace_id);
+
+#endif /* _RAS2_ACPI_H */
--
2.34.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RFC PATCH v9 11/11] ras: scrub: ACPI RAS2: Add memory ACPI RAS2 driver
2024-07-16 15:03 [RFC PATCH v9 00/11] EDAC: Scrub: Introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
` (9 preceding siblings ...)
2024-07-16 15:03 ` [RFC PATCH v9 10/11] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
@ 2024-07-16 15:03 ` shiju.jose
10 siblings, 0 replies; 30+ messages in thread
From: shiju.jose @ 2024-07-16 15:03 UTC (permalink / raw)
To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
Cc: bp, tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
kangkang.shen, wanghuiqiang, linuxarm, shiju.jose
From: Shiju Jose <shiju.jose@huawei.com>
Memory ACPI RAS2 driver binds to the platform device add by the
ACPI RAS2 table parser.
Driver uses a PCC subspace for communicating with the ACPI compliant
platform to provide control of memory scrub parameters to the userspace
via the edac scrub.
Get the scrub attr descriptors from the EDAC scrub and register with EDAC
RAS feature driver to expose sysfs scrub control attributes to the userspace.
For example scrub control for the RAS2 memory device is exposed in
/sys/bus/edac/devices/acpi_ras2_mem0/scrub/
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
Documentation/scrub/edac-scrub.rst | 37 +++
drivers/ras/Kconfig | 10 +
drivers/ras/Makefile | 1 +
drivers/ras/acpi_ras2.c | 401 +++++++++++++++++++++++++++++
4 files changed, 449 insertions(+)
create mode 100644 drivers/ras/acpi_ras2.c
diff --git a/Documentation/scrub/edac-scrub.rst b/Documentation/scrub/edac-scrub.rst
index cf7d8b130204..da9cd2e73687 100644
--- a/Documentation/scrub/edac-scrub.rst
+++ b/Documentation/scrub/edac-scrub.rst
@@ -68,3 +68,40 @@ root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub/enable_background
root@localhost:~# echo 0 > /sys/bus/edac/devices/cxl_region0/scrub/enable_background
root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub/enable_background
0
+
+2. RAS2
+2.1 On demand scrubbing for a specific memory region.
+root@localhost:~# echo 0x120000 > /sys/bus/edac/devices/acpi_ras2_mem0/scrub/addr_range_base
+root@localhost:~# echo 0x150000 > /sys/bus/edac/devices/acpi_ras2_mem0/scrub/addr_range_size
+root@localhost:~# cat /sys/bus/edac/devices/acpi_ras2_mem0/scrub/cycle_in_hours_range
+0x1-0x18
+root@localhost:~# cat /sys/bus/edac/devices/acpi_ras2_mem0/scrub/cycle_in_hours
+0xa
+root@localhost:~# echo 15 > /sys/bus/edac/devices/acpi_ras2_mem0/scrub/cycle_in_hours
+root@localhost:~# echo 1 > /sys/bus/edac/devices/acpi_ras2_mem0/scrub/enable_on_demand
+root@localhost:~# cat /sys/bus/edac/devices/acpi_ras2_mem0/scrub/enable_on_demand
+1
+root@localhost:~# cat /sys/bus/edac/devices/acpi_ras2_mem0/scrub/cycle_in_hours
+0xf
+root@localhost:~# cat /sys/bus/edac/devices/acpi_ras2_mem0/scrub/addr_range_base
+0x120000
+root@localhost:~# cat //sys/bus/edac/devices/acpi_ras2_mem0/scrub/addr_range_size
+0x150000
+root@localhost:~# echo 0 > /sys/bus/edac/devices/acpi_ras2_mem0/scrub/enable_on_demand
+root@localhost:~# cat /sys/bus/edac/devices/acpi_ras2_mem0/scrub/enable_on_demand
+0
+
+2.2 Background scrubbing the entire memory
+root@localhost:~# cat /sys/bus/edac/devices/acpi_ras2_mem0/scrub/cycle_in_hours_range
+0x1-0x18
+root@localhost:~# cat /sys/bus/edac/devices/acpi_ras2_mem0/scrub/cycle_in_hours
+0xa
+root@localhost:~# cat /sys/bus/edac/devices/acpi_ras2_mem0/enable_background
+0
+root@localhost:~# echo 3 > /sys/bus/edac/devices/acpi_ras2_mem0/scrub/cycle_in_hours
+root@localhost:~# echo 1 > /sys/bus/edac/devices/acpi_ras2_mem0/enable_background
+root@localhost:~# cat /sys/bus/edac/devices/acpi_ras2_mem0/enable_background
+1
+root@localhost:~# cat /sys/bus/edac/devices/acpi_ras2_mem0/scrub/cycle_in_hours
+0x3
+root@localhost:~# echo 0 > /sys/bus/edac/devices/acpi_ras2_mem0/enable_background
diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig
index fc4f4bb94a4c..a2635017d80d 100644
--- a/drivers/ras/Kconfig
+++ b/drivers/ras/Kconfig
@@ -46,4 +46,14 @@ config RAS_FMPM
Memory will be retired during boot time and run time depending on
platform-specific policies.
+config MEM_ACPI_RAS2
+ tristate "Memory ACPI RAS2 driver"
+ depends on ACPI_RAS2
+ depends on EDAC
+ help
+ The driver binds to the platform device added by the ACPI RAS2
+ table parser. Use a PCC channel subspace for communicating with
+ the ACPI compliant platform to provide control of memory scrub
+ parameters to the user via the edac scrub.
+
endif
diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile
index 11f95d59d397..a0e6e903d6b0 100644
--- a/drivers/ras/Makefile
+++ b/drivers/ras/Makefile
@@ -2,6 +2,7 @@
obj-$(CONFIG_RAS) += ras.o
obj-$(CONFIG_DEBUG_FS) += debugfs.o
obj-$(CONFIG_RAS_CEC) += cec.o
+obj-$(CONFIG_MEM_ACPI_RAS2) += acpi_ras2.o
obj-$(CONFIG_RAS_FMPM) += amd/fmpm.o
obj-y += amd/atl/
diff --git a/drivers/ras/acpi_ras2.c b/drivers/ras/acpi_ras2.c
new file mode 100644
index 000000000000..49703d8bc4fa
--- /dev/null
+++ b/drivers/ras/acpi_ras2.c
@@ -0,0 +1,401 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * ACPI RAS2 memory driver
+ *
+ * Copyright (c) 2024 HiSilicon Limited.
+ *
+ */
+
+#define pr_fmt(fmt) "MEMORY ACPI RAS2: " fmt
+
+#include <linux/edac_ras_feature.h>
+#include <linux/platform_device.h>
+#include <acpi/ras2_acpi.h>
+
+#define RAS2_DEV_NUM_RAS_FEATURES 1
+
+#define RAS2_SUPPORT_HW_PARTOL_SCRUB BIT(0)
+#define RAS2_TYPE_PATROL_SCRUB 0x0000
+
+#define RAS2_GET_PATROL_PARAMETERS 0x01
+#define RAS2_START_PATROL_SCRUBBER 0x02
+#define RAS2_STOP_PATROL_SCRUBBER 0x03
+
+#define RAS2_PATROL_SCRUB_SCHRS_IN_MASK GENMASK(15, 8)
+#define RAS2_PATROL_SCRUB_EN_BACKGROUND BIT(0)
+#define RAS2_PATROL_SCRUB_SCHRS_OUT_MASK GENMASK(7, 0)
+#define RAS2_PATROL_SCRUB_MIN_SCHRS_OUT_MASK GENMASK(15, 8)
+#define RAS2_PATROL_SCRUB_MAX_SCHRS_OUT_MASK GENMASK(23, 16)
+#define RAS2_PATROL_SCRUB_FLAG_SCRUBBER_RUNNING BIT(0)
+
+#define RAS2_SCRUB_NAME_LEN 128
+
+struct acpi_ras2_ps_shared_mem {
+ struct acpi_ras2_shared_memory common;
+ struct acpi_ras2_patrol_scrub_parameter params;
+};
+
+static int ras2_is_patrol_scrub_support(struct ras2_scrub_ctx *ras2_ctx)
+{
+ struct acpi_ras2_shared_memory __iomem *common = (void *)
+ ras2_ctx->pcc_subspace->pcc_comm_addr;
+
+ guard(mutex)(&ras2_ctx->lock);
+ common->set_capabilities[0] = 0;
+
+ return common->features[0] & RAS2_SUPPORT_HW_PARTOL_SCRUB;
+}
+
+static int ras2_update_patrol_scrub_params_cache(struct ras2_scrub_ctx *ras2_ctx)
+{
+ struct acpi_ras2_ps_shared_mem __iomem *ps_sm = (void *)
+ ras2_ctx->pcc_subspace->pcc_comm_addr;
+ int ret;
+
+ ps_sm->common.set_capabilities[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+ ps_sm->params.patrol_scrub_command = RAS2_GET_PATROL_PARAMETERS;
+
+ ret = ras2_send_pcc_cmd(ras2_ctx, RAS2_PCC_CMD_EXEC);
+ if (ret) {
+ dev_err(ras2_ctx->dev, "failed to read parameters\n");
+ return ret;
+ }
+
+ ras2_ctx->schrs_min = FIELD_GET(RAS2_PATROL_SCRUB_MIN_SCHRS_OUT_MASK,
+ ps_sm->params.scrub_params_out);
+ ras2_ctx->schrs_max = FIELD_GET(RAS2_PATROL_SCRUB_MAX_SCHRS_OUT_MASK,
+ ps_sm->params.scrub_params_out);
+ if (!ras2_ctx->bg) {
+ ras2_ctx->base = ps_sm->params.actual_address_range[0];
+ ras2_ctx->size = ps_sm->params.actual_address_range[1];
+ }
+ ras2_ctx->schrs = FIELD_GET(RAS2_PATROL_SCRUB_SCHRS_OUT_MASK,
+ ps_sm->params.scrub_params_out);
+
+ return 0;
+}
+
+/* Context - lock must be held */
+static int ras2_get_patrol_scrub_running(struct ras2_scrub_ctx *ras2_ctx,
+ bool *running)
+{
+ struct acpi_ras2_ps_shared_mem __iomem *ps_sm = (void *)
+ ras2_ctx->pcc_subspace->pcc_comm_addr;
+ int ret;
+
+ ps_sm->common.set_capabilities[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+ ps_sm->params.patrol_scrub_command = RAS2_GET_PATROL_PARAMETERS;
+
+ ret = ras2_send_pcc_cmd(ras2_ctx, RAS2_PCC_CMD_EXEC);
+ if (ret) {
+ dev_err(ras2_ctx->dev, "failed to read parameters\n");
+ return ret;
+ }
+
+ *running = ps_sm->params.flags & RAS2_PATROL_SCRUB_FLAG_SCRUBBER_RUNNING;
+
+ return 0;
+}
+
+static int ras2_hw_scrub_write_schrs(struct device *dev, void *drv_data, u64 schrs)
+{
+ struct ras2_scrub_ctx *ras2_ctx = drv_data;
+ bool running;
+ int ret;
+
+ guard(mutex)(&ras2_ctx->lock);
+ ret = ras2_get_patrol_scrub_running(ras2_ctx, &running);
+ if (ret)
+ return ret;
+
+ if (running)
+ return -EBUSY;
+
+ if (schrs < ras2_ctx->schrs_min || schrs > ras2_ctx->schrs_max)
+ return -EINVAL;
+
+ ras2_ctx->schrs = schrs;
+
+ return 0;
+}
+
+static int ras2_hw_scrub_read_schrs(struct device *dev, void *drv_data, u64 *schrs)
+{
+ struct ras2_scrub_ctx *ras2_ctx = drv_data;
+
+ *schrs = ras2_ctx->schrs;
+
+ return 0;
+}
+
+static int ras2_hw_scrub_read_schrs_range(struct device *dev, void *drv_data, u64 *min, u64 *max)
+{
+ struct ras2_scrub_ctx *ras2_ctx = drv_data;
+
+ *min = ras2_ctx->schrs_min;
+ *max = ras2_ctx->schrs_max;
+
+ return 0;
+}
+
+static int ras2_hw_scrub_read_range(struct device *dev, void *drv_data, u64 *base, u64 *size)
+{
+ struct ras2_scrub_ctx *ras2_ctx = drv_data;
+
+ /*
+ * When BG scrubbing is enabled the actual address range is not valid.
+ * Return -EBUSY now unless findout a method to retrieve actual full PA range.
+ */
+ if (ras2_ctx->bg)
+ return -EBUSY;
+
+ *base = ras2_ctx->base;
+ *size = ras2_ctx->size;
+
+ return 0;
+}
+
+static int ras2_hw_scrub_write_range(struct device *dev, void *drv_data, u64 base, u64 size)
+{
+ struct ras2_scrub_ctx *ras2_ctx = drv_data;
+ bool running;
+ int ret;
+
+ guard(mutex)(&ras2_ctx->lock);
+ ret = ras2_get_patrol_scrub_running(ras2_ctx, &running);
+ if (ret)
+ return ret;
+
+ if (running)
+ return -EBUSY;
+
+ if (!base || !size) {
+ dev_warn(dev, "%s: Invalid address range, base=0x%llx size=0x%llx\n",
+ __func__, base, size);
+ return -EINVAL;
+ }
+
+ ras2_ctx->base = base;
+ ras2_ctx->size = size;
+
+ return 0;
+}
+
+static int ras2_hw_scrub_set_enabled_bg(struct device *dev, void *drv_data, bool enable)
+{
+ struct ras2_scrub_ctx *ras2_ctx = drv_data;
+ struct acpi_ras2_ps_shared_mem __iomem *ps_sm = (void *)
+ ras2_ctx->pcc_subspace->pcc_comm_addr;
+ bool enabled;
+ int ret;
+
+ guard(mutex)(&ras2_ctx->lock);
+ ps_sm->common.set_capabilities[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+ ret = ras2_get_patrol_scrub_running(ras2_ctx, &enabled);
+ if (ret)
+ return ret;
+ if (enable) {
+ if (ras2_ctx->bg || enabled)
+ return -EBUSY;
+ ps_sm->params.requested_address_range[0] = 0;
+ ps_sm->params.requested_address_range[1] = 0;
+ ps_sm->params.scrub_params_in &= ~RAS2_PATROL_SCRUB_SCHRS_IN_MASK;
+ ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PATROL_SCRUB_SCHRS_IN_MASK,
+ ras2_ctx->schrs);
+ ps_sm->params.patrol_scrub_command = RAS2_START_PATROL_SCRUBBER;
+ } else {
+ if (!ras2_ctx->bg)
+ return -EPERM;
+ if (!ras2_ctx->bg && enabled)
+ return -EBUSY;
+ ps_sm->params.patrol_scrub_command = RAS2_STOP_PATROL_SCRUBBER;
+ }
+ ps_sm->params.scrub_params_in &= ~RAS2_PATROL_SCRUB_EN_BACKGROUND;
+ ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PATROL_SCRUB_EN_BACKGROUND,
+ enable);
+ ret = ras2_send_pcc_cmd(ras2_ctx, RAS2_PCC_CMD_EXEC);
+ if (ret) {
+ dev_err(ras2_ctx->dev, "%s: failed to enable(%d) background scrubbing\n",
+ __func__, enable);
+ return ret;
+ }
+ if (enable) {
+ ras2_ctx->bg = true;
+ /* Update the cache to account for rounding of supplied parameters and similar */
+ ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+ } else {
+ ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+ ras2_ctx->bg = false;
+ }
+
+ return ret;
+}
+
+static int ras2_hw_scrub_get_enabled_bg(struct device *dev, void *drv_data, bool *enabled)
+{
+ struct ras2_scrub_ctx *ras2_ctx = drv_data;
+
+ *enabled = ras2_ctx->bg;
+
+ return 0;
+}
+
+static int ras2_hw_scrub_set_enabled_od(struct device *dev, void *drv_data, bool enable)
+{
+ struct ras2_scrub_ctx *ras2_ctx = drv_data;
+ struct acpi_ras2_ps_shared_mem __iomem *ps_sm = (void *)
+ ras2_ctx->pcc_subspace->pcc_comm_addr;
+ bool enabled;
+ int ret;
+
+ guard(mutex)(&ras2_ctx->lock);
+ ps_sm->common.set_capabilities[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+ if (ras2_ctx->bg)
+ return -EBUSY;
+ ret = ras2_get_patrol_scrub_running(ras2_ctx, &enabled);
+ if (ret)
+ return ret;
+ if (enable) {
+ if (!ras2_ctx->base || !ras2_ctx->size) {
+ dev_warn(ras2_ctx->dev,
+ "%s: Invalid address range, base=0x%llx "
+ "size=0x%llx\n", __func__,
+ ras2_ctx->base, ras2_ctx->size);
+ return -ERANGE;
+ }
+ if (enabled)
+ return -EBUSY;
+ ps_sm->params.scrub_params_in &= ~RAS2_PATROL_SCRUB_SCHRS_IN_MASK;
+ ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PATROL_SCRUB_SCHRS_IN_MASK,
+ ras2_ctx->schrs);
+ ps_sm->params.requested_address_range[0] = ras2_ctx->base;
+ ps_sm->params.requested_address_range[1] = ras2_ctx->size;
+ ps_sm->params.scrub_params_in &= ~RAS2_PATROL_SCRUB_EN_BACKGROUND;
+ ps_sm->params.patrol_scrub_command = RAS2_START_PATROL_SCRUBBER;
+ } else {
+ if (!enabled)
+ return 0;
+ ps_sm->params.patrol_scrub_command = RAS2_STOP_PATROL_SCRUBBER;
+ }
+
+ ret = ras2_send_pcc_cmd(ras2_ctx, RAS2_PCC_CMD_EXEC);
+ if (ret) {
+ dev_err(ras2_ctx->dev, "failed to enable(%d) the demand scrubbing\n", enable);
+ return ret;
+ }
+
+ return ras2_update_patrol_scrub_params_cache(ras2_ctx);
+}
+
+static int ras2_hw_scrub_get_enabled_od(struct device *dev, void *drv_data, bool *enabled)
+{
+ struct ras2_scrub_ctx *ras2_ctx = drv_data;
+
+ guard(mutex)(&ras2_ctx->lock);
+ if (ras2_ctx->bg) {
+ *enabled = false;
+ return 0;
+ }
+
+ return ras2_get_patrol_scrub_running(ras2_ctx, enabled);
+}
+
+static int ras2_hw_scrub_get_name(struct device *dev, void *drv_data, char *name)
+{
+ struct ras2_scrub_ctx *ras2_ctx = drv_data;
+
+ return sysfs_emit(name, "acpi_ras2_mem%d_scrub\n", ras2_ctx->id);
+}
+
+static const struct edac_scrub_ops ras2_scrub_ops = {
+ .read_range = ras2_hw_scrub_read_range,
+ .write_range = ras2_hw_scrub_write_range,
+ .get_enabled_bg = ras2_hw_scrub_get_enabled_bg,
+ .set_enabled_bg = ras2_hw_scrub_set_enabled_bg,
+ .get_enabled_od = ras2_hw_scrub_get_enabled_od,
+ .set_enabled_od = ras2_hw_scrub_set_enabled_od,
+ .get_name = ras2_hw_scrub_get_name,
+ .cycle_in_hours_range = ras2_hw_scrub_read_schrs_range,
+ .cycle_in_hours_read = ras2_hw_scrub_read_schrs,
+ .cycle_in_hours_write = ras2_hw_scrub_write_schrs,
+};
+
+static DEFINE_IDA(ras2_ida);
+
+static void ida_release(void *ctx)
+{
+ struct ras2_scrub_ctx *ras2_ctx = ctx;
+
+ ida_free(&ras2_ida, ras2_ctx->id);
+}
+
+static int ras2_probe(struct platform_device *pdev)
+{
+ struct edac_ras_feature ras_features[RAS2_DEV_NUM_RAS_FEATURES];
+ char scrub_name[RAS2_SCRUB_NAME_LEN];
+ struct ras2_scrub_ctx *ras2_ctx;
+ int num_ras_features = 0;
+ int ret, id;
+
+ /* RAS2 PCC Channel and Scrub specific context */
+ ras2_ctx = devm_kzalloc(&pdev->dev, sizeof(*ras2_ctx), GFP_KERNEL);
+ if (!ras2_ctx)
+ return -ENOMEM;
+
+ ras2_ctx->dev = &pdev->dev;
+ mutex_init(&ras2_ctx->lock);
+
+ ret = devm_ras2_register_pcc_channel(&pdev->dev, ras2_ctx,
+ *((int *)dev_get_platdata(&pdev->dev)));
+ if (ret < 0) {
+ dev_dbg(ras2_ctx->dev,
+ "failed to register pcc channel ret=%d\n", ret);
+ return ret;
+ }
+ if (!ras2_is_patrol_scrub_support(ras2_ctx))
+ return -EOPNOTSUPP;
+
+ ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+ if (ret)
+ return ret;
+
+ id = ida_alloc(&ras2_ida, GFP_KERNEL);
+ if (id < 0)
+ return id;
+
+ ras2_ctx->id = id;
+
+ ret = devm_add_action_or_reset(&pdev->dev, ida_release, ras2_ctx);
+ if (ret < 0)
+ return ret;
+
+ snprintf(scrub_name, sizeof(scrub_name), "acpi_ras2_mem%d",
+ ras2_ctx->id);
+
+ ras_features[num_ras_features].feat = ras_feat_scrub;
+ ras_features[num_ras_features].scrub_ops = &ras2_scrub_ops;
+ ras_features[num_ras_features].scrub_ctx = ras2_ctx;
+ num_ras_features++;
+
+ return edac_ras_dev_register(&pdev->dev, scrub_name, NULL,
+ num_ras_features, ras_features);
+}
+
+static const struct platform_device_id ras2_id_table[] = {
+ { .name = "acpi_ras2", },
+ { }
+};
+MODULE_DEVICE_TABLE(platform, ras2_id_table);
+
+static struct platform_driver ras2_driver = {
+ .probe = ras2_probe,
+ .driver = {
+ .name = "acpi_ras2",
+ },
+ .id_table = ras2_id_table,
+};
+module_driver(ras2_driver, platform_driver_register, platform_driver_unregister);
+
+MODULE_IMPORT_NS(ACPI_RAS2);
+MODULE_DESCRIPTION("ACPI RAS2 memory driver");
+MODULE_LICENSE("GPL");
--
2.34.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver
2024-07-16 15:03 ` [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver shiju.jose
@ 2024-07-16 18:00 ` fan
2024-07-17 11:06 ` Shiju Jose
2024-07-17 10:00 ` Mauro Carvalho Chehab
1 sibling, 1 reply; 30+ messages in thread
From: fan @ 2024-07-16 18:00 UTC (permalink / raw)
To: shiju.jose
Cc: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel, bp,
tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu, kangkang.shen,
wanghuiqiang, linuxarm
On Tue, Jul 16, 2024 at 04:03:25PM +0100, shiju.jose@huawei.com wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> Add generic EDAC driver supports registering RAS features supported
> in the system. The driver exposes feature's control attributes to the
> userspace in /sys/bus/edac/devices/<dev-name>/<ras-feature>/
>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
> drivers/edac/Makefile | 1 +
> drivers/edac/edac_ras_feature.c | 155 +++++++++++++++++++++++++++++++
> include/linux/edac_ras_feature.h | 66 +++++++++++++
> 3 files changed, 222 insertions(+)
> create mode 100755 drivers/edac/edac_ras_feature.c
> create mode 100755 include/linux/edac_ras_feature.h
>
> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
> index 9c09893695b7..c532b57a6d8a 100644
> --- a/drivers/edac/Makefile
> +++ b/drivers/edac/Makefile
> @@ -10,6 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
>
> edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
> edac_core-y += edac_module.o edac_device_sysfs.o wq.o
> +edac_core-y += edac_ras_feature.o
>
> edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
>
> diff --git a/drivers/edac/edac_ras_feature.c b/drivers/edac/edac_ras_feature.c
> new file mode 100755
> index 000000000000..24a729fea66f
> --- /dev/null
> +++ b/drivers/edac/edac_ras_feature.c
> @@ -0,0 +1,155 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * EDAC RAS control feature driver supports registering RAS
> + * features with the EDAC and exposes the feature's control
> + * attributes to the userspace in sysfs.
> + *
> + * Copyright (c) 2024 HiSilicon Limited.
> + */
> +
> +#define pr_fmt(fmt) "EDAC RAS CONTROL FEAT: " fmt
> +
> +#include <linux/edac_ras_feature.h>
> +
> +static void edac_ras_dev_release(struct device *dev)
> +{
> + struct edac_ras_feat_ctx *ctx =
> + container_of(dev, struct edac_ras_feat_ctx, dev);
> +
> + kfree(ctx);
> +}
> +
> +const struct device_type edac_ras_dev_type = {
> + .name = "edac_ras_dev",
> + .release = edac_ras_dev_release,
> +};
> +
> +static void edac_ras_dev_unreg(void *data)
> +{
> + device_unregister(data);
> +}
> +
> +static int edac_ras_feat_scrub_init(struct device *parent,
> + struct edac_scrub_data *sdata,
> + const struct edac_ras_feature *sfeat,
> + const struct attribute_group **attr_groups)
> +{
> + sdata->ops = sfeat->scrub_ops;
> + sdata->private = sfeat->scrub_ctx;
> +
> + return 1;
> +}
> +
> +static int edac_ras_feat_ecs_init(struct device *parent,
> + struct edac_ecs_data *edata,
> + const struct edac_ras_feature *efeat,
> + const struct attribute_group **attr_groups)
> +{
> + int num = efeat->ecs_info.num_media_frus;
> +
> + edata->ops = efeat->ecs_ops;
> + edata->private = efeat->ecs_ctx;
> +
> + return num;
> +}
> +
> +/**
> + * edac_ras_dev_register - register device for ras features with edac
> + * @parent: client device.
> + * @name: client device's name.
> + * @private: parent driver's data to store in the context if any.
> + * @num_features: number of ras features to register.
> + * @ras_features: list of ras features to register.
> + *
> + * Returns 0 on success, error otherwise.
> + * The new edac_ras_feat_ctx would be freed automatically.
> + */
> +int edac_ras_dev_register(struct device *parent, char *name,
> + void *private, int num_features,
> + const struct edac_ras_feature *ras_features)
> +{
> + const struct attribute_group **ras_attr_groups;
> + struct edac_ras_feat_ctx *ctx;
> + int attr_gcnt = 0;
> + int ret, feat;
> +
> + if (!parent || !name || !num_features || !ras_features)
> + return -EINVAL;
> +
> + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> + if (!ctx)
> + return -ENOMEM;
> +
> + ctx->dev.parent = parent;
> + ctx->private = private;
> +
> + /* Double parse so we can make space for attributes */
> + for (feat = 0; feat < num_features; feat++) {
> + switch (ras_features[feat].feat) {
> + case ras_feat_scrub:
> + attr_gcnt++;
> + break;
> + case ras_feat_ecs:
> + attr_gcnt += ras_features[feat].ecs_info.num_media_frus;
> + break;
> + default:
> + ret = -EINVAL;
> + goto ctx_free;
> + }
> + }
> +
> + ras_attr_groups = devm_kzalloc(parent,
> + (attr_gcnt + 1) * sizeof(*ras_attr_groups),
> + GFP_KERNEL);
> + if (!ras_attr_groups) {
> + ret = -ENOMEM;
> + goto ctx_free;
> + }
> +
> + attr_gcnt = 0;
> + for (feat = 0; feat < num_features; feat++, ras_features++) {
> + if (ras_features->feat == ras_feat_scrub) {
> + if (!ras_features->scrub_ops)
> + continue;
> + ret = edac_ras_feat_scrub_init(parent, &ctx->scrub,
> + ras_features, &ras_attr_groups[attr_gcnt]);
> + if (ret < 0)
> + goto ctx_free;
> +
> + attr_gcnt += ret;
> + } else if (ras_features->feat == ras_feat_ecs) {
> + if (!ras_features->ecs_ops)
> + continue;
> + ret = edac_ras_feat_ecs_init(parent, &ctx->ecs,
> + ras_features, &ras_attr_groups[attr_gcnt]);
> + if (ret < 0)
> + goto ctx_free;
> +
> + attr_gcnt += ret;
> + } else {
> + ret = -EINVAL;
> + goto ctx_free;
We already check this in the first pass, cannot be reached in the second
pass.
> + }
Why use if/else instead of using switch/case as above?
> + }
> + ras_attr_groups[attr_gcnt] = NULL;
> + ctx->dev.bus = edac_get_sysfs_subsys();
> + ctx->dev.type = &edac_ras_dev_type;
> + ctx->dev.groups = ras_attr_groups;
> + dev_set_drvdata(&ctx->dev, ctx);
> + ret = dev_set_name(&ctx->dev, name);
> + if (ret)
> + goto ctx_free;
> +
> + ret = device_register(&ctx->dev);
> + if (ret) {
> + put_device(&ctx->dev);
need to free ctx?
> + return ret;
> + }
> +
> + return devm_add_action_or_reset(parent, edac_ras_dev_unreg, &ctx->dev);
> +
> +ctx_free:
> + kfree(ctx);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(edac_ras_dev_register);
> diff --git a/include/linux/edac_ras_feature.h b/include/linux/edac_ras_feature.h
> new file mode 100755
> index 000000000000..000e99141023
> --- /dev/null
> +++ b/include/linux/edac_ras_feature.h
> @@ -0,0 +1,66 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * EDAC RAS control features.
> + *
> + * Copyright (c) 2024 HiSilicon Limited.
> + */
> +
> +#ifndef __EDAC_RAS_FEAT_H
> +#define __EDAC_RAS_FEAT_H
> +
> +#include <linux/types.h>
> +#include <linux/edac.h>
> +
> +#define EDAC_RAS_NAME_LEN 128
> +
> +enum edac_ras_feat {
> + ras_feat_scrub,
> + ras_feat_ecs,
> + ras_feat_max
> +};
Use uppercase for the strings.
Fan
> +
> +struct edac_ecs_ex_info {
> + u16 num_media_frus;
> +};
> +
> +/*
> + * EDAC RAS feature information structure
> + */
> +struct edac_scrub_data {
> + const struct edac_scrub_ops *ops;
> + void *private;
> +};
> +
> +struct edac_ecs_data {
> + const struct edac_ecs_ops *ops;
> + void *private;
> +};
> +
> +struct device;
> +
> +struct edac_ras_feat_ctx {
> + struct device dev;
> + void *private;
> + struct edac_scrub_data scrub;
> + struct edac_ecs_data ecs;
> +};
> +
> +struct edac_ras_feature {
> + enum edac_ras_feat feat;
> + union {
> + const struct edac_scrub_ops *scrub_ops;
> + const struct edac_ecs_ops *ecs_ops;
> + };
> + union {
> + struct edac_ecs_ex_info ecs_info;
> + };
> + union {
> + void *scrub_ctx;
> + void *ecs_ctx;
> + };
> +};
> +
> +int edac_ras_dev_register(struct device *parent, char *dev_name,
> + void *parent_pvt_data, int num_features,
> + const struct edac_ras_feature *ras_features);
> +#endif /* __EDAC_RAS_FEAT_H */
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver
2024-07-16 15:03 ` [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver shiju.jose
2024-07-16 18:00 ` fan
@ 2024-07-17 10:00 ` Mauro Carvalho Chehab
2024-07-17 11:01 ` Shiju Jose
1 sibling, 1 reply; 30+ messages in thread
From: Mauro Carvalho Chehab @ 2024-07-17 10:00 UTC (permalink / raw)
To: shiju.jose
Cc: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel, bp,
tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu, kangkang.shen,
wanghuiqiang, linuxarm
Em Tue, 16 Jul 2024 16:03:25 +0100
<shiju.jose@huawei.com> escreveu:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> Add generic EDAC driver supports registering RAS features supported
> in the system. The driver exposes feature's control attributes to the
> userspace in /sys/bus/edac/devices/<dev-name>/<ras-feature>/
>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
> drivers/edac/Makefile | 1 +
> drivers/edac/edac_ras_feature.c | 155 +++++++++++++++++++++++++++++++
> include/linux/edac_ras_feature.h | 66 +++++++++++++
> 3 files changed, 222 insertions(+)
> create mode 100755 drivers/edac/edac_ras_feature.c
> create mode 100755 include/linux/edac_ras_feature.h
>
> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
> index 9c09893695b7..c532b57a6d8a 100644
> --- a/drivers/edac/Makefile
> +++ b/drivers/edac/Makefile
> @@ -10,6 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
>
> edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
> edac_core-y += edac_module.o edac_device_sysfs.o wq.o
> +edac_core-y += edac_ras_feature.o
>
> edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
>
> diff --git a/drivers/edac/edac_ras_feature.c b/drivers/edac/edac_ras_feature.c
> new file mode 100755
> index 000000000000..24a729fea66f
> --- /dev/null
> +++ b/drivers/edac/edac_ras_feature.c
> @@ -0,0 +1,155 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * EDAC RAS control feature driver supports registering RAS
> + * features with the EDAC and exposes the feature's control
> + * attributes to the userspace in sysfs.
> + *
> + * Copyright (c) 2024 HiSilicon Limited.
> + */
> +
> +#define pr_fmt(fmt) "EDAC RAS CONTROL FEAT: " fmt
Sounds a too long prefix for my taste.
> +
> +#include <linux/edac_ras_feature.h>
> +
> +static void edac_ras_dev_release(struct device *dev)
> +{
> + struct edac_ras_feat_ctx *ctx =
> + container_of(dev, struct edac_ras_feat_ctx, dev);
> +
> + kfree(ctx);
> +}
> +
> +const struct device_type edac_ras_dev_type = {
> + .name = "edac_ras_dev",
> + .release = edac_ras_dev_release,
> +};
> +
> +static void edac_ras_dev_unreg(void *data)
> +{
> + device_unregister(data);
> +}
> +
> +static int edac_ras_feat_scrub_init(struct device *parent,
> + struct edac_scrub_data *sdata,
> + const struct edac_ras_feature *sfeat,
> + const struct attribute_group **attr_groups)
> +{
> + sdata->ops = sfeat->scrub_ops;
> + sdata->private = sfeat->scrub_ctx;
> +
> + return 1;
> +}
> +
> +static int edac_ras_feat_ecs_init(struct device *parent,
> + struct edac_ecs_data *edata,
> + const struct edac_ras_feature *efeat,
> + const struct attribute_group **attr_groups)
> +{
> + int num = efeat->ecs_info.num_media_frus;
> +
> + edata->ops = efeat->ecs_ops;
> + edata->private = efeat->ecs_ctx;
> +
> + return num;
> +}
I would place this function earlier and/or add some documentation
for the above two functions.
I got confused when reviewed the first function and saw there an
unconditional:
return 1;
Now, I guess the goal is to return the number of initialized
features, right?
> +
> +/**
> + * edac_ras_dev_register - register device for ras features with edac
> + * @parent: client device.
> + * @name: client device's name.
> + * @private: parent driver's data to store in the context if any.
> + * @num_features: number of ras features to register.
> + * @ras_features: list of ras features to register.
> + *
> + * Returns 0 on success, error otherwise.
> + * The new edac_ras_feat_ctx would be freed automatically.
> + */
> +int edac_ras_dev_register(struct device *parent, char *name,
> + void *private, int num_features,
> + const struct edac_ras_feature *ras_features)
> +{
> + const struct attribute_group **ras_attr_groups;
> + struct edac_ras_feat_ctx *ctx;
> + int attr_gcnt = 0;
> + int ret, feat;
> +
> + if (!parent || !name || !num_features || !ras_features)
> + return -EINVAL;
> +
> + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> + if (!ctx)
> + return -ENOMEM;
> +
> + ctx->dev.parent = parent;
> + ctx->private = private;
> +
> + /* Double parse so we can make space for attributes */
> + for (feat = 0; feat < num_features; feat++) {
> + switch (ras_features[feat].feat) {
> + case ras_feat_scrub:
> + attr_gcnt++;
> + break;
> + case ras_feat_ecs:
> + attr_gcnt += ras_features[feat].ecs_info.num_media_frus;
> + break;
As already suggested, the enum names shall be in uppercase.
Having a lowercase one here looks really weird.
> + default:
> + ret = -EINVAL;
> + goto ctx_free;
> + }
> + }
I would place this logic earlier, before allocating ctx, as, in case of
errors, the function can just call "return -EINVAL".
> +
> + ras_attr_groups = devm_kzalloc(parent,
> + (attr_gcnt + 1) * sizeof(*ras_attr_groups),
> + GFP_KERNEL);
Hmm... why are you using devm variant here, and non-devm one for cxt?
My personal preference is to avoid devm variants, as memory is
only freed when the device refcount becomes zero (which, depending
on the driver, may never happen in practice, as driver core may keep
a refcount, depending on how the device was probed).
> + if (!ras_attr_groups) {
> + ret = -ENOMEM;
> + goto ctx_free;
> + }
> +
> + attr_gcnt = 0;
> + for (feat = 0; feat < num_features; feat++, ras_features++) {
> + if (ras_features->feat == ras_feat_scrub) {
I would use a switch here as well, just like the previous feature type
check.
> + if (!ras_features->scrub_ops)
> + continue;
> + ret = edac_ras_feat_scrub_init(parent, &ctx->scrub,
> + ras_features, &ras_attr_groups[attr_gcnt]);
I don't think it is worth having those ancillary functions here...
> + if (ret < 0)
> + goto ctx_free;
> +
> + attr_gcnt += ret;
> + } else if (ras_features->feat == ras_feat_ecs) {
> + if (!ras_features->ecs_ops)
> + continue;
> + ret = edac_ras_feat_ecs_init(parent, &ctx->ecs,
> + ras_features, &ras_attr_groups[attr_gcnt]);
and here, as most of the current functions are very simple:
both just sets two arguments:
edata->ops
edata->private
and returned vaules are always a positive counter...
> + if (ret < 0)
> + goto ctx_free;
So, this check for instance, doesn't make sense.
> +
> + attr_gcnt += ret;
> + } else {
> + ret = -EINVAL;
> + goto ctx_free;
> + }
> + }
> + ras_attr_groups[attr_gcnt] = NULL;
> + ctx->dev.bus = edac_get_sysfs_subsys();
> + ctx->dev.type = &edac_ras_dev_type;
> + ctx->dev.groups = ras_attr_groups;
> + dev_set_drvdata(&ctx->dev, ctx);
> + ret = dev_set_name(&ctx->dev, name);
> + if (ret)
> + goto ctx_free;
> +
> + ret = device_register(&ctx->dev);
> + if (ret) {
> + put_device(&ctx->dev);
> + return ret;
> + }
> +
> + return devm_add_action_or_reset(parent, edac_ras_dev_unreg, &ctx->dev);
> +
> +ctx_free:
> + kfree(ctx);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(edac_ras_dev_register);
> diff --git a/include/linux/edac_ras_feature.h b/include/linux/edac_ras_feature.h
> new file mode 100755
> index 000000000000..000e99141023
> --- /dev/null
> +++ b/include/linux/edac_ras_feature.h
> @@ -0,0 +1,66 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * EDAC RAS control features.
> + *
> + * Copyright (c) 2024 HiSilicon Limited.
> + */
> +
> +#ifndef __EDAC_RAS_FEAT_H
> +#define __EDAC_RAS_FEAT_H
> +
> +#include <linux/types.h>
> +#include <linux/edac.h>
> +
> +#define EDAC_RAS_NAME_LEN 128
> +
> +enum edac_ras_feat {
> + ras_feat_scrub,
> + ras_feat_ecs,
> + ras_feat_max
> +};
Enum values in uppercase, please.
> +
> +struct edac_ecs_ex_info {
> + u16 num_media_frus;
> +};
> +
> +/*
> + * EDAC RAS feature information structure
> + */
> +struct edac_scrub_data {
> + const struct edac_scrub_ops *ops;
> + void *private;
> +};
> +
> +struct edac_ecs_data {
> + const struct edac_ecs_ops *ops;
> + void *private;
> +};
> +
> +struct device;
> +
> +struct edac_ras_feat_ctx {
> + struct device dev;
> + void *private;
> + struct edac_scrub_data scrub;
> + struct edac_ecs_data ecs;
> +};
> +
> +struct edac_ras_feature {
> + enum edac_ras_feat feat;
> + union {
> + const struct edac_scrub_ops *scrub_ops;
> + const struct edac_ecs_ops *ecs_ops;
> + };
> + union {
> + struct edac_ecs_ex_info ecs_info;
> + };
I would place the variable structs union at the end. This may help with
alignments, if you place the pointers earlier.
> + union {
> + void *scrub_ctx;
> + void *ecs_ctx;
> + };
> +};
> +
> +int edac_ras_dev_register(struct device *parent, char *dev_name,
> + void *parent_pvt_data, int num_features,
> + const struct edac_ras_feature *ras_features);
> +#endif /* __EDAC_RAS_FEAT_H */
Thanks,
Mauro
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver
2024-07-17 10:00 ` Mauro Carvalho Chehab
@ 2024-07-17 11:01 ` Shiju Jose
2024-07-18 6:19 ` Mauro Carvalho Chehab
0 siblings, 1 reply; 30+ messages in thread
From: Shiju Jose @ 2024-07-17 11:01 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: linux-edac@vger.kernel.org, linux-cxl@vger.kernel.org,
linux-acpi@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, bp@alien8.de, tony.luck@intel.com,
rafael@kernel.org, lenb@kernel.org, mchehab@kernel.org,
dan.j.williams@intel.com, dave@stgolabs.net, Jonathan Cameron,
dave.jiang@intel.com, alison.schofield@intel.com,
vishal.l.verma@intel.com, ira.weiny@intel.com, david@redhat.com,
Vilas.Sridharan@amd.com, leo.duran@amd.com, Yazen.Ghannam@amd.com,
rientjes@google.com, jiaqiyan@google.com, Jon.Grimm@amd.com,
dave.hansen@linux.intel.com, naoya.horiguchi@nec.com,
james.morse@arm.com, jthoughton@google.com,
somasundaram.a@hpe.com, erdemaktas@google.com, pgonda@google.com,
duenwen@google.com, mike.malvestuto@intel.com, gthelen@google.com,
wschwartz@amperecomputing.com, dferguson@amperecomputing.com,
wbs@os.amperecomputing.com, nifan.cxl@gmail.com, tanxiaofei,
Zengtao (B), Roberto Sassu, kangkang.shen@futurewei.com,
wanghuiqiang, Linuxarm
Hi Mauro,
Thanks for the feedbacks.
>-----Original Message-----
>From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
>Sent: 17 July 2024 11:00
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: linux-edac@vger.kernel.org; linux-cxl@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org;
>bp@alien8.de; tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org;
>mchehab@kernel.org; dan.j.williams@intel.com; dave@stgolabs.net; Jonathan
>Cameron <jonathan.cameron@huawei.com>; dave.jiang@intel.com;
>alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com;
>david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com;
>Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
>naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
>duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com;
>wschwartz@amperecomputing.com; dferguson@amperecomputing.com;
>wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
>Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
>wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm
><linuxarm@huawei.com>
>Subject: Re: [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver
>
>Em Tue, 16 Jul 2024 16:03:25 +0100
><shiju.jose@huawei.com> escreveu:
>
>> From: Shiju Jose <shiju.jose@huawei.com>
>>
>> Add generic EDAC driver supports registering RAS features supported in
>> the system. The driver exposes feature's control attributes to the
>> userspace in /sys/bus/edac/devices/<dev-name>/<ras-feature>/
>>
>> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
>> ---
>> drivers/edac/Makefile | 1 +
>> drivers/edac/edac_ras_feature.c | 155
>> +++++++++++++++++++++++++++++++ include/linux/edac_ras_feature.h |
>> 66 +++++++++++++
>> 3 files changed, 222 insertions(+)
>> create mode 100755 drivers/edac/edac_ras_feature.c create mode
>> 100755 include/linux/edac_ras_feature.h
>>
>> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index
>> 9c09893695b7..c532b57a6d8a 100644
>> --- a/drivers/edac/Makefile
>> +++ b/drivers/edac/Makefile
>> @@ -10,6 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
>>
>> edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
>> edac_core-y += edac_module.o edac_device_sysfs.o wq.o
>> +edac_core-y += edac_ras_feature.o
>>
>> edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
>>
>> diff --git a/drivers/edac/edac_ras_feature.c
>> b/drivers/edac/edac_ras_feature.c new file mode 100755 index
>> 000000000000..24a729fea66f
>> --- /dev/null
>> +++ b/drivers/edac/edac_ras_feature.c
>> @@ -0,0 +1,155 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * EDAC RAS control feature driver supports registering RAS
>> + * features with the EDAC and exposes the feature's control
>> + * attributes to the userspace in sysfs.
>> + *
>> + * Copyright (c) 2024 HiSilicon Limited.
>> + */
>> +
>
>> +#define pr_fmt(fmt) "EDAC RAS CONTROL FEAT: " fmt
>
>Sounds a too long prefix for my taste.
Will do. Previously it was "EDAC RAS FEAT"
>
>> +
>> +#include <linux/edac_ras_feature.h>
>> +
>> +static void edac_ras_dev_release(struct device *dev) {
>> + struct edac_ras_feat_ctx *ctx =
>> + container_of(dev, struct edac_ras_feat_ctx, dev);
>> +
>> + kfree(ctx);
>> +}
>> +
>> +const struct device_type edac_ras_dev_type = {
>> + .name = "edac_ras_dev",
>> + .release = edac_ras_dev_release,
>> +};
>> +
>> +static void edac_ras_dev_unreg(void *data) {
>> + device_unregister(data);
>> +}
>> +
>> +static int edac_ras_feat_scrub_init(struct device *parent,
>> + struct edac_scrub_data *sdata,
>> + const struct edac_ras_feature *sfeat,
>> + const struct attribute_group **attr_groups) {
>> + sdata->ops = sfeat->scrub_ops;
>> + sdata->private = sfeat->scrub_ctx;
>> +
>> + return 1;
>> +}
>> +
>> +static int edac_ras_feat_ecs_init(struct device *parent,
>> + struct edac_ecs_data *edata,
>> + const struct edac_ras_feature *efeat,
>> + const struct attribute_group **attr_groups) {
>> + int num = efeat->ecs_info.num_media_frus;
>> +
>> + edata->ops = efeat->ecs_ops;
>> + edata->private = efeat->ecs_ctx;
>> +
>> + return num;
>> +}
>
>I would place this function earlier and/or add some documentation for the above
>two functions.
Will do. I guess you want place these functions above edac_ras_dev_release() right?
>
>I got confused when reviewed the first function and saw there an
>unconditional:
The call for the feature specific init functions are added here in the next feature specific patches
of this series.
>
> return 1;
>
>Now, I guess the goal is to return the number of initialized features, right?
Return the number of attr groups added for a feature as the instances for a feature is dynamic,
for e.g. the number of FRUs in ECS feature.
>
>> +
>> +/**
>> + * edac_ras_dev_register - register device for ras features with edac
>> + * @parent: client device.
>> + * @name: client device's name.
>> + * @private: parent driver's data to store in the context if any.
>> + * @num_features: number of ras features to register.
>> + * @ras_features: list of ras features to register.
>> + *
>> + * Returns 0 on success, error otherwise.
>> + * The new edac_ras_feat_ctx would be freed automatically.
>> + */
>> +int edac_ras_dev_register(struct device *parent, char *name,
>> + void *private, int num_features,
>> + const struct edac_ras_feature *ras_features) {
>> + const struct attribute_group **ras_attr_groups;
>> + struct edac_ras_feat_ctx *ctx;
>> + int attr_gcnt = 0;
>> + int ret, feat;
>> +
>> + if (!parent || !name || !num_features || !ras_features)
>> + return -EINVAL;
>> +
>> + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
>> + if (!ctx)
>> + return -ENOMEM;
>> +
>> + ctx->dev.parent = parent;
>> + ctx->private = private;
>> +
>> + /* Double parse so we can make space for attributes */
>> + for (feat = 0; feat < num_features; feat++) {
>> + switch (ras_features[feat].feat) {
>> + case ras_feat_scrub:
>> + attr_gcnt++;
>> + break;
>> + case ras_feat_ecs:
>> + attr_gcnt +=
>ras_features[feat].ecs_info.num_media_frus;
>> + break;
>
>As already suggested, the enum names shall be in uppercase.
>Having a lowercase one here looks really weird.
Agree.
>
>> + default:
>> + ret = -EINVAL;
>> + goto ctx_free;
>> + }
>> + }
>
>I would place this logic earlier, before allocating ctx, as, in case of errors, the
>function can just call "return -EINVAL".
Ok.
>
>> +
>> + ras_attr_groups = devm_kzalloc(parent,
>> + (attr_gcnt + 1) * sizeof(*ras_attr_groups),
>> + GFP_KERNEL);
>
>Hmm... why are you using devm variant here, and non-devm one for cxt?
>
>My personal preference is to avoid devm variants, as memory is only freed
>when the device refcount becomes zero (which, depending on the driver, may
>never happen in practice, as driver core may keep a refcount, depending on how
>the device was probed).
Can use Kzalloc and need to add free for ras_attr_groups on error etc.
>
>> + if (!ras_attr_groups) {
>> + ret = -ENOMEM;
>> + goto ctx_free;
>> + }
>> +
>> + attr_gcnt = 0;
>> + for (feat = 0; feat < num_features; feat++, ras_features++) {
>> + if (ras_features->feat == ras_feat_scrub) {
>
>I would use a switch here as well, just like the previous feature type check.
Will do.
>
>> + if (!ras_features->scrub_ops)
>> + continue;
>> + ret = edac_ras_feat_scrub_init(parent, &ctx->scrub,
>> + ras_features,
>&ras_attr_groups[attr_gcnt]);
>
>I don't think it is worth having those ancillary functions here...
>
>> + if (ret < 0)
>> + goto ctx_free;
>> +
>> + attr_gcnt += ret;
>> + } else if (ras_features->feat == ras_feat_ecs) {
>> + if (!ras_features->ecs_ops)
>> + continue;
>> + ret = edac_ras_feat_ecs_init(parent, &ctx->ecs,
>> + ras_features,
>&ras_attr_groups[attr_gcnt]);
>
>and here, as most of the current functions are very simple:
>
>both just sets two arguments:
>
> edata->ops
> edata->private
>
>and returned vaules are always a positive counter...
>
>> + if (ret < 0)
>> + goto ctx_free;
>
>So, this check for instance, doesn't make sense.
The call for the feature specific init functions are added in the next feature specific patches
of this series and which could return error.
>
>> +
>> + attr_gcnt += ret;
>> + } else {
>> + ret = -EINVAL;
>> + goto ctx_free;
>> + }
>> + }
>> + ras_attr_groups[attr_gcnt] = NULL;
>> + ctx->dev.bus = edac_get_sysfs_subsys();
>> + ctx->dev.type = &edac_ras_dev_type;
>> + ctx->dev.groups = ras_attr_groups;
>> + dev_set_drvdata(&ctx->dev, ctx);
>> + ret = dev_set_name(&ctx->dev, name);
>> + if (ret)
>> + goto ctx_free;
>> +
>> + ret = device_register(&ctx->dev);
>> + if (ret) {
>> + put_device(&ctx->dev);
>> + return ret;
>> + }
>> +
>> + return devm_add_action_or_reset(parent, edac_ras_dev_unreg,
>> +&ctx->dev);
>> +
>> +ctx_free:
>> + kfree(ctx);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(edac_ras_dev_register);
>> diff --git a/include/linux/edac_ras_feature.h
>> b/include/linux/edac_ras_feature.h
>> new file mode 100755
>> index 000000000000..000e99141023
>> --- /dev/null
>> +++ b/include/linux/edac_ras_feature.h
>> @@ -0,0 +1,66 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * EDAC RAS control features.
>> + *
>> + * Copyright (c) 2024 HiSilicon Limited.
>> + */
>> +
>> +#ifndef __EDAC_RAS_FEAT_H
>> +#define __EDAC_RAS_FEAT_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/edac.h>
>> +
>> +#define EDAC_RAS_NAME_LEN 128
>> +
>> +enum edac_ras_feat {
>> + ras_feat_scrub,
>> + ras_feat_ecs,
>> + ras_feat_max
>> +};
>
>Enum values in uppercase, please.
Will do.
>
>> +
>> +struct edac_ecs_ex_info {
>> + u16 num_media_frus;
>> +};
>> +
>> +/*
>> + * EDAC RAS feature information structure */ struct edac_scrub_data
>> +{
>> + const struct edac_scrub_ops *ops;
>> + void *private;
>> +};
>> +
>> +struct edac_ecs_data {
>> + const struct edac_ecs_ops *ops;
>> + void *private;
>> +};
>> +
>> +struct device;
>> +
>> +struct edac_ras_feat_ctx {
>> + struct device dev;
>> + void *private;
>> + struct edac_scrub_data scrub;
>> + struct edac_ecs_data ecs;
>> +};
>> +
>> +struct edac_ras_feature {
>> + enum edac_ras_feat feat;
>> + union {
>> + const struct edac_scrub_ops *scrub_ops;
>> + const struct edac_ecs_ops *ecs_ops;
>> + };
>> + union {
>> + struct edac_ecs_ex_info ecs_info;
>> + };
>
>I would place the variable structs union at the end. This may help with
>alignments, if you place the pointers earlier.
Will do.
>
>> + union {
>> + void *scrub_ctx;
>> + void *ecs_ctx;
>> + };
>> +};
>> +
>> +int edac_ras_dev_register(struct device *parent, char *dev_name,
>> + void *parent_pvt_data, int num_features,
>> + const struct edac_ras_feature *ras_features); #endif
>/*
>> +__EDAC_RAS_FEAT_H */
>
>
>
>Thanks,
>Mauro
>
Thanks,
Shiju
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver
2024-07-16 18:00 ` fan
@ 2024-07-17 11:06 ` Shiju Jose
0 siblings, 0 replies; 30+ messages in thread
From: Shiju Jose @ 2024-07-17 11:06 UTC (permalink / raw)
To: fan
Cc: linux-edac@vger.kernel.org, linux-cxl@vger.kernel.org,
linux-acpi@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, bp@alien8.de, tony.luck@intel.com,
rafael@kernel.org, lenb@kernel.org, mchehab@kernel.org,
dan.j.williams@intel.com, dave@stgolabs.net, Jonathan Cameron,
dave.jiang@intel.com, alison.schofield@intel.com,
vishal.l.verma@intel.com, ira.weiny@intel.com, david@redhat.com,
Vilas.Sridharan@amd.com, leo.duran@amd.com, Yazen.Ghannam@amd.com,
rientjes@google.com, jiaqiyan@google.com, Jon.Grimm@amd.com,
dave.hansen@linux.intel.com, naoya.horiguchi@nec.com,
james.morse@arm.com, jthoughton@google.com,
somasundaram.a@hpe.com, erdemaktas@google.com, pgonda@google.com,
duenwen@google.com, mike.malvestuto@intel.com, gthelen@google.com,
wschwartz@amperecomputing.com, dferguson@amperecomputing.com,
wbs@os.amperecomputing.com, tanxiaofei, Zengtao (B),
Roberto Sassu, kangkang.shen@futurewei.com, wanghuiqiang,
Linuxarm
Hi Fan,
Thanks for the feedback.
>-----Original Message-----
>From: fan <nifan.cxl@gmail.com>
>Sent: 16 July 2024 19:01
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: linux-edac@vger.kernel.org; linux-cxl@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org;
>bp@alien8.de; tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org;
>mchehab@kernel.org; dan.j.williams@intel.com; dave@stgolabs.net; Jonathan
>Cameron <jonathan.cameron@huawei.com>; dave.jiang@intel.com;
>alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com;
>david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com;
>Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
>naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
>duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com;
>wschwartz@amperecomputing.com; dferguson@amperecomputing.com;
>wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
>Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
>wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm
><linuxarm@huawei.com>
>Subject: Re: [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver
>
>On Tue, Jul 16, 2024 at 04:03:25PM +0100, shiju.jose@huawei.com wrote:
>> From: Shiju Jose <shiju.jose@huawei.com>
>>
>> Add generic EDAC driver supports registering RAS features supported in
>> the system. The driver exposes feature's control attributes to the
>> userspace in /sys/bus/edac/devices/<dev-name>/<ras-feature>/
>>
>> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
>> ---
>> drivers/edac/Makefile | 1 +
>> drivers/edac/edac_ras_feature.c | 155
>> +++++++++++++++++++++++++++++++ include/linux/edac_ras_feature.h |
>> 66 +++++++++++++
>> 3 files changed, 222 insertions(+)
>> create mode 100755 drivers/edac/edac_ras_feature.c create mode
>> 100755 include/linux/edac_ras_feature.h
>>
>> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index
>> 9c09893695b7..c532b57a6d8a 100644
>> --- a/drivers/edac/Makefile
>> +++ b/drivers/edac/Makefile
>> @@ -10,6 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
>>
>> edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
>> edac_core-y += edac_module.o edac_device_sysfs.o wq.o
>> +edac_core-y += edac_ras_feature.o
>>
>> edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
>>
>> diff --git a/drivers/edac/edac_ras_feature.c
>> b/drivers/edac/edac_ras_feature.c new file mode 100755 index
>> 000000000000..24a729fea66f
>> --- /dev/null
>> +++ b/drivers/edac/edac_ras_feature.c
>> @@ -0,0 +1,155 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * EDAC RAS control feature driver supports registering RAS
>> + * features with the EDAC and exposes the feature's control
>> + * attributes to the userspace in sysfs.
>> + *
>> + * Copyright (c) 2024 HiSilicon Limited.
>> + */
>> +
>> +#define pr_fmt(fmt) "EDAC RAS CONTROL FEAT: " fmt
>> +
>> +#include <linux/edac_ras_feature.h>
>> +
>> +static void edac_ras_dev_release(struct device *dev) {
>> + struct edac_ras_feat_ctx *ctx =
>> + container_of(dev, struct edac_ras_feat_ctx, dev);
>> +
>> + kfree(ctx);
>> +}
>> +
>> +const struct device_type edac_ras_dev_type = {
>> + .name = "edac_ras_dev",
>> + .release = edac_ras_dev_release,
>> +};
>> +
>> +static void edac_ras_dev_unreg(void *data) {
>> + device_unregister(data);
>> +}
>> +
>> +static int edac_ras_feat_scrub_init(struct device *parent,
>> + struct edac_scrub_data *sdata,
>> + const struct edac_ras_feature *sfeat,
>> + const struct attribute_group **attr_groups) {
>> + sdata->ops = sfeat->scrub_ops;
>> + sdata->private = sfeat->scrub_ctx;
>> +
>> + return 1;
>> +}
>> +
>> +static int edac_ras_feat_ecs_init(struct device *parent,
>> + struct edac_ecs_data *edata,
>> + const struct edac_ras_feature *efeat,
>> + const struct attribute_group **attr_groups) {
>> + int num = efeat->ecs_info.num_media_frus;
>> +
>> + edata->ops = efeat->ecs_ops;
>> + edata->private = efeat->ecs_ctx;
>> +
>> + return num;
>> +}
>> +
>> +/**
>> + * edac_ras_dev_register - register device for ras features with edac
>> + * @parent: client device.
>> + * @name: client device's name.
>> + * @private: parent driver's data to store in the context if any.
>> + * @num_features: number of ras features to register.
>> + * @ras_features: list of ras features to register.
>> + *
>> + * Returns 0 on success, error otherwise.
>> + * The new edac_ras_feat_ctx would be freed automatically.
>> + */
>> +int edac_ras_dev_register(struct device *parent, char *name,
>> + void *private, int num_features,
>> + const struct edac_ras_feature *ras_features) {
>> + const struct attribute_group **ras_attr_groups;
>> + struct edac_ras_feat_ctx *ctx;
>> + int attr_gcnt = 0;
>> + int ret, feat;
>> +
>> + if (!parent || !name || !num_features || !ras_features)
>> + return -EINVAL;
>> +
>> + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
>> + if (!ctx)
>> + return -ENOMEM;
>> +
>> + ctx->dev.parent = parent;
>> + ctx->private = private;
>> +
>> + /* Double parse so we can make space for attributes */
>> + for (feat = 0; feat < num_features; feat++) {
>> + switch (ras_features[feat].feat) {
>> + case ras_feat_scrub:
>> + attr_gcnt++;
>> + break;
>> + case ras_feat_ecs:
>> + attr_gcnt +=
>ras_features[feat].ecs_info.num_media_frus;
>> + break;
>> + default:
>> + ret = -EINVAL;
>> + goto ctx_free;
>> + }
>> + }
>> +
>> + ras_attr_groups = devm_kzalloc(parent,
>> + (attr_gcnt + 1) * sizeof(*ras_attr_groups),
>> + GFP_KERNEL);
>> + if (!ras_attr_groups) {
>> + ret = -ENOMEM;
>> + goto ctx_free;
>> + }
>> +
>> + attr_gcnt = 0;
>> + for (feat = 0; feat < num_features; feat++, ras_features++) {
>> + if (ras_features->feat == ras_feat_scrub) {
>> + if (!ras_features->scrub_ops)
>> + continue;
>> + ret = edac_ras_feat_scrub_init(parent, &ctx->scrub,
>> + ras_features,
>&ras_attr_groups[attr_gcnt]);
>> + if (ret < 0)
>> + goto ctx_free;
>> +
>> + attr_gcnt += ret;
>> + } else if (ras_features->feat == ras_feat_ecs) {
>> + if (!ras_features->ecs_ops)
>> + continue;
>> + ret = edac_ras_feat_ecs_init(parent, &ctx->ecs,
>> + ras_features,
>&ras_attr_groups[attr_gcnt]);
>> + if (ret < 0)
>> + goto ctx_free;
>> +
>> + attr_gcnt += ret;
>> + } else {
>> + ret = -EINVAL;
>> + goto ctx_free;
>We already check this in the first pass, cannot be reached in the second pass.
Will change.
>> + }
>Why use if/else instead of using switch/case as above?
Will do.
>> + }
>> + ras_attr_groups[attr_gcnt] = NULL;
>> + ctx->dev.bus = edac_get_sysfs_subsys();
>> + ctx->dev.type = &edac_ras_dev_type;
>> + ctx->dev.groups = ras_attr_groups;
>> + dev_set_drvdata(&ctx->dev, ctx);
>> + ret = dev_set_name(&ctx->dev, name);
>> + if (ret)
>> + goto ctx_free;
>> +
>> + ret = device_register(&ctx->dev);
>> + if (ret) {
>> + put_device(&ctx->dev);
>need to free ctx?
Will fix.
>> + return ret;
>> + }
>> +
>> + return devm_add_action_or_reset(parent, edac_ras_dev_unreg,
>> +&ctx->dev);
>> +
>> +ctx_free:
>> + kfree(ctx);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(edac_ras_dev_register);
>> diff --git a/include/linux/edac_ras_feature.h
>> b/include/linux/edac_ras_feature.h
>> new file mode 100755
>> index 000000000000..000e99141023
>> --- /dev/null
>> +++ b/include/linux/edac_ras_feature.h
>> @@ -0,0 +1,66 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * EDAC RAS control features.
>> + *
>> + * Copyright (c) 2024 HiSilicon Limited.
>> + */
>> +
>> +#ifndef __EDAC_RAS_FEAT_H
>> +#define __EDAC_RAS_FEAT_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/edac.h>
>> +
>> +#define EDAC_RAS_NAME_LEN 128
>> +
>> +enum edac_ras_feat {
>> + ras_feat_scrub,
>> + ras_feat_ecs,
>> + ras_feat_max
>> +};
>Use uppercase for the strings.
Will do.
>
>Fan
>> +
>> +struct edac_ecs_ex_info {
>> + u16 num_media_frus;
>> +};
>> +
>> +/*
>> + * EDAC RAS feature information structure */ struct edac_scrub_data
>> +{
>> + const struct edac_scrub_ops *ops;
>> + void *private;
>> +};
>> +
>> +struct edac_ecs_data {
>> + const struct edac_ecs_ops *ops;
>> + void *private;
>> +};
>> +
>> +struct device;
>> +
>> +struct edac_ras_feat_ctx {
>> + struct device dev;
>> + void *private;
>> + struct edac_scrub_data scrub;
>> + struct edac_ecs_data ecs;
>> +};
>> +
>> +struct edac_ras_feature {
>> + enum edac_ras_feat feat;
>> + union {
>> + const struct edac_scrub_ops *scrub_ops;
>> + const struct edac_ecs_ops *ecs_ops;
>> + };
>> + union {
>> + struct edac_ecs_ex_info ecs_info;
>> + };
>> + union {
>> + void *scrub_ctx;
>> + void *ecs_ctx;
>> + };
>> +};
>> +
>> +int edac_ras_dev_register(struct device *parent, char *dev_name,
>> + void *parent_pvt_data, int num_features,
>> + const struct edac_ras_feature *ras_features); #endif
>/*
>> +__EDAC_RAS_FEAT_H */
>> --
>> 2.34.1
>>
Thanks,
Shiju
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v9 02/11] EDAC: Add EDAC scrub control driver
2024-07-16 15:03 ` [RFC PATCH v9 02/11] EDAC: Add EDAC scrub control driver shiju.jose
@ 2024-07-17 12:56 ` Mauro Carvalho Chehab
2024-07-17 14:07 ` Shiju Jose
0 siblings, 1 reply; 30+ messages in thread
From: Mauro Carvalho Chehab @ 2024-07-17 12:56 UTC (permalink / raw)
To: shiju.jose
Cc: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel, bp,
tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu, kangkang.shen,
wanghuiqiang, linuxarm
Em Tue, 16 Jul 2024 16:03:26 +0100
<shiju.jose@huawei.com> escreveu:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> Add generic EDAC scrub control driver supports configuring the memory scrubbers
> in the system. The device with scrub feature, get the scrub descriptor from the
> EDAC scrub and registers with the EDAC RAS feature driver, which adds the sysfs
> scrub control interface. The scrub control attributes are available to the
> userspace in /sys/bus/edac/devices/<dev-name>/scrub/.
>
> Generic EDAC scrub driver and the common sysfs scrub interface promotes
> unambiguous access from the userspace irrespective of the underlying scrub
> devices.
>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
> Documentation/ABI/testing/sysfs-edac-scrub | 64 +++++
> drivers/edac/Makefile | 2 +-
> drivers/edac/edac_ras_feature.c | 1 +
> drivers/edac/edac_scrub.c | 312 +++++++++++++++++++++
> include/linux/edac_ras_feature.h | 28 ++
> 5 files changed, 406 insertions(+), 1 deletion(-)
> create mode 100644 Documentation/ABI/testing/sysfs-edac-scrub
> create mode 100755 drivers/edac/edac_scrub.c
>
> diff --git a/Documentation/ABI/testing/sysfs-edac-scrub b/Documentation/ABI/testing/sysfs-edac-scrub
> new file mode 100644
> index 000000000000..dd19afd5e165
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-edac-scrub
> @@ -0,0 +1,64 @@
> +What: /sys/bus/edac/devices/<dev-name>/scrub
> +Date: Oct 2024
> +KernelVersion: 6.12
> +Contact: linux-edac@vger.kernel.org
> +Description:
> + The sysfs edac bus devices /<dev-name>/scrub subdirectory
> + belongs to the memory scrub control feature, where <dev-name>
> + directory corresponds to a device/memory region registered
> + with the edac scrub driver and thus registered with the
> + generic edac ras driver too.
> +
> +What: /sys/bus/edac/devices/<dev-name>/scrub/addr_range_base
> +Date: Oct 2024
> +KernelVersion: 6.12
> +Contact: linux-edac@vger.kernel.org
> +Description:
> + (RW) The base of the address range of the memory region
> + to be scrubbed (on-demand scrubbing).
> +
> +What: /sys/bus/edac/devices/<dev-name>/scrub/addr_range_size
> +Date: Oct 2024
> +KernelVersion: 6.12
> +Contact: linux-edac@vger.kernel.org
> +Description:
> + (RW) The size of the address range of the memory region
> + to be scrubbed (on-demand scrubbing).
> +
> +What: /sys/bus/edac/devices/<dev-name>/scrub/enable_background
> +Date: Oct 2024
> +KernelVersion: 6.12
> +Contact: linux-edac@vger.kernel.org
> +Description:
> + (RW) Start/Stop background(patrol) scrubbing if supported.
> +
> +What: /sys/bus/edac/devices/<dev-name>/scrub/enable_on_demand
> +Date: Oct 2024
> +KernelVersion: 6.12
> +Contact: linux-edac@vger.kernel.org
> +Description:
> + (RW) Start/Stop on-demand scrubbing the memory region
> + if supported.
This is a generic comment for all sysfs calls: what happens if not
supported?
There are a couple of ways to implement it, like:
1. Don't create the attribute;
2. return an error code (-ENOENT? -EINVAL?) if trying to read or
write to the devnode - please detail the used error code(s);
In any case, please define the behavior and document it.
From what I see, you're setting 0x444 on RW nodes when write
is not enabled, but still it is possible to not have RO
supported. This is specially true as technology evolves, as
memory controllers and different types of memories may have
very different ways to control it[1].
[1] If you're curious enough, one legacy example of memories
implemented on a very different way was Fully Buffered DIMMs
where each DIMM had its own internal chipset to offload
certain tasks, including scrubbing and ECC implementation.
It ended not being succeeded long term, as it required
special DIMMs for server's market, reducing the production
scale, but it is an interesting example about how hardware
designs could be innovative breaking existing paradigms.
The FB-DIMM design actually forced a redesign at the EDAC
subsystem, as it was too centered on how an specific type
of memory controllers.
> +
> +What: /sys/bus/edac/devices/<dev-name>/scrub/name
> +Date: Oct 2024
> +KernelVersion: 6.12
> +Contact: linux-edac@vger.kernel.org
> +Description:
> + (RO) name of the memory scrubber
> +
> +What: /sys/bus/edac/devices/<dev-name>/scrub/cycle_in_hours_available
> +Date: Oct 2024
> +KernelVersion: 6.12
> +Contact: linux-edac@vger.kernel.org
> +Description:
> + (RO) Supported range for the scrub cycle in hours by the
> + memory scrubber.
> +
> +What: /sys/bus/edac/devices/<dev-name>/scrub/cycle_in_hours
> +Date: Oct 2024
> +KernelVersion: 6.12
> +Contact: linux-edac@vger.kernel.org
> +Description:
> + (RW) The scrub cycle in hours specified and it must be with in the
> + supported range by the memory scrubber.
Why specifying it in hours? I would use seconds, as it is easy to
represent one hour as 3600 seconds, but you can't specify a cycle of,
let's say, 30min, if the minimum range value is one hour.
I mean, we never know how technology will evolve nor how manufacturers will
implement support for scrubbing cycle on their chipsets.
> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
> index c532b57a6d8a..de56cbd039eb 100644
> --- a/drivers/edac/Makefile
> +++ b/drivers/edac/Makefile
> @@ -10,7 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
>
> edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
> edac_core-y += edac_module.o edac_device_sysfs.o wq.o
> -edac_core-y += edac_ras_feature.o
> +edac_core-y += edac_ras_feature.o edac_scrub.o
>
> edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
>
> diff --git a/drivers/edac/edac_ras_feature.c b/drivers/edac/edac_ras_feature.c
> index 24a729fea66f..48927f868372 100755
> --- a/drivers/edac/edac_ras_feature.c
> +++ b/drivers/edac/edac_ras_feature.c
> @@ -36,6 +36,7 @@ static int edac_ras_feat_scrub_init(struct device *parent,
> {
> sdata->ops = sfeat->scrub_ops;
> sdata->private = sfeat->scrub_ctx;
> + attr_groups[0] = edac_scrub_get_desc();
>
> return 1;
> }
> diff --git a/drivers/edac/edac_scrub.c b/drivers/edac/edac_scrub.c
> new file mode 100755
> index 000000000000..0b07eafd3551
> --- /dev/null
> +++ b/drivers/edac/edac_scrub.c
> @@ -0,0 +1,312 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Generic EDAC scrub driver supports controlling the memory
> + * scrubbers in the system and the common sysfs scrub interface
> + * promotes unambiguous access from the userspace.
> + *
> + * Copyright (c) 2024 HiSilicon Limited.
> + */
> +
> +#define pr_fmt(fmt) "EDAC SCRUB: " fmt
> +
> +#include <linux/edac_ras_feature.h>
> +
> +static ssize_t addr_range_base_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> + u64 base, size;
> + int ret;
> +
> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base, &size);
> + if (ret)
> + return ret;
Also a generic comment applied to all devnodes: what if ops->read_range
is NULL? Shouldn't it be checked? Btw, you could use read_range == NULL
if to implement error handling for unsupported features.
> +
> + return sysfs_emit(buf, "0x%llx\n", base);
> +}
> +
> +static ssize_t addr_range_size_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> + u64 base, size;
> + int ret;
> +
> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base, &size);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "0x%llx\n", size);
> +}
> +
> +static ssize_t addr_range_base_store(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> + u64 base, size;
> + int ret;
> +
> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base, &size);
> + if (ret)
> + return ret;
> +
> + ret = kstrtou64(buf, 16, &base);
I would use base 0, letting the parser expect "0x" for hexadecimal values.
Same for other *_store methods.
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->write_range(ras_feat_dev->parent, ctx->scrub.private, base, size);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static ssize_t addr_range_size_store(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + const char *buf,
> + size_t len)
> +{
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> + u64 base, size;
> + int ret;
> +
> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base, &size);
> + if (ret)
> + return ret;
> +
> + ret = kstrtou64(buf, 16, &size);
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->write_range(ras_feat_dev->parent, ctx->scrub.private, base, size);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static ssize_t enable_background_store(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> + bool enable;
> + int ret;
> +
> + ret = kstrtobool(buf, &enable);
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->set_enabled_bg(ras_feat_dev->parent, ctx->scrub.private, enable);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static ssize_t enable_background_show(struct device *ras_feat_dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> + bool enable;
> + int ret;
> +
> + ret = ops->get_enabled_bg(ras_feat_dev->parent, ctx->scrub.private, &enable);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "%d\n", enable);
> +}
> +
> +static ssize_t enable_on_demand_show(struct device *ras_feat_dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> + bool enable;
> + int ret;
> +
> + ret = ops->get_enabled_od(ras_feat_dev->parent, ctx->scrub.private, &enable);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "%d\n", enable);
> +}
> +
> +static ssize_t enable_on_demand_store(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> + bool enable;
> + int ret;
> +
> + ret = kstrtobool(buf, &enable);
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->set_enabled_od(ras_feat_dev->parent, ctx->scrub.private, enable);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static ssize_t name_show(struct device *ras_feat_dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> + int ret;
> +
> + ret = ops->get_name(ras_feat_dev->parent, ctx->scrub.private, buf);
> + if (ret)
> + return ret;
> +
> + return strlen(buf);
> +}
> +
> +static ssize_t cycle_in_hours_show(struct device *ras_feat_dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> + u64 val;
> + int ret;
> +
> + ret = ops->cycle_in_hours_read(ras_feat_dev->parent, ctx->scrub.private, &val);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "0x%llx\n", val);
> +}
> +
> +static ssize_t cycle_in_hours_store(struct device *ras_feat_dev, struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> + long val;
> + int ret;
> +
> + ret = kstrtol(buf, 10, &val);
Even here, I would be using base=0, but if you only want to support base
10, please document it at the sysfs ABI.
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->cycle_in_hours_write(ras_feat_dev->parent, ctx->scrub.private, val);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static ssize_t cycle_in_hours_range_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> + u64 min_schrs, max_schrs;
> + int ret;
> +
> + ret = ops->cycle_in_hours_range(ras_feat_dev->parent, ctx->scrub.private,
> + &min_schrs, &max_schrs);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "0x%llx-0x%llx\n", min_schrs, max_schrs);
Hmm... you added the store in decimal, but here you're showing in
hexa...
Btw, don't group multiple values on a single sysfs node. Instead,
implement two separate devnodes:
min-scrub-cycle
max-scrub-cycle
(see the note above about "hours")
> +}
> +
> +static DEVICE_ATTR_RW(addr_range_base);
> +static DEVICE_ATTR_RW(addr_range_size);
> +static DEVICE_ATTR_RW(enable_background);
> +static DEVICE_ATTR_RW(enable_on_demand);
> +static DEVICE_ATTR_RO(name);
> +static DEVICE_ATTR_RW(cycle_in_hours);
> +static DEVICE_ATTR_RO(cycle_in_hours_range);
> +
> +static struct attribute *scrub_attrs[] = {
> + &dev_attr_addr_range_base.attr,
> + &dev_attr_addr_range_size.attr,
> + &dev_attr_enable_background.attr,
> + &dev_attr_enable_on_demand.attr,
> + &dev_attr_name.attr,
> + &dev_attr_cycle_in_hours.attr,
> + &dev_attr_cycle_in_hours_range.attr,
> + NULL
> +};
> +
> +static umode_t scrub_attr_visible(struct kobject *kobj,
> + struct attribute *a, int attr_id)
> +{
> + struct device *ras_feat_dev = kobj_to_dev(kobj);
> + struct edac_ras_feat_ctx *ctx;
> + const struct edac_scrub_ops *ops;
> +
> + ctx = dev_get_drvdata(ras_feat_dev);
> + if (!ctx)
> + return 0;
> +
> + ops = ctx->scrub.ops;
> + if (a == &dev_attr_addr_range_base.attr ||
> + a == &dev_attr_addr_range_size.attr) {
> + if (ops->read_range && ops->write_range)
> + return a->mode;
> + if (ops->read_range)
> + return 0444;
> + return 0;
> + }
> + if (a == &dev_attr_enable_background.attr) {
> + if (ops->set_enabled_bg && ops->get_enabled_bg)
> + return a->mode;
> + if (ops->get_enabled_bg)
> + return 0444;
> + return 0;
> + }
> + if (a == &dev_attr_enable_on_demand.attr) {
> + if (ops->set_enabled_od && ops->get_enabled_od)
> + return a->mode;
> + if (ops->get_enabled_od)
> + return 0444;
> + return 0;
> + }
> + if (a == &dev_attr_name.attr)
> + return ops->get_name ? a->mode : 0;
> + if (a == &dev_attr_cycle_in_hours_range.attr)
> + return ops->cycle_in_hours_range ? a->mode : 0;
> + if (a == &dev_attr_cycle_in_hours.attr) { /* Write only makes little sense */
> + if (ops->cycle_in_hours_read && ops->cycle_in_hours_write)
> + return a->mode;
> + if (ops->cycle_in_hours_read)
> + return 0444;
> + return 0;
> + }
> +
> + return 0;
> +}
> +
> +static const struct attribute_group scrub_attr_group = {
> + .name = "scrub",
> + .attrs = scrub_attrs,
> + .is_visible = scrub_attr_visible,
> +};
> +
> +/**
> + * edac_scrub_get_desc - get edac scrub's attr descriptor
> + *
> + * Returns attribute_group for the scrub feature.
> + */
> +const struct attribute_group *edac_scrub_get_desc(void)
> +{
> + return &scrub_attr_group;
> +}
> diff --git a/include/linux/edac_ras_feature.h b/include/linux/edac_ras_feature.h
> index 000e99141023..462f9ecbf9d4 100755
> --- a/include/linux/edac_ras_feature.h
> +++ b/include/linux/edac_ras_feature.h
> @@ -19,6 +19,34 @@ enum edac_ras_feat {
> ras_feat_max
> };
>
> +/**
> + * struct scrub_ops - scrub device operations (all elements optional)
> + * @read_range: read base and offset of scrubbing range.
> + * @write_range: set the base and offset of the scrubbing range.
> + * @get_enabled_bg: check if currently performing background scrub.
> + * @set_enabled_bg: start or stop a bg-scrub.
> + * @get_enabled_od: check if currently performing on-demand scrub.
> + * @set_enabled_od: start or stop an on-demand scrub.
> + * @cycle_in_hours_range: retrieve limits on supported cycle in hours.
> + * @cycle_in_hours_read: read the scrub cycle in hours.
> + * @cycle_in_hours_write: set the scrub cycle in hours.
> + * @get_name: get the memory scrubber's name.
> + */
> +struct edac_scrub_ops {
> + int (*read_range)(struct device *dev, void *drv_data, u64 *base, u64 *size);
> + int (*write_range)(struct device *dev, void *drv_data, u64 base, u64 size);
> + int (*get_enabled_bg)(struct device *dev, void *drv_data, bool *enable);
> + int (*set_enabled_bg)(struct device *dev, void *drv_data, bool enable);
> + int (*get_enabled_od)(struct device *dev, void *drv_data, bool *enable);
> + int (*set_enabled_od)(struct device *dev, void *drv_data, bool enable);
> + int (*cycle_in_hours_range)(struct device *dev, void *drv_data, u64 *min, u64 *max);
> + int (*cycle_in_hours_read)(struct device *dev, void *drv_data, u64 *schrs);
> + int (*cycle_in_hours_write)(struct device *dev, void *drv_data, u64 schrs);
> + int (*get_name)(struct device *dev, void *drv_data, char *buf);
> +};
> +
> +const struct attribute_group *edac_scrub_get_desc(void);
> +
> struct edac_ecs_ex_info {
> u16 num_media_frus;
> };
Thanks,
Mauro
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v9 03/11] EDAC: Add EDAC ECS control driver
2024-07-16 15:03 ` [RFC PATCH v9 03/11] EDAC: Add EDAC ECS " shiju.jose
@ 2024-07-17 13:08 ` Mauro Carvalho Chehab
2024-07-17 17:13 ` nifan.cxl
1 sibling, 0 replies; 30+ messages in thread
From: Mauro Carvalho Chehab @ 2024-07-17 13:08 UTC (permalink / raw)
To: shiju.jose
Cc: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel, bp,
tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu, kangkang.shen,
wanghuiqiang, linuxarm
Em Tue, 16 Jul 2024 16:03:27 +0100
<shiju.jose@huawei.com> escreveu:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> Add EDAC ECS (Error Check Scrub) control driver supports configuring
> the memory device's ECS feature.
>
> The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
> Specification (JESD79-5) and allows the DRAM to internally read, correct
> single-bit errors, and write back corrected data bits to the DRAM array
> while providing transparency to error counts.
>
> The DDR5 device contains number of memory media FRUs per device. The
> DDR5 ECS feature and thus the ECS control driver supports configuring
> the ECS parameters per FRU.
>
> The memory devices supports ECS feature register with the EDAC ECS driver
typo:
supports -> support
> and thus with the generic EDAC RAS feature driver, which adds the sysfs
> ECS control interface. The ECS control attributes are exposed to the
> userspace in /sys/bus/edac/devices/<dev-name>/ecs_fruX/.
>
> Generic EDAC ECS driver and the common sysfs ECS interface promotes
> unambiguous control from the userspace irrespective of the underlying
> devices, support ECS feature.
>
> The support for ECS feature is added separately because the DDR5 ECS
> feature's control attributes are dissimilar from those of the scrub
> feature.
>
> Note: Documentation can be added if necessary.
Please document.
>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
> drivers/edac/Makefile | 2 +-
> drivers/edac/edac_ecs.c | 396 +++++++++++++++++++++++++++++++
> drivers/edac/edac_ras_feature.c | 5 +
> include/linux/edac_ras_feature.h | 36 +++
> 4 files changed, 438 insertions(+), 1 deletion(-)
> create mode 100755 drivers/edac/edac_ecs.c
>
> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
> index de56cbd039eb..c1412c7d3efb 100644
> --- a/drivers/edac/Makefile
> +++ b/drivers/edac/Makefile
> @@ -10,7 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
>
> edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
> edac_core-y += edac_module.o edac_device_sysfs.o wq.o
> -edac_core-y += edac_ras_feature.o edac_scrub.o
> +edac_core-y += edac_ras_feature.o edac_scrub.o edac_ecs.o
>
> edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
>
> diff --git a/drivers/edac/edac_ecs.c b/drivers/edac/edac_ecs.c
> new file mode 100755
> index 000000000000..37dabd053c36
> --- /dev/null
> +++ b/drivers/edac/edac_ecs.c
> @@ -0,0 +1,396 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * ECS driver supporting controlling on die error check scrub
> + * (e.g. DDR5 ECS). The common sysfs ECS interface promotes
> + * unambiguous access from the userspace.
> + *
> + * Copyright (c) 2024 HiSilicon Limited.
> + */
> +
> +#define pr_fmt(fmt) "EDAC ECS: " fmt
> +
> +#include <linux/edac_ras_feature.h>
> +
> +#define EDAC_ECS_FRU_NAME "ecs_fru"
> +
> +enum edac_ecs_attributes {
> + ecs_log_entry_type,
> + ecs_log_entry_type_per_dram,
> + ecs_log_entry_type_per_memory_media,
> + ecs_mode,
> + ecs_mode_counts_rows,
> + ecs_mode_counts_codewords,
> + ecs_reset,
> + ecs_name,
> + ecs_threshold,
> + ecs_max_attrs
> +};
Please use uppercase for enums.
> +
> +struct edac_ecs_dev_attr {
> + struct device_attribute dev_attr;
> + int fru_id;
> +};
> +
> +struct edac_ecs_fru_context {
> + char name[EDAC_RAS_NAME_LEN];
> + struct edac_ecs_dev_attr ecs_dev_attr[ecs_max_attrs];
> + struct attribute *ecs_attrs[ecs_max_attrs + 1];
> + struct attribute_group group;
> +};
> +
> +struct edac_ecs_context {
> + u16 num_media_frus;
> + struct edac_ecs_fru_context *fru_ctxs;
> +};
> +
> +#define to_ecs_dev_attr(_dev_attr) \
> + container_of(_dev_attr, struct edac_ecs_dev_attr, dev_attr)
> +
> +static ssize_t log_entry_type_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> + u64 val;
> + int ret;
> +
> + ret = ops->get_log_entry_type(ras_feat_dev->parent, ctx->ecs.private,
> + ecs_dev_attr->fru_id, &val);
> + if (ret)
> + return ret;
Same notes as patch 2/11 with regards to sysfs documentation/store/show.
Also, it is hard to review this patch without the ABI documentation.
Regards,
Mauro
Thanks,
Mauro
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v9 02/11] EDAC: Add EDAC scrub control driver
2024-07-17 12:56 ` Mauro Carvalho Chehab
@ 2024-07-17 14:07 ` Shiju Jose
2024-07-18 7:03 ` Mauro Carvalho Chehab
0 siblings, 1 reply; 30+ messages in thread
From: Shiju Jose @ 2024-07-17 14:07 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: linux-edac@vger.kernel.org, linux-cxl@vger.kernel.org,
linux-acpi@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, bp@alien8.de, tony.luck@intel.com,
rafael@kernel.org, lenb@kernel.org, mchehab@kernel.org,
dan.j.williams@intel.com, dave@stgolabs.net, Jonathan Cameron,
dave.jiang@intel.com, alison.schofield@intel.com,
vishal.l.verma@intel.com, ira.weiny@intel.com, david@redhat.com,
Vilas.Sridharan@amd.com, leo.duran@amd.com, Yazen.Ghannam@amd.com,
rientjes@google.com, jiaqiyan@google.com, Jon.Grimm@amd.com,
dave.hansen@linux.intel.com, naoya.horiguchi@nec.com,
james.morse@arm.com, jthoughton@google.com,
somasundaram.a@hpe.com, erdemaktas@google.com, pgonda@google.com,
duenwen@google.com, mike.malvestuto@intel.com, gthelen@google.com,
wschwartz@amperecomputing.com, dferguson@amperecomputing.com,
wbs@os.amperecomputing.com, nifan.cxl@gmail.com, tanxiaofei,
Zengtao (B), Roberto Sassu, kangkang.shen@futurewei.com,
wanghuiqiang, Linuxarm
>-----Original Message-----
>From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
>Sent: 17 July 2024 13:57
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: linux-edac@vger.kernel.org; linux-cxl@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org;
>bp@alien8.de; tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org;
>mchehab@kernel.org; dan.j.williams@intel.com; dave@stgolabs.net; Jonathan
>Cameron <jonathan.cameron@huawei.com>; dave.jiang@intel.com;
>alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com;
>david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com;
>Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
>naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
>duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com;
>wschwartz@amperecomputing.com; dferguson@amperecomputing.com;
>wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
>Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
>wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm
><linuxarm@huawei.com>
>Subject: Re: [RFC PATCH v9 02/11] EDAC: Add EDAC scrub control driver
>
>Em Tue, 16 Jul 2024 16:03:26 +0100
><shiju.jose@huawei.com> escreveu:
>
>> From: Shiju Jose <shiju.jose@huawei.com>
>>
>> Add generic EDAC scrub control driver supports configuring the memory
>> scrubbers in the system. The device with scrub feature, get the scrub
>> descriptor from the EDAC scrub and registers with the EDAC RAS feature
>> driver, which adds the sysfs scrub control interface. The scrub
>> control attributes are available to the userspace in
>/sys/bus/edac/devices/<dev-name>/scrub/.
>>
>> Generic EDAC scrub driver and the common sysfs scrub interface
>> promotes unambiguous access from the userspace irrespective of the
>> underlying scrub devices.
>>
>> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
>> ---
>> Documentation/ABI/testing/sysfs-edac-scrub | 64 +++++
>> drivers/edac/Makefile | 2 +-
>> drivers/edac/edac_ras_feature.c | 1 +
>> drivers/edac/edac_scrub.c | 312 +++++++++++++++++++++
>> include/linux/edac_ras_feature.h | 28 ++
>> 5 files changed, 406 insertions(+), 1 deletion(-) create mode 100644
>> Documentation/ABI/testing/sysfs-edac-scrub
>> create mode 100755 drivers/edac/edac_scrub.c
>>
>> diff --git a/Documentation/ABI/testing/sysfs-edac-scrub
>> b/Documentation/ABI/testing/sysfs-edac-scrub
>> new file mode 100644
>> index 000000000000..dd19afd5e165
>> --- /dev/null
>> +++ b/Documentation/ABI/testing/sysfs-edac-scrub
>> @@ -0,0 +1,64 @@
>> +What: /sys/bus/edac/devices/<dev-name>/scrub
>> +Date: Oct 2024
>> +KernelVersion: 6.12
>> +Contact: linux-edac@vger.kernel.org
>> +Description:
>> + The sysfs edac bus devices /<dev-name>/scrub subdirectory
>> + belongs to the memory scrub control feature, where <dev-
>name>
>> + directory corresponds to a device/memory region registered
>> + with the edac scrub driver and thus registered with the
>> + generic edac ras driver too.
>> +
>> +What: /sys/bus/edac/devices/<dev-
>name>/scrub/addr_range_base
>> +Date: Oct 2024
>> +KernelVersion: 6.12
>> +Contact: linux-edac@vger.kernel.org
>> +Description:
>> + (RW) The base of the address range of the memory region
>> + to be scrubbed (on-demand scrubbing).
>> +
>> +What: /sys/bus/edac/devices/<dev-
>name>/scrub/addr_range_size
>> +Date: Oct 2024
>> +KernelVersion: 6.12
>> +Contact: linux-edac@vger.kernel.org
>> +Description:
>> + (RW) The size of the address range of the memory region
>> + to be scrubbed (on-demand scrubbing).
>> +
>> +What: /sys/bus/edac/devices/<dev-
>name>/scrub/enable_background
>> +Date: Oct 2024
>> +KernelVersion: 6.12
>> +Contact: linux-edac@vger.kernel.org
>> +Description:
>> + (RW) Start/Stop background(patrol) scrubbing if supported.
>> +
>> +What: /sys/bus/edac/devices/<dev-
>name>/scrub/enable_on_demand
>> +Date: Oct 2024
>> +KernelVersion: 6.12
>> +Contact: linux-edac@vger.kernel.org
>> +Description:
>> + (RW) Start/Stop on-demand scrubbing the memory region
>> + if supported.
>
>This is a generic comment for all sysfs calls: what happens if not supported?
>
>There are a couple of ways to implement it, like:
>
>1. Don't create the attribute;
>2. return an error code (-ENOENT? -EINVAL?) if trying to read or
> write to the devnode - please detail the used error code(s);
>
>In any case, please define the behavior and document it.
>
>From what I see, you're setting 0x444 on RW nodes when write is not enabled,
>but still it is possible to not have RO supported. This is specially true as
>technology evolves, as memory controllers and different types of memories may
>have very different ways to control it[1].
It is not true. If the parent device does not support and define callbacks for both read and write,
then return 0 as you can see in the scrub_attr_visible() and the attribute
would not be present for that device in the sysfs.
For e.g. attributes addr_range_base and addr_range_size does not support by CXL patrol
scrub feature, but supported by ACPI RAS2 scrub feature.
>
>[1] If you're curious enough, one legacy example of memories
> implemented on a very different way was Fully Buffered DIMMs
> where each DIMM had its own internal chipset to offload
> certain tasks, including scrubbing and ECC implementation.
> It ended not being succeeded long term, as it required
> special DIMMs for server's market, reducing the production
> scale, but it is an interesting example about how hardware
> designs could be innovative breaking existing paradigms.
> The FB-DIMM design actually forced a redesign at the EDAC
> subsystem, as it was too centered on how an specific type
> of memory controllers.
>
>> +
>> +What: /sys/bus/edac/devices/<dev-name>/scrub/name
>> +Date: Oct 2024
>> +KernelVersion: 6.12
>> +Contact: linux-edac@vger.kernel.org
>> +Description:
>> + (RO) name of the memory scrubber
>> +
>
>
>> +What: /sys/bus/edac/devices/<dev-
>name>/scrub/cycle_in_hours_available
>> +Date: Oct 2024
>> +KernelVersion: 6.12
>> +Contact: linux-edac@vger.kernel.org
>> +Description:
>> + (RO) Supported range for the scrub cycle in hours by the
>> + memory scrubber.
>> +
>> +What: /sys/bus/edac/devices/<dev-
>name>/scrub/cycle_in_hours
>> +Date: Oct 2024
>> +KernelVersion: 6.12
>> +Contact: linux-edac@vger.kernel.org
>> +Description:
>> + (RW) The scrub cycle in hours specified and it must be with in
>the
>> + supported range by the memory scrubber.
>
>Why specifying it in hours? I would use seconds, as it is easy to represent one
>hour as 3600 seconds, but you can't specify a cycle of, let's say, 30min, if the
>minimum range value is one hour.
For the CXL patrol scrub, scrub cycle defined in hours(CXL spec 3.1 Table 8-208. Device Patrol Scrub
Control Feature Writable Attributes), but ACPI RAS2 does not define the unit for the scrub cycle.
Thus proposed represent scrub cycle in hours in common.
Not sure how convenient to set the scrub cycle in seconds from the user perspective and
also is it require to finish the background scrubbing in such short time?
>
>I mean, we never know how technology will evolve nor how manufacturers will
>implement support for scrubbing cycle on their chipsets.
>
>> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index
>> c532b57a6d8a..de56cbd039eb 100644
>> --- a/drivers/edac/Makefile
>> +++ b/drivers/edac/Makefile
>> @@ -10,7 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
>>
>> edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
>> edac_core-y += edac_module.o edac_device_sysfs.o wq.o
>> -edac_core-y += edac_ras_feature.o
>> +edac_core-y += edac_ras_feature.o edac_scrub.o
>>
>> edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
>>
>> diff --git a/drivers/edac/edac_ras_feature.c
>> b/drivers/edac/edac_ras_feature.c index 24a729fea66f..48927f868372
>> 100755
>> --- a/drivers/edac/edac_ras_feature.c
>> +++ b/drivers/edac/edac_ras_feature.c
>> @@ -36,6 +36,7 @@ static int edac_ras_feat_scrub_init(struct device
>> *parent, {
>> sdata->ops = sfeat->scrub_ops;
>> sdata->private = sfeat->scrub_ctx;
>> + attr_groups[0] = edac_scrub_get_desc();
>>
>> return 1;
>> }
>> diff --git a/drivers/edac/edac_scrub.c b/drivers/edac/edac_scrub.c new
>> file mode 100755 index 000000000000..0b07eafd3551
>> --- /dev/null
>> +++ b/drivers/edac/edac_scrub.c
>> @@ -0,0 +1,312 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Generic EDAC scrub driver supports controlling the memory
>> + * scrubbers in the system and the common sysfs scrub interface
>> + * promotes unambiguous access from the userspace.
>> + *
>> + * Copyright (c) 2024 HiSilicon Limited.
>> + */
>> +
>> +#define pr_fmt(fmt) "EDAC SCRUB: " fmt
>> +
>> +#include <linux/edac_ras_feature.h>
>> +
>> +static ssize_t addr_range_base_show(struct device *ras_feat_dev,
>> + struct device_attribute *attr,
>> + char *buf)
>> +{
>> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
>> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
>> + u64 base, size;
>> + int ret;
>> +
>> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base,
>&size);
>> + if (ret)
>> + return ret;
>
>Also a generic comment applied to all devnodes: what if ops->read_range is
>NULL? Shouldn't it be checked? Btw, you could use read_range == NULL if to
>implement error handling for unsupported features.
If ops->read_range is NULL, scrub_attr_visible() return 0 and then the corresponding attributes
addr_range_base and addr_range_size would not be added in the sysfs.
Same for other attributes.
>
>> +
>> + return sysfs_emit(buf, "0x%llx\n", base); }
>> +
>> +static ssize_t addr_range_size_show(struct device *ras_feat_dev,
>> + struct device_attribute *attr,
>> + char *buf)
>> +{
>> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
>> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
>> + u64 base, size;
>> + int ret;
>> +
>> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base,
>&size);
>> + if (ret)
>> + return ret;
>> +
>> + return sysfs_emit(buf, "0x%llx\n", size); }
>> +
>> +static ssize_t addr_range_base_store(struct device *ras_feat_dev,
>> + struct device_attribute *attr,
>> + const char *buf, size_t len) {
>> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
>> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
>> + u64 base, size;
>> + int ret;
>> +
>> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base,
>&size);
>> + if (ret)
>> + return ret;
>> +
>> + ret = kstrtou64(buf, 16, &base);
>
>I would use base 0, letting the parser expect "0x" for hexadecimal values.
>Same for other *_store methods.
Will check.
>
>> + if (ret < 0)
>> + return ret;
>> +
>> + ret = ops->write_range(ras_feat_dev->parent, ctx->scrub.private, base,
>size);
>> + if (ret)
>> + return ret;
>> +
>> + return len;
>> +}
>> +
>> +static ssize_t addr_range_size_store(struct device *ras_feat_dev,
>> + struct device_attribute *attr,
>> + const char *buf,
>> + size_t len)
>> +{
>> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
>> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
>> + u64 base, size;
>> + int ret;
>> +
>> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base,
>&size);
>> + if (ret)
>> + return ret;
>> +
>> + ret = kstrtou64(buf, 16, &size);
>> + if (ret < 0)
>> + return ret;
>> +
>> + ret = ops->write_range(ras_feat_dev->parent, ctx->scrub.private, base,
>size);
>> + if (ret)
>> + return ret;
>> +
>> + return len;
>> +}
>> +
>> +static ssize_t enable_background_store(struct device *ras_feat_dev,
>> + struct device_attribute *attr,
>> + const char *buf, size_t len) {
>> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
>> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
>> + bool enable;
>> + int ret;
>> +
>> + ret = kstrtobool(buf, &enable);
>> + if (ret < 0)
>> + return ret;
>> +
>> + ret = ops->set_enabled_bg(ras_feat_dev->parent, ctx->scrub.private,
>enable);
>> + if (ret)
>> + return ret;
>> +
>> + return len;
>> +}
>> +
>> +static ssize_t enable_background_show(struct device *ras_feat_dev,
>> + struct device_attribute *attr, char *buf) {
>> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
>> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
>> + bool enable;
>> + int ret;
>> +
>> + ret = ops->get_enabled_bg(ras_feat_dev->parent, ctx->scrub.private,
>&enable);
>> + if (ret)
>> + return ret;
>> +
>> + return sysfs_emit(buf, "%d\n", enable); }
>> +
>> +static ssize_t enable_on_demand_show(struct device *ras_feat_dev,
>> + struct device_attribute *attr, char *buf) {
>> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
>> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
>> + bool enable;
>> + int ret;
>> +
>> + ret = ops->get_enabled_od(ras_feat_dev->parent, ctx->scrub.private,
>&enable);
>> + if (ret)
>> + return ret;
>> +
>> + return sysfs_emit(buf, "%d\n", enable); }
>> +
>> +static ssize_t enable_on_demand_store(struct device *ras_feat_dev,
>> + struct device_attribute *attr,
>> + const char *buf, size_t len) {
>> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
>> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
>> + bool enable;
>> + int ret;
>> +
>> + ret = kstrtobool(buf, &enable);
>> + if (ret < 0)
>> + return ret;
>> +
>> + ret = ops->set_enabled_od(ras_feat_dev->parent, ctx->scrub.private,
>enable);
>> + if (ret)
>> + return ret;
>> +
>> + return len;
>> +}
>> +
>> +static ssize_t name_show(struct device *ras_feat_dev,
>> + struct device_attribute *attr, char *buf) {
>> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
>> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
>> + int ret;
>> +
>> + ret = ops->get_name(ras_feat_dev->parent, ctx->scrub.private, buf);
>> + if (ret)
>> + return ret;
>> +
>> + return strlen(buf);
>> +}
>> +
>> +static ssize_t cycle_in_hours_show(struct device *ras_feat_dev, struct
>device_attribute *attr,
>> + char *buf)
>> +{
>> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
>> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
>> + u64 val;
>> + int ret;
>> +
>> + ret = ops->cycle_in_hours_read(ras_feat_dev->parent, ctx-
>>scrub.private, &val);
>> + if (ret)
>> + return ret;
>> +
>> + return sysfs_emit(buf, "0x%llx\n", val); }
>> +
>> +static ssize_t cycle_in_hours_store(struct device *ras_feat_dev, struct
>device_attribute *attr,
>> + const char *buf, size_t len)
>> +{
>> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
>> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
>> + long val;
>> + int ret;
>> +
>> + ret = kstrtol(buf, 10, &val);
>
>Even here, I would be using base=0, but if you only want to support base 10,
>please document it at the sysfs ABI.
Will do.
>
>> + if (ret < 0)
>> + return ret;
>> +
>> + ret = ops->cycle_in_hours_write(ras_feat_dev->parent, ctx-
>>scrub.private, val);
>> + if (ret)
>> + return ret;
>> +
>> + return len;
>> +}
>> +
>> +static ssize_t cycle_in_hours_range_show(struct device *ras_feat_dev,
>> + struct device_attribute *attr,
>> + char *buf)
>> +{
>> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
>> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
>> + u64 min_schrs, max_schrs;
>> + int ret;
>> +
>> + ret = ops->cycle_in_hours_range(ras_feat_dev->parent, ctx-
>>scrub.private,
>> + &min_schrs, &max_schrs);
>> + if (ret)
>> + return ret;
>> +
>> + return sysfs_emit(buf, "0x%llx-0x%llx\n", min_schrs, max_schrs);
>
>Hmm... you added the store in decimal, but here you're showing in hexa...
Will check for store and show decimal.
>
>Btw, don't group multiple values on a single sysfs node. Instead, implement two
>separate devnodes:
Here we are showing the supported range for the scrub cycle.
I am wondering any opinion on this from others?
>
> min-scrub-cycle
> max-scrub-cycle
>
>(see the note above about "hours")
>
>
>> +}
>> +
>> +static DEVICE_ATTR_RW(addr_range_base); static
>> +DEVICE_ATTR_RW(addr_range_size); static
>> +DEVICE_ATTR_RW(enable_background);
>> +static DEVICE_ATTR_RW(enable_on_demand); static
>DEVICE_ATTR_RO(name);
>> +static DEVICE_ATTR_RW(cycle_in_hours); static
>> +DEVICE_ATTR_RO(cycle_in_hours_range);
>> +
>> +static struct attribute *scrub_attrs[] = {
>> + &dev_attr_addr_range_base.attr,
>> + &dev_attr_addr_range_size.attr,
>> + &dev_attr_enable_background.attr,
>> + &dev_attr_enable_on_demand.attr,
>> + &dev_attr_name.attr,
>> + &dev_attr_cycle_in_hours.attr,
>> + &dev_attr_cycle_in_hours_range.attr,
>> + NULL
>> +};
>> +
>> +static umode_t scrub_attr_visible(struct kobject *kobj,
>> + struct attribute *a, int attr_id) {
>> + struct device *ras_feat_dev = kobj_to_dev(kobj);
>> + struct edac_ras_feat_ctx *ctx;
>> + const struct edac_scrub_ops *ops;
>> +
>> + ctx = dev_get_drvdata(ras_feat_dev);
>> + if (!ctx)
>> + return 0;
>> +
>> + ops = ctx->scrub.ops;
>> + if (a == &dev_attr_addr_range_base.attr ||
>> + a == &dev_attr_addr_range_size.attr) {
>> + if (ops->read_range && ops->write_range)
>> + return a->mode;
>> + if (ops->read_range)
>> + return 0444;
>> + return 0;
>> + }
>> + if (a == &dev_attr_enable_background.attr) {
>> + if (ops->set_enabled_bg && ops->get_enabled_bg)
>> + return a->mode;
>> + if (ops->get_enabled_bg)
>> + return 0444;
>> + return 0;
>> + }
>> + if (a == &dev_attr_enable_on_demand.attr) {
>> + if (ops->set_enabled_od && ops->get_enabled_od)
>> + return a->mode;
>> + if (ops->get_enabled_od)
>> + return 0444;
>> + return 0;
>> + }
>> + if (a == &dev_attr_name.attr)
>> + return ops->get_name ? a->mode : 0;
>> + if (a == &dev_attr_cycle_in_hours_range.attr)
>> + return ops->cycle_in_hours_range ? a->mode : 0;
>> + if (a == &dev_attr_cycle_in_hours.attr) { /* Write only makes little sense
>*/
>> + if (ops->cycle_in_hours_read && ops->cycle_in_hours_write)
>> + return a->mode;
>> + if (ops->cycle_in_hours_read)
>> + return 0444;
>> + return 0;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static const struct attribute_group scrub_attr_group = {
>> + .name = "scrub",
>> + .attrs = scrub_attrs,
>> + .is_visible = scrub_attr_visible,
>> +};
>> +
>> +/**
>> + * edac_scrub_get_desc - get edac scrub's attr descriptor
>> + *
>> + * Returns attribute_group for the scrub feature.
>> + */
>> +const struct attribute_group *edac_scrub_get_desc(void) {
>> + return &scrub_attr_group;
>> +}
>> diff --git a/include/linux/edac_ras_feature.h
>> b/include/linux/edac_ras_feature.h
>> index 000e99141023..462f9ecbf9d4 100755
>> --- a/include/linux/edac_ras_feature.h
>> +++ b/include/linux/edac_ras_feature.h
>> @@ -19,6 +19,34 @@ enum edac_ras_feat {
>> ras_feat_max
>> };
>>
>> +/**
>> + * struct scrub_ops - scrub device operations (all elements optional)
>> + * @read_range: read base and offset of scrubbing range.
>> + * @write_range: set the base and offset of the scrubbing range.
>> + * @get_enabled_bg: check if currently performing background scrub.
>> + * @set_enabled_bg: start or stop a bg-scrub.
>> + * @get_enabled_od: check if currently performing on-demand scrub.
>> + * @set_enabled_od: start or stop an on-demand scrub.
>> + * @cycle_in_hours_range: retrieve limits on supported cycle in hours.
>> + * @cycle_in_hours_read: read the scrub cycle in hours.
>> + * @cycle_in_hours_write: set the scrub cycle in hours.
>> + * @get_name: get the memory scrubber's name.
>> + */
>> +struct edac_scrub_ops {
>> + int (*read_range)(struct device *dev, void *drv_data, u64 *base, u64
>*size);
>> + int (*write_range)(struct device *dev, void *drv_data, u64 base, u64
>size);
>> + int (*get_enabled_bg)(struct device *dev, void *drv_data, bool *enable);
>> + int (*set_enabled_bg)(struct device *dev, void *drv_data, bool enable);
>> + int (*get_enabled_od)(struct device *dev, void *drv_data, bool *enable);
>> + int (*set_enabled_od)(struct device *dev, void *drv_data, bool enable);
>> + int (*cycle_in_hours_range)(struct device *dev, void *drv_data, u64
>*min, u64 *max);
>> + int (*cycle_in_hours_read)(struct device *dev, void *drv_data, u64
>*schrs);
>> + int (*cycle_in_hours_write)(struct device *dev, void *drv_data, u64
>schrs);
>> + int (*get_name)(struct device *dev, void *drv_data, char *buf); };
>> +
>> +const struct attribute_group *edac_scrub_get_desc(void);
>> +
>> struct edac_ecs_ex_info {
>> u16 num_media_frus;
>> };
>
>
>
>Thanks,
>Mauro
Thanks,
Shiju
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v9 03/11] EDAC: Add EDAC ECS control driver
2024-07-16 15:03 ` [RFC PATCH v9 03/11] EDAC: Add EDAC ECS " shiju.jose
2024-07-17 13:08 ` Mauro Carvalho Chehab
@ 2024-07-17 17:13 ` nifan.cxl
1 sibling, 0 replies; 30+ messages in thread
From: nifan.cxl @ 2024-07-17 17:13 UTC (permalink / raw)
To: shiju.jose
Cc: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel, bp,
tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu, kangkang.shen,
wanghuiqiang, linuxarm
On Tue, Jul 16, 2024 at 04:03:27PM +0100, shiju.jose@huawei.com wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> Add EDAC ECS (Error Check Scrub) control driver supports configuring
> the memory device's ECS feature.
>
> The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
> Specification (JESD79-5) and allows the DRAM to internally read, correct
> single-bit errors, and write back corrected data bits to the DRAM array
> while providing transparency to error counts.
>
> The DDR5 device contains number of memory media FRUs per device. The
> DDR5 ECS feature and thus the ECS control driver supports configuring
> the ECS parameters per FRU.
>
> The memory devices supports ECS feature register with the EDAC ECS driver
> and thus with the generic EDAC RAS feature driver, which adds the sysfs
> ECS control interface. The ECS control attributes are exposed to the
> userspace in /sys/bus/edac/devices/<dev-name>/ecs_fruX/.
>
> Generic EDAC ECS driver and the common sysfs ECS interface promotes
> unambiguous control from the userspace irrespective of the underlying
> devices, support ECS feature.
>
> The support for ECS feature is added separately because the DDR5 ECS
> feature's control attributes are dissimilar from those of the scrub
> feature.
>
> Note: Documentation can be added if necessary.
>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
> drivers/edac/Makefile | 2 +-
> drivers/edac/edac_ecs.c | 396 +++++++++++++++++++++++++++++++
> drivers/edac/edac_ras_feature.c | 5 +
> include/linux/edac_ras_feature.h | 36 +++
> 4 files changed, 438 insertions(+), 1 deletion(-)
> create mode 100755 drivers/edac/edac_ecs.c
>
> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
> index de56cbd039eb..c1412c7d3efb 100644
> --- a/drivers/edac/Makefile
> +++ b/drivers/edac/Makefile
> @@ -10,7 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
>
> edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
> edac_core-y += edac_module.o edac_device_sysfs.o wq.o
> -edac_core-y += edac_ras_feature.o edac_scrub.o
> +edac_core-y += edac_ras_feature.o edac_scrub.o edac_ecs.o
>
> edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
>
> diff --git a/drivers/edac/edac_ecs.c b/drivers/edac/edac_ecs.c
> new file mode 100755
> index 000000000000..37dabd053c36
> --- /dev/null
> +++ b/drivers/edac/edac_ecs.c
> @@ -0,0 +1,396 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * ECS driver supporting controlling on die error check scrub
> + * (e.g. DDR5 ECS). The common sysfs ECS interface promotes
> + * unambiguous access from the userspace.
> + *
> + * Copyright (c) 2024 HiSilicon Limited.
> + */
> +
> +#define pr_fmt(fmt) "EDAC ECS: " fmt
> +
> +#include <linux/edac_ras_feature.h>
> +
> +#define EDAC_ECS_FRU_NAME "ecs_fru"
> +
> +enum edac_ecs_attributes {
> + ecs_log_entry_type,
> + ecs_log_entry_type_per_dram,
> + ecs_log_entry_type_per_memory_media,
> + ecs_mode,
> + ecs_mode_counts_rows,
> + ecs_mode_counts_codewords,
> + ecs_reset,
> + ecs_name,
> + ecs_threshold,
> + ecs_max_attrs
> +};
As mentioned in other review, use uppercase.
Fan
> +
> +struct edac_ecs_dev_attr {
> + struct device_attribute dev_attr;
> + int fru_id;
> +};
> +
> +struct edac_ecs_fru_context {
> + char name[EDAC_RAS_NAME_LEN];
> + struct edac_ecs_dev_attr ecs_dev_attr[ecs_max_attrs];
> + struct attribute *ecs_attrs[ecs_max_attrs + 1];
> + struct attribute_group group;
> +};
> +
> +struct edac_ecs_context {
> + u16 num_media_frus;
> + struct edac_ecs_fru_context *fru_ctxs;
> +};
> +
> +#define to_ecs_dev_attr(_dev_attr) \
> + container_of(_dev_attr, struct edac_ecs_dev_attr, dev_attr)
> +
> +static ssize_t log_entry_type_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> + u64 val;
> + int ret;
> +
> + ret = ops->get_log_entry_type(ras_feat_dev->parent, ctx->ecs.private,
> + ecs_dev_attr->fru_id, &val);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "0x%llx\n", val);
> +}
> +
> +static ssize_t log_entry_type_store(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> + long val;
> + int ret;
> +
> + ret = kstrtol(buf, 10, &val);
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->set_log_entry_type(ras_feat_dev->parent, ctx->ecs.private,
> + ecs_dev_attr->fru_id, val);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static ssize_t log_entry_type_per_dram_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> + u64 val;
> + int ret;
> +
> + ret = ops->get_log_entry_type_per_dram(ras_feat_dev->parent, ctx->ecs.private,
> + ecs_dev_attr->fru_id, &val);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "0x%llx\n", val);
> +}
> +
> +static ssize_t log_entry_type_per_memory_media_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> + u64 val;
> + int ret;
> +
> + ret = ops->get_log_entry_type_per_memory_media(ras_feat_dev->parent,
> + ctx->ecs.private,
> + ecs_dev_attr->fru_id, &val);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "0x%llx\n", val);
> +}
> +
> +static ssize_t mode_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> + u64 val;
> + int ret;
> +
> + ret = ops->get_mode(ras_feat_dev->parent, ctx->ecs.private,
> + ecs_dev_attr->fru_id, &val);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "0x%llx\n", val);
> +}
> +
> +static ssize_t mode_store(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> + long val;
> + int ret;
> +
> + ret = kstrtol(buf, 10, &val);
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->set_mode(ras_feat_dev->parent, ctx->ecs.private,
> + ecs_dev_attr->fru_id, val);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static ssize_t mode_counts_rows_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> + u64 val;
> + int ret;
> +
> + ret = ops->get_mode_counts_rows(ras_feat_dev->parent, ctx->ecs.private,
> + ecs_dev_attr->fru_id, &val);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "0x%llx\n", val);
> +}
> +
> +static ssize_t mode_counts_codewords_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> + u64 val;
> + int ret;
> +
> + ret = ops->get_mode_counts_codewords(ras_feat_dev->parent, ctx->ecs.private,
> + ecs_dev_attr->fru_id, &val);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "0x%llx\n", val);
> +}
> +
> +static ssize_t reset_store(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> + long val;
> + int ret;
> +
> + ret = kstrtol(buf, 10, &val);
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->reset(ras_feat_dev->parent, ctx->ecs.private,
> + ecs_dev_attr->fru_id, val);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static ssize_t name_show(struct device *ras_feat_dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> + int ret;
> +
> + ret = ops->get_name(ras_feat_dev->parent, ctx->ecs.private,
> + ecs_dev_attr->fru_id, buf);
> + if (ret)
> + return ret;
> +
> + return strlen(buf);
> +}
> +
> +static ssize_t threshold_show(struct device *ras_feat_dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> + int ret;
> + u64 val;
> +
> + ret = ops->get_threshold(ras_feat_dev->parent, ctx->ecs.private,
> + ecs_dev_attr->fru_id, &val);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "0x%llx\n", val);
> +}
> +
> +static ssize_t threshold_store(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct edac_ecs_dev_attr *ecs_dev_attr = to_ecs_dev_attr(attr);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> + long val;
> + int ret;
> +
> + ret = kstrtol(buf, 10, &val);
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->set_threshold(ras_feat_dev->parent, ctx->ecs.private,
> + ecs_dev_attr->fru_id, val);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static umode_t ecs_attr_visible(struct kobject *kobj,
> + struct attribute *a, int attr_id)
> +{
> + struct device *ras_feat_dev = kobj_to_dev(kobj);
> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_ecs_ops *ops = ctx->ecs.ops;
> +
> + switch (attr_id) {
> + case ecs_log_entry_type:
> + if (ops->get_log_entry_type && ops->set_log_entry_type)
> + return a->mode;
> + if (ops->get_log_entry_type)
> + return 0444;
> + return 0;
> + case ecs_log_entry_type_per_dram:
> + return ops->get_log_entry_type_per_dram ? a->mode : 0;
> + case ecs_log_entry_type_per_memory_media:
> + return ops->get_log_entry_type_per_memory_media ? a->mode : 0;
> + case ecs_mode:
> + if (ops->get_mode && ops->set_mode)
> + return a->mode;
> + if (ops->get_mode)
> + return 0444;
> + return 0;
> + case ecs_mode_counts_rows:
> + return ops->get_mode_counts_rows ? a->mode : 0;
> + case ecs_mode_counts_codewords:
> + return ops->get_mode_counts_codewords ? a->mode : 0;
> + case ecs_reset:
> + return ops->reset ? a->mode : 0;
> + case ecs_name:
> + return ops->get_name ? a->mode : 0;
> + case ecs_threshold:
> + if (ops->get_threshold && ops->set_threshold)
> + return a->mode;
> + if (ops->get_threshold)
> + return 0444;
> + return 0;
> + default:
> + return 0;
> + }
> +}
> +
> +#define EDAC_ECS_ATTR_RO(_name, _fru_id) \
> + ((struct edac_ecs_dev_attr) { .dev_attr = __ATTR_RO(_name), \
> + .fru_id = _fru_id })
> +
> +#define EDAC_ECS_ATTR_WO(_name, _fru_id) \
> + ((struct edac_ecs_dev_attr) { .dev_attr = __ATTR_WO(_name), \
> + .fru_id = _fru_id })
> +
> +#define EDAC_ECS_ATTR_RW(_name, _fru_id) \
> + ((struct edac_ecs_dev_attr) { .dev_attr = __ATTR_RW(_name), \
> + .fru_id = _fru_id })
> +
> +static int ecs_create_desc(struct device *ecs_dev,
> + const struct attribute_group **attr_groups,
> + u16 num_media_frus)
> +{
> + struct edac_ecs_context *ecs_ctx;
> + u32 fru;
> +
> + ecs_ctx = devm_kzalloc(ecs_dev, sizeof(*ecs_ctx), GFP_KERNEL);
> + if (!ecs_ctx)
> + return -ENOMEM;
> +
> + ecs_ctx->num_media_frus = num_media_frus;
> + ecs_ctx->fru_ctxs = devm_kcalloc(ecs_dev, num_media_frus,
> + sizeof(*ecs_ctx->fru_ctxs),
> + GFP_KERNEL);
> + if (!ecs_ctx->fru_ctxs)
> + return -ENOMEM;
> +
> + for (fru = 0; fru < num_media_frus; fru++) {
> + struct edac_ecs_fru_context *fru_ctx = &ecs_ctx->fru_ctxs[fru];
> + struct attribute_group *group = &fru_ctx->group;
> + int i;
> +
> + fru_ctx->ecs_dev_attr[0] = EDAC_ECS_ATTR_RW(log_entry_type, fru);
> + fru_ctx->ecs_dev_attr[1] = EDAC_ECS_ATTR_RO(log_entry_type_per_dram, fru);
> + fru_ctx->ecs_dev_attr[2] = EDAC_ECS_ATTR_RO(log_entry_type_per_memory_media, fru);
> + fru_ctx->ecs_dev_attr[3] = EDAC_ECS_ATTR_RW(mode, fru);
> + fru_ctx->ecs_dev_attr[4] = EDAC_ECS_ATTR_RO(mode_counts_rows, fru);
> + fru_ctx->ecs_dev_attr[5] = EDAC_ECS_ATTR_RO(mode_counts_codewords, fru);
> + fru_ctx->ecs_dev_attr[6] = EDAC_ECS_ATTR_WO(reset, fru);
> + fru_ctx->ecs_dev_attr[7] = EDAC_ECS_ATTR_RO(name, fru);
> + fru_ctx->ecs_dev_attr[8] = EDAC_ECS_ATTR_RW(threshold, fru);
> + for (i = 0; i < ecs_max_attrs; i++)
> + fru_ctx->ecs_attrs[i] = &fru_ctx->ecs_dev_attr[i].dev_attr.attr;
> +
> + sprintf(fru_ctx->name, "%s%d", EDAC_ECS_FRU_NAME, fru);
> + group->name = fru_ctx->name;
> + group->attrs = fru_ctx->ecs_attrs;
> + group->is_visible = ecs_attr_visible;
> +
> + attr_groups[fru] = group;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * edac_ecs_get_desc - get edac ecs descriptors
> + * @ecs_dev: client ecs device
> + * @attr_groups: pointer to attrribute group container
> + * @num_media_frus: number of media FRUs in the device
> + *
> + * Returns 0 on success, error otherwise.
> + */
> +int edac_ecs_get_desc(struct device *ecs_dev,
> + const struct attribute_group **attr_groups,
> + u16 num_media_frus)
> +{
> + if (!ecs_dev || !attr_groups || !num_media_frus)
> + return -EINVAL;
> +
> + return ecs_create_desc(ecs_dev, attr_groups, num_media_frus);
> +}
> diff --git a/drivers/edac/edac_ras_feature.c b/drivers/edac/edac_ras_feature.c
> index 48927f868372..a02ffbcc1c1e 100755
> --- a/drivers/edac/edac_ras_feature.c
> +++ b/drivers/edac/edac_ras_feature.c
> @@ -47,10 +47,15 @@ static int edac_ras_feat_ecs_init(struct device *parent,
> const struct attribute_group **attr_groups)
> {
> int num = efeat->ecs_info.num_media_frus;
> + int ret;
>
> edata->ops = efeat->ecs_ops;
> edata->private = efeat->ecs_ctx;
>
> + ret = edac_ecs_get_desc(parent, attr_groups, num);
> + if (ret)
> + return ret;
> +
> return num;
> }
>
> diff --git a/include/linux/edac_ras_feature.h b/include/linux/edac_ras_feature.h
> index 462f9ecbf9d4..153f8a3557f1 100755
> --- a/include/linux/edac_ras_feature.h
> +++ b/include/linux/edac_ras_feature.h
> @@ -47,10 +47,46 @@ struct edac_scrub_ops {
>
> const struct attribute_group *edac_scrub_get_desc(void);
>
> +/**
> + * struct ecs_ops - ECS device operations (all elements optional)
> + * @get_log_entry_type: read the log entry type value.
> + * @set_log_entry_type: set the log entry type value.
> + * @get_log_entry_type_per_dram: read the log entry type per dram value.
> + * @get_log_entry_type_memory_media: read the log entry type per memory media value.
> + * @get_mode: read the mode value.
> + * @set_mode: set the mode value.
> + * @get_mode_counts_rows: read the mode counts rows value.
> + * @get_mode_counts_codewords: read the mode counts codewords value.
> + * @reset: reset the ECS counter.
> + * @get_threshold: read the threshold value.
> + * @set_threshold: set the threshold value.
> + * @get_name: get the ECS's name.
> + */
> +struct edac_ecs_ops {
> + int (*get_log_entry_type)(struct device *dev, void *drv_data, int fru_id, u64 *val);
> + int (*set_log_entry_type)(struct device *dev, void *drv_data, int fru_id, u64 val);
> + int (*get_log_entry_type_per_dram)(struct device *dev, void *drv_data,
> + int fru_id, u64 *val);
> + int (*get_log_entry_type_per_memory_media)(struct device *dev, void *drv_data,
> + int fru_id, u64 *val);
> + int (*get_mode)(struct device *dev, void *drv_data, int fru_id, u64 *val);
> + int (*set_mode)(struct device *dev, void *drv_data, int fru_id, u64 val);
> + int (*get_mode_counts_rows)(struct device *dev, void *drv_data, int fru_id, u64 *val);
> + int (*get_mode_counts_codewords)(struct device *dev, void *drv_data, int fru_id, u64 *val);
> + int (*reset)(struct device *dev, void *drv_data, int fru_id, u64 val);
> + int (*get_threshold)(struct device *dev, void *drv_data, int fru_id, u64 *threshold);
> + int (*set_threshold)(struct device *dev, void *drv_data, int fru_id, u64 threshold);
> + int (*get_name)(struct device *dev, void *drv_data, int fru_id, char *buf);
> +};
> +
> struct edac_ecs_ex_info {
> u16 num_media_frus;
> };
>
> +int edac_ecs_get_desc(struct device *ecs_dev,
> + const struct attribute_group **attr_groups,
> + u16 num_media_frus);
> +
> /*
> * EDAC RAS feature information structure
> */
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v9 04/11] cxl/mbox: Add GET_SUPPORTED_FEATURES mailbox command
2024-07-16 15:03 ` [RFC PATCH v9 04/11] cxl/mbox: Add GET_SUPPORTED_FEATURES mailbox command shiju.jose
@ 2024-07-17 17:28 ` nifan.cxl
0 siblings, 0 replies; 30+ messages in thread
From: nifan.cxl @ 2024-07-17 17:28 UTC (permalink / raw)
To: shiju.jose
Cc: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel, bp,
tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu, kangkang.shen,
wanghuiqiang, linuxarm
On Tue, Jul 16, 2024 at 04:03:28PM +0100, shiju.jose@huawei.com wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> Add support for GET_SUPPORTED_FEATURES mailbox command.
>
> CXL spec 3.1 section 8.2.9.6 describes optional device specific features.
> CXL devices supports features with changeable attributes.
s/supports/support/
Fan
> Get Supported Features retrieves the list of supported device specific
> features. The settings of a feature can be retrieved using Get Feature
> and optionally modified using Set Feature.
>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
> drivers/cxl/core/mbox.c | 27 ++++++++++++++++++
> drivers/cxl/cxlmem.h | 61 +++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 88 insertions(+)
>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 2626f3fff201..9b9b1d26454e 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -1324,6 +1324,33 @@ int cxl_set_timestamp(struct cxl_memdev_state *mds)
> }
> EXPORT_SYMBOL_NS_GPL(cxl_set_timestamp, CXL);
>
> +int cxl_get_supported_features(struct cxl_memdev_state *mds,
> + u32 count, u16 start_index,
> + struct cxl_mbox_get_supp_feats_out *feats_out)
> +{
> + struct cxl_mbox_get_supp_feats_in pi;
> + struct cxl_mbox_cmd mbox_cmd;
> + int rc;
> +
> + pi.count = cpu_to_le32(count);
> + pi.start_index = cpu_to_le16(start_index);
> +
> + mbox_cmd = (struct cxl_mbox_cmd) {
> + .opcode = CXL_MBOX_OP_GET_SUPPORTED_FEATURES,
> + .size_in = sizeof(pi),
> + .payload_in = &pi,
> + .size_out = count,
> + .payload_out = feats_out,
> + .min_out = sizeof(*feats_out),
> + };
> + rc = cxl_internal_send_cmd(mds, &mbox_cmd);
> + if (rc < 0)
> + return rc;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_get_supported_features, CXL);
> +
> int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
> struct cxl_region *cxlr)
> {
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 19aba81cdf13..b0e1565b9d2e 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -530,6 +530,7 @@ enum cxl_opcode {
> CXL_MBOX_OP_GET_LOG_CAPS = 0x0402,
> CXL_MBOX_OP_CLEAR_LOG = 0x0403,
> CXL_MBOX_OP_GET_SUP_LOG_SUBLIST = 0x0405,
> + CXL_MBOX_OP_GET_SUPPORTED_FEATURES = 0x0500,
> CXL_MBOX_OP_IDENTIFY = 0x4000,
> CXL_MBOX_OP_GET_PARTITION_INFO = 0x4100,
> CXL_MBOX_OP_SET_PARTITION_INFO = 0x4101,
> @@ -699,6 +700,63 @@ struct cxl_mbox_set_timestamp_in {
>
> } __packed;
>
> +/*
> + * Get Supported Features CXL 3.1 Spec 8.2.9.6.1
> + */
> +
> +/*
> + * Get Supported Features input payload
> + * CXL rev 3.1 section 8.2.9.6.1 Table 8-95
> + */
> +struct cxl_mbox_get_supp_feats_in {
> + __le32 count;
> + __le16 start_index;
> + u8 rsvd[2];
> +} __packed;
> +
> +/*
> + * Get Supported Features Supported Feature Entry
> + * CXL rev 3.1 section 8.2.9.6.1 Table 8-97
> + */
> +/* Supported Feature Entry : Payload out attribute flags */
> +#define CXL_FEAT_ENTRY_FLAG_CHANGABLE BIT(0)
> +#define CXL_FEAT_ENTRY_FLAG_DEEPEST_RESET_PERSISTENCE_MASK GENMASK(3, 1)
> +#define CXL_FEAT_ENTRY_FLAG_PERSIST_ACROSS_FIRMWARE_UPDATE BIT(4)
> +#define CXL_FEAT_ENTRY_FLAG_SUPPORT_DEFAULT_SELECTION BIT(5)
> +#define CXL_FEAT_ENTRY_FLAG_SUPPORT_SAVED_SELECTION BIT(6)
> +
> +enum cxl_feat_attr_value_persistence {
> + CXL_FEAT_ATTR_VALUE_PERSISTENCE_NONE,
> + CXL_FEAT_ATTR_VALUE_PERSISTENCE_CXL_RESET,
> + CXL_FEAT_ATTR_VALUE_PERSISTENCE_HOT_RESET,
> + CXL_FEAT_ATTR_VALUE_PERSISTENCE_WARM_RESET,
> + CXL_FEAT_ATTR_VALUE_PERSISTENCE_COLD_RESET,
> + CXL_FEAT_ATTR_VALUE_PERSISTENCE_MAX
> +};
> +
> +struct cxl_mbox_supp_feat_entry {
> + uuid_t uuid;
> + __le16 index;
> + __le16 get_size;
> + __le16 set_size;
> + __le32 attr_flags;
> + u8 get_version;
> + u8 set_version;
> + __le16 set_effects;
> + u8 rsvd[18];
> +} __packed;
> +
> +/*
> + * Get Supported Features output payload
> + * CXL rev 3.1 section 8.2.9.6.1 Table 8-96
> + */
> +struct cxl_mbox_get_supp_feats_out {
> + __le16 nr_entries;
> + __le16 nr_supported;
> + u8 rsvd[4];
> + struct cxl_mbox_supp_feat_entry feat_entries[];
> +} __packed;
> +
> /* Get Poison List CXL 3.0 Spec 8.2.9.8.4.1 */
> struct cxl_mbox_poison_in {
> __le64 offset;
> @@ -830,6 +888,9 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> enum cxl_event_type event_type,
> const uuid_t *uuid, union cxl_event *evt);
> int cxl_set_timestamp(struct cxl_memdev_state *mds);
> +int cxl_get_supported_features(struct cxl_memdev_state *mds,
> + u32 count, u16 start_index,
> + struct cxl_mbox_get_supp_feats_out *feats_out);
> int cxl_poison_state_init(struct cxl_memdev_state *mds);
> int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
> struct cxl_region *cxlr);
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v9 05/11] cxl/mbox: Add GET_FEATURE mailbox command
2024-07-16 15:03 ` [RFC PATCH v9 05/11] cxl/mbox: Add GET_FEATURE " shiju.jose
@ 2024-07-17 18:08 ` nifan.cxl
2024-07-18 9:11 ` Shiju Jose
0 siblings, 1 reply; 30+ messages in thread
From: nifan.cxl @ 2024-07-17 18:08 UTC (permalink / raw)
To: shiju.jose
Cc: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel, bp,
tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu, kangkang.shen,
wanghuiqiang, linuxarm
On Tue, Jul 16, 2024 at 04:03:29PM +0100, shiju.jose@huawei.com wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> Add support for GET_FEATURE mailbox command.
>
> CXL spec 3.1 section 8.2.9.6 describes optional device specific features.
> The settings of a feature can be retrieved using Get Feature command.
>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
Minor comments inline.
> drivers/cxl/core/mbox.c | 37 +++++++++++++++++++++++++++++++++++++
> drivers/cxl/cxlmem.h | 27 +++++++++++++++++++++++++++
> 2 files changed, 64 insertions(+)
>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 9b9b1d26454e..b1eeed508459 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -1351,6 +1351,43 @@ int cxl_get_supported_features(struct cxl_memdev_state *mds,
> }
> EXPORT_SYMBOL_NS_GPL(cxl_get_supported_features, CXL);
>
> +size_t cxl_get_feature(struct cxl_memdev_state *mds,
> + const uuid_t feat_uuid, void *feat_out,
> + size_t feat_out_size,
> + enum cxl_get_feat_selection selection)
feat_uuid and selection are both payload inputs, maybe more natural to put
them together before feat_out.
> +{
> + size_t data_to_rd_size, size_out;
> + struct cxl_mbox_get_feat_in pi;
> + struct cxl_mbox_cmd mbox_cmd;
> + size_t data_rcvd_size = 0;
> + int rc;
> +
> + size_out = min(feat_out_size, mds->payload_size);
> + pi.uuid = feat_uuid;
> + pi.selection = selection;
> + do {
> + data_to_rd_size = min(feat_out_size - data_rcvd_size, mds->payload_size);
> + pi.offset = cpu_to_le16(data_rcvd_size);
> + pi.count = cpu_to_le16(data_to_rd_size);
> +
> + mbox_cmd = (struct cxl_mbox_cmd) {
> + .opcode = CXL_MBOX_OP_GET_FEATURE,
> + .size_in = sizeof(pi),
> + .payload_in = &pi,
> + .size_out = size_out,
> + .payload_out = feat_out + data_rcvd_size,
> + .min_out = data_to_rd_size,
> + };
> + rc = cxl_internal_send_cmd(mds, &mbox_cmd);
> + if (rc < 0 || mbox_cmd.size_out == 0)
Is there other case when size_out will be 0 other than the feat_out_size
is 0?
If feat_out_size is 0, maybe we return directly, or we use while () {},
instead of do {} while.
Anyway, if there is no other case that will return size_out as 0, we can
avoid the check.
Fan
> + return 0;
> + data_rcvd_size += mbox_cmd.size_out;
> + } while (data_rcvd_size < feat_out_size);
> +
> + return data_rcvd_size;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_get_feature, CXL);
> +
> int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
> struct cxl_region *cxlr)
> {
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index b0e1565b9d2e..25698a6fbe66 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -531,6 +531,7 @@ enum cxl_opcode {
> CXL_MBOX_OP_CLEAR_LOG = 0x0403,
> CXL_MBOX_OP_GET_SUP_LOG_SUBLIST = 0x0405,
> CXL_MBOX_OP_GET_SUPPORTED_FEATURES = 0x0500,
> + CXL_MBOX_OP_GET_FEATURE = 0x0501,
> CXL_MBOX_OP_IDENTIFY = 0x4000,
> CXL_MBOX_OP_GET_PARTITION_INFO = 0x4100,
> CXL_MBOX_OP_SET_PARTITION_INFO = 0x4101,
> @@ -757,6 +758,28 @@ struct cxl_mbox_get_supp_feats_out {
> struct cxl_mbox_supp_feat_entry feat_entries[];
> } __packed;
>
> +/*
> + * Get Feature CXL 3.1 Spec 8.2.9.6.2
> + */
> +
> +/*
> + * Get Feature input payload
> + * CXL rev 3.1 section 8.2.9.6.2 Table 8-99
> + */
> +enum cxl_get_feat_selection {
> + CXL_GET_FEAT_SEL_CURRENT_VALUE,
> + CXL_GET_FEAT_SEL_DEFAULT_VALUE,
> + CXL_GET_FEAT_SEL_SAVED_VALUE,
> + CXL_GET_FEAT_SEL_MAX
> +};
> +
> +struct cxl_mbox_get_feat_in {
> + uuid_t uuid;
> + __le16 offset;
> + __le16 count;
> + u8 selection;
> +} __packed;
> +
> /* Get Poison List CXL 3.0 Spec 8.2.9.8.4.1 */
> struct cxl_mbox_poison_in {
> __le64 offset;
> @@ -891,6 +914,10 @@ int cxl_set_timestamp(struct cxl_memdev_state *mds);
> int cxl_get_supported_features(struct cxl_memdev_state *mds,
> u32 count, u16 start_index,
> struct cxl_mbox_get_supp_feats_out *feats_out);
> +size_t cxl_get_feature(struct cxl_memdev_state *mds,
> + const uuid_t feat_uuid, void *feat_out,
> + size_t feat_out_size,
> + enum cxl_get_feat_selection selection);
> int cxl_poison_state_init(struct cxl_memdev_state *mds);
> int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
> struct cxl_region *cxlr);
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v9 06/11] cxl/mbox: Add SET_FEATURE mailbox command
2024-07-16 15:03 ` [RFC PATCH v9 06/11] cxl/mbox: Add SET_FEATURE " shiju.jose
@ 2024-07-17 20:13 ` nifan.cxl
2024-07-18 9:15 ` Shiju Jose
0 siblings, 1 reply; 30+ messages in thread
From: nifan.cxl @ 2024-07-17 20:13 UTC (permalink / raw)
To: shiju.jose
Cc: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel, bp,
tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu, kangkang.shen,
wanghuiqiang, linuxarm
On Tue, Jul 16, 2024 at 04:03:30PM +0100, shiju.jose@huawei.com wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> Add support for SET_FEATURE mailbox command.
>
> CXL spec 3.1 section 8.2.9.6 describes optional device specific features.
> CXL devices supports features with changeable attributes.
> The settings of a feature can be optionally modified using Set Feature
> command.
Add more specific spec reference to the command here: 8.2.9.6.3.
The same suggestions for get supported features and get feature commands.
>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
> drivers/cxl/core/mbox.c | 71 +++++++++++++++++++++++++++++++++++++++++
> drivers/cxl/cxlmem.h | 33 +++++++++++++++++++
> 2 files changed, 104 insertions(+)
>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index b1eeed508459..50ecd2bd7372 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -1388,6 +1388,77 @@ size_t cxl_get_feature(struct cxl_memdev_state *mds,
> }
> EXPORT_SYMBOL_NS_GPL(cxl_get_feature, CXL);
>
> +/*
> + * FEAT_DATA_MIN_PAYLOAD_SIZE - min extra number of bytes should be
> + * available in the mailbox for storing the actual feature data so that
> + * the feature data transfer would work as expected.
> + */
> +#define FEAT_DATA_MIN_PAYLOAD_SIZE 10
> +int cxl_set_feature(struct cxl_memdev_state *mds,
> + const uuid_t feat_uuid, u8 feat_version,
> + void *feat_data, size_t feat_data_size,
> + u8 feat_flag)
> +{
> + struct cxl_memdev_set_feat_pi {
> + struct cxl_mbox_set_feat_hdr hdr;
> + u8 feat_data[];
> + } __packed;
> + size_t data_in_size, data_sent_size = 0;
> + struct cxl_mbox_cmd mbox_cmd;
> + size_t hdr_size;
> + int rc = 0;
> +
> + struct cxl_memdev_set_feat_pi *pi __free(kfree) =
> + kmalloc(mds->payload_size, GFP_KERNEL);
> + pi->hdr.uuid = feat_uuid;
> + pi->hdr.version = feat_version;
> + feat_flag &= ~CXL_SET_FEAT_FLAG_DATA_TRANSFER_MASK;
Although we may not support it yet, should we set bit[3] (saved across reset)
since we already defined CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET and
it is not used?
Fan
> + hdr_size = sizeof(pi->hdr);
> + /*
> + * Check minimum mbox payload size is available for
> + * the feature data transfer.
> + */
> + if (hdr_size + FEAT_DATA_MIN_PAYLOAD_SIZE > mds->payload_size)
> + return -ENOMEM;
> +
> + if ((hdr_size + feat_data_size) <= mds->payload_size) {
> + pi->hdr.flags = cpu_to_le32(feat_flag |
> + CXL_SET_FEAT_FLAG_FULL_DATA_TRANSFER);
> + data_in_size = feat_data_size;
> + } else {
> + pi->hdr.flags = cpu_to_le32(feat_flag |
> + CXL_SET_FEAT_FLAG_INITIATE_DATA_TRANSFER);
> + data_in_size = mds->payload_size - hdr_size;
> + }
> +
> + do {
> + pi->hdr.offset = cpu_to_le16(data_sent_size);
> + memcpy(pi->feat_data, feat_data + data_sent_size, data_in_size);
> + mbox_cmd = (struct cxl_mbox_cmd) {
> + .opcode = CXL_MBOX_OP_SET_FEATURE,
> + .size_in = hdr_size + data_in_size,
> + .payload_in = pi,
> + };
> + rc = cxl_internal_send_cmd(mds, &mbox_cmd);
> + if (rc < 0)
> + return rc;
> +
> + data_sent_size += data_in_size;
> + if (data_sent_size >= feat_data_size)
> + return 0;
> +
> + if ((feat_data_size - data_sent_size) <= (mds->payload_size - hdr_size)) {
> + data_in_size = feat_data_size - data_sent_size;
> + pi->hdr.flags = cpu_to_le32(feat_flag |
> + CXL_SET_FEAT_FLAG_FINISH_DATA_TRANSFER);
> + } else {
> + pi->hdr.flags = cpu_to_le32(feat_flag |
> + CXL_SET_FEAT_FLAG_CONTINUE_DATA_TRANSFER);
> + }
> + } while (true);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_set_feature, CXL);
> +
> int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
> struct cxl_region *cxlr)
> {
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 25698a6fbe66..c3cb8e2736b5 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -532,6 +532,7 @@ enum cxl_opcode {
> CXL_MBOX_OP_GET_SUP_LOG_SUBLIST = 0x0405,
> CXL_MBOX_OP_GET_SUPPORTED_FEATURES = 0x0500,
> CXL_MBOX_OP_GET_FEATURE = 0x0501,
> + CXL_MBOX_OP_SET_FEATURE = 0x0502,
> CXL_MBOX_OP_IDENTIFY = 0x4000,
> CXL_MBOX_OP_GET_PARTITION_INFO = 0x4100,
> CXL_MBOX_OP_SET_PARTITION_INFO = 0x4101,
> @@ -780,6 +781,34 @@ struct cxl_mbox_get_feat_in {
> u8 selection;
> } __packed;
>
> +/*
> + * Set Feature CXL 3.1 Spec 8.2.9.6.3
> + */
> +
> +/*
> + * Set Feature input payload
> + * CXL rev 3.1 section 8.2.9.6.3 Table 8-101
> + */
> +/* Set Feature : Payload in flags */
> +#define CXL_SET_FEAT_FLAG_DATA_TRANSFER_MASK GENMASK(2, 0)
> +enum cxl_set_feat_flag_data_transfer {
> + CXL_SET_FEAT_FLAG_FULL_DATA_TRANSFER,
> + CXL_SET_FEAT_FLAG_INITIATE_DATA_TRANSFER,
> + CXL_SET_FEAT_FLAG_CONTINUE_DATA_TRANSFER,
> + CXL_SET_FEAT_FLAG_FINISH_DATA_TRANSFER,
> + CXL_SET_FEAT_FLAG_ABORT_DATA_TRANSFER,
> + CXL_SET_FEAT_FLAG_DATA_TRANSFER_MAX
> +};
> +#define CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET BIT(3)
> +
> +struct cxl_mbox_set_feat_hdr {
> + uuid_t uuid;
> + __le32 flags;
> + __le16 offset;
> + u8 version;
> + u8 rsvd[9];
> +} __packed;
> +
> /* Get Poison List CXL 3.0 Spec 8.2.9.8.4.1 */
> struct cxl_mbox_poison_in {
> __le64 offset;
> @@ -918,6 +947,10 @@ size_t cxl_get_feature(struct cxl_memdev_state *mds,
> const uuid_t feat_uuid, void *feat_out,
> size_t feat_out_size,
> enum cxl_get_feat_selection selection);
> +int cxl_set_feature(struct cxl_memdev_state *mds,
> + const uuid_t feat_uuid, u8 feat_version,
> + void *feat_data, size_t feat_data_size,
> + u8 feat_flag);
> int cxl_poison_state_init(struct cxl_memdev_state *mds);
> int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
> struct cxl_region *cxlr);
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver
2024-07-17 11:01 ` Shiju Jose
@ 2024-07-18 6:19 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 30+ messages in thread
From: Mauro Carvalho Chehab @ 2024-07-18 6:19 UTC (permalink / raw)
To: Shiju Jose
Cc: linux-edac@vger.kernel.org, linux-cxl@vger.kernel.org,
linux-acpi@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, bp@alien8.de, tony.luck@intel.com,
rafael@kernel.org, lenb@kernel.org, mchehab@kernel.org,
dan.j.williams@intel.com, dave@stgolabs.net, Jonathan Cameron,
dave.jiang@intel.com, alison.schofield@intel.com,
vishal.l.verma@intel.com, ira.weiny@intel.com, david@redhat.com,
Vilas.Sridharan@amd.com, leo.duran@amd.com, Yazen.Ghannam@amd.com,
rientjes@google.com, jiaqiyan@google.com, Jon.Grimm@amd.com,
dave.hansen@linux.intel.com, naoya.horiguchi@nec.com,
james.morse@arm.com, jthoughton@google.com,
somasundaram.a@hpe.com, erdemaktas@google.com, pgonda@google.com,
duenwen@google.com, mike.malvestuto@intel.com, gthelen@google.com,
wschwartz@amperecomputing.com, dferguson@amperecomputing.com,
wbs@os.amperecomputing.com, nifan.cxl@gmail.com, tanxiaofei,
Zengtao (B), Roberto Sassu, kangkang.shen@futurewei.com,
wanghuiqiang, Linuxarm
Em Wed, 17 Jul 2024 11:01:58 +0000
Shiju Jose <shiju.jose@huawei.com> escreveu:
> Hi Mauro,
>
> Thanks for the feedbacks.
>
> >-----Original Message-----
> >From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> >Sent: 17 July 2024 11:00
> >To: Shiju Jose <shiju.jose@huawei.com>
> >Cc: linux-edac@vger.kernel.org; linux-cxl@vger.kernel.org; linux-
> >acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org;
> >bp@alien8.de; tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org;
> >mchehab@kernel.org; dan.j.williams@intel.com; dave@stgolabs.net; Jonathan
> >Cameron <jonathan.cameron@huawei.com>; dave.jiang@intel.com;
> >alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com;
> >david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com;
> >Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com;
> >Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
> >naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
> >somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
> >duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com;
> >wschwartz@amperecomputing.com; dferguson@amperecomputing.com;
> >wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
> ><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
> >Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
> >wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm
> ><linuxarm@huawei.com>
> >Subject: Re: [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver
> >
> >Em Tue, 16 Jul 2024 16:03:25 +0100
> ><shiju.jose@huawei.com> escreveu:
> >
> >> From: Shiju Jose <shiju.jose@huawei.com>
> >>
> >> Add generic EDAC driver supports registering RAS features supported in
> >> the system. The driver exposes feature's control attributes to the
> >> userspace in /sys/bus/edac/devices/<dev-name>/<ras-feature>/
> >>
> >> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> >> ---
> >> drivers/edac/Makefile | 1 +
> >> drivers/edac/edac_ras_feature.c | 155
> >> +++++++++++++++++++++++++++++++ include/linux/edac_ras_feature.h |
> >> 66 +++++++++++++
> >> 3 files changed, 222 insertions(+)
> >> create mode 100755 drivers/edac/edac_ras_feature.c create mode
> >> 100755 include/linux/edac_ras_feature.h
> >>
> >> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index
> >> 9c09893695b7..c532b57a6d8a 100644
> >> --- a/drivers/edac/Makefile
> >> +++ b/drivers/edac/Makefile
> >> @@ -10,6 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
> >>
> >> edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
> >> edac_core-y += edac_module.o edac_device_sysfs.o wq.o
> >> +edac_core-y += edac_ras_feature.o
> >>
> >> edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
> >>
> >> diff --git a/drivers/edac/edac_ras_feature.c
> >> b/drivers/edac/edac_ras_feature.c new file mode 100755 index
> >> 000000000000..24a729fea66f
> >> --- /dev/null
> >> +++ b/drivers/edac/edac_ras_feature.c
> >> @@ -0,0 +1,155 @@
> >> +// SPDX-License-Identifier: GPL-2.0
> >> +/*
> >> + * EDAC RAS control feature driver supports registering RAS
> >> + * features with the EDAC and exposes the feature's control
> >> + * attributes to the userspace in sysfs.
> >> + *
> >> + * Copyright (c) 2024 HiSilicon Limited.
> >> + */
> >> +
> >
> >> +#define pr_fmt(fmt) "EDAC RAS CONTROL FEAT: " fmt
> >
> >Sounds a too long prefix for my taste.
> Will do. Previously it was "EDAC RAS FEAT"
>
> >
> >> +
> >> +#include <linux/edac_ras_feature.h>
> >> +
> >> +static void edac_ras_dev_release(struct device *dev) {
> >> + struct edac_ras_feat_ctx *ctx =
> >> + container_of(dev, struct edac_ras_feat_ctx, dev);
> >> +
> >> + kfree(ctx);
> >> +}
> >> +
> >> +const struct device_type edac_ras_dev_type = {
> >> + .name = "edac_ras_dev",
> >> + .release = edac_ras_dev_release,
> >> +};
> >> +
> >> +static void edac_ras_dev_unreg(void *data) {
> >> + device_unregister(data);
> >> +}
> >> +
> >> +static int edac_ras_feat_scrub_init(struct device *parent,
> >> + struct edac_scrub_data *sdata,
> >> + const struct edac_ras_feature *sfeat,
> >> + const struct attribute_group **attr_groups) {
> >> + sdata->ops = sfeat->scrub_ops;
> >> + sdata->private = sfeat->scrub_ctx;
> >> +
> >> + return 1;
> >> +}
> >> +
> >> +static int edac_ras_feat_ecs_init(struct device *parent,
> >> + struct edac_ecs_data *edata,
> >> + const struct edac_ras_feature *efeat,
> >> + const struct attribute_group **attr_groups) {
> >> + int num = efeat->ecs_info.num_media_frus;
> >> +
> >> + edata->ops = efeat->ecs_ops;
> >> + edata->private = efeat->ecs_ctx;
> >> +
> >> + return num;
> >> +}
> >
> >I would place this function earlier and/or add some documentation for the above
> >two functions.
> Will do. I guess you want place these functions above edac_ras_dev_release() right?
I mean placing edac_ras_feat_ecs_ini() before edac_ras_feat_scrub_init(),
as it helps reviewers to understand that the return code is the number
of attr groups. Another option would be to document the arguments and
the return value for such functions.
> >
> >I got confused when reviewed the first function and saw there an
> >unconditional:
> The call for the feature specific init functions are added here in the next feature specific patches
> of this series.
> >
> > return 1;
> >
> >Now, I guess the goal is to return the number of initialized features, right?
> Return the number of attr groups added for a feature as the instances for a feature is dynamic,
> for e.g. the number of FRUs in ECS feature.
>
> >
> >> +
> >> +/**
> >> + * edac_ras_dev_register - register device for ras features with edac
> >> + * @parent: client device.
> >> + * @name: client device's name.
> >> + * @private: parent driver's data to store in the context if any.
> >> + * @num_features: number of ras features to register.
> >> + * @ras_features: list of ras features to register.
> >> + *
> >> + * Returns 0 on success, error otherwise.
> >> + * The new edac_ras_feat_ctx would be freed automatically.
> >> + */
> >> +int edac_ras_dev_register(struct device *parent, char *name,
> >> + void *private, int num_features,
> >> + const struct edac_ras_feature *ras_features) {
> >> + const struct attribute_group **ras_attr_groups;
> >> + struct edac_ras_feat_ctx *ctx;
> >> + int attr_gcnt = 0;
> >> + int ret, feat;
> >> +
> >> + if (!parent || !name || !num_features || !ras_features)
> >> + return -EINVAL;
> >> +
> >> + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> >> + if (!ctx)
> >> + return -ENOMEM;
> >> +
> >> + ctx->dev.parent = parent;
> >> + ctx->private = private;
> >> +
> >> + /* Double parse so we can make space for attributes */
> >> + for (feat = 0; feat < num_features; feat++) {
> >> + switch (ras_features[feat].feat) {
> >> + case ras_feat_scrub:
> >> + attr_gcnt++;
> >> + break;
> >> + case ras_feat_ecs:
> >> + attr_gcnt +=
> >ras_features[feat].ecs_info.num_media_frus;
> >> + break;
> >
> >As already suggested, the enum names shall be in uppercase.
> >Having a lowercase one here looks really weird.
> Agree.
> >
> >> + default:
> >> + ret = -EINVAL;
> >> + goto ctx_free;
> >> + }
> >> + }
> >
> >I would place this logic earlier, before allocating ctx, as, in case of errors, the
> >function can just call "return -EINVAL".
> Ok.
>
> >
> >> +
> >> + ras_attr_groups = devm_kzalloc(parent,
> >> + (attr_gcnt + 1) * sizeof(*ras_attr_groups),
> >> + GFP_KERNEL);
> >
> >Hmm... why are you using devm variant here, and non-devm one for cxt?
> >
> >My personal preference is to avoid devm variants, as memory is only freed
> >when the device refcount becomes zero (which, depending on the driver, may
> >never happen in practice, as driver core may keep a refcount, depending on how
> >the device was probed).
> Can use Kzalloc and need to add free for ras_attr_groups on error etc.
Ok. While here, please also use the kcalloc/kmalloc_array variants, as
doing num * sizeof(foo) should be avoided.
Btw, there are some checks inside checkpatch meant to identify it like
ALLOC_WITH_MULTIPLY. Not sure why it didn't trigger it here.
Hint: while the number of false positive hits increase, I'm always running
checkpatch with --strict, as it detects some additional potential problems.
> >> + if (!ras_attr_groups) {
> >> + ret = -ENOMEM;
> >> + goto ctx_free;
> >> + }
> >> +
> >> + attr_gcnt = 0;
> >> + for (feat = 0; feat < num_features; feat++, ras_features++) {
> >> + if (ras_features->feat == ras_feat_scrub) {
> >
> >I would use a switch here as well, just like the previous feature type check.
> Will do.
> >
> >> + if (!ras_features->scrub_ops)
> >> + continue;
> >> + ret = edac_ras_feat_scrub_init(parent, &ctx->scrub,
> >> + ras_features,
> >&ras_attr_groups[attr_gcnt]);
> >
> >I don't think it is worth having those ancillary functions here...
> >
> >> + if (ret < 0)
> >> + goto ctx_free;
> >> +
> >> + attr_gcnt += ret;
> >> + } else if (ras_features->feat == ras_feat_ecs) {
> >> + if (!ras_features->ecs_ops)
> >> + continue;
> >> + ret = edac_ras_feat_ecs_init(parent, &ctx->ecs,
> >> + ras_features,
> >&ras_attr_groups[attr_gcnt]);
> >
> >and here, as most of the current functions are very simple:
> >
> >both just sets two arguments:
> >
> > edata->ops
> > edata->private
> >
> >and returned vaules are always a positive counter...
> >
> >> + if (ret < 0)
> >> + goto ctx_free;
> >
> >So, this check for instance, doesn't make sense.
> The call for the feature specific init functions are added in the next feature specific patches
> of this series and which could return error.
Ok.
> >
> >> +
> >> + attr_gcnt += ret;
> >> + } else {
> >> + ret = -EINVAL;
> >> + goto ctx_free;
> >> + }
> >> + }
> >> + ras_attr_groups[attr_gcnt] = NULL;
> >> + ctx->dev.bus = edac_get_sysfs_subsys();
> >> + ctx->dev.type = &edac_ras_dev_type;
> >> + ctx->dev.groups = ras_attr_groups;
> >> + dev_set_drvdata(&ctx->dev, ctx);
> >> + ret = dev_set_name(&ctx->dev, name);
> >> + if (ret)
> >> + goto ctx_free;
> >> +
> >> + ret = device_register(&ctx->dev);
> >> + if (ret) {
> >> + put_device(&ctx->dev);
> >> + return ret;
> >> + }
> >> +
> >> + return devm_add_action_or_reset(parent, edac_ras_dev_unreg,
> >> +&ctx->dev);
> >> +
> >> +ctx_free:
> >> + kfree(ctx);
> >> + return ret;
> >> +}
> >> +EXPORT_SYMBOL_GPL(edac_ras_dev_register);
> >> diff --git a/include/linux/edac_ras_feature.h
> >> b/include/linux/edac_ras_feature.h
> >> new file mode 100755
> >> index 000000000000..000e99141023
> >> --- /dev/null
> >> +++ b/include/linux/edac_ras_feature.h
> >> @@ -0,0 +1,66 @@
> >> +/* SPDX-License-Identifier: GPL-2.0 */
> >> +/*
> >> + * EDAC RAS control features.
> >> + *
> >> + * Copyright (c) 2024 HiSilicon Limited.
> >> + */
> >> +
> >> +#ifndef __EDAC_RAS_FEAT_H
> >> +#define __EDAC_RAS_FEAT_H
> >> +
> >> +#include <linux/types.h>
> >> +#include <linux/edac.h>
> >> +
> >> +#define EDAC_RAS_NAME_LEN 128
> >> +
> >> +enum edac_ras_feat {
> >> + ras_feat_scrub,
> >> + ras_feat_ecs,
> >> + ras_feat_max
> >> +};
> >
> >Enum values in uppercase, please.
> Will do.
> >
> >> +
> >> +struct edac_ecs_ex_info {
> >> + u16 num_media_frus;
> >> +};
> >> +
> >> +/*
> >> + * EDAC RAS feature information structure */ struct edac_scrub_data
> >> +{
> >> + const struct edac_scrub_ops *ops;
> >> + void *private;
> >> +};
> >> +
> >> +struct edac_ecs_data {
> >> + const struct edac_ecs_ops *ops;
> >> + void *private;
> >> +};
> >> +
> >> +struct device;
> >> +
> >> +struct edac_ras_feat_ctx {
> >> + struct device dev;
> >> + void *private;
> >> + struct edac_scrub_data scrub;
> >> + struct edac_ecs_data ecs;
> >> +};
> >> +
> >> +struct edac_ras_feature {
> >> + enum edac_ras_feat feat;
> >> + union {
> >> + const struct edac_scrub_ops *scrub_ops;
> >> + const struct edac_ecs_ops *ecs_ops;
> >> + };
> >> + union {
> >> + struct edac_ecs_ex_info ecs_info;
> >> + };
> >
> >I would place the variable structs union at the end. This may help with
> >alignments, if you place the pointers earlier.
> Will do.
>
> >
> >> + union {
> >> + void *scrub_ctx;
> >> + void *ecs_ctx;
> >> + };
> >> +};
> >> +
> >> +int edac_ras_dev_register(struct device *parent, char *dev_name,
> >> + void *parent_pvt_data, int num_features,
> >> + const struct edac_ras_feature *ras_features); #endif
> >/*
> >> +__EDAC_RAS_FEAT_H */
> >
> >
> >
> >Thanks,
> >Mauro
> >
>
> Thanks,
> Shiju
Thanks,
Mauro
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v9 02/11] EDAC: Add EDAC scrub control driver
2024-07-17 14:07 ` Shiju Jose
@ 2024-07-18 7:03 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 30+ messages in thread
From: Mauro Carvalho Chehab @ 2024-07-18 7:03 UTC (permalink / raw)
To: Shiju Jose
Cc: linux-edac@vger.kernel.org, linux-cxl@vger.kernel.org,
linux-acpi@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, bp@alien8.de, tony.luck@intel.com,
rafael@kernel.org, lenb@kernel.org, mchehab@kernel.org,
dan.j.williams@intel.com, dave@stgolabs.net, Jonathan Cameron,
dave.jiang@intel.com, alison.schofield@intel.com,
vishal.l.verma@intel.com, ira.weiny@intel.com, david@redhat.com,
Vilas.Sridharan@amd.com, leo.duran@amd.com, Yazen.Ghannam@amd.com,
rientjes@google.com, jiaqiyan@google.com, Jon.Grimm@amd.com,
dave.hansen@linux.intel.com, naoya.horiguchi@nec.com,
james.morse@arm.com, jthoughton@google.com,
somasundaram.a@hpe.com, erdemaktas@google.com, pgonda@google.com,
duenwen@google.com, mike.malvestuto@intel.com, gthelen@google.com,
wschwartz@amperecomputing.com, dferguson@amperecomputing.com,
wbs@os.amperecomputing.com, nifan.cxl@gmail.com, tanxiaofei,
Zengtao (B), Roberto Sassu, kangkang.shen@futurewei.com,
wanghuiqiang, Linuxarm
Em Wed, 17 Jul 2024 14:07:05 +0000
Shiju Jose <shiju.jose@huawei.com> escreveu:
> >-----Original Message-----
> >From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> >Sent: 17 July 2024 13:57
> >To: Shiju Jose <shiju.jose@huawei.com>
> >Cc: linux-edac@vger.kernel.org; linux-cxl@vger.kernel.org; linux-
> >acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org;
> >bp@alien8.de; tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org;
> >mchehab@kernel.org; dan.j.williams@intel.com; dave@stgolabs.net; Jonathan
> >Cameron <jonathan.cameron@huawei.com>; dave.jiang@intel.com;
> >alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com;
> >david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com;
> >Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com;
> >Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
> >naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
> >somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
> >duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com;
> >wschwartz@amperecomputing.com; dferguson@amperecomputing.com;
> >wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
> ><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
> >Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
> >wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm
> ><linuxarm@huawei.com>
> >Subject: Re: [RFC PATCH v9 02/11] EDAC: Add EDAC scrub control driver
> >
> >Em Tue, 16 Jul 2024 16:03:26 +0100
> ><shiju.jose@huawei.com> escreveu:
> >
> >> From: Shiju Jose <shiju.jose@huawei.com>
> >>
> >> Add generic EDAC scrub control driver supports configuring the memory
> >> scrubbers in the system. The device with scrub feature, get the scrub
> >> descriptor from the EDAC scrub and registers with the EDAC RAS feature
> >> driver, which adds the sysfs scrub control interface. The scrub
> >> control attributes are available to the userspace in
> >/sys/bus/edac/devices/<dev-name>/scrub/.
> >>
> >> Generic EDAC scrub driver and the common sysfs scrub interface
> >> promotes unambiguous access from the userspace irrespective of the
> >> underlying scrub devices.
> >>
> >> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> >> ---
> >> Documentation/ABI/testing/sysfs-edac-scrub | 64 +++++
> >> drivers/edac/Makefile | 2 +-
> >> drivers/edac/edac_ras_feature.c | 1 +
> >> drivers/edac/edac_scrub.c | 312 +++++++++++++++++++++
> >> include/linux/edac_ras_feature.h | 28 ++
> >> 5 files changed, 406 insertions(+), 1 deletion(-) create mode 100644
> >> Documentation/ABI/testing/sysfs-edac-scrub
> >> create mode 100755 drivers/edac/edac_scrub.c
> >>
> >> diff --git a/Documentation/ABI/testing/sysfs-edac-scrub
> >> b/Documentation/ABI/testing/sysfs-edac-scrub
> >> new file mode 100644
> >> index 000000000000..dd19afd5e165
> >> --- /dev/null
> >> +++ b/Documentation/ABI/testing/sysfs-edac-scrub
> >> @@ -0,0 +1,64 @@
> >> +What: /sys/bus/edac/devices/<dev-name>/scrub
> >> +Date: Oct 2024
> >> +KernelVersion: 6.12
> >> +Contact: linux-edac@vger.kernel.org
> >> +Description:
> >> + The sysfs edac bus devices /<dev-name>/scrub subdirectory
> >> + belongs to the memory scrub control feature, where <dev-
> >name>
> >> + directory corresponds to a device/memory region registered
> >> + with the edac scrub driver and thus registered with the
> >> + generic edac ras driver too.
> >> +
> >> +What: /sys/bus/edac/devices/<dev-
> >name>/scrub/addr_range_base
> >> +Date: Oct 2024
> >> +KernelVersion: 6.12
> >> +Contact: linux-edac@vger.kernel.org
> >> +Description:
> >> + (RW) The base of the address range of the memory region
> >> + to be scrubbed (on-demand scrubbing).
> >> +
> >> +What: /sys/bus/edac/devices/<dev-
> >name>/scrub/addr_range_size
> >> +Date: Oct 2024
> >> +KernelVersion: 6.12
> >> +Contact: linux-edac@vger.kernel.org
> >> +Description:
> >> + (RW) The size of the address range of the memory region
> >> + to be scrubbed (on-demand scrubbing).
> >> +
> >> +What: /sys/bus/edac/devices/<dev-
> >name>/scrub/enable_background
> >> +Date: Oct 2024
> >> +KernelVersion: 6.12
> >> +Contact: linux-edac@vger.kernel.org
> >> +Description:
> >> + (RW) Start/Stop background(patrol) scrubbing if supported.
> >> +
> >> +What: /sys/bus/edac/devices/<dev-
> >name>/scrub/enable_on_demand
> >> +Date: Oct 2024
> >> +KernelVersion: 6.12
> >> +Contact: linux-edac@vger.kernel.org
> >> +Description:
> >> + (RW) Start/Stop on-demand scrubbing the memory region
> >> + if supported.
> >
> >This is a generic comment for all sysfs calls: what happens if not supported?
> >
> >There are a couple of ways to implement it, like:
> >
> >1. Don't create the attribute;
> >2. return an error code (-ENOENT? -EINVAL?) if trying to read or
> > write to the devnode - please detail the used error code(s);
> >
> >In any case, please define the behavior and document it.
> >
> >From what I see, you're setting 0x444 on RW nodes when write is not enabled,
> >but still it is possible to not have RO supported. This is specially true as
> >technology evolves, as memory controllers and different types of memories may
> >have very different ways to control it[1].
>
> It is not true. If the parent device does not support and define callbacks for both read and write,
> then return 0 as you can see in the scrub_attr_visible() and the attribute
> would not be present for that device in the sysfs.
> For e.g. attributes addr_range_base and addr_range_size does not support by CXL patrol
> scrub feature, but supported by ACPI RAS2 scrub feature.
> >
> >[1] If you're curious enough, one legacy example of memories
> > implemented on a very different way was Fully Buffered DIMMs
> > where each DIMM had its own internal chipset to offload
> > certain tasks, including scrubbing and ECC implementation.
> > It ended not being succeeded long term, as it required
> > special DIMMs for server's market, reducing the production
> > scale, but it is an interesting example about how hardware
> > designs could be innovative breaking existing paradigms.
> > The FB-DIMM design actually forced a redesign at the EDAC
> > subsystem, as it was too centered on how an specific type
> > of memory controllers.
> >
> >> +
> >> +What: /sys/bus/edac/devices/<dev-name>/scrub/name
> >> +Date: Oct 2024
> >> +KernelVersion: 6.12
> >> +Contact: linux-edac@vger.kernel.org
> >> +Description:
> >> + (RO) name of the memory scrubber
> >> +
> >
> >
> >> +What: /sys/bus/edac/devices/<dev-
> >name>/scrub/cycle_in_hours_available
> >> +Date: Oct 2024
> >> +KernelVersion: 6.12
> >> +Contact: linux-edac@vger.kernel.org
> >> +Description:
> >> + (RO) Supported range for the scrub cycle in hours by the
> >> + memory scrubber.
> >> +
> >> +What: /sys/bus/edac/devices/<dev-
> >name>/scrubin_hours
> >> +Date: Oct 2024
> >> +KernelVersion: 6.12
> >> +Contact: linux-edac@vger.kernel.org
> >> +Description:
> >> + (RW) The scrub cycle in hours specified and it must be with in
> >the
> >> + supported range by the memory scrubber.
> >
> >Why specifying it in hours? I would use seconds, as it is easy to represent one
> >hour as 3600 seconds, but you can't specify a cycle of, let's say, 30min, if the
> >minimum range value is one hour.
> For the CXL patrol scrub, scrub cycle defined in hours(CXL spec 3.1 Table 8-208. Device Patrol Scrub
> Control Feature Writable Attributes), but ACPI RAS2 does not define the unit for the scrub cycle.
> Thus proposed represent scrub cycle in hours in common.
I understand that the final goal of this series is to have CXL exported
via sysfs, but this patch is not binding the scrub to CXL. Instead, it
is placing it on a generic location:
/sys/bus/edac/devices/<dev-name>/scrub
So, it doesn't make sense to bind it to CXL 3.1 spec.
> Not sure how convenient to set the scrub cycle in seconds from the user perspective and
From users perspective, it doesn't make much difference.
See, IMO, we should define this as:
/sys/bus/edac/devices/<dev-name>/scrub/min_cycle_duration
/sys/bus/edac/devices/<dev-name>/scrub/max_cycle_duration
/sys/bus/edac/devices/<dev-name>/scrub/current_cycle_duration
See, whatever logic userspace does, it needs to read the contents of
`min_cycle_duration`, choose a value higher than that, and then check
if the value is not bigger than `max_cycle_duration`.
Such value will then be written at current_cycle_duration.
The logic inside the Kernel will then convert it into some register
data, rounding it to the closest value to fit the actual memory
controller parameters.
A read from `current_cycle_duration` will than return what it was
actually programmed there.
So, even if the user programs it to, let's say, 4 hours, the actual
content of `current_cycle_duration` could return a number indicating
that the actual cycle is 4 hours, 20 minutes and 30 seconds.
> also is it require to finish the background scrubbing in such short time?
My main concern here is not about the minimal value, but about the minimal
quantity that can be specified/returned.
See, if you think on a generic way, It should be possible that some device
would support a scrub cycle lasting 2 hours and 30 minutes, for instance.
I'm also concerned scrubbing and memory refresh times are very dependent of
the memory technologies used to store and retain data at DRAM. From time to
time, we see large shifts on such technologies, affecting by orders of
order of magnitude memory timings including refresh and scrub cycles.
> >I mean, we never know how technology will evolve nor how manufacturers will
> >implement support for scrubbing cycle on their chipsets.
> >
> >> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index
> >> c532b57a6d8a..de56cbd039eb 100644
> >> --- a/drivers/edac/Makefile
> >> +++ b/drivers/edac/Makefile
> >> @@ -10,7 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
> >>
> >> edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
> >> edac_core-y += edac_module.o edac_device_sysfs.o wq.o
> >> -edac_core-y += edac_ras_feature.o
> >> +edac_core-y += edac_ras_feature.o edac_scrub.o
> >>
> >> edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
> >>
> >> diff --git a/drivers/edac/edac_ras_feature.c
> >> b/drivers/edac/edac_ras_feature.c index 24a729fea66f..48927f868372
> >> 100755
> >> --- a/drivers/edac/edac_ras_feature.c
> >> +++ b/drivers/edac/edac_ras_feature.c
> >> @@ -36,6 +36,7 @@ static int edac_ras_feat_scrub_init(struct device
> >> *parent, {
> >> sdata->ops = sfeat->scrub_ops;
> >> sdata->private = sfeat->scrub_ctx;
> >> + attr_groups[0] = edac_scrub_get_desc();
> >>
> >> return 1;
> >> }
> >> diff --git a/drivers/edac/edac_scrub.c b/drivers/edac/edac_scrub.c new
> >> file mode 100755 index 000000000000..0b07eafd3551
> >> --- /dev/null
> >> +++ b/drivers/edac/edac_scrub.c
> >> @@ -0,0 +1,312 @@
> >> +// SPDX-License-Identifier: GPL-2.0
> >> +/*
> >> + * Generic EDAC scrub driver supports controlling the memory
> >> + * scrubbers in the system and the common sysfs scrub interface
> >> + * promotes unambiguous access from the userspace.
> >> + *
> >> + * Copyright (c) 2024 HiSilicon Limited.
> >> + */
> >> +
> >> +#define pr_fmt(fmt) "EDAC SCRUB: " fmt
> >> +
> >> +#include <linux/edac_ras_feature.h>
> >> +
> >> +static ssize_t addr_range_base_show(struct device *ras_feat_dev,
> >> + struct device_attribute *attr,
> >> + char *buf)
> >> +{
> >> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> >> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> >> + u64 base, size;
> >> + int ret;
> >> +
> >> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base,
> >&size);
> >> + if (ret)
> >> + return ret;
> >
> >Also a generic comment applied to all devnodes: what if ops->read_range is
> >NULL? Shouldn't it be checked? Btw, you could use read_range == NULL if to
> >implement error handling for unsupported features.
> If ops->read_range is NULL, scrub_attr_visible() return 0 and then the corresponding attributes
> addr_range_base and addr_range_size would not be added in the sysfs.
> Same for other attributes.
Ok. Please document that either at the patch description and/or at the ABI.
> >
> >> +
> >> + return sysfs_emit(buf, "0x%llx\n", base); }
> >> +
> >> +static ssize_t addr_range_size_show(struct device *ras_feat_dev,
> >> + struct device_attribute *attr,
> >> + char *buf)
> >> +{
> >> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> >> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> >> + u64 base, size;
> >> + int ret;
> >> +
> >> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base,
> >&size);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + return sysfs_emit(buf, "0x%llx\n", size); }
> >> +
> >> +static ssize_t addr_range_base_store(struct device *ras_feat_dev,
> >> + struct device_attribute *attr,
> >> + const char *buf, size_t len) {
> >> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> >> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> >> + u64 base, size;
> >> + int ret;
> >> +
> >> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base,
> >&size);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + ret = kstrtou64(buf, 16, &base);
> >
> >I would use base 0, letting the parser expect "0x" for hexadecimal values.
> >Same for other *_store methods.
> Will check.
>
> >
> >> + if (ret < 0)
> >> + return ret;
> >> +
> >> + ret = ops->write_range(ras_feat_dev->parent, ctx->scrub.private, base,
> >size);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + return len;
> >> +}
> >> +
> >> +static ssize_t addr_range_size_store(struct device *ras_feat_dev,
> >> + struct device_attribute *attr,
> >> + const char *buf,
> >> + size_t len)
> >> +{
> >> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> >> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> >> + u64 base, size;
> >> + int ret;
> >> +
> >> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub.private, &base,
> >&size);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + ret = kstrtou64(buf, 16, &size);
> >> + if (ret < 0)
> >> + return ret;
> >> +
> >> + ret = ops->write_range(ras_feat_dev->parent, ctx->scrub.private, base,
> >size);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + return len;
> >> +}
> >> +
> >> +static ssize_t enable_background_store(struct device *ras_feat_dev,
> >> + struct device_attribute *attr,
> >> + const char *buf, size_t len) {
> >> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> >> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> >> + bool enable;
> >> + int ret;
> >> +
> >> + ret = kstrtobool(buf, &enable);
> >> + if (ret < 0)
> >> + return ret;
> >> +
> >> + ret = ops->set_enabled_bg(ras_feat_dev->parent, ctx->scrub.private,
> >enable);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + return len;
> >> +}
> >> +
> >> +static ssize_t enable_background_show(struct device *ras_feat_dev,
> >> + struct device_attribute *attr, char *buf) {
> >> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> >> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> >> + bool enable;
> >> + int ret;
> >> +
> >> + ret = ops->get_enabled_bg(ras_feat_dev->parent, ctx->scrub.private,
> >&enable);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + return sysfs_emit(buf, "%d\n", enable); }
> >> +
> >> +static ssize_t enable_on_demand_show(struct device *ras_feat_dev,
> >> + struct device_attribute *attr, char *buf) {
> >> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> >> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> >> + bool enable;
> >> + int ret;
> >> +
> >> + ret = ops->get_enabled_od(ras_feat_dev->parent, ctx->scrub.private,
> >&enable);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + return sysfs_emit(buf, "%d\n", enable); }
> >> +
> >> +static ssize_t enable_on_demand_store(struct device *ras_feat_dev,
> >> + struct device_attribute *attr,
> >> + const char *buf, size_t len) {
> >> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> >> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> >> + bool enable;
> >> + int ret;
> >> +
> >> + ret = kstrtobool(buf, &enable);
> >> + if (ret < 0)
> >> + return ret;
> >> +
> >> + ret = ops->set_enabled_od(ras_feat_dev->parent, ctx->scrub.private,
> >enable);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + return len;
> >> +}
> >> +
> >> +static ssize_t name_show(struct device *ras_feat_dev,
> >> + struct device_attribute *attr, char *buf) {
> >> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> >> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> >> + int ret;
> >> +
> >> + ret = ops->get_name(ras_feat_dev->parent, ctx->scrub.private, buf);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + return strlen(buf);
> >> +}
> >> +
> >> +static ssize_t cycle_in_hours_show(struct device *ras_feat_dev, struct
> >device_attribute *attr,
> >> + char *buf)
> >> +{
> >> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> >> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> >> + u64 val;
> >> + int ret;
> >> +
> >> + ret = ops->cycle_in_hours_read(ras_feat_dev->parent, ctx-
> >>scrub.private, &val);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + return sysfs_emit(buf, "0x%llx\n", val); }
> >> +
> >> +static ssize_t cycle_in_hours_store(struct device *ras_feat_dev, struct
> >device_attribute *attr,
> >> + const char *buf, size_t len)
> >> +{
> >> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> >> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> >> + long val;
> >> + int ret;
> >> +
> >> + ret = kstrtol(buf, 10, &val);
> >
> >Even here, I would be using base=0, but if you only want to support base 10,
> >please document it at the sysfs ABI.
> Will do.
> >
> >> + if (ret < 0)
> >> + return ret;
> >> +
> >> + ret = ops->cycle_in_hours_write(ras_feat_dev->parent, ctx-
> >>scrub.private, val);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + return len;
> >> +}
> >> +
> >> +static ssize_t cycle_in_hours_range_show(struct device *ras_feat_dev,
> >> + struct device_attribute *attr,
> >> + char *buf)
> >> +{
> >> + struct edac_ras_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> >> + const struct edac_scrub_ops *ops = ctx->scrub.ops;
> >> + u64 min_schrs, max_schrs;
> >> + int ret;
> >> +
> >> + ret = ops->cycle_in_hours_range(ras_feat_dev->parent, ctx-
> >>scrub.private,
> >> + &min_schrs, &max_schrs);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + return sysfs_emit(buf, "0x%llx-0x%llx\n", min_schrs, max_schrs);
> >
> >Hmm... you added the store in decimal, but here you're showing in hexa...
> Will check for store and show decimal.
> >
> >Btw, don't group multiple values on a single sysfs node. Instead, implement two
> >separate devnodes:
> Here we are showing the supported range for the scrub cycle.
> I am wondering any opinion on this from others?
That is how ABIs are implemented.
See for instance hwmon class, where all measurements have ranges, mapped
as min/max pairs:
/sys/class/hwmon/hwmonX/currY_max
/sys/class/hwmon/hwmonX/currY_min
/sys/class/hwmon/hwmonX/currY_rated_max
/sys/class/hwmon/hwmonX/currY_rated_min
/sys/class/hwmon/hwmonX/fanY_max
/sys/class/hwmon/hwmonX/fanY_min
/sys/class/hwmon/hwmonX/humidityY_max
/sys/class/hwmon/hwmonX/humidityY_max_alarm
/sys/class/hwmon/hwmonX/humidityY_max_hyst
/sys/class/hwmon/hwmonX/humidityY_min
/sys/class/hwmon/hwmonX/humidityY_min_alarm
/sys/class/hwmon/hwmonX/humidityY_min_hyst
/sys/class/hwmon/hwmonX/humidityY_rated_max
/sys/class/hwmon/hwmonX/humidityY_rated_min
/sys/class/hwmon/hwmonX/inY_max
/sys/class/hwmon/hwmonX/inY_min
/sys/class/hwmon/hwmonX/inY_rated_max
/sys/class/hwmon/hwmonX/inY_rated_min
...
You can also seek for range: there's none defined under ABI
documentation.
Tip: you can use:
./scripts/get_abi.pl search
to check such things.
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v9 05/11] cxl/mbox: Add GET_FEATURE mailbox command
2024-07-17 18:08 ` nifan.cxl
@ 2024-07-18 9:11 ` Shiju Jose
0 siblings, 0 replies; 30+ messages in thread
From: Shiju Jose @ 2024-07-18 9:11 UTC (permalink / raw)
To: nifan.cxl@gmail.com
Cc: linux-edac@vger.kernel.org, linux-cxl@vger.kernel.org,
linux-acpi@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, bp@alien8.de, tony.luck@intel.com,
rafael@kernel.org, lenb@kernel.org, mchehab@kernel.org,
dan.j.williams@intel.com, dave@stgolabs.net, Jonathan Cameron,
dave.jiang@intel.com, alison.schofield@intel.com,
vishal.l.verma@intel.com, ira.weiny@intel.com, david@redhat.com,
Vilas.Sridharan@amd.com, leo.duran@amd.com, Yazen.Ghannam@amd.com,
rientjes@google.com, jiaqiyan@google.com, Jon.Grimm@amd.com,
dave.hansen@linux.intel.com, naoya.horiguchi@nec.com,
james.morse@arm.com, jthoughton@google.com,
somasundaram.a@hpe.com, erdemaktas@google.com, pgonda@google.com,
duenwen@google.com, mike.malvestuto@intel.com, gthelen@google.com,
wschwartz@amperecomputing.com, dferguson@amperecomputing.com,
wbs@os.amperecomputing.com, tanxiaofei, Zengtao (B),
Roberto Sassu, kangkang.shen@futurewei.com, wanghuiqiang,
Linuxarm
>-----Original Message-----
>From: nifan.cxl@gmail.com <nifan.cxl@gmail.com>
>Sent: 17 July 2024 19:08
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: linux-edac@vger.kernel.org; linux-cxl@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org;
>bp@alien8.de; tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org;
>mchehab@kernel.org; dan.j.williams@intel.com; dave@stgolabs.net; Jonathan
>Cameron <jonathan.cameron@huawei.com>; dave.jiang@intel.com;
>alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com;
>david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com;
>Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
>naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
>duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com;
>wschwartz@amperecomputing.com; dferguson@amperecomputing.com;
>wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
>Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
>wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm
><linuxarm@huawei.com>
>Subject: Re: [RFC PATCH v9 05/11] cxl/mbox: Add GET_FEATURE mailbox
>command
>
>On Tue, Jul 16, 2024 at 04:03:29PM +0100, shiju.jose@huawei.com wrote:
>> From: Shiju Jose <shiju.jose@huawei.com>
>>
>> Add support for GET_FEATURE mailbox command.
>>
>> CXL spec 3.1 section 8.2.9.6 describes optional device specific features.
>> The settings of a feature can be retrieved using Get Feature command.
>>
>> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
>> ---
>Minor comments inline.
>
>> drivers/cxl/core/mbox.c | 37 +++++++++++++++++++++++++++++++++++++
>> drivers/cxl/cxlmem.h | 27 +++++++++++++++++++++++++++
>> 2 files changed, 64 insertions(+)
>>
>> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index
>> 9b9b1d26454e..b1eeed508459 100644
>> --- a/drivers/cxl/core/mbox.c
>> +++ b/drivers/cxl/core/mbox.c
>> @@ -1351,6 +1351,43 @@ int cxl_get_supported_features(struct
>> cxl_memdev_state *mds, }
>> EXPORT_SYMBOL_NS_GPL(cxl_get_supported_features, CXL);
>>
>> +size_t cxl_get_feature(struct cxl_memdev_state *mds,
>> + const uuid_t feat_uuid, void *feat_out,
>> + size_t feat_out_size,
>> + enum cxl_get_feat_selection selection)
>feat_uuid and selection are both payload inputs, maybe more natural to put
>them together before feat_out.
Will do.
>> +{
>> + size_t data_to_rd_size, size_out;
>> + struct cxl_mbox_get_feat_in pi;
>> + struct cxl_mbox_cmd mbox_cmd;
>> + size_t data_rcvd_size = 0;
>> + int rc;
>> +
>> + size_out = min(feat_out_size, mds->payload_size);
>> + pi.uuid = feat_uuid;
>> + pi.selection = selection;
>> + do {
>> + data_to_rd_size = min(feat_out_size - data_rcvd_size, mds-
>>payload_size);
>> + pi.offset = cpu_to_le16(data_rcvd_size);
>> + pi.count = cpu_to_le16(data_to_rd_size);
>> +
>> + mbox_cmd = (struct cxl_mbox_cmd) {
>> + .opcode = CXL_MBOX_OP_GET_FEATURE,
>> + .size_in = sizeof(pi),
>> + .payload_in = &pi,
>> + .size_out = size_out,
>> + .payload_out = feat_out + data_rcvd_size,
>> + .min_out = data_to_rd_size,
>> + };
>> + rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>> + if (rc < 0 || mbox_cmd.size_out == 0)
>Is there other case when size_out will be 0 other than the feat_out_size is 0?
I think size_out can be 0 depending on the implementation on the firmware or some
error situation when the feat_out_size is non zero.
>
>If feat_out_size is 0, maybe we return directly, or we use while () {}, instead of
>do {} while.
I had a check for feat_out_size against min feat out size in the previous version.
Will add return directly if feat_out_size is 0.
>Anyway, if there is no other case that will return size_out as 0, we can avoid the
>check.
>
>Fan
>> + return 0;
>> + data_rcvd_size += mbox_cmd.size_out;
>> + } while (data_rcvd_size < feat_out_size);
>> +
>> + return data_rcvd_size;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_get_feature, CXL);
>> +
>> int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
>> struct cxl_region *cxlr)
>> {
>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index
>> b0e1565b9d2e..25698a6fbe66 100644
>> --- a/drivers/cxl/cxlmem.h
>> +++ b/drivers/cxl/cxlmem.h
>> @@ -531,6 +531,7 @@ enum cxl_opcode {
>> CXL_MBOX_OP_CLEAR_LOG = 0x0403,
>> CXL_MBOX_OP_GET_SUP_LOG_SUBLIST = 0x0405,
>> CXL_MBOX_OP_GET_SUPPORTED_FEATURES = 0x0500,
>> + CXL_MBOX_OP_GET_FEATURE = 0x0501,
>> CXL_MBOX_OP_IDENTIFY = 0x4000,
>> CXL_MBOX_OP_GET_PARTITION_INFO = 0x4100,
>> CXL_MBOX_OP_SET_PARTITION_INFO = 0x4101,
>> @@ -757,6 +758,28 @@ struct cxl_mbox_get_supp_feats_out {
>> struct cxl_mbox_supp_feat_entry feat_entries[]; } __packed;
>>
>> +/*
>> + * Get Feature CXL 3.1 Spec 8.2.9.6.2 */
>> +
>> +/*
>> + * Get Feature input payload
>> + * CXL rev 3.1 section 8.2.9.6.2 Table 8-99 */ enum
>> +cxl_get_feat_selection {
>> + CXL_GET_FEAT_SEL_CURRENT_VALUE,
>> + CXL_GET_FEAT_SEL_DEFAULT_VALUE,
>> + CXL_GET_FEAT_SEL_SAVED_VALUE,
>> + CXL_GET_FEAT_SEL_MAX
>> +};
>> +
>> +struct cxl_mbox_get_feat_in {
>> + uuid_t uuid;
>> + __le16 offset;
>> + __le16 count;
>> + u8 selection;
>> +} __packed;
>> +
>> /* Get Poison List CXL 3.0 Spec 8.2.9.8.4.1 */ struct
>> cxl_mbox_poison_in {
>> __le64 offset;
>> @@ -891,6 +914,10 @@ int cxl_set_timestamp(struct cxl_memdev_state
>> *mds); int cxl_get_supported_features(struct cxl_memdev_state *mds,
>> u32 count, u16 start_index,
>> struct cxl_mbox_get_supp_feats_out *feats_out);
>> +size_t cxl_get_feature(struct cxl_memdev_state *mds,
>> + const uuid_t feat_uuid, void *feat_out,
>> + size_t feat_out_size,
>> + enum cxl_get_feat_selection selection);
>> int cxl_poison_state_init(struct cxl_memdev_state *mds); int
>> cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
>> struct cxl_region *cxlr);
>> --
>> 2.34.1
>>
Thanks,
Shiju
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v9 06/11] cxl/mbox: Add SET_FEATURE mailbox command
2024-07-17 20:13 ` nifan.cxl
@ 2024-07-18 9:15 ` Shiju Jose
0 siblings, 0 replies; 30+ messages in thread
From: Shiju Jose @ 2024-07-18 9:15 UTC (permalink / raw)
To: nifan.cxl@gmail.com
Cc: linux-edac@vger.kernel.org, linux-cxl@vger.kernel.org,
linux-acpi@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, bp@alien8.de, tony.luck@intel.com,
rafael@kernel.org, lenb@kernel.org, mchehab@kernel.org,
dan.j.williams@intel.com, dave@stgolabs.net, Jonathan Cameron,
dave.jiang@intel.com, alison.schofield@intel.com,
vishal.l.verma@intel.com, ira.weiny@intel.com, david@redhat.com,
Vilas.Sridharan@amd.com, leo.duran@amd.com, Yazen.Ghannam@amd.com,
rientjes@google.com, jiaqiyan@google.com, Jon.Grimm@amd.com,
dave.hansen@linux.intel.com, naoya.horiguchi@nec.com,
james.morse@arm.com, jthoughton@google.com,
somasundaram.a@hpe.com, erdemaktas@google.com, pgonda@google.com,
duenwen@google.com, mike.malvestuto@intel.com, gthelen@google.com,
wschwartz@amperecomputing.com, dferguson@amperecomputing.com,
wbs@os.amperecomputing.com, tanxiaofei, Zengtao (B),
Roberto Sassu, kangkang.shen@futurewei.com, wanghuiqiang,
Linuxarm
>-----Original Message-----
>From: nifan.cxl@gmail.com <nifan.cxl@gmail.com>
>Sent: 17 July 2024 21:14
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: linux-edac@vger.kernel.org; linux-cxl@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org;
>bp@alien8.de; tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org;
>mchehab@kernel.org; dan.j.williams@intel.com; dave@stgolabs.net; Jonathan
>Cameron <jonathan.cameron@huawei.com>; dave.jiang@intel.com;
>alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com;
>david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com;
>Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
>naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
>duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com;
>wschwartz@amperecomputing.com; dferguson@amperecomputing.com;
>wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
>Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
>wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm
><linuxarm@huawei.com>
>Subject: Re: [RFC PATCH v9 06/11] cxl/mbox: Add SET_FEATURE mailbox
>command
>
>On Tue, Jul 16, 2024 at 04:03:30PM +0100, shiju.jose@huawei.com wrote:
>> From: Shiju Jose <shiju.jose@huawei.com>
>>
>> Add support for SET_FEATURE mailbox command.
>>
>> CXL spec 3.1 section 8.2.9.6 describes optional device specific features.
>> CXL devices supports features with changeable attributes.
>> The settings of a feature can be optionally modified using Set Feature
>> command.
>
>Add more specific spec reference to the command here: 8.2.9.6.3.
>The same suggestions for get supported features and get feature commands.
Will do.
>
>>
>> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
>> ---
>> drivers/cxl/core/mbox.c | 71
>+++++++++++++++++++++++++++++++++++++++++
>> drivers/cxl/cxlmem.h | 33 +++++++++++++++++++
>> 2 files changed, 104 insertions(+)
>>
>> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index
>> b1eeed508459..50ecd2bd7372 100644
>> --- a/drivers/cxl/core/mbox.c
>> +++ b/drivers/cxl/core/mbox.c
>> @@ -1388,6 +1388,77 @@ size_t cxl_get_feature(struct cxl_memdev_state
>> *mds, } EXPORT_SYMBOL_NS_GPL(cxl_get_feature, CXL);
>>
>> +/*
>> + * FEAT_DATA_MIN_PAYLOAD_SIZE - min extra number of bytes should be
>> + * available in the mailbox for storing the actual feature data so
>> +that
>> + * the feature data transfer would work as expected.
>> + */
>> +#define FEAT_DATA_MIN_PAYLOAD_SIZE 10 int cxl_set_feature(struct
>> +cxl_memdev_state *mds,
>> + const uuid_t feat_uuid, u8 feat_version,
>> + void *feat_data, size_t feat_data_size,
>> + u8 feat_flag)
>> +{
>> + struct cxl_memdev_set_feat_pi {
>> + struct cxl_mbox_set_feat_hdr hdr;
>> + u8 feat_data[];
>> + } __packed;
>> + size_t data_in_size, data_sent_size = 0;
>> + struct cxl_mbox_cmd mbox_cmd;
>> + size_t hdr_size;
>> + int rc = 0;
>> +
>> + struct cxl_memdev_set_feat_pi *pi __free(kfree) =
>> + kmalloc(mds->payload_size,
>GFP_KERNEL);
>> + pi->hdr.uuid = feat_uuid;
>> + pi->hdr.version = feat_version;
>> + feat_flag &= ~CXL_SET_FEAT_FLAG_DATA_TRANSFER_MASK;
>
>Although we may not support it yet, should we set bit[3] (saved across reset)
>since we already defined CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET
>and it is not used?
Will do.
>
>Fan
>> + hdr_size = sizeof(pi->hdr);
>> + /*
>> + * Check minimum mbox payload size is available for
>> + * the feature data transfer.
>> + */
>> + if (hdr_size + FEAT_DATA_MIN_PAYLOAD_SIZE > mds->payload_size)
>> + return -ENOMEM;
>> +
>> + if ((hdr_size + feat_data_size) <= mds->payload_size) {
>> + pi->hdr.flags = cpu_to_le32(feat_flag |
>> +
>CXL_SET_FEAT_FLAG_FULL_DATA_TRANSFER);
>> + data_in_size = feat_data_size;
>> + } else {
>> + pi->hdr.flags = cpu_to_le32(feat_flag |
>> +
>CXL_SET_FEAT_FLAG_INITIATE_DATA_TRANSFER);
>> + data_in_size = mds->payload_size - hdr_size;
>> + }
>> +
>> + do {
>> + pi->hdr.offset = cpu_to_le16(data_sent_size);
>> + memcpy(pi->feat_data, feat_data + data_sent_size,
>data_in_size);
>> + mbox_cmd = (struct cxl_mbox_cmd) {
>> + .opcode = CXL_MBOX_OP_SET_FEATURE,
>> + .size_in = hdr_size + data_in_size,
>> + .payload_in = pi,
>> + };
>> + rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>> + if (rc < 0)
>> + return rc;
>> +
>> + data_sent_size += data_in_size;
>> + if (data_sent_size >= feat_data_size)
>> + return 0;
>> +
>> + if ((feat_data_size - data_sent_size) <= (mds->payload_size -
>hdr_size)) {
>> + data_in_size = feat_data_size - data_sent_size;
>> + pi->hdr.flags = cpu_to_le32(feat_flag |
>> +
>CXL_SET_FEAT_FLAG_FINISH_DATA_TRANSFER);
>> + } else {
>> + pi->hdr.flags = cpu_to_le32(feat_flag |
>> +
>CXL_SET_FEAT_FLAG_CONTINUE_DATA_TRANSFER);
>> + }
>> + } while (true);
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_set_feature, CXL);
>> +
>> int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
>> struct cxl_region *cxlr)
>> {
>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index
>> 25698a6fbe66..c3cb8e2736b5 100644
>> --- a/drivers/cxl/cxlmem.h
>> +++ b/drivers/cxl/cxlmem.h
>> @@ -532,6 +532,7 @@ enum cxl_opcode {
>> CXL_MBOX_OP_GET_SUP_LOG_SUBLIST = 0x0405,
>> CXL_MBOX_OP_GET_SUPPORTED_FEATURES = 0x0500,
>> CXL_MBOX_OP_GET_FEATURE = 0x0501,
>> + CXL_MBOX_OP_SET_FEATURE = 0x0502,
>> CXL_MBOX_OP_IDENTIFY = 0x4000,
>> CXL_MBOX_OP_GET_PARTITION_INFO = 0x4100,
>> CXL_MBOX_OP_SET_PARTITION_INFO = 0x4101,
>> @@ -780,6 +781,34 @@ struct cxl_mbox_get_feat_in {
>> u8 selection;
>> } __packed;
>>
>> +/*
>> + * Set Feature CXL 3.1 Spec 8.2.9.6.3 */
>> +
>> +/*
>> + * Set Feature input payload
>> + * CXL rev 3.1 section 8.2.9.6.3 Table 8-101 */
>> +/* Set Feature : Payload in flags */
>> +#define CXL_SET_FEAT_FLAG_DATA_TRANSFER_MASK GENMASK(2, 0)
>> +enum cxl_set_feat_flag_data_transfer {
>> + CXL_SET_FEAT_FLAG_FULL_DATA_TRANSFER,
>> + CXL_SET_FEAT_FLAG_INITIATE_DATA_TRANSFER,
>> + CXL_SET_FEAT_FLAG_CONTINUE_DATA_TRANSFER,
>> + CXL_SET_FEAT_FLAG_FINISH_DATA_TRANSFER,
>> + CXL_SET_FEAT_FLAG_ABORT_DATA_TRANSFER,
>> + CXL_SET_FEAT_FLAG_DATA_TRANSFER_MAX
>> +};
>> +#define CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET BIT(3)
>> +
>> +struct cxl_mbox_set_feat_hdr {
>> + uuid_t uuid;
>> + __le32 flags;
>> + __le16 offset;
>> + u8 version;
>> + u8 rsvd[9];
>> +} __packed;
>> +
>> /* Get Poison List CXL 3.0 Spec 8.2.9.8.4.1 */ struct
>> cxl_mbox_poison_in {
>> __le64 offset;
>> @@ -918,6 +947,10 @@ size_t cxl_get_feature(struct cxl_memdev_state
>*mds,
>> const uuid_t feat_uuid, void *feat_out,
>> size_t feat_out_size,
>> enum cxl_get_feat_selection selection);
>> +int cxl_set_feature(struct cxl_memdev_state *mds,
>> + const uuid_t feat_uuid, u8 feat_version,
>> + void *feat_data, size_t feat_data_size,
>> + u8 feat_flag);
>> int cxl_poison_state_init(struct cxl_memdev_state *mds); int
>> cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
>> struct cxl_region *cxlr);
>> --
>> 2.34.1
>>
Thanks,
Shiju
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v9 07/11] cxl/memscrub: Add CXL memory device patrol scrub control feature
2024-07-16 15:03 ` [RFC PATCH v9 07/11] cxl/memscrub: Add CXL memory device patrol scrub control feature shiju.jose
@ 2024-07-18 22:02 ` fan
0 siblings, 0 replies; 30+ messages in thread
From: fan @ 2024-07-18 22:02 UTC (permalink / raw)
To: shiju.jose
Cc: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel, bp,
tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu, kangkang.shen,
wanghuiqiang, linuxarm
On Tue, Jul 16, 2024 at 04:03:31PM +0100, shiju.jose@huawei.com wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> CXL spec 3.1 section 8.2.9.9.11.1 describes the device patrol scrub control
> feature. The device patrol scrub proactively locates and makes corrections
> to errors in regular cycle.
>
> Allow specifying the number of hours within which the patrol scrub must be
> completed, subject to minimum and maximum limits reported by the device.
> Also allow disabling scrub allowing trade-off error rates against
> performance.
>
> Add support for CXL memory device based patrol scrub control.
> Register with EDAC RAS control feature driver, which gets the scrub attr
> descriptors from the EDAC scrub and expose sysfs scrub control attributes
> to the userspace.
> For example CXL device based scrub control for the CXL mem0 device is exposed
> in /sys/bus/edac/devices/cxl_mem0/scrub/
>
> Also add support for region based CXL memory patrol scrub control.
> CXL memory region may be interleaved across one or more CXL memory devices.
> For example region based scrub control for CXL region1 is exposed in
> /sys/bus/edac/devices/cxl_region1/scrub/
>
> Open Questions:
> Q1: CXL 3.1 spec defined patrol scrub control feature at CXL memory devices with
> supporting set scrub cycle and enable/disable scrub. but not based on HPA range.
> Thus presently scrub control for a region is implemented based on all associated
> CXL memory devices.
> What is the exact use case for the CXL region based scrub control?
> How the HPA range, which Dan asked for region based scrubbing is used?
> Does spec change is required for patrol scrub control feature with support
> for setting the HPA range?
>
> Q2: Both CXL device based and CXL region based scrub control would be enabled
> at the same time in a system?
>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
> Documentation/scrub/edac-scrub.rst | 70 +++++
> drivers/cxl/Kconfig | 19 ++
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/memscrub.c | 413 +++++++++++++++++++++++++++++
> drivers/cxl/core/region.c | 6 +
> drivers/cxl/cxlmem.h | 8 +
> drivers/cxl/mem.c | 4 +
> 7 files changed, 521 insertions(+)
> create mode 100644 Documentation/scrub/edac-scrub.rst
> create mode 100644 drivers/cxl/core/memscrub.c
>
> diff --git a/Documentation/scrub/edac-scrub.rst b/Documentation/scrub/edac-scrub.rst
> new file mode 100644
> index 000000000000..cf7d8b130204
> --- /dev/null
> +++ b/Documentation/scrub/edac-scrub.rst
> @@ -0,0 +1,70 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===================
> +EDAC Scrub control
> +===================
> +
> +Copyright (c) 2024 HiSilicon Limited.
> +
> +:Author: Shiju Jose <shiju.jose@huawei.com>
> +:License: The GNU Free Documentation License, Version 1.2
> + (dual licensed under the GPL v2)
> +:Original Reviewers:
> +
> +- Written for: 6.12
> +- Updated for:
> +
> +Introduction
> +------------
> +The edac scrub driver provides interfaces for controlling the
> +memory scrubbers in the system. The scrub device drivers in the
> +system register with the edac scrub. The driver exposes the
> +scrub controls to the user in the sysfs.
> +
> +The File System
> +---------------
> +
> +The control attributes of the registered scrubbers could be
> +accessed in the /sys/bus/edac/devices/<dev-name>/scrub/
> +
> +sysfs
> +-----
> +
> +Sysfs files are documented in
> +`Documentation/ABI/testing/sysfs-edac-scrub-control`.
> +
> +Example
> +-------
> +
> +The usage takes the form shown in this example::
> +
> +1. CXL memory device patrol scrubber
> +1.1 device based
> +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub/cycle_in_hours_range
> +0x1-0xff
> +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub/cycle_in_hours
> +0xc
> +root@localhost:~# echo 30 > /sys/bus/edac/devices/cxl_mem0/scrub/cycle_in_hours
> +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub/cycle_in_hours
> +0x1e
> +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_mem0/scrub/enable_background
> +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub/enable_background
> +1
> +root@localhost:~# echo 0 > /sys/bus/edac/devices/cxl_mem0/scrub/enable_background
> +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub/enable_background
> +0
> +
> +1.2. region based
> +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub/cycle_in_hours_range
> +0x1-0xff
> +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub/cycle_in_hours
> +0xc
> +root@localhost:~# echo 30 > /sys/bus/edac/devices/cxl_region0/scrub/cycle_in_hours
> +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub/cycle_in_hours
> +0x1e
> +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_region0/scrub/enable_background
> +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub/enable_background
> +1
> +root@localhost:~# echo 0 > /sys/bus/edac/devices/cxl_region0/scrub/enable_background
> +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub/enable_background
> +0
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 99b5c25be079..7da70685a2db 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -145,4 +145,23 @@ config CXL_REGION_INVALIDATION_TEST
> If unsure, or if this kernel is meant for production environments,
> say N.
>
> +config CXL_SCRUB
> + bool "CXL: Memory scrub feature"
> + depends on CXL_PCI
> + depends on CXL_MEM
> + depends on EDAC
> + help
> + The CXL memory scrub control is an optional feature allows host to
> + control the scrub configurations of CXL Type 3 devices, which
> + supports patrol scrubbing.
s/supports/support/
> +
> + Registers with the scrub subsystem to provide control attributes
> + of CXL memory device scrubber to the user.
> + Provides interface functions to support configuring the CXL memory
> + device patrol scrubber.
> +
> + Say 'y/n' to enable/disable control of memory scrub parameters for
> + CXL.mem devices. See section 8.2.9.9.11.1 of CXL 3.1 specification
> + for detailed description of CXL memory patrol scrub control feature.
> +
> endif
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 9259bcc6773c..e0fc814c3983 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -16,3 +16,4 @@ cxl_core-y += pmu.o
> cxl_core-y += cdat.o
> cxl_core-$(CONFIG_TRACING) += trace.o
> cxl_core-$(CONFIG_CXL_REGION) += region.o
> +cxl_core-$(CONFIG_CXL_SCRUB) += memscrub.o
> diff --git a/drivers/cxl/core/memscrub.c b/drivers/cxl/core/memscrub.c
> new file mode 100644
> index 000000000000..430f85b01f6c
> --- /dev/null
> +++ b/drivers/cxl/core/memscrub.c
> @@ -0,0 +1,413 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * CXL memory scrub driver.
> + *
> + * Copyright (c) 2024 HiSilicon Limited.
> + *
> + * - Provides functions to configure patrol scrub feature of the
> + * CXL memory devices.
> + * - Registers with the scrub subsystem driver to expose the sysfs attributes
> + * to the user for configuring the CXL memory patrol scrub feature.
> + */
> +
> +#define pr_fmt(fmt) "CXL_MEM_SCRUB: " fmt
The format is not consistent with other definitions in the series,
remove "_".
> +
> +#include <cxlmem.h>
> +#include <linux/cleanup.h>
> +#include <linux/limits.h>
> +#include <cxl.h>
> +#include <linux/edac_ras_feature.h>
> +
> +#define CXL_DEV_NUM_RAS_FEATURES 2
> +
> +/*ToDo: This reusable function will be moved to a common file */
> +static int cxl_mem_get_supported_feature_entry(struct cxl_memdev *cxlmd, const uuid_t *feat_uuid,
> + struct cxl_mbox_supp_feat_entry *feat_entry_out)
> +{
> + struct cxl_mbox_supp_feat_entry *feat_entry;
> + struct cxl_dev_state *cxlds = cxlmd->cxlds;
> + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> + int feat_index, feats_out_size;
> + int nentries, count;
> + int ret;
> +
> + feat_index = 0;
> + feats_out_size = sizeof(struct cxl_mbox_get_supp_feats_out) +
> + sizeof(struct cxl_mbox_supp_feat_entry);
> + struct cxl_mbox_get_supp_feats_out *feats_out __free(kfree) =
> + kmalloc(feats_out_size, GFP_KERNEL);
> + if (!feats_out)
> + return -ENOMEM;
> +
> + while (true) {
> + memset(feats_out, 0, feats_out_size);
> + ret = cxl_get_supported_features(mds, feats_out_size,
> + feat_index, feats_out);
> + if (ret)
> + return ret;
> +
> + nentries = feats_out->nr_entries;
> + if (!nentries)
> + return -EOPNOTSUPP;
> +
> + /* Check CXL memdev supports the feature */
> + feat_entry = feats_out->feat_entries;
> + for (count = 0; count < nentries; count++, feat_entry++) {
> + if (uuid_equal(&feat_entry->uuid, feat_uuid)) {
> + memcpy(feat_entry_out, feat_entry,
> + sizeof(*feat_entry_out));
> + return 0;
> + }
> + }
> + feat_index += nentries;
> + }
> +}
> +
> +#define CXL_SCRUB_NAME_LEN 128
> +
> +/* CXL memory patrol scrub control definitions */
> +#define CXL_MEMDEV_PS_GET_FEAT_VERSION 0x01
> +#define CXL_MEMDEV_PS_SET_FEAT_VERSION 0x01
> +
> +static const uuid_t cxl_patrol_scrub_uuid =
> + UUID_INIT(0x96dad7d6, 0xfde8, 0x482b, 0xa7, 0x33, 0x75, 0x77, 0x4e, \
> + 0x06, 0xdb, 0x8a);
> +
> +/* CXL memory patrol scrub control functions */
> +struct cxl_patrol_scrub_context {
> + u16 get_feat_size;
> + u16 set_feat_size;
> + struct cxl_memdev *cxlmd;
> + struct cxl_region *cxlr;
> +};
> +
> +/**
> + * struct cxl_memdev_ps_params - CXL memory patrol scrub parameter data structure.
> + * @enable: [IN & OUT] enable(1)/disable(0) patrol scrub.
> + * @scrub_cycle_changeable: [OUT] scrub cycle attribute of patrol scrub is changeable.
> + * @scrub_cycle_hrs: [IN] Requested patrol scrub cycle in hours.
> + * [OUT] Current patrol scrub cycle in hours.
> + * @min_scrub_cycle_hrs:[OUT] minimum patrol scrub cycle in hours supported.
> + */
> +struct cxl_memdev_ps_params {
> + bool enable;
> + bool scrub_cycle_changeable;
> + u16 scrub_cycle_hrs;
> + u16 min_scrub_cycle_hrs;
> +};
> +
> +enum cxl_scrub_param {
> + cxl_ps_param_enable,
> + cxl_ps_param_scrub_cycle,
> +};
Use uppercase string.
Fan
> +
> +#define CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK BIT(0)
> +#define CXL_MEMDEV_PS_SCRUB_CYCLE_REALTIME_REPORT_CAP_MASK BIT(1)
> +#define CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK GENMASK(7, 0)
> +#define CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK GENMASK(15, 8)
> +#define CXL_MEMDEV_PS_FLAG_ENABLED_MASK BIT(0)
> +
> +struct cxl_memdev_ps_rd_attrs {
> + u8 scrub_cycle_cap;
> + __le16 scrub_cycle_hrs;
> + u8 scrub_flags;
> +} __packed;
> +
> +struct cxl_memdev_ps_wr_attrs {
> + u8 scrub_cycle_hrs;
> + u8 scrub_flags;
> +} __packed;
> +
> +static int cxl_mem_ps_get_attrs(struct cxl_memdev_state *mds,
> + struct cxl_memdev_ps_params *params)
> +{
> + size_t rd_data_size = sizeof(struct cxl_memdev_ps_rd_attrs);
> + size_t data_size;
> + struct cxl_memdev_ps_rd_attrs *rd_attrs __free(kfree) =
> + kmalloc(rd_data_size, GFP_KERNEL);
> + if (!rd_attrs)
> + return -ENOMEM;
> +
> + data_size = cxl_get_feature(mds, cxl_patrol_scrub_uuid, rd_attrs,
> + rd_data_size, CXL_GET_FEAT_SEL_CURRENT_VALUE);
> + if (!data_size)
> + return -EIO;
> +
> + params->scrub_cycle_changeable = FIELD_GET(CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK,
> + rd_attrs->scrub_cycle_cap);
> + params->enable = FIELD_GET(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
> + rd_attrs->scrub_flags);
> + params->scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
> + rd_attrs->scrub_cycle_hrs);
> + params->min_scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK,
> + rd_attrs->scrub_cycle_hrs);
> +
> + return 0;
> +}
> +
> +static int cxl_ps_get_attrs(struct device *dev, void *drv_data,
> + struct cxl_memdev_ps_params *params)
> +{
> + struct cxl_patrol_scrub_context *cxl_ps_ctx = drv_data;
> + struct cxl_memdev *cxlmd;
> + struct cxl_dev_state *cxlds;
> + struct cxl_memdev_state *mds;
> + u16 min_scrub_cycle = 0;
> + int i, ret;
> +
> + if (cxl_ps_ctx->cxlr) {
> + struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
> + struct cxl_region_params *p = &cxlr->params;
> +
> + for (i = p->interleave_ways - 1; i >= 0; i--) {
> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> +
> + cxlmd = cxled_to_memdev(cxled);
> + cxlds = cxlmd->cxlds;
> + mds = to_cxl_memdev_state(cxlds);
> + ret = cxl_mem_ps_get_attrs(mds, params);
> + if (ret)
> + return ret;
> +
> + if (params->min_scrub_cycle_hrs > min_scrub_cycle)
> + min_scrub_cycle = params->min_scrub_cycle_hrs;
> + }
> + params->min_scrub_cycle_hrs = min_scrub_cycle;
> + return 0;
> + }
> + cxlmd = cxl_ps_ctx->cxlmd;
> + cxlds = cxlmd->cxlds;
> + mds = to_cxl_memdev_state(cxlds);
> +
> + return cxl_mem_ps_get_attrs(mds, params);
> +}
> +
> +static int cxl_mem_ps_set_attrs(struct device *dev, struct cxl_memdev_state *mds,
> + struct cxl_memdev_ps_params *params,
> + enum cxl_scrub_param param_type)
> +{
> + struct cxl_memdev_ps_wr_attrs wr_attrs;
> + struct cxl_memdev_ps_params rd_params;
> + int ret;
> +
> + ret = cxl_mem_ps_get_attrs(mds, &rd_params);
> + if (ret) {
> + dev_err(dev, "Get cxlmemdev patrol scrub params failed ret=%d\n",
> + ret);
> + return ret;
> + }
> +
> + switch (param_type) {
> + case cxl_ps_param_enable:
> + wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
> + params->enable);
> + wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
> + rd_params.scrub_cycle_hrs);
> + break;
> + case cxl_ps_param_scrub_cycle:
> + if (params->scrub_cycle_hrs < rd_params.min_scrub_cycle_hrs) {
> + dev_err(dev, "Invalid CXL patrol scrub cycle(%d) to set\n",
> + params->scrub_cycle_hrs);
> + dev_err(dev, "Minimum supported CXL patrol scrub cycle in hour %d\n",
> + params->min_scrub_cycle_hrs);
> + return -EINVAL;
> + }
> + wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
> + params->scrub_cycle_hrs);
> + wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
> + rd_params.enable);
> + break;
> + }
> +
> + ret = cxl_set_feature(mds, cxl_patrol_scrub_uuid, CXL_MEMDEV_PS_SET_FEAT_VERSION,
> + &wr_attrs, sizeof(wr_attrs),
> + CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET);
> + if (ret) {
> + dev_err(dev, "CXL patrol scrub set feature failed ret=%d\n", ret);
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +static int cxl_ps_set_attrs(struct device *dev, void *drv_data,
> + struct cxl_memdev_ps_params *params,
> + enum cxl_scrub_param param_type)
> +{
> + struct cxl_patrol_scrub_context *cxl_ps_ctx = drv_data;
> + struct cxl_memdev *cxlmd;
> + struct cxl_dev_state *cxlds;
> + struct cxl_memdev_state *mds;
> + int ret, i;
> +
> + if (cxl_ps_ctx->cxlr) {
> + struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
> + struct cxl_region_params *p = &cxlr->params;
> +
> + for (i = p->interleave_ways - 1; i >= 0; i--) {
> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> +
> + cxlmd = cxled_to_memdev(cxled);
> + cxlds = cxlmd->cxlds;
> + mds = to_cxl_memdev_state(cxlds);
> + ret = cxl_mem_ps_set_attrs(dev, mds, params, param_type);
> + if (ret)
> + return ret;
> + }
> + } else {
> + cxlmd = cxl_ps_ctx->cxlmd;
> + cxlds = cxlmd->cxlds;
> + mds = to_cxl_memdev_state(cxlds);
> +
> + return cxl_mem_ps_set_attrs(dev, mds, params, param_type);
> + }
> +
> + return 0;
> +}
> +
> +static int cxl_patrol_scrub_get_enabled_bg(struct device *dev, void *drv_data, bool *enabled)
> +{
> + struct cxl_memdev_ps_params params;
> + int ret;
> +
> + ret = cxl_ps_get_attrs(dev, drv_data, ¶ms);
> + if (ret)
> + return ret;
> +
> + *enabled = params.enable;
> +
> + return 0;
> +}
> +
> +static int cxl_patrol_scrub_set_enabled_bg(struct device *dev, void *drv_data, bool enable)
> +{
> + struct cxl_memdev_ps_params params = {
> + .enable = enable,
> + };
> +
> + return cxl_ps_set_attrs(dev, drv_data, ¶ms, cxl_ps_param_enable);
> +}
> +
> +static int cxl_patrol_scrub_get_name(struct device *dev, void *drv_data, char *name)
> +{
> + struct cxl_patrol_scrub_context *cxl_ps_ctx = drv_data;
> + struct cxl_memdev *cxlmd = cxl_ps_ctx->cxlmd;
> +
> + if (cxl_ps_ctx->cxlr) {
> + struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
> +
> + return sysfs_emit(name, "cxl_region%d_patrol_scrub\n", cxlr->id);
> + }
> +
> + return sysfs_emit(name, "cxl_%s_patrol_scrub\n", dev_name(&cxlmd->dev));
> +}
> +
> +static int cxl_patrol_scrub_write_scrub_cycle_hrs(struct device *dev, void *drv_data,
> + u64 scrub_cycle_hrs)
> +{
> + struct cxl_memdev_ps_params params = {
> + .scrub_cycle_hrs = scrub_cycle_hrs,
> + };
> +
> + return cxl_ps_set_attrs(dev, drv_data, ¶ms, cxl_ps_param_scrub_cycle);
> +}
> +
> +static int cxl_patrol_scrub_read_scrub_cycle_hrs(struct device *dev, void *drv_data,
> + u64 *scrub_cycle_hrs)
> +{
> + struct cxl_memdev_ps_params params;
> + int ret;
> +
> + ret = cxl_ps_get_attrs(dev, drv_data, ¶ms);
> + if (ret)
> + return ret;
> +
> + *scrub_cycle_hrs = params.scrub_cycle_hrs;
> +
> + return 0;
> +}
> +
> +static int cxl_patrol_scrub_read_scrub_cycle_hrs_range(struct device *dev, void *drv_data,
> + u64 *min, u64 *max)
> +{
> + struct cxl_memdev_ps_params params;
> + int ret;
> +
> + ret = cxl_ps_get_attrs(dev, drv_data, ¶ms);
> + if (ret)
> + return ret;
> + *min = params.min_scrub_cycle_hrs;
> + *max = U8_MAX; /* Max set by register size */
> +
> + return 0;
> +}
> +
> +static const struct edac_scrub_ops cxl_ps_scrub_ops = {
> + .get_enabled_bg = cxl_patrol_scrub_get_enabled_bg,
> + .set_enabled_bg = cxl_patrol_scrub_set_enabled_bg,
> + .get_name = cxl_patrol_scrub_get_name,
> + .cycle_in_hours_read = cxl_patrol_scrub_read_scrub_cycle_hrs,
> + .cycle_in_hours_write = cxl_patrol_scrub_write_scrub_cycle_hrs,
> + .cycle_in_hours_range = cxl_patrol_scrub_read_scrub_cycle_hrs_range,
> +};
> +
> +int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr)
> +{
> + struct edac_ras_feature ras_features[CXL_DEV_NUM_RAS_FEATURES];
> + struct cxl_patrol_scrub_context *cxl_ps_ctx;
> + struct cxl_mbox_supp_feat_entry feat_entry;
> + char cxl_dev_name[CXL_SCRUB_NAME_LEN];
> + int rc, i, num_ras_features = 0;
> +
> + if (cxlr) {
> + struct cxl_region_params *p = &cxlr->params;
> +
> + for (i = p->interleave_ways - 1; i >= 0; i--) {
> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> +
> + cxlmd = cxled_to_memdev(cxled);
> + memset(&feat_entry, 0, sizeof(feat_entry));
> + rc = cxl_mem_get_supported_feature_entry(cxlmd, &cxl_patrol_scrub_uuid,
> + &feat_entry);
> + if (rc < 0)
> + return rc;
> + if (!(feat_entry.attr_flags & CXL_FEAT_ENTRY_FLAG_CHANGABLE))
> + return -EOPNOTSUPP;
> + }
> + } else {
> + rc = cxl_mem_get_supported_feature_entry(cxlmd, &cxl_patrol_scrub_uuid,
> + &feat_entry);
> + if (rc < 0)
> + return rc;
> +
> + if (!(feat_entry.attr_flags & CXL_FEAT_ENTRY_FLAG_CHANGABLE))
> + return -EOPNOTSUPP;
> + }
> +
> + cxl_ps_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_ps_ctx), GFP_KERNEL);
> + if (!cxl_ps_ctx)
> + return -ENOMEM;
> +
> + *cxl_ps_ctx = (struct cxl_patrol_scrub_context) {
> + .get_feat_size = feat_entry.get_size,
> + .set_feat_size = feat_entry.set_size,
> + };
> + if (cxlr) {
> + snprintf(cxl_dev_name, sizeof(cxl_dev_name),
> + "cxl_region%d", cxlr->id);
> + cxl_ps_ctx->cxlr = cxlr;
> + } else {
> + snprintf(cxl_dev_name, sizeof(cxl_dev_name),
> + "%s_%s", "cxl", dev_name(&cxlmd->dev));
> + cxl_ps_ctx->cxlmd = cxlmd;
> + }
> +
> + ras_features[num_ras_features].feat = ras_feat_scrub;
> + ras_features[num_ras_features].scrub_ops = &cxl_ps_scrub_ops;
> + ras_features[num_ras_features].scrub_ctx = cxl_ps_ctx;
> + num_ras_features++;
> +
> + return edac_ras_dev_register(&cxlmd->dev, cxl_dev_name, NULL,
> + num_ras_features, ras_features);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_mem_ras_features_init, CXL);
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 3c2b6144be23..14db9d301747 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3304,6 +3304,12 @@ static int cxl_region_probe(struct device *dev)
> p->res->start, p->res->end, cxlr,
> is_system_ram) > 0)
> return 0;
> +
> + rc = cxl_mem_ras_features_init(NULL, cxlr);
> + if (rc)
> + dev_warn(&cxlr->dev, "CXL ras features init for region_id=%d failed\n",
> + cxlr->id);
> +
> return devm_cxl_add_dax_region(cxlr);
> default:
> dev_dbg(&cxlr->dev, "unsupported region mode: %d\n",
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index c3cb8e2736b5..9a0eb41e5997 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -958,6 +958,14 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
> int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
> int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
>
> +/* cxl memory scrub functions */
> +#ifdef CONFIG_CXL_SCRUB
> +int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr);
> +#else
> +static inline int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr)
> +{ return 0; }
> +#endif
> +
> #ifdef CONFIG_CXL_SUSPEND
> void cxl_mem_active_inc(void);
> void cxl_mem_active_dec(void);
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index 0c79d9ce877c..7c8360e2e09b 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -117,6 +117,10 @@ static int cxl_mem_probe(struct device *dev)
> if (!cxlds->media_ready)
> return -EBUSY;
>
> + rc = cxl_mem_ras_features_init(cxlmd, NULL);
> + if (rc)
> + dev_warn(&cxlmd->dev, "CXL ras features init failed\n");
> +
> /*
> * Someone is trying to reattach this device after it lost its port
> * connection (an endpoint port previously registered by this memdev was
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC PATCH v9 08/11] cxl/memscrub: Add CXL memory device ECS control feature
2024-07-16 15:03 ` [RFC PATCH v9 08/11] cxl/memscrub: Add CXL memory device ECS " shiju.jose
@ 2024-07-19 18:43 ` fan
2024-07-24 9:10 ` Shiju Jose
0 siblings, 1 reply; 30+ messages in thread
From: fan @ 2024-07-19 18:43 UTC (permalink / raw)
To: shiju.jose
Cc: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel, bp,
tony.luck, rafael, lenb, mchehab, dan.j.williams, dave,
jonathan.cameron, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
duenwen, mike.malvestuto, gthelen, wschwartz, dferguson, wbs,
nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu, kangkang.shen,
wanghuiqiang, linuxarm
On Tue, Jul 16, 2024 at 04:03:32PM +0100, shiju.jose@huawei.com wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> CXL spec 3.1 section 8.2.9.9.11.2 describes the DDR5 Error Check
> Scrub (ECS) control feature.
> The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
> Specification (JESD79-5) and allows the DRAM to internally read, correct
> single-bit errors, and write back corrected data bits to the DRAM array
> while providing transparency to error counts.
>
> The ECS control allows the requester to change the log entry type, the ECS
> threshold count provided that the request is within the definition
> specified in DDR5 mode registers, change mode between codeword mode and
> row count mode, and reset the ECS counter.
>
> Register with EDAC RAS control feature driver, which gets the ECS attr
> descriptors from the EDAC ECS and expose sysfs ECS control attributes
> to the userspace.
> For example ECS control for the memory media FRU 0 in CXL mem0 device is
> in /sys/bus/edac/devices/cxl_mem0/ecs_fru0/
>
> Note: The documentation can be added if necessary.
>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
Some lines are too long. And some other comments inline.
> drivers/cxl/core/memscrub.c | 429 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 429 insertions(+)
>
> diff --git a/drivers/cxl/core/memscrub.c b/drivers/cxl/core/memscrub.c
> index 430f85b01f6c..9be230ea989a 100644
> --- a/drivers/cxl/core/memscrub.c
> +++ b/drivers/cxl/core/memscrub.c
> @@ -351,13 +351,411 @@ static const struct edac_scrub_ops cxl_ps_scrub_ops = {
> .cycle_in_hours_range = cxl_patrol_scrub_read_scrub_cycle_hrs_range,
> };
>
> +/* CXL DDR5 ECS control definitions */
> +#define CXL_MEMDEV_ECS_GET_FEAT_VERSION 0x01
> +#define CXL_MEMDEV_ECS_SET_FEAT_VERSION 0x01
> +
> +static const uuid_t cxl_ecs_uuid =
> + UUID_INIT(0xe5b13f22, 0x2328, 0x4a14, 0xb8, 0xba, 0xb9, 0x69, 0x1e, \
> + 0x89, 0x33, 0x86);
> +
> +struct cxl_ecs_context {
> + u16 num_media_frus;
> + u16 get_feat_size;
> + u16 set_feat_size;
> + struct cxl_memdev *cxlmd;
> +};
> +
> +/**
> + * struct cxl_ecs_params - CXL memory DDR5 ECS parameter data structure.
> + * @log_entry_type: ECS log entry type, per DRAM or per memory media FRU.
> + * @threshold: ECS threshold count per GB of memory cells.
> + * @mode: codeword/row count mode
> + * 0 : ECS counts rows with errors
> + * 1 : ECS counts codeword with errors
> + * @reset_counter: [IN] reset ECC counter to default value.
> + */
> +struct cxl_ecs_params {
> + u8 log_entry_type;
> + u16 threshold;
> + u8 mode;
An enum is defined below, why not directly use enum type here?
> + bool reset_counter;
> +};
> +
> +enum {
> + CXL_ECS_PARAM_LOG_ENTRY_TYPE,
> + CXL_ECS_PARAM_THRESHOLD,
> + CXL_ECS_PARAM_MODE,
> + CXL_ECS_PARAM_RESET_COUNTER,
> +};
> +
> +#define CXL_ECS_LOG_ENTRY_TYPE_MASK GENMASK(1, 0)
> +#define CXL_ECS_REALTIME_REPORT_CAP_MASK BIT(0)
> +#define CXL_ECS_THRESHOLD_COUNT_MASK GENMASK(2, 0)
> +#define CXL_ECS_MODE_MASK BIT(3)
> +#define CXL_ECS_RESET_COUNTER_MASK BIT(4)
> +
> +static const u16 ecs_supp_threshold[] = { 0, 0, 0, 256, 1024, 4096 };
> +
> +enum {
> + ECS_LOG_ENTRY_TYPE_DRAM = 0x0,
> + ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU = 0x1,
> +};
> +
> +enum {
> + ECS_THRESHOLD_256 = 3,
> + ECS_THRESHOLD_1024 = 4,
> + ECS_THRESHOLD_4096 = 5,
> +};
> +
> +enum {
> + ECS_MODE_COUNTS_ROWS = 0,
> + ECS_MODE_COUNTS_CODEWORDS = 1,
> +};
> +
> +struct cxl_ecs_rd_attrs {
> + u8 ecs_log_cap;
> + u8 ecs_cap;
> + __le16 ecs_config;
> + u8 ecs_flags;
> +} __packed;
> +
> +struct cxl_ecs_wr_attrs {
> + u8 ecs_log_cap;
> + __le16 ecs_config;
> +} __packed;
> +
> +/* CXL DDR5 ECS control functions */
> +static int cxl_mem_ecs_get_attrs(struct device *dev, void *drv_data, int fru_id,
> + struct cxl_ecs_params *params)
> +{
> + struct cxl_ecs_context *cxl_ecs_ctx = drv_data;
> + struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd;
> + struct cxl_dev_state *cxlds = cxlmd->cxlds;
> + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> + size_t rd_data_size;
> + u8 threshold_index;
> + size_t data_size;
> +
> + rd_data_size = cxl_ecs_ctx->get_feat_size;
> +
> + struct cxl_ecs_rd_attrs *rd_attrs __free(kfree) =
> + kmalloc(rd_data_size, GFP_KERNEL);
> + if (!rd_attrs)
> + return -ENOMEM;
> +
> + params->log_entry_type = 0;
> + params->threshold = 0;
> + params->mode = 0;
> + data_size = cxl_get_feature(mds, cxl_ecs_uuid, rd_attrs,
> + rd_data_size, CXL_GET_FEAT_SEL_CURRENT_VALUE);
> + if (!data_size)
> + return -EIO;
> +
> + params->log_entry_type = FIELD_GET(CXL_ECS_LOG_ENTRY_TYPE_MASK,
> + rd_attrs[fru_id].ecs_log_cap);
> + threshold_index = FIELD_GET(CXL_ECS_THRESHOLD_COUNT_MASK,
> + rd_attrs[fru_id].ecs_config);
> + params->threshold = ecs_supp_threshold[threshold_index];
> + params->mode = FIELD_GET(CXL_ECS_MODE_MASK,
> + rd_attrs[fru_id].ecs_config);
> + return 0;
> +}
> +
> +static int cxl_mem_ecs_set_attrs(struct device *dev, void *drv_data, int fru_id,
> + struct cxl_ecs_params *params, u8 param_type)
> +{
> + struct cxl_ecs_context *cxl_ecs_ctx = drv_data;
> + struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd;
> + struct cxl_dev_state *cxlds = cxlmd->cxlds;
> + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> + size_t rd_data_size, wr_data_size;
> + u16 num_media_frus, count;
> + size_t data_size;
> + int ret;
> +
> + num_media_frus = cxl_ecs_ctx->num_media_frus;
> + rd_data_size = cxl_ecs_ctx->get_feat_size;
> + wr_data_size = cxl_ecs_ctx->set_feat_size;
> + struct cxl_ecs_rd_attrs *rd_attrs __free(kfree) =
> + kmalloc(rd_data_size, GFP_KERNEL);
> + if (!rd_attrs)
> + return -ENOMEM;
> +
> + data_size = cxl_get_feature(mds, cxl_ecs_uuid, rd_attrs,
> + rd_data_size, CXL_GET_FEAT_SEL_CURRENT_VALUE);
> + if (!data_size)
> + return -EIO;
> + struct cxl_ecs_wr_attrs *wr_attrs __free(kfree) =
> + kmalloc(wr_data_size, GFP_KERNEL);
> + if (!wr_attrs)
> + return -ENOMEM;
> +
> + /* Fill writable attributes from the current attributes read for all the media FRUs */
> + for (count = 0; count < num_media_frus; count++) {
> + wr_attrs[count].ecs_log_cap = rd_attrs[count].ecs_log_cap;
> + wr_attrs[count].ecs_config = rd_attrs[count].ecs_config;
> + }
> +
> + /* Fill attribute to be set for the media FRU */
> + switch (param_type) {
> + case CXL_ECS_PARAM_LOG_ENTRY_TYPE:
> + if (params->log_entry_type != ECS_LOG_ENTRY_TYPE_DRAM &&
> + params->log_entry_type != ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU) {
> + dev_err(dev,
> + "Invalid CXL ECS scrub log entry type(%d) to set\n",
> + params->log_entry_type);
> + dev_err(dev,
> + "Log Entry Type 0: per DRAM 1: per Memory Media FRU\n");
> + return -EINVAL;
> + }
> + wr_attrs[fru_id].ecs_log_cap = FIELD_PREP(CXL_ECS_LOG_ENTRY_TYPE_MASK,
> + params->log_entry_type);
> + break;
> + case CXL_ECS_PARAM_THRESHOLD:
> + wr_attrs[fru_id].ecs_config &= ~CXL_ECS_THRESHOLD_COUNT_MASK;
> + switch (params->threshold) {
> + case 256:
> + wr_attrs[fru_id].ecs_config |= FIELD_PREP(
> + CXL_ECS_THRESHOLD_COUNT_MASK,
> + ECS_THRESHOLD_256);
> + break;
> + case 1024:
> + wr_attrs[fru_id].ecs_config |= FIELD_PREP(
> + CXL_ECS_THRESHOLD_COUNT_MASK,
> + ECS_THRESHOLD_1024);
> + break;
> + case 4096:
> + wr_attrs[fru_id].ecs_config |= FIELD_PREP(
> + CXL_ECS_THRESHOLD_COUNT_MASK,
> + ECS_THRESHOLD_4096);
> + break;
> + default:
> + dev_err(dev,
> + "Invalid CXL ECS scrub threshold count(%d) to set\n",
> + params->threshold);
> + dev_err(dev,
> + "Supported scrub threshold count: 256,1024,4096\n");
> + return -EINVAL;
> + }
> + break;
> + case CXL_ECS_PARAM_MODE:
> + if (params->mode != ECS_MODE_COUNTS_ROWS &&
> + params->mode != ECS_MODE_COUNTS_CODEWORDS) {
> + dev_err(dev,
> + "Invalid CXL ECS scrub mode(%d) to set\n",
> + params->mode);
> + dev_err(dev,
> + "Mode 0: ECS counts rows with errors"
> + " 1: ECS counts codewords with errors\n");
> + return -EINVAL;
> + }
> + wr_attrs[fru_id].ecs_config &= ~CXL_ECS_MODE_MASK;
> + wr_attrs[fru_id].ecs_config |= FIELD_PREP(CXL_ECS_MODE_MASK,
> + params->mode);
> + break;
> + case CXL_ECS_PARAM_RESET_COUNTER:
> + wr_attrs[fru_id].ecs_config &= ~CXL_ECS_RESET_COUNTER_MASK;
> + wr_attrs[fru_id].ecs_config |= FIELD_PREP(CXL_ECS_RESET_COUNTER_MASK,
> + params->reset_counter);
> + break;
> + default:
> + dev_err(dev, "Invalid CXL ECS parameter to set\n");
> + return -EINVAL;
> + }
> +
> + ret = cxl_set_feature(mds, cxl_ecs_uuid, CXL_MEMDEV_ECS_SET_FEAT_VERSION,
> + wr_attrs, wr_data_size,
> + CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET);
> + if (ret) {
> + dev_err(dev, "CXL ECS set feature failed ret=%d\n", ret);
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +static int cxl_ecs_get_log_entry_type(struct device *dev, void *drv_data, int fru_id, u64 *val)
> +{
> + struct cxl_ecs_params params;
> + int ret;
> +
> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
> + if (ret)
> + return ret;
> +
> + *val = params.log_entry_type;
> +
> + return 0;
> +}
> +
> +static int cxl_ecs_set_log_entry_type(struct device *dev, void *drv_data, int fru_id, u64 val)
> +{
> + struct cxl_ecs_params params = {
> + .log_entry_type = val,
> + };
> +
> + return cxl_mem_ecs_set_attrs(dev, drv_data, fru_id, ¶ms, CXL_ECS_PARAM_LOG_ENTRY_TYPE);
> +}
> +
> +static int cxl_ecs_get_log_entry_type_per_dram(struct device *dev, void *drv_data,
> + int fru_id, u64 *val)
I may have missed something. We have cxl_ecs_get_log_entry_type, and what is
cxl_ecs_get_log_entry_type_per_memory_media and cxl_ecs_get_log_entry_type_per_dram for?
> +{
> + struct cxl_ecs_params params;
> + int ret;
> +
> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
> + if (ret)
> + return ret;
> +
> + if (params.log_entry_type == ECS_LOG_ENTRY_TYPE_DRAM)
> + *val = 1;
> + else
> + *val = 0;
> +
> + return 0;
> +}
> +
> +static int cxl_ecs_get_log_entry_type_per_memory_media(struct device *dev, void *drv_data,
> + int fru_id, u64 *val)
> +{
> + struct cxl_ecs_params params;
> + int ret;
> +
> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
> + if (ret)
> + return ret;
> +
> + if (params.log_entry_type == ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU)
> + *val = 1;
> + else
> + *val = 0;
> +
> + return 0;
> +}
> +
> +static int cxl_ecs_get_mode(struct device *dev, void *drv_data, int fru_id, u64 *val)
> +{
> + struct cxl_ecs_params params;
> + int ret;
> +
> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
> + if (ret)
> + return ret;
> +
> + *val = params.mode;
> +
> + return 0;
> +}
> +
> +static int cxl_ecs_set_mode(struct device *dev, void *drv_data, int fru_id, u64 val)
> +{
> + struct cxl_ecs_params params = {
> + .mode = val,
> + };
> +
> + return cxl_mem_ecs_set_attrs(dev, drv_data, fru_id, ¶ms, CXL_ECS_PARAM_MODE);
> +}
> +
> +static int cxl_ecs_get_mode_counts_rows(struct device *dev, void *drv_data, int fru_id, u64 *val)
As above, what is cxl_ecs_get_mode_counts_codewords and
cxl_ecs_get_mode_counts_rows for?
Fan
> +{
> + struct cxl_ecs_params params;
> + int ret;
> +
> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
> + if (ret)
> + return ret;
> +
> + if (params.mode == ECS_MODE_COUNTS_ROWS)
> + *val = 1;
> + else
> + *val = 0;
> +
> + return 0;
> +}
> +
> +static int cxl_ecs_get_mode_counts_codewords(struct device *dev, void *drv_data,
> + int fru_id, u64 *val)
> +{
> + struct cxl_ecs_params params;
> + int ret;
> +
> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
> + if (ret)
> + return ret;
> +
> + if (params.mode == ECS_MODE_COUNTS_CODEWORDS)
> + *val = 1;
> + else
> + *val = 0;
> +
> + return 0;
> +}
> +
> +static int cxl_ecs_reset(struct device *dev, void *drv_data, int fru_id, u64 val)
> +{
> + struct cxl_ecs_params params = {
> + .reset_counter = val,
> + };
> +
> + return cxl_mem_ecs_set_attrs(dev, drv_data, fru_id, ¶ms, CXL_ECS_PARAM_RESET_COUNTER);
> +}
> +
> +static int cxl_ecs_get_threshold(struct device *dev, void *drv_data, int fru_id, u64 *val)
> +{
> + struct cxl_ecs_params params;
> + int ret;
> +
> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
> + if (ret)
> + return ret;
> +
> + *val = params.threshold;
> +
> + return 0;
> +}
> +
> +static int cxl_ecs_set_threshold(struct device *dev, void *drv_data, int fru_id, u64 val)
> +{
> + struct cxl_ecs_params params = {
> + .threshold = val,
> + };
> +
> + return cxl_mem_ecs_set_attrs(dev, drv_data, fru_id, ¶ms, CXL_ECS_PARAM_THRESHOLD);
> +}
> +
> +static int cxl_ecs_get_name(struct device *dev, void *drv_data, int fru_id, char *name)
> +{
> + struct cxl_ecs_context *cxl_ecs_ctx = drv_data;
> + struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd;
> +
> + return sysfs_emit(name, "cxl_%s_ecs_fru%d\n", dev_name(&cxlmd->dev), fru_id);
> +}
> +
> +static const struct edac_ecs_ops cxl_ecs_ops = {
> + .get_log_entry_type = cxl_ecs_get_log_entry_type,
> + .set_log_entry_type = cxl_ecs_set_log_entry_type,
> + .get_log_entry_type_per_dram = cxl_ecs_get_log_entry_type_per_dram,
> + .get_log_entry_type_per_memory_media = cxl_ecs_get_log_entry_type_per_memory_media,
> + .get_mode = cxl_ecs_get_mode,
> + .set_mode = cxl_ecs_set_mode,
> + .get_mode_counts_codewords = cxl_ecs_get_mode_counts_codewords,
> + .get_mode_counts_rows = cxl_ecs_get_mode_counts_rows,
> + .reset = cxl_ecs_reset,
> + .get_threshold = cxl_ecs_get_threshold,
> + .set_threshold = cxl_ecs_set_threshold,
> + .get_name = cxl_ecs_get_name,
> +};
> +
> int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr)
> {
> struct edac_ras_feature ras_features[CXL_DEV_NUM_RAS_FEATURES];
> struct cxl_patrol_scrub_context *cxl_ps_ctx;
> struct cxl_mbox_supp_feat_entry feat_entry;
> char cxl_dev_name[CXL_SCRUB_NAME_LEN];
> + struct cxl_ecs_context *cxl_ecs_ctx;
> int rc, i, num_ras_features = 0;
> + int num_media_frus;
>
> if (cxlr) {
> struct cxl_region_params *p = &cxlr->params;
> @@ -407,6 +805,37 @@ int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr)
> ras_features[num_ras_features].scrub_ctx = cxl_ps_ctx;
> num_ras_features++;
>
> + if (!cxlr) {
> + rc = cxl_mem_get_supported_feature_entry(cxlmd, &cxl_ecs_uuid, &feat_entry);
> + if (rc < 0)
> + goto feat_register;
> +
> + if (!(feat_entry.attr_flags & CXL_FEAT_ENTRY_FLAG_CHANGABLE))
> + goto feat_register;
> + num_media_frus = feat_entry.get_size/
> + sizeof(struct cxl_ecs_rd_attrs);
> + if (!num_media_frus)
> + goto feat_register;
> +
> + cxl_ecs_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_ecs_ctx), GFP_KERNEL);
> + if (!cxl_ecs_ctx)
> + goto feat_register;
> + *cxl_ecs_ctx = (struct cxl_ecs_context) {
> + .get_feat_size = feat_entry.get_size,
> + .set_feat_size = feat_entry.set_size,
> + .num_media_frus = num_media_frus,
> + .cxlmd = cxlmd,
> + };
> +
> + ras_features[num_ras_features].feat = ras_feat_ecs;
> + ras_features[num_ras_features].ecs_ops = &cxl_ecs_ops;
> + ras_features[num_ras_features].ecs_ctx = cxl_ecs_ctx;
> + ras_features[num_ras_features].ecs_info.num_media_frus = num_media_frus;
> + num_ras_features++;
> + }
> +
> +feat_register:
> +
> return edac_ras_dev_register(&cxlmd->dev, cxl_dev_name, NULL,
> num_ras_features, ras_features);
> }
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [RFC PATCH v9 08/11] cxl/memscrub: Add CXL memory device ECS control feature
2024-07-19 18:43 ` fan
@ 2024-07-24 9:10 ` Shiju Jose
0 siblings, 0 replies; 30+ messages in thread
From: Shiju Jose @ 2024-07-24 9:10 UTC (permalink / raw)
To: fan
Cc: linux-edac@vger.kernel.org, linux-cxl@vger.kernel.org,
linux-acpi@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, bp@alien8.de, tony.luck@intel.com,
rafael@kernel.org, lenb@kernel.org, mchehab@kernel.org,
dan.j.williams@intel.com, dave@stgolabs.net, Jonathan Cameron,
dave.jiang@intel.com, alison.schofield@intel.com,
vishal.l.verma@intel.com, ira.weiny@intel.com, david@redhat.com,
Vilas.Sridharan@amd.com, leo.duran@amd.com, Yazen.Ghannam@amd.com,
rientjes@google.com, jiaqiyan@google.com, Jon.Grimm@amd.com,
dave.hansen@linux.intel.com, naoya.horiguchi@nec.com,
james.morse@arm.com, jthoughton@google.com,
somasundaram.a@hpe.com, erdemaktas@google.com, pgonda@google.com,
duenwen@google.com, mike.malvestuto@intel.com, gthelen@google.com,
wschwartz@amperecomputing.com, dferguson@amperecomputing.com,
wbs@os.amperecomputing.com, tanxiaofei, Zengtao (B),
Roberto Sassu, kangkang.shen@futurewei.com, wanghuiqiang,
Linuxarm
Hi Fan,
Thanks for the comments.
Sorry for the delay.
>-----Original Message-----
>From: fan <nifan.cxl@gmail.com>
>Sent: 19 July 2024 19:43
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: linux-edac@vger.kernel.org; linux-cxl@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org;
>bp@alien8.de; tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org;
>mchehab@kernel.org; dan.j.williams@intel.com; dave@stgolabs.net; Jonathan
>Cameron <jonathan.cameron@huawei.com>; dave.jiang@intel.com;
>alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com;
>david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com;
>Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
>naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
>duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com;
>wschwartz@amperecomputing.com; dferguson@amperecomputing.com;
>wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
>Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
>wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm
><linuxarm@huawei.com>
>Subject: Re: [RFC PATCH v9 08/11] cxl/memscrub: Add CXL memory device ECS
>control feature
>
>On Tue, Jul 16, 2024 at 04:03:32PM +0100, shiju.jose@huawei.com wrote:
>> From: Shiju Jose <shiju.jose@huawei.com>
>>
>> CXL spec 3.1 section 8.2.9.9.11.2 describes the DDR5 Error Check Scrub
>> (ECS) control feature.
>> The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
>> Specification (JESD79-5) and allows the DRAM to internally read,
>> correct single-bit errors, and write back corrected data bits to the
>> DRAM array while providing transparency to error counts.
>>
>> The ECS control allows the requester to change the log entry type, the
>> ECS threshold count provided that the request is within the definition
>> specified in DDR5 mode registers, change mode between codeword mode
>> and row count mode, and reset the ECS counter.
>>
>> Register with EDAC RAS control feature driver, which gets the ECS attr
>> descriptors from the EDAC ECS and expose sysfs ECS control attributes
>> to the userspace.
>> For example ECS control for the memory media FRU 0 in CXL mem0 device
>> is in /sys/bus/edac/devices/cxl_mem0/ecs_fru0/
>>
>> Note: The documentation can be added if necessary.
>>
>> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
>> ---
>
>Some lines are too long. And some other comments inline.
Wil fix.
>
>> drivers/cxl/core/memscrub.c | 429
>> ++++++++++++++++++++++++++++++++++++
>> 1 file changed, 429 insertions(+)
>>
>> diff --git a/drivers/cxl/core/memscrub.c b/drivers/cxl/core/memscrub.c
>> index 430f85b01f6c..9be230ea989a 100644
>> --- a/drivers/cxl/core/memscrub.c
>> +++ b/drivers/cxl/core/memscrub.c
>> @@ -351,13 +351,411 @@ static const struct edac_scrub_ops
>cxl_ps_scrub_ops = {
>> .cycle_in_hours_range = cxl_patrol_scrub_read_scrub_cycle_hrs_range,
>> };
>>
>> +/* CXL DDR5 ECS control definitions */
>> +#define CXL_MEMDEV_ECS_GET_FEAT_VERSION 0x01
>> +#define CXL_MEMDEV_ECS_SET_FEAT_VERSION 0x01
>> +
>> +static const uuid_t cxl_ecs_uuid =
>> + UUID_INIT(0xe5b13f22, 0x2328, 0x4a14, 0xb8, 0xba, 0xb9, 0x69, 0x1e,
>\
>> + 0x89, 0x33, 0x86);
>> +
>> +struct cxl_ecs_context {
>> + u16 num_media_frus;
>> + u16 get_feat_size;
>> + u16 set_feat_size;
>> + struct cxl_memdev *cxlmd;
>> +};
>> +
>> +/**
>> + * struct cxl_ecs_params - CXL memory DDR5 ECS parameter data structure.
>> + * @log_entry_type: ECS log entry type, per DRAM or per memory media
>FRU.
>> + * @threshold: ECS threshold count per GB of memory cells.
>> + * @mode: codeword/row count mode
>> + * 0 : ECS counts rows with errors
>> + * 1 : ECS counts codeword with errors
>> + * @reset_counter: [IN] reset ECC counter to default value.
>> + */
>> +struct cxl_ecs_params {
>> + u8 log_entry_type;
>> + u16 threshold;
>> + u8 mode;
>
>An enum is defined below, why not directly use enum type here?
Will do.
>
>> + bool reset_counter;
>> +};
>> +
>> +enum {
>> + CXL_ECS_PARAM_LOG_ENTRY_TYPE,
>> + CXL_ECS_PARAM_THRESHOLD,
>> + CXL_ECS_PARAM_MODE,
>> + CXL_ECS_PARAM_RESET_COUNTER,
>> +};
>> +
>> +#define CXL_ECS_LOG_ENTRY_TYPE_MASK GENMASK(1, 0)
>> +#define CXL_ECS_REALTIME_REPORT_CAP_MASK BIT(0)
>> +#define CXL_ECS_THRESHOLD_COUNT_MASK GENMASK(2, 0)
>> +#define CXL_ECS_MODE_MASK BIT(3)
>> +#define CXL_ECS_RESET_COUNTER_MASK BIT(4)
>> +
>> +static const u16 ecs_supp_threshold[] = { 0, 0, 0, 256, 1024, 4096 };
>> +
>> +enum {
>> + ECS_LOG_ENTRY_TYPE_DRAM = 0x0,
>> + ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU = 0x1, };
>> +
>> +enum {
>> + ECS_THRESHOLD_256 = 3,
>> + ECS_THRESHOLD_1024 = 4,
>> + ECS_THRESHOLD_4096 = 5,
>> +};
>> +
>> +enum {
>> + ECS_MODE_COUNTS_ROWS = 0,
>> + ECS_MODE_COUNTS_CODEWORDS = 1,
>> +};
>> +
>> +struct cxl_ecs_rd_attrs {
>> + u8 ecs_log_cap;
>> + u8 ecs_cap;
>> + __le16 ecs_config;
>> + u8 ecs_flags;
>> +} __packed;
>> +
>> +struct cxl_ecs_wr_attrs {
>> + u8 ecs_log_cap;
>> + __le16 ecs_config;
>> +} __packed;
>> +
>> +/* CXL DDR5 ECS control functions */
>> +static int cxl_mem_ecs_get_attrs(struct device *dev, void *drv_data, int
>fru_id,
>> + struct cxl_ecs_params *params)
>> +{
>> + struct cxl_ecs_context *cxl_ecs_ctx = drv_data;
>> + struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd;
>> + struct cxl_dev_state *cxlds = cxlmd->cxlds;
>> + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
>> + size_t rd_data_size;
>> + u8 threshold_index;
>> + size_t data_size;
>> +
>> + rd_data_size = cxl_ecs_ctx->get_feat_size;
>> +
>> + struct cxl_ecs_rd_attrs *rd_attrs __free(kfree) =
>> + kmalloc(rd_data_size, GFP_KERNEL);
>> + if (!rd_attrs)
>> + return -ENOMEM;
>> +
>> + params->log_entry_type = 0;
>> + params->threshold = 0;
>> + params->mode = 0;
>> + data_size = cxl_get_feature(mds, cxl_ecs_uuid, rd_attrs,
>> + rd_data_size,
>CXL_GET_FEAT_SEL_CURRENT_VALUE);
>> + if (!data_size)
>> + return -EIO;
>> +
>> + params->log_entry_type =
>FIELD_GET(CXL_ECS_LOG_ENTRY_TYPE_MASK,
>> + rd_attrs[fru_id].ecs_log_cap);
>> + threshold_index = FIELD_GET(CXL_ECS_THRESHOLD_COUNT_MASK,
>> + rd_attrs[fru_id].ecs_config);
>> + params->threshold = ecs_supp_threshold[threshold_index];
>> + params->mode = FIELD_GET(CXL_ECS_MODE_MASK,
>> + rd_attrs[fru_id].ecs_config);
>> + return 0;
>> +}
>> +
>> +static int cxl_mem_ecs_set_attrs(struct device *dev, void *drv_data, int
>fru_id,
>> + struct cxl_ecs_params *params, u8
>param_type) {
>> + struct cxl_ecs_context *cxl_ecs_ctx = drv_data;
>> + struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd;
>> + struct cxl_dev_state *cxlds = cxlmd->cxlds;
>> + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
>> + size_t rd_data_size, wr_data_size;
>> + u16 num_media_frus, count;
>> + size_t data_size;
>> + int ret;
>> +
>> + num_media_frus = cxl_ecs_ctx->num_media_frus;
>> + rd_data_size = cxl_ecs_ctx->get_feat_size;
>> + wr_data_size = cxl_ecs_ctx->set_feat_size;
>> + struct cxl_ecs_rd_attrs *rd_attrs __free(kfree) =
>> + kmalloc(rd_data_size, GFP_KERNEL);
>> + if (!rd_attrs)
>> + return -ENOMEM;
>> +
>> + data_size = cxl_get_feature(mds, cxl_ecs_uuid, rd_attrs,
>> + rd_data_size,
>CXL_GET_FEAT_SEL_CURRENT_VALUE);
>> + if (!data_size)
>> + return -EIO;
>> + struct cxl_ecs_wr_attrs *wr_attrs __free(kfree) =
>> + kmalloc(wr_data_size, GFP_KERNEL);
>> + if (!wr_attrs)
>> + return -ENOMEM;
>> +
>> + /* Fill writable attributes from the current attributes read for all the
>media FRUs */
>> + for (count = 0; count < num_media_frus; count++) {
>> + wr_attrs[count].ecs_log_cap = rd_attrs[count].ecs_log_cap;
>> + wr_attrs[count].ecs_config = rd_attrs[count].ecs_config;
>> + }
>> +
>> + /* Fill attribute to be set for the media FRU */
>> + switch (param_type) {
>> + case CXL_ECS_PARAM_LOG_ENTRY_TYPE:
>> + if (params->log_entry_type != ECS_LOG_ENTRY_TYPE_DRAM
>&&
>> + params->log_entry_type !=
>ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU) {
>> + dev_err(dev,
>> + "Invalid CXL ECS scrub log entry type(%d) to
>set\n",
>> + params->log_entry_type);
>> + dev_err(dev,
>> + "Log Entry Type 0: per DRAM 1: per Memory
>Media FRU\n");
>> + return -EINVAL;
>> + }
>> + wr_attrs[fru_id].ecs_log_cap =
>FIELD_PREP(CXL_ECS_LOG_ENTRY_TYPE_MASK,
>> + params-
>>log_entry_type);
>> + break;
>> + case CXL_ECS_PARAM_THRESHOLD:
>> + wr_attrs[fru_id].ecs_config &=
>~CXL_ECS_THRESHOLD_COUNT_MASK;
>> + switch (params->threshold) {
>> + case 256:
>> + wr_attrs[fru_id].ecs_config |= FIELD_PREP(
>> + CXL_ECS_THRESHOLD_COUNT_MASK,
>> + ECS_THRESHOLD_256);
>> + break;
>> + case 1024:
>> + wr_attrs[fru_id].ecs_config |= FIELD_PREP(
>> +
> CXL_ECS_THRESHOLD_COUNT_MASK,
>> + ECS_THRESHOLD_1024);
>> + break;
>> + case 4096:
>> + wr_attrs[fru_id].ecs_config |= FIELD_PREP(
>> +
> CXL_ECS_THRESHOLD_COUNT_MASK,
>> + ECS_THRESHOLD_4096);
>> + break;
>> + default:
>> + dev_err(dev,
>> + "Invalid CXL ECS scrub threshold count(%d) to
>set\n",
>> + params->threshold);
>> + dev_err(dev,
>> + "Supported scrub threshold count:
>256,1024,4096\n");
>> + return -EINVAL;
>> + }
>> + break;
>> + case CXL_ECS_PARAM_MODE:
>> + if (params->mode != ECS_MODE_COUNTS_ROWS &&
>> + params->mode != ECS_MODE_COUNTS_CODEWORDS) {
>> + dev_err(dev,
>> + "Invalid CXL ECS scrub mode(%d) to set\n",
>> + params->mode);
>> + dev_err(dev,
>> + "Mode 0: ECS counts rows with errors"
>> + " 1: ECS counts codewords with errors\n");
>> + return -EINVAL;
>> + }
>> + wr_attrs[fru_id].ecs_config &= ~CXL_ECS_MODE_MASK;
>> + wr_attrs[fru_id].ecs_config |=
>FIELD_PREP(CXL_ECS_MODE_MASK,
>> + params->mode);
>> + break;
>> + case CXL_ECS_PARAM_RESET_COUNTER:
>> + wr_attrs[fru_id].ecs_config &=
>~CXL_ECS_RESET_COUNTER_MASK;
>> + wr_attrs[fru_id].ecs_config |=
>FIELD_PREP(CXL_ECS_RESET_COUNTER_MASK,
>> + params-
>>reset_counter);
>> + break;
>> + default:
>> + dev_err(dev, "Invalid CXL ECS parameter to set\n");
>> + return -EINVAL;
>> + }
>> +
>> + ret = cxl_set_feature(mds, cxl_ecs_uuid,
>CXL_MEMDEV_ECS_SET_FEAT_VERSION,
>> + wr_attrs, wr_data_size,
>> +
>CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET);
>> + if (ret) {
>> + dev_err(dev, "CXL ECS set feature failed ret=%d\n", ret);
>> + return ret;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int cxl_ecs_get_log_entry_type(struct device *dev, void
>> +*drv_data, int fru_id, u64 *val) {
>> + struct cxl_ecs_params params;
>> + int ret;
>> +
>> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
>> + if (ret)
>> + return ret;
>> +
>> + *val = params.log_entry_type;
>> +
>> + return 0;
>> +}
>> +
>> +static int cxl_ecs_set_log_entry_type(struct device *dev, void
>> +*drv_data, int fru_id, u64 val) {
>> + struct cxl_ecs_params params = {
>> + .log_entry_type = val,
>> + };
>> +
>> + return cxl_mem_ecs_set_attrs(dev, drv_data, fru_id, ¶ms,
>> +CXL_ECS_PARAM_LOG_ENTRY_TYPE); }
>> +
>> +static int cxl_ecs_get_log_entry_type_per_dram(struct device *dev, void
>*drv_data,
>> + int fru_id, u64 *val)
>
>I may have missed something. We have cxl_ecs_get_log_entry_type, and what is
>cxl_ecs_get_log_entry_type_per_memory_media and
>cxl_ecs_get_log_entry_type_per_dram for?
Reason for adding these readonly attributes to avoid user need to check the spec to
interpret the value set or the supported options for ECS log type.
From spec,
Common DDR5 ECS Log Capabilities
* Bits[1:0]: Log Entry Type: The log entry type of how the ECS log is
reported. The entry type is defined commonly for all memory media FRUs
within the device.
- 00b = Per DRAM
- 01b = Per Memory Media FRU
- All other encodings are reserved
>
>> +{
>> + struct cxl_ecs_params params;
>> + int ret;
>> +
>> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
>> + if (ret)
>> + return ret;
>> +
>> + if (params.log_entry_type == ECS_LOG_ENTRY_TYPE_DRAM)
>> + *val = 1;
>> + else
>> + *val = 0;
>> +
>> + return 0;
>> +}
>> +
>> +static int cxl_ecs_get_log_entry_type_per_memory_media(struct device
>*dev, void *drv_data,
>> + int fru_id, u64 *val)
>> +{
>> + struct cxl_ecs_params params;
>> + int ret;
>> +
>> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
>> + if (ret)
>> + return ret;
>> +
>> + if (params.log_entry_type ==
>ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU)
>> + *val = 1;
>> + else
>> + *val = 0;
>> +
>> + return 0;
>> +}
>> +
>> +static int cxl_ecs_get_mode(struct device *dev, void *drv_data, int
>> +fru_id, u64 *val) {
>> + struct cxl_ecs_params params;
>> + int ret;
>> +
>> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
>> + if (ret)
>> + return ret;
>> +
>> + *val = params.mode;
>> +
>> + return 0;
>> +}
>> +
>> +static int cxl_ecs_set_mode(struct device *dev, void *drv_data, int
>> +fru_id, u64 val) {
>> + struct cxl_ecs_params params = {
>> + .mode = val,
>> + };
>> +
>> + return cxl_mem_ecs_set_attrs(dev, drv_data, fru_id, ¶ms,
>> +CXL_ECS_PARAM_MODE); }
>> +
>> +static int cxl_ecs_get_mode_counts_rows(struct device *dev, void
>> +*drv_data, int fru_id, u64 *val)
>As above, what is cxl_ecs_get_mode_counts_codewords and
>cxl_ecs_get_mode_counts_rows for?
Reason for adding these readonly attributes to avoid user need to check the spec to
interpret the value set or the supported options for mode.
From spec,
Bit[3]: Codeword/Row Count Mode:
- 0 = ECS counts rows with errors
- 1 = ECS counts codewords with errors
>
>Fan
>> +{
>> + struct cxl_ecs_params params;
>> + int ret;
>> +
>> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
>> + if (ret)
>> + return ret;
>> +
>> + if (params.mode == ECS_MODE_COUNTS_ROWS)
>> + *val = 1;
>> + else
>> + *val = 0;
>> +
>> + return 0;
>> +}
>> +
>> +static int cxl_ecs_get_mode_counts_codewords(struct device *dev, void
>*drv_data,
>> + int fru_id, u64 *val)
>> +{
>> + struct cxl_ecs_params params;
>> + int ret;
>> +
>> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
>> + if (ret)
>> + return ret;
>> +
>> + if (params.mode == ECS_MODE_COUNTS_CODEWORDS)
>> + *val = 1;
>> + else
>> + *val = 0;
>> +
>> + return 0;
>> +}
>> +
>> +static int cxl_ecs_reset(struct device *dev, void *drv_data, int
>> +fru_id, u64 val) {
>> + struct cxl_ecs_params params = {
>> + .reset_counter = val,
>> + };
>> +
>> + return cxl_mem_ecs_set_attrs(dev, drv_data, fru_id, ¶ms,
>> +CXL_ECS_PARAM_RESET_COUNTER); }
>> +
>> +static int cxl_ecs_get_threshold(struct device *dev, void *drv_data,
>> +int fru_id, u64 *val) {
>> + struct cxl_ecs_params params;
>> + int ret;
>> +
>> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
>> + if (ret)
>> + return ret;
>> +
>> + *val = params.threshold;
>> +
>> + return 0;
>> +}
>> +
>> +static int cxl_ecs_set_threshold(struct device *dev, void *drv_data,
>> +int fru_id, u64 val) {
>> + struct cxl_ecs_params params = {
>> + .threshold = val,
>> + };
>> +
>> + return cxl_mem_ecs_set_attrs(dev, drv_data, fru_id, ¶ms,
>> +CXL_ECS_PARAM_THRESHOLD); }
>> +
>> +static int cxl_ecs_get_name(struct device *dev, void *drv_data, int
>> +fru_id, char *name) {
>> + struct cxl_ecs_context *cxl_ecs_ctx = drv_data;
>> + struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd;
>> +
>> + return sysfs_emit(name, "cxl_%s_ecs_fru%d\n", dev_name(&cxlmd-
>>dev),
>> +fru_id); }
>> +
>> +static const struct edac_ecs_ops cxl_ecs_ops = {
>> + .get_log_entry_type = cxl_ecs_get_log_entry_type,
>> + .set_log_entry_type = cxl_ecs_set_log_entry_type,
>> + .get_log_entry_type_per_dram =
>cxl_ecs_get_log_entry_type_per_dram,
>> + .get_log_entry_type_per_memory_media =
>cxl_ecs_get_log_entry_type_per_memory_media,
>> + .get_mode = cxl_ecs_get_mode,
>> + .set_mode = cxl_ecs_set_mode,
>> + .get_mode_counts_codewords = cxl_ecs_get_mode_counts_codewords,
>> + .get_mode_counts_rows = cxl_ecs_get_mode_counts_rows,
>> + .reset = cxl_ecs_reset,
>> + .get_threshold = cxl_ecs_get_threshold,
>> + .set_threshold = cxl_ecs_set_threshold,
>> + .get_name = cxl_ecs_get_name,
>> +};
>> +
>> int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct
>> cxl_region *cxlr) {
>> struct edac_ras_feature ras_features[CXL_DEV_NUM_RAS_FEATURES];
>> struct cxl_patrol_scrub_context *cxl_ps_ctx;
>> struct cxl_mbox_supp_feat_entry feat_entry;
>> char cxl_dev_name[CXL_SCRUB_NAME_LEN];
>> + struct cxl_ecs_context *cxl_ecs_ctx;
>> int rc, i, num_ras_features = 0;
>> + int num_media_frus;
>>
>> if (cxlr) {
>> struct cxl_region_params *p = &cxlr->params; @@ -407,6
>+805,37 @@
>> int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region
>*cxlr)
>> ras_features[num_ras_features].scrub_ctx = cxl_ps_ctx;
>> num_ras_features++;
>>
>> + if (!cxlr) {
>> + rc = cxl_mem_get_supported_feature_entry(cxlmd,
>&cxl_ecs_uuid, &feat_entry);
>> + if (rc < 0)
>> + goto feat_register;
>> +
>> + if (!(feat_entry.attr_flags &
>CXL_FEAT_ENTRY_FLAG_CHANGABLE))
>> + goto feat_register;
>> + num_media_frus = feat_entry.get_size/
>> + sizeof(struct cxl_ecs_rd_attrs);
>> + if (!num_media_frus)
>> + goto feat_register;
>> +
>> + cxl_ecs_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_ecs_ctx),
>GFP_KERNEL);
>> + if (!cxl_ecs_ctx)
>> + goto feat_register;
>> + *cxl_ecs_ctx = (struct cxl_ecs_context) {
>> + .get_feat_size = feat_entry.get_size,
>> + .set_feat_size = feat_entry.set_size,
>> + .num_media_frus = num_media_frus,
>> + .cxlmd = cxlmd,
>> + };
>> +
>> + ras_features[num_ras_features].feat = ras_feat_ecs;
>> + ras_features[num_ras_features].ecs_ops = &cxl_ecs_ops;
>> + ras_features[num_ras_features].ecs_ctx = cxl_ecs_ctx;
>> + ras_features[num_ras_features].ecs_info.num_media_frus =
>num_media_frus;
>> + num_ras_features++;
>> + }
>> +
>> +feat_register:
>> +
>> return edac_ras_dev_register(&cxlmd->dev, cxl_dev_name, NULL,
>> num_ras_features, ras_features); }
>> --
>> 2.34.1
>>
Thanks,
Shiju
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2024-07-24 9:10 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-16 15:03 [RFC PATCH v9 00/11] EDAC: Scrub: Introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
2024-07-16 15:03 ` [RFC PATCH v9 01/11] EDAC: Add generic EDAC RAS feature driver shiju.jose
2024-07-16 18:00 ` fan
2024-07-17 11:06 ` Shiju Jose
2024-07-17 10:00 ` Mauro Carvalho Chehab
2024-07-17 11:01 ` Shiju Jose
2024-07-18 6:19 ` Mauro Carvalho Chehab
2024-07-16 15:03 ` [RFC PATCH v9 02/11] EDAC: Add EDAC scrub control driver shiju.jose
2024-07-17 12:56 ` Mauro Carvalho Chehab
2024-07-17 14:07 ` Shiju Jose
2024-07-18 7:03 ` Mauro Carvalho Chehab
2024-07-16 15:03 ` [RFC PATCH v9 03/11] EDAC: Add EDAC ECS " shiju.jose
2024-07-17 13:08 ` Mauro Carvalho Chehab
2024-07-17 17:13 ` nifan.cxl
2024-07-16 15:03 ` [RFC PATCH v9 04/11] cxl/mbox: Add GET_SUPPORTED_FEATURES mailbox command shiju.jose
2024-07-17 17:28 ` nifan.cxl
2024-07-16 15:03 ` [RFC PATCH v9 05/11] cxl/mbox: Add GET_FEATURE " shiju.jose
2024-07-17 18:08 ` nifan.cxl
2024-07-18 9:11 ` Shiju Jose
2024-07-16 15:03 ` [RFC PATCH v9 06/11] cxl/mbox: Add SET_FEATURE " shiju.jose
2024-07-17 20:13 ` nifan.cxl
2024-07-18 9:15 ` Shiju Jose
2024-07-16 15:03 ` [RFC PATCH v9 07/11] cxl/memscrub: Add CXL memory device patrol scrub control feature shiju.jose
2024-07-18 22:02 ` fan
2024-07-16 15:03 ` [RFC PATCH v9 08/11] cxl/memscrub: Add CXL memory device ECS " shiju.jose
2024-07-19 18:43 ` fan
2024-07-24 9:10 ` Shiju Jose
2024-07-16 15:03 ` [RFC PATCH v9 09/11] platform: Add __free() based cleanup function for platform_device_put shiju.jose
2024-07-16 15:03 ` [RFC PATCH v9 10/11] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
2024-07-16 15:03 ` [RFC PATCH v9 11/11] ras: scrub: ACPI RAS2: Add memory " shiju.jose
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).