From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B969528A2C1; Thu, 13 Feb 2025 21:35:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.174 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739482508; cv=none; b=LTyqcDkubIT8sn6M9LHeoiz03xKTAxgz1L1vCCMiSnOK4kxfYAb9ICBj44crzIriJhGIC2QgI8hFsPARswbSip544zXIPPZJ5fg0dIHr+oyBfQ5/CwTk1hDp0cHyO7ddcrM+61CkYLNZobxebqwOFX/Dbx2RpuxvoMm6e0siOoA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739482508; c=relaxed/simple; bh=EveVALkVYrNjtHz4fNnE2UfX5sEJ3RkVlL9ZpsrqUPY=; h=Message-ID:From:Date:To:Cc:Subject:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=sJeY5btRN6Kd5vn8XN97VKtFATQ2FlsDidliDeqWfTrGDgYm9c4W4pNqHy7xHwm4rZbjvdBjYujjTwlOeKiUH2R1d7BAdeSvChxAS9ST7BouiZcukkikCqFfL5cpbwiKI7hKB+5HdbNakSZokh0yi7+oiC4zcZA0VmADbz3uiYs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=LDCXX2ZN; arc=none smtp.client-ip=209.85.214.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LDCXX2ZN" Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-21f48ebaadfso27919775ad.2; Thu, 13 Feb 2025 13:35:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739482505; x=1740087305; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:subject:cc :to:date:from:message-id:from:to:cc:subject:date:message-id:reply-to; bh=mO1FvYLWgJ76EQQgd/J/C8zLqhfJtgjs8p1wp7GyAYY=; b=LDCXX2ZNY1s9nihoFqYQS8bGuJvZQbdULzdPrB+EQqY0dBhWUKNtG8iIeeSomU3Pve U8Z+3Vn4JwQPf8T/8tCVb5lOEB6hCWpw9yBOEwdCtHuQelH1PfxJlAiLboJjybJ/t3sf gVuLHQgqZ5uVZibnBAEgvT0CT37Xgv2Y21jLeK/FHYFp1fTZ7svICQuHvZs+5Btud8AX kyxJu3ZpLSRKuduohXBRjJ+UVWR6YDthhECRv63k5AQKLvaToIWfKpHv96+sCN0eKzeu nRh0pT369oVKYg3sIfQhbmvF8NbPVnpg+lu6lZmZ2G0ZCGv3ZMNeTbElyDlPMkfrl8RZ TkpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739482505; x=1740087305; h=in-reply-to:content-disposition:mime-version:references:subject:cc :to:date:from:message-id:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=mO1FvYLWgJ76EQQgd/J/C8zLqhfJtgjs8p1wp7GyAYY=; b=ciM1SYPLdxJvMtn1xTIdTDGhMZ/ElCm7ZuLhxiUZ6192GJLGQaJ93APRrO91hz7NMQ MzNmbwOCwHZ1cZXeX6UNnIEOodrkdlWa8nLuWI5nUghp8KqPVGah2CgZCofgQOxnht6q 43avzAiRcTZxCvA+hvBvUyTps04Pl0ZnAlmzZUN/p14NCx+oJvSsuEWVAmAA8DrWa5la ZzisvjcJ3Q0NgSTwvc3DZHtW0i7ljG1NWy4nyX2E11JsMr8d4Vi7yoqcvCRJF+IGXPFI M4ICCwsn6GX2BgP2KJaSrnC3UiDnds8OWh/Cku67R+/7HjqA1Mjqs0CqwTo9fkM0s7jS WuNQ== X-Forwarded-Encrypted: i=1; AJvYcCV2zvwXm6vJjrSb4gc6B5NS9ZDWXi9rrCPP5J13f6nzl0i8+FHgDduisRJu+5+9cKiyIguoHK0fcVFue2o9@vger.kernel.org, AJvYcCWC3rZIqDMfURV25AbDWsRpdZwslYgH8pCtOGHpZudgQFag9jomzO3QaSLC8dxRTKzgkLHBqCqsMt4Q@vger.kernel.org, AJvYcCXayjIXdLLsGnt/GKQKmHqaLMPIOWRuvMDqPP4JwoeipcQiT+HAjbM0qaX0p8ZDPe+Fhjk/8fpmnmPW@vger.kernel.org, AJvYcCXg1pxLhq+7hYzIxXPevFVALMT3AIy06IDxosBBgehmz+MN+UwE7LmzskyT8a9w1q77hJ2TG+7iz5FT@vger.kernel.org X-Gm-Message-State: AOJu0YzYCysX+7QqsZfZ/292hBuIL9Z6tzWIvzlekDk4XDkITWA+2eyD dudPAq6eqcViJI9ElvKeuSJqFpacynRUXWD81/oiJaV3B2GP+2X6 X-Gm-Gg: ASbGncthYvfT3e5jL08o/KuqXieNrWwV/ylVMmuf63bVrXCn7llxkbQQuMUjCU5SGvc RPln3xCE3iZRNZko2gfzWmsziUX8fA+ZlQyzKYwu/TEkIk5FMRFb12cmDnHYuKy/08w7iNxL0qE M9wwTodByeCBBGcJYE++KAV+HBzko858/pjKtbHxScNY/Nfo5biOlVZ4JTC6OD2NU9lV0VzgWsU 4re2bR3Z/T1I7QwQ/4tohcAI7x5L0lXNFZmfgJz/zXWipAgUmcf+aiXwtPodwvy5egoms/GpuMZ ZIR2yEhGopbOnXe9hcnng4EGHzSSM3R0EOGpUhX5TaA= X-Google-Smtp-Source: AGHT+IHjNa1VbeDC205wGUrmuat+iH4dd4bSxcrot/VgExDFdrrERr0MrDPFSLuVWD4ahG/o8WwJiQ== X-Received: by 2002:a17:903:2f90:b0:216:485f:bf90 with SMTP id d9443c01a7336-220bbb0cab4mr133435095ad.27.1739482504553; Thu, 13 Feb 2025 13:35:04 -0800 (PST) Received: from asus. (c-73-189-148-61.hsd1.ca.comcast.net. [73.189.148.61]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-220d5585220sm16741435ad.215.2025.02.13.13.34.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Feb 2025 13:35:03 -0800 (PST) Message-ID: <67ae6587.170a0220.2d3544.9687@mx.google.com> X-Google-Original-Message-ID: From: Fan Ni X-Google-Original-From: Fan Ni Date: Thu, 13 Feb 2025 13:34:49 -0800 To: shiju.jose@huawei.com Cc: linux-edac@vger.kernel.org, linux-cxl@vger.kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, bp@alien8.de, tony.luck@intel.com, rafael@kernel.org, lenb@kernel.org, mchehab@kernel.org, dan.j.williams@intel.com, dave@stgolabs.net, jonathan.cameron@huawei.com, dave.jiang@intel.com, alison.schofield@intel.com, vishal.l.verma@intel.com, ira.weiny@intel.com, david@redhat.com, Vilas.Sridharan@amd.com, leo.duran@amd.com, Yazen.Ghannam@amd.com, rientjes@google.com, jiaqiyan@google.com, Jon.Grimm@amd.com, dave.hansen@linux.intel.com, naoya.horiguchi@nec.com, james.morse@arm.com, jthoughton@google.com, somasundaram.a@hpe.com, erdemaktas@google.com, pgonda@google.com, duenwen@google.com, gthelen@google.com, wschwartz@amperecomputing.com, dferguson@amperecomputing.com, wbs@os.amperecomputing.com, nifan.cxl@gmail.com, tanxiaofei@huawei.com, prime.zeng@hisilicon.com, roberto.sassu@huawei.com, kangkang.shen@futurewei.com, wanghuiqiang@huawei.com, linuxarm@huawei.com Subject: Re: [PATCH v20 02/15] EDAC: Add scrub control feature References: <20250212143654.1893-1-shiju.jose@huawei.com> <20250212143654.1893-3-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250212143654.1893-3-shiju.jose@huawei.com> On Wed, Feb 12, 2025 at 02:36:40PM +0000, shiju.jose@huawei.com wrote: > From: Shiju Jose > > Add a generic EDAC scrub control to manage memory scrubbers in the system. > Devices with a scrub feature register with the EDAC device driver, which > retrieves the scrub descriptor from the EDAC scrub driver and exposes the > sysfs scrub control attributes for a scrub instance to userspace at > /sys/bus/edac/devices//scrubX/. > > The common sysfs scrub control interface abstracts the control of > arbitrary scrubbing functionality into a common set of functions. The > sysfs scrub attribute nodes are only present if the client driver has > implemented the corresponding attribute callback function and passed the > operations(ops) to the EDAC device driver during registration. > > Co-developed-by: Jonathan Cameron > Signed-off-by: Jonathan Cameron > Tested-by: Daniel Ferguson > Signed-off-by: Shiju Jose > --- > Documentation/ABI/testing/sysfs-edac-scrub | 69 ++++++ > Documentation/edac/features.rst | 6 + > Documentation/edac/index.rst | 1 + > Documentation/edac/scrub.rst | 259 +++++++++++++++++++++ > drivers/edac/Kconfig | 9 + > drivers/edac/Makefile | 2 + > drivers/edac/edac_device.c | 41 +++- > drivers/edac/scrub.c | 209 +++++++++++++++++ > include/linux/edac.h | 43 ++++ > 9 files changed, 635 insertions(+), 4 deletions(-) > create mode 100644 Documentation/ABI/testing/sysfs-edac-scrub > create mode 100644 Documentation/edac/scrub.rst > create mode 100755 drivers/edac/scrub.c LGTM. Just one question, for min/max/current_cycle_duration attributes, is there a reason why seconds are used instead of hours directly as mentioned in the spec. That confused me a little bit when I tested to modify the current_cycle_duration with some value not multiple of 3600 and found the value read back is not the same as that just written. With that in mind, Tested-by: Fan Ni > > diff --git a/Documentation/ABI/testing/sysfs-edac-scrub b/Documentation/ABI/testing/sysfs-edac-scrub > new file mode 100644 > index 000000000000..a3c0ad40b2b0 > --- /dev/null > +++ b/Documentation/ABI/testing/sysfs-edac-scrub > @@ -0,0 +1,69 @@ > +What: /sys/bus/edac/devices//scrubX > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + The sysfs EDAC bus devices //scrubX subdirectory > + belongs to an instance of memory scrub control feature, > + where directory corresponds to a device/memory > + region registered with the EDAC device driver for the > + scrub control feature. > + > + The sysfs scrub attr nodes are only present if the parent > + driver has implemented the corresponding attr callback > + function and provided the necessary operations to the EDAC > + device driver during registration. > + > +What: /sys/bus/edac/devices//scrubX/addr > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) The base address of the memory region to be scrubbed > + for on-demand scrubbing. Setting address starts scrubbing. > + The size must be set before that. > + > + The readback addr value is non-zero if the requested > + on-demand scrubbing is in progress, zero otherwise. > + > +What: /sys/bus/edac/devices//scrubX/size > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) The size of the memory region to be scrubbed > + (on-demand scrubbing). > + > +What: /sys/bus/edac/devices//scrubX/enable_background > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) Start/Stop background(patrol) scrubbing if supported. > + > +What: /sys/bus/edac/devices//scrubX/min_cycle_duration > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RO) Supported minimum scrub cycle duration in seconds > + by the memory scrubber. > + > +What: /sys/bus/edac/devices//scrubX/max_cycle_duration > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RO) Supported maximum scrub cycle duration in seconds > + by the memory scrubber. > + > +What: /sys/bus/edac/devices//scrubX/current_cycle_duration > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) The current scrub cycle duration in seconds and must be > + within the supported range by the memory scrubber. > + > + Scrub has an overhead when running and that may want to be > + reduced by taking longer to do it. > diff --git a/Documentation/edac/features.rst b/Documentation/edac/features.rst > index 6b0fdc6f5d6e..942d7a92b8d7 100644 > --- a/Documentation/edac/features.rst > +++ b/Documentation/edac/features.rst > @@ -92,3 +92,9 @@ High level design is illustrated in the following diagram:: > 3. RAS dynamic feature controller - Userspace sample modules in rasdaemon for > dynamic scrub/repair control to issue scrubbing/repair when excess number > of corrected memory errors are reported in a short span of time. > + > +RAS features > +------------ > +1. Memory Scrub > + > +Memory scrub features are documented in `Documentation/edac/scrub.rst`. > diff --git a/Documentation/edac/index.rst b/Documentation/edac/index.rst > index de4a3aa452cb..0a00c23838b6 100644 > --- a/Documentation/edac/index.rst > +++ b/Documentation/edac/index.rst > @@ -8,3 +8,4 @@ EDAC Subsystem > :maxdepth: 1 > > features > + scrub > diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst > new file mode 100644 > index 000000000000..50bb44b126fa > --- /dev/null > +++ b/Documentation/edac/scrub.rst > @@ -0,0 +1,259 @@ > +.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.2-no-invariants-or-later > + > +=================== > +EDAC Scrub Control > +=================== > + > +Copyright (c) 2024-2025 HiSilicon Limited. > + > +:Author: Shiju Jose > +:License: The GNU Free Documentation License, Version 1.2 without > + Invariant Sections, Front-Cover Texts nor Back-Cover Texts. > + (dual licensed under the GPL v2) > + > +- Written for: 6.15 > + > +Introduction > +------------ > +Increasing DRAM size and cost have made memory subsystem reliability an > +important concern. These modules are used where potentially corrupted data > +could cause expensive or fatal issues. Memory errors are among the top > +hardware failures that cause server and workload crashes. > + > +Memory scrubbing is a feature where an ECC (Error-Correcting Code) engine > +reads data from each memory media location, corrects with an ECC if > +necessary and writes the corrected data back to the same memory media > +location. > + > +The memory DIMMs can be scrubbed at a configurable rate to detect > +uncorrected memory errors and attempt recovery from detected errors, > +providing the following benefits. > + > +1. Proactively scrubbing memory DIMMs reduces the chance of a correctable > + error becoming uncorrectable. > + > +2. When detected, uncorrected errors caught in unallocated memory pages are > + isolated and prevented from being allocated to an application or the OS. > + > +3. This reduces the likelihood of software or hardware products encountering > + memory errors. > + > +4. The additional data on failures in memory may be used to build up statistics > + that are later used to decide whether to use memory repair technologies > + such as Post Package Repair or Sparing. > + > +There are 2 types of memory scrubbing: > + > +1. Background (patrol) scrubbing of the RAM while the RAM is otherwise > + idle. > + > +2. On-demand scrubbing for a specific address range or region of memory. > + > +Several types of interfaces to hardware memory scrubbers have been > +identified, such as CXL memory device patrol scrub, CXL DDR5 ECS, ACPI > +RAS2 memory scrubbing, and ACPI NVDIMM ARS (Address Range Scrub). > + > +The control mechanisms vary across different memory scrubbers. To enable > +standardized userspace tooling, there is a need to present these controls > +through a standardized ABI. > + > +Introduce a generic memory EDAC scrub control that allows users to manage > +underlying scrubbers in the system through a standardized sysfs scrub > +control interface. This common sysfs scrub control interface abstracts the > +management of various scrubbing functionalities into a unified set of > +functions. > + > +Use cases of common scrub control feature > +----------------------------------------- > +1. Several types of interfaces for hardware (HW) memory scrubbers have > + been identified, including the CXL memory device patrol scrub, CXL DDR5 > + ECS, ACPI RAS2 memory scrubbing features, ACPI NVDIMM ARS (Address Range > + Scrub), and software-based memory scrubbers. Of the identified interfaces > + to hardware memory scrubbers some support control over patrol (background) > + scrubbing (e.g., ACPI RAS2, CXL) and/or on-demand scrubbing (e.g., ACPI RAS2, > + ACPI ARS). However, the scrub control interfaces vary between memory > + scrubbers, highlighting the need for a standardized, generic sysfs scrub > + control interface that is accessible to userspace for administration and use > + by scripts/tools. > + > +2. User-space scrub controls allow users to disable scrubbing if necessary, > + for example, to disable background patrol scrubbing or adjust the scrub > + rate for performance-aware operations where background activities need to > + be minimized or disabled. > + > +3. User-space tools enable on-demand scrubbing for specific address ranges, > + provided that the scrubber supports this functionality. > + > +4. User-space tools can also control memory DIMM scrubbing at a configurable > + scrub rate via sysfs scrub controls. This approach offers several benefits: > + > + 4.1. Detects uncorrectable memory errors early, before user access to affected > + memory, helping facilitate recovery. > + > + 4.2. Reduces the likelihood of correctable errors developing into uncorrectable > + errors. > + > +5. Policy control for hotplugged memory is necessary because there may not > + be a system-wide BIOS or similar control to manage scrub settings for a CXL > + device added after boot. Determining these settings is a policy decision, > + balancing reliability against performance, so userspace should control it. > + Therefore, a unified interface is recommended for handling this function in > + a way that aligns with other similar interfaces, rather than creating a > + separate one. > + > +Scrubbing features > +------------------ > + > +CXL Memory Scrubbing features > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > +CXL spec r3.1 [1]_ section 8.2.9.9.11.1 describes the memory device patrol > +scrub control feature. The device patrol scrub proactively locates and makes > +corrections to errors in regular cycle. The patrol scrub control allows the > +userspace request to change CXL patrol scrubber's configurations. > + > +The patrol scrub control allows the requester to specify the number of > +hours in which the patrol scrub cycles must be completed, provided that > +the requested scrub rate must be within the supported range of the > +scrub rate that the device is capable of. In the CXL driver, the > +number of seconds per scrub cycles, which user requests via sysfs, is > +rescaled to hours per scrub cycles. In addition, the patrol scrub controls > +allow the host to disable and enable the feature in case disabling of the > +feature is needed for other purposes such as performance-aware operations > +which require the background operations to be turned off. > + > +Error Check Scrub (ECS) > +~~~~~~~~~~~~~~~~~~~~~~~ > +CXL spec r3.1 [1]_ section 8.2.9.9.11.2 describes the Error Check Scrub (ECS) > +is a feature defined in JEDEC DDR5 SDRAM Specification (JESD79-5) and > +allows the DRAM to internally read, correct single-bit errors, and write > +back corrected data bits to the DRAM array while providing transparency > +to error counts. > + > +The DDR5 device contains number of memory media FRUs per device. The > +DDR5 ECS feature and thus the ECS control driver supports configuring > +the ECS parameters per FRU. > + > +ACPI RAS2 Hardware-based Memory Scrubbing > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > +ACPI spec 6.5 [2]_ section 5.2.21 ACPI RAS2 describes ACPI RAS2 table > +provides interfaces for platform RAS features and supports independent > +RAS controls and capabilities for a given RAS feature for multiple > +instances of the same component in a given system. > +Memory RAS features apply to RAS capabilities, controls and operations > +that are specific to memory. RAS2 PCC sub-spaces for memory-specific RAS > +features have a Feature Type of 0x00 (Memory). > + > +The platform can use the hardware-based memory scrubbing feature to expose > +controls and capabilities associated with hardware-based memory scrub > +engines. The RAS2 memory scrubbing feature supports following as per spec, > + > +1. Independent memory scrubbing controls for each NUMA domain, identified > + using its proximity domain. > + > +2. Provision for background (patrol) scrubbing of the entire memory system, > + as well as on-demand scrubbing for a specific region of memory. > + > +ACPI Address Range Scrubbing(ARS) > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > +ACPI spec 6.5 [2]_ section 9.19.7.2 describes Address Range Scrubbing(ARS). > +ARS allows the platform to communicate memory errors to system software. > +This capability allows system software to prevent accesses to addresses > +with uncorrectable errors in memory. ARS functions manage all NVDIMMs > +present in the system. Only one scrub can be in progress system wide > +at any given time. > +Following functions are supported as per the specification. > + > +1. Query ARS Capabilities for a given address range, indicates platform > + supports the ACPI NVDIMM Root Device Unconsumed Error Notification. > + > +2. Start ARS triggers an Address Range Scrub for the given memory range. > + Address scrubbing can be done for volatile memory, persistent memory, > + or both. > + > +3. Query ARS Status command allows software to get the status of ARS, > + including the progress of ARS and ARS error record. > + > +4. Clear Uncorrectable Error. > + > +5. Translate SPA > + > +6. ARS Error Inject etc. > + > +The kernel supports an existing control for ARS and ARS is currently not > +supported in EDAC. > + > +.. [1] https://computeexpresslink.org/cxl-specification/ > + > +.. [2] https://uefi.org/specs/ACPI/6.5/ > + > +Comparison of various scrubbing features > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > + +--------------+-----------+-----------+-----------+-----------+ > + | | ACPI | CXL patrol| CXL ECS | ARS | > + | Name | RAS2 | scrub | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | On-demand | Supported | No | No | Supported | > + | Scrubbing | | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Background | Supported | Supported | Supported | No | > + | scrubbing | | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Mode of | Scrub ctrl| per device| per memory| Unknown | > + | scrubbing | per NUMA | | media | | > + | | domain. | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Query scrub | Supported | Supported | Supported | Supported | > + | capabilities | | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Setting | Supported | No | No | Supported | > + | address range| | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Setting | Supported | Supported | No | No | > + | scrub rate | | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Unit for | Not | in hours | No | No | > + | scrub rate | Defined | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | Supported | | | | > + | Scrub | on-demand | No | No | Supported | > + | status/ | scrubbing | | | | > + | Completion | only | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | UC error | |CXL general|CXL general| ACPI UCE | > + | reporting | Exception |media/DRAM |media/DRAM | notify and| > + | | |event/media|event/media| query | > + | | |scan? |scan? | ARS status| > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Support for | Supported | Supported | Supported | No | > + | EDAC control | | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + > +The File System > +--------------- > + > +The control attributes of a registered scrubber instance could be > +accessed in the > + > +/sys/bus/edac/devices//scrubX/ > + > +sysfs > +----- > + > +Sysfs files are documented in > +`Documentation/ABI/testing/sysfs-edac-scrub` > diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig > index 2051a7c944a5..175d706168ab 100644 > --- a/drivers/edac/Kconfig > +++ b/drivers/edac/Kconfig > @@ -75,6 +75,15 @@ config EDAC_GHES > > In doubt, say 'Y'. > > +config EDAC_SCRUB > + bool "EDAC scrub feature" > + help > + The EDAC scrub feature is optional and is designed to control the > + memory scrubbers in the system. The common sysfs scrub interface > + abstracts the control of various arbitrary scrubbing functionalities > + into a unified set of functions. > + Say 'y/n' to enable/disable EDAC scrub feature. > + > config EDAC_AMD64 > tristate "AMD64 (Opteron, Athlon64)" > depends on AMD_NB && EDAC_DECODE_MCE > diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile > index 89789ba8275f..f2a86ed997b7 100644 > --- a/drivers/edac/Makefile > +++ b/drivers/edac/Makefile > @@ -13,6 +13,8 @@ edac_core-y += edac_module.o edac_device_sysfs.o wq.o > > edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o > > +edac_core-$(CONFIG_EDAC_SCRUB) += scrub.o > + > ifdef CONFIG_PCI > edac_core-y += edac_pci.o edac_pci_sysfs.o > endif > diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c > index 142a661ff543..40407f0ee600 100644 > --- a/drivers/edac/edac_device.c > +++ b/drivers/edac/edac_device.c > @@ -575,6 +575,7 @@ static void edac_dev_release(struct device *dev) > { > struct edac_dev_feat_ctx *ctx = container_of(dev, struct edac_dev_feat_ctx, dev); > > + kfree(ctx->scrub); > kfree(ctx->dev.groups); > kfree(ctx); > } > @@ -610,8 +611,10 @@ int edac_dev_register(struct device *parent, char *name, > const struct edac_dev_feature *ras_features) > { > const struct attribute_group **ras_attr_groups; > + struct edac_dev_data *dev_data; > struct edac_dev_feat_ctx *ctx; > int attr_gcnt = 0; > + int scrub_cnt = 0; > int ret, feat; > > if (!parent || !name || !num_features || !ras_features) > @@ -620,7 +623,10 @@ int edac_dev_register(struct device *parent, char *name, > /* Double parse to make space for attributes */ > for (feat = 0; feat < num_features; feat++) { > switch (ras_features[feat].ft_type) { > - /* Add feature specific code */ > + case RAS_FEAT_SCRUB: > + attr_gcnt++; > + scrub_cnt++; > + break; > default: > return -EINVAL; > } > @@ -636,13 +642,38 @@ int edac_dev_register(struct device *parent, char *name, > goto ctx_free; > } > > + if (scrub_cnt) { > + ctx->scrub = kcalloc(scrub_cnt, sizeof(*ctx->scrub), GFP_KERNEL); > + if (!ctx->scrub) { > + ret = -ENOMEM; > + goto groups_free; > + } > + } > + > attr_gcnt = 0; > + scrub_cnt = 0; > for (feat = 0; feat < num_features; feat++, ras_features++) { > switch (ras_features->ft_type) { > - /* Add feature specific code */ > + case RAS_FEAT_SCRUB: > + if (!ras_features->scrub_ops || > + scrub_cnt != ras_features->instance) > + goto data_mem_free; > + > + dev_data = &ctx->scrub[scrub_cnt]; > + dev_data->instance = scrub_cnt; > + dev_data->scrub_ops = ras_features->scrub_ops; > + dev_data->private = ras_features->ctx; > + ret = edac_scrub_get_desc(parent, &ras_attr_groups[attr_gcnt], > + ras_features->instance); > + if (ret) > + goto data_mem_free; > + > + scrub_cnt++; > + attr_gcnt++; > + break; > default: > ret = -EINVAL; > - goto groups_free; > + goto data_mem_free; > } > } > > @@ -655,7 +686,7 @@ int edac_dev_register(struct device *parent, char *name, > > ret = dev_set_name(&ctx->dev, name); > if (ret) > - goto groups_free; > + goto data_mem_free; > > ret = device_register(&ctx->dev); > if (ret) { > @@ -665,6 +696,8 @@ int edac_dev_register(struct device *parent, char *name, > > return devm_add_action_or_reset(parent, edac_dev_unreg, &ctx->dev); > > +data_mem_free: > + kfree(ctx->scrub); > groups_free: > kfree(ras_attr_groups); > ctx_free: > diff --git a/drivers/edac/scrub.c b/drivers/edac/scrub.c > new file mode 100755 > index 000000000000..e421d3ebd959 > --- /dev/null > +++ b/drivers/edac/scrub.c > @@ -0,0 +1,209 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * The generic EDAC scrub driver controls the memory scrubbers in the > + * system. The common sysfs scrub interface abstracts the control of > + * various arbitrary scrubbing functionalities into a unified set of > + * functions. > + * > + * Copyright (c) 2024-2025 HiSilicon Limited. > + */ > + > +#include > + > +enum edac_scrub_attributes { > + SCRUB_ADDRESS, > + SCRUB_SIZE, > + SCRUB_ENABLE_BACKGROUND, > + SCRUB_MIN_CYCLE_DURATION, > + SCRUB_MAX_CYCLE_DURATION, > + SCRUB_CUR_CYCLE_DURATION, > + SCRUB_MAX_ATTRS > +}; > + > +struct edac_scrub_dev_attr { > + struct device_attribute dev_attr; > + u8 instance; > +}; > + > +struct edac_scrub_context { > + char name[EDAC_FEAT_NAME_LEN]; > + struct edac_scrub_dev_attr scrub_dev_attr[SCRUB_MAX_ATTRS]; > + struct attribute *scrub_attrs[SCRUB_MAX_ATTRS + 1]; > + struct attribute_group group; > +}; > + > +#define TO_SCRUB_DEV_ATTR(_dev_attr) \ > + container_of(_dev_attr, struct edac_scrub_dev_attr, dev_attr) > + > +#define EDAC_SCRUB_ATTR_SHOW(attrib, cb, type, format) \ > +static ssize_t attrib##_show(struct device *ras_feat_dev, \ > + struct device_attribute *attr, char *buf) \ > +{ \ > + u8 inst = TO_SCRUB_DEV_ATTR(attr)->instance; \ > + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev); \ > + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops; \ > + type data; \ > + int ret; \ > + \ > + ret = ops->cb(ras_feat_dev->parent, ctx->scrub[inst].private, &data); \ > + if (ret) \ > + return ret; \ > + \ > + return sysfs_emit(buf, format, data); \ > +} > + > +EDAC_SCRUB_ATTR_SHOW(addr, read_addr, u64, "0x%llx\n") > +EDAC_SCRUB_ATTR_SHOW(size, read_size, u64, "0x%llx\n") > +EDAC_SCRUB_ATTR_SHOW(enable_background, get_enabled_bg, bool, "%u\n") > +EDAC_SCRUB_ATTR_SHOW(min_cycle_duration, get_min_cycle, u32, "%u\n") > +EDAC_SCRUB_ATTR_SHOW(max_cycle_duration, get_max_cycle, u32, "%u\n") > +EDAC_SCRUB_ATTR_SHOW(current_cycle_duration, get_cycle_duration, u32, "%u\n") > + > +#define EDAC_SCRUB_ATTR_STORE(attrib, cb, type, conv_func) \ > +static ssize_t attrib##_store(struct device *ras_feat_dev, \ > + struct device_attribute *attr, \ > + const char *buf, size_t len) \ > +{ \ > + u8 inst = TO_SCRUB_DEV_ATTR(attr)->instance; \ > + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev); \ > + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops; \ > + type data; \ > + int ret; \ > + \ > + ret = conv_func(buf, 0, &data); \ > + if (ret < 0) \ > + return ret; \ > + \ > + ret = ops->cb(ras_feat_dev->parent, ctx->scrub[inst].private, data); \ > + if (ret) \ > + return ret; \ > + \ > + return len; \ > +} > + > +EDAC_SCRUB_ATTR_STORE(addr, write_addr, u64, kstrtou64) > +EDAC_SCRUB_ATTR_STORE(size, write_size, u64, kstrtou64) > +EDAC_SCRUB_ATTR_STORE(enable_background, set_enabled_bg, unsigned long, kstrtoul) > +EDAC_SCRUB_ATTR_STORE(current_cycle_duration, set_cycle_duration, unsigned long, kstrtoul) > + > +static umode_t scrub_attr_visible(struct kobject *kobj, struct attribute *a, int attr_id) > +{ > + struct device *ras_feat_dev = kobj_to_dev(kobj); > + struct device_attribute *dev_attr = container_of(a, struct device_attribute, attr); > + u8 inst = TO_SCRUB_DEV_ATTR(dev_attr)->instance; > + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev); > + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops; > + > + switch (attr_id) { > + case SCRUB_ADDRESS: > + if (ops->read_addr) { > + if (ops->write_addr) > + return a->mode; > + else > + return 0444; > + } > + break; > + case SCRUB_SIZE: > + if (ops->read_size) { > + if (ops->write_size) > + return a->mode; > + else > + return 0444; > + } > + break; > + case SCRUB_ENABLE_BACKGROUND: > + if (ops->get_enabled_bg) { > + if (ops->set_enabled_bg) > + return a->mode; > + else > + return 0444; > + } > + break; > + case SCRUB_MIN_CYCLE_DURATION: > + if (ops->get_min_cycle) > + return a->mode; > + break; > + case SCRUB_MAX_CYCLE_DURATION: > + if (ops->get_max_cycle) > + return a->mode; > + break; > + case SCRUB_CUR_CYCLE_DURATION: > + if (ops->get_cycle_duration) { > + if (ops->set_cycle_duration) > + return a->mode; > + else > + return 0444; > + } > + break; > + default: > + break; > + } > + > + return 0; > +} > + > +#define EDAC_SCRUB_ATTR_RO(_name, _instance) \ > + ((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_RO(_name), \ > + .instance = _instance }) > + > +#define EDAC_SCRUB_ATTR_WO(_name, _instance) \ > + ((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_WO(_name), \ > + .instance = _instance }) > + > +#define EDAC_SCRUB_ATTR_RW(_name, _instance) \ > + ((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_RW(_name), \ > + .instance = _instance }) > + > +static int scrub_create_desc(struct device *scrub_dev, > + const struct attribute_group **attr_groups, u8 instance) > +{ > + struct edac_scrub_context *scrub_ctx; > + struct attribute_group *group; > + int i; > + struct edac_scrub_dev_attr dev_attr[] = { > + [SCRUB_ADDRESS] = EDAC_SCRUB_ATTR_RW(addr, instance), > + [SCRUB_SIZE] = EDAC_SCRUB_ATTR_RW(size, instance), > + [SCRUB_ENABLE_BACKGROUND] = EDAC_SCRUB_ATTR_RW(enable_background, instance), > + [SCRUB_MIN_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RO(min_cycle_duration, instance), > + [SCRUB_MAX_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RO(max_cycle_duration, instance), > + [SCRUB_CUR_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RW(current_cycle_duration, instance) > + }; > + > + scrub_ctx = devm_kzalloc(scrub_dev, sizeof(*scrub_ctx), GFP_KERNEL); > + if (!scrub_ctx) > + return -ENOMEM; > + > + group = &scrub_ctx->group; > + for (i = 0; i < SCRUB_MAX_ATTRS; i++) { > + memcpy(&scrub_ctx->scrub_dev_attr[i], &dev_attr[i], sizeof(dev_attr[i])); > + scrub_ctx->scrub_attrs[i] = &scrub_ctx->scrub_dev_attr[i].dev_attr.attr; > + } > + sprintf(scrub_ctx->name, "%s%d", "scrub", instance); > + group->name = scrub_ctx->name; > + group->attrs = scrub_ctx->scrub_attrs; > + group->is_visible = scrub_attr_visible; > + > + attr_groups[0] = group; > + > + return 0; > +} > + > +/** > + * edac_scrub_get_desc - get EDAC scrub descriptors > + * @scrub_dev: client device, with scrub support > + * @attr_groups: pointer to attribute group container > + * @instance: device's scrub instance number. > + * > + * Return: > + * * %0 - Success. > + * * %-EINVAL - Invalid parameters passed. > + * * %-ENOMEM - Dynamic memory allocation failed. > + */ > +int edac_scrub_get_desc(struct device *scrub_dev, > + const struct attribute_group **attr_groups, u8 instance) > +{ > + if (!scrub_dev || !attr_groups) > + return -EINVAL; > + > + return scrub_create_desc(scrub_dev, attr_groups, instance); > +} > diff --git a/include/linux/edac.h b/include/linux/edac.h > index 8c4b6ca2a994..1cbab08720df 100644 > --- a/include/linux/edac.h > +++ b/include/linux/edac.h > @@ -662,13 +662,54 @@ static inline struct dimm_info *edac_get_dimm(struct mem_ctl_info *mci, > return mci->dimms[index]; > } > > +#define EDAC_FEAT_NAME_LEN 128 > + > /* RAS feature type */ > enum edac_dev_feat { > + RAS_FEAT_SCRUB, > RAS_FEAT_MAX > }; > > +/** > + * struct edac_scrub_ops - scrub device operations (all elements optional) > + * @read_addr: read base address of scrubbing range. > + * @read_size: read offset of scrubbing range. > + * @write_addr: set base address of the scrubbing range. > + * @write_size: set offset of the scrubbing range. > + * @get_enabled_bg: check if currently performing background scrub. > + * @set_enabled_bg: start or stop a bg-scrub. > + * @get_min_cycle: get minimum supported scrub cycle duration in seconds. > + * @get_max_cycle: get maximum supported scrub cycle duration in seconds. > + * @get_cycle_duration: get current scrub cycle duration in seconds. > + * @set_cycle_duration: set current scrub cycle duration in seconds. > + */ > +struct edac_scrub_ops { > + int (*read_addr)(struct device *dev, void *drv_data, u64 *base); > + int (*read_size)(struct device *dev, void *drv_data, u64 *size); > + int (*write_addr)(struct device *dev, void *drv_data, u64 base); > + int (*write_size)(struct device *dev, void *drv_data, u64 size); > + int (*get_enabled_bg)(struct device *dev, void *drv_data, bool *enable); > + int (*set_enabled_bg)(struct device *dev, void *drv_data, bool enable); > + int (*get_min_cycle)(struct device *dev, void *drv_data, u32 *min); > + int (*get_max_cycle)(struct device *dev, void *drv_data, u32 *max); > + int (*get_cycle_duration)(struct device *dev, void *drv_data, u32 *cycle); > + int (*set_cycle_duration)(struct device *dev, void *drv_data, u32 cycle); > +}; > + > +#if IS_ENABLED(CONFIG_EDAC_SCRUB) > +int edac_scrub_get_desc(struct device *scrub_dev, > + const struct attribute_group **attr_groups, > + u8 instance); > +#else > +static inline int edac_scrub_get_desc(struct device *scrub_dev, > + const struct attribute_group **attr_groups, > + u8 instance) > +{ return -EOPNOTSUPP; } > +#endif /* CONFIG_EDAC_SCRUB */ > + > /* EDAC device feature information structure */ > struct edac_dev_data { > + const struct edac_scrub_ops *scrub_ops; > u8 instance; > void *private; > }; > @@ -676,11 +717,13 @@ struct edac_dev_data { > struct edac_dev_feat_ctx { > struct device dev; > void *private; > + struct edac_dev_data *scrub; > }; > > struct edac_dev_feature { > enum edac_dev_feat ft_type; > u8 instance; > + const struct edac_scrub_ops *scrub_ops; > void *ctx; > }; > > -- > 2.43.0 >