Linux Tegra architecture development
 help / color / mirror / Atom feed
From: Srirangan Madhavan <smadhavan@nvidia.com>
To: linux-cxl@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org
Cc: vsethi@nvidia.com, alwilliamson@nvidia.com,
	Dan Williams <danwilliams@nvidia.com>,
	Sai Yashwanth Reddy Kancherla <skancherla@nvidia.com>,
	Vishal Aslot <vaslot@nvidia.com>,
	Manish Honap <mhonap@nvidia.com>, Jiandi An <jan@nvidia.com>,
	Richard Cheng <icheng@nvidia.com>,
	linux-tegra@vger.kernel.org,
	Srirangan Madhavan <smadhavan@nvidia.com>
Subject: [PATCH v6 0/9] cxl: Add cxl_reset sysfs attribute for memdevs
Date: Thu, 28 May 2026 08:31:45 +0000	[thread overview]
Message-ID: <20260528083154.137979-1-smadhavan@nvidia.com> (raw)

Hi folks!

This patch series introduces support for the CXL Reset method for CXL
Type 2 devices, implementing the reset procedure outlined in the CXL
Specification r3.2 [1], Sections 8.1.3, 9.6, and 9.7.

The userspace ABI is a write-only cxl_reset attribute under the CXL
memdev device:

    /sys/bus/cxl/devices/memX/cxl_reset

The memdev is the userspace handle, while the implementation coordinates
the target PCI function, affected sibling PCI functions, active CXL
memdevs, and any CXL regions reachable through those memdevs.

v6 changes (from v5 [2]):
- Rebased on the current CXL tree used for v7.1-rc4 development.
- Move the ABI from /sys/bus/pci/devices/.../cxl_reset to
  /sys/bus/cxl/devices/memX/cxl_reset.
- Use the memdev as the userspace handle while keeping the reset
  orchestration scoped to the CXL device reset scope.
- Reduce the earlier PCI/CXL save/restore series [3] to a single CXL HDM
  decoder restore/commit helper patch, included here as patch 1.
- Do not offline or hot-remove memory as part of reset. Return -EBUSY
  if an affected CXL region is online as System RAM or has an active
  region driver bound.
- Add reset-idle validation and CPU cache invalidation for affected CXL
  regions.
- Add CXL sibling PCI function discovery using the Non-CXL Function Map
  DVSEC and CXL.cache/CXL.mem capability bits.
- Coordinate PCI save/disable/restore and IOMMU reset prepare/done for
  the target and affected sibling functions.
- Add CXL DVSEC reset sequencing, including CXL.cache disable,
  writeback-invalidate, a minimum 100ms quiet period, reset-complete
  polling, and Reset Error reporting.
- Track affected memdevs, lock active memdevs across reset, restore and
  commit decoder state, re-enable CXL.mem, and wait for media ready
  after reset.
- Cache reset capability at memdev registration time for sysfs
  visibility.
- Document reset scope, Memory Clear not being requested, and -EBUSY
  behavior for active CXL regions.

Motivation:
-----------
- As support for Type 2 devices is being introduced, more devices need a
  CXL-specific reset mechanism beyond bus-wide PCI reset methods.

- FLR does not affect CXL.cache or CXL.mem protocol state, making CXL
  Reset the appropriate mechanism for cases where those protocols must
  be reset.

- The CXL specification highlights use cases such as function rebinding
  and error recovery where CXL Reset is explicitly required.

Change Description:
-------------------

Patch 1: cxl/hdm: Add helpers to restore and commit memdev decoders
- Restore endpoint decoder programming from CXL core's cached decoder
  objects while keeping CXL.mem disabled.
- Commit restored HDM decoders as a separate step so reset orchestration
  can re-enable CXL.mem only after safety checks complete.

Patch 2: PCI: Export pci_dev_save_and_disable() and pci_dev_restore()
- Export PCI reset lifecycle helpers so CXL reset orchestration can save,
  disable, restore, and invoke reset callbacks for affected functions.

Patch 3: cxl: Add reset-idle and cache flush helpers
- Collect CXL regions affected by a memdev reset.
- Fail reset if affected regions are not idle.
- Invalidate CPU caches for each affected region once.

Patch 4: PCI/CXL: Add sibling function coordination for reset
- Identify CXL.cache/CXL.mem sibling functions in the reset scope.
- Use the Non-CXL Function Map DVSEC to exclude non-CXL functions.
- Save, disable, restore, and unlock affected PCI sibling functions.

Patch 5: cxl/pci: Add CXL DVSEC reset helper
- Execute CXL Reset through the CXL Device DVSEC.
- Disable CXL.cache and request writeback-invalidate where supported.
- Enforce the post-reset quiet period and poll for reset completion.
- Block and restore IOMMU traffic while reset is active.

Patch 6: cxl/pci: Track memdevs affected by CXL reset
- Track the target memdev and any sibling-function memdevs affected by
  reset.
- Revalidate and lock active memdevs before reset proceeds.

Patch 7: cxl/pci: Orchestrate CXL reset for affected memdevs
- Coordinate region validation, CPU cache invalidation, PCI function
  preparation, DVSEC reset, decoder restore and commit, CXL.mem enable,
  and media-ready wait.

Patch 8: cxl/memdev: Add cxl_reset sysfs attribute
- Expose /sys/bus/cxl/devices/memX/cxl_reset.
- Only make the attribute visible when the underlying PCI function is
  Type 2 and reset capable.
- Write a boolean true value, such as "1" or "true", to trigger reset.

Patch 9: Documentation/ABI: Document CXL memdev cxl_reset
- Document the new memdev sysfs ABI, reset scope, Memory Clear behavior,
  and idle-region requirement.

The CPU cache invalidation step depends on
cpu_cache_invalidate_memregion() support for the affected address ranges.
If no provider is available, reset fails before hardware reset is
requested.

Command line to test CXL reset on a capable memdev:

    echo 1 > /sys/bus/cxl/devices/memX/cxl_reset

Basic CXL DVSEC reset testing was done on a CXL Type 2 device. The reset
sequence completed successfully and ResetComplete was observed. Full
memdev/region integration testing is still in progress.

References:
[1] https://computeexpresslink.org/wp-content/uploads/2024/12/CXL_3.2-Spec-Announcement_FINAL-1.pdf
[2] https://lore.kernel.org/linux-cxl/20260306092322.148765-1-smadhavan@nvidia.com/
[3] https://lore.kernel.org/linux-cxl/20260306080026.116789-1-smadhavan@nvidia.com/

Srirangan Madhavan (9):
  cxl/hdm: Add helpers to restore and commit memdev decoders
  PCI: Export pci_dev_save_and_disable() and pci_dev_restore()
  cxl: Add reset-idle and cache flush helpers
  PCI/CXL: Add sibling function coordination for reset
  cxl/pci: Add CXL DVSEC reset helper
  cxl/pci: Track memdevs affected by CXL reset
  cxl/pci: Orchestrate CXL reset for affected memdevs
  cxl/memdev: Add cxl_reset sysfs attribute
  Documentation/ABI: Document CXL memdev cxl_reset

 Documentation/ABI/testing/sysfs-bus-cxl |   28 +
 drivers/cxl/core/hdm.c                  |  318 ++++++-
 drivers/cxl/core/memdev.c               |   30 +
 drivers/cxl/core/pci.c                  | 1140 +++++++++++++++++++++++
 drivers/cxl/cxl.h                       |    5 +
 drivers/cxl/cxlmem.h                    |    2 +
 drivers/pci/pci.c                       |   22 +-
 include/linux/pci.h                     |    2 +
 include/uapi/linux/pci_regs.h           |   15 +
 9 files changed, 1557 insertions(+), 5 deletions(-)

base-commit: abb3c0de119032f4c0c81177884a3bb0a133e6ca
-- 
2.43.0

             reply	other threads:[~2026-05-28  8:32 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-28  8:31 Srirangan Madhavan [this message]
2026-05-28  8:31 ` [PATCH v6 1/9] cxl/hdm: Add helpers to restore and commit memdev decoders Srirangan Madhavan
2026-05-28 11:06   ` Richard Cheng
2026-05-28  8:31 ` [PATCH v6 2/9] PCI: Export pci_dev_save_and_disable() and pci_dev_restore() Srirangan Madhavan
2026-05-28  8:31 ` [PATCH v6 3/9] cxl: Add reset-idle and cache flush helpers Srirangan Madhavan
2026-05-28  8:31 ` [PATCH v6 4/9] PCI/CXL: Add sibling function coordination for reset Srirangan Madhavan
2026-05-28 11:15   ` Richard Cheng
2026-05-28  8:31 ` [PATCH v6 5/9] cxl/pci: Add CXL DVSEC reset helper Srirangan Madhavan
2026-05-28  8:31 ` [PATCH v6 6/9] cxl/pci: Track memdevs affected by CXL reset Srirangan Madhavan
2026-05-28  8:31 ` [PATCH v6 7/9] cxl/pci: Orchestrate CXL reset for affected memdevs Srirangan Madhavan
2026-05-28  8:31 ` [PATCH v6 8/9] cxl/memdev: Add cxl_reset sysfs attribute Srirangan Madhavan
2026-05-28  8:31 ` [PATCH v6 9/9] Documentation/ABI: Document CXL memdev cxl_reset Srirangan Madhavan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260528083154.137979-1-smadhavan@nvidia.com \
    --to=smadhavan@nvidia.com \
    --cc=alwilliamson@nvidia.com \
    --cc=danwilliams@nvidia.com \
    --cc=icheng@nvidia.com \
    --cc=jan@nvidia.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=mhonap@nvidia.com \
    --cc=skancherla@nvidia.com \
    --cc=vaslot@nvidia.com \
    --cc=vsethi@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox