Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Alex Williamson <alwilliamson@nvidia.com>
To: "Dan Williams (nvidia)" <djbw@kernel.org>
Cc: Srirangan Madhavan <smadhavan@nvidia.com>,
	Alison Schofield <alison.schofield@intel.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Ira Weiny <ira.weiny@intel.com>,
	Jonathan Cameron <jic23@kernel.org>,
	Vishal Verma <vishal.l.verma@intel.com>,
	linux-cxl@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, vsethi@nvidia.com,
	Dan Williams <danwilliams@nvidia.com>,
	Sai Yashwanth Reddy Kancherla <skancherla@nvidia.com>,
	Vishal Aslot <vaslot@nvidia.com>,
	Manish Honap <mhonap@nvidia.com>, Jiandi An <jan@nvidia.com>,
	Richard Cheng <icheng@nvidia.com>,
	linux-tegra@vger.kernel.org
Subject: Re: [PATCH v7 08/11] cxl: Coordinate sibling functions for CXL reset
Date: Fri, 26 Jun 2026 16:08:46 -0600	[thread overview]
Message-ID: <20260626160846.426e71b6@nvidia.com> (raw)
In-Reply-To: <6a3b100798704_3c9f100a9@djbw-dev.notmuch>

On Tue, 23 Jun 2026 16:00:23 -0700
"Dan Williams (nvidia)" <djbw@kernel.org> wrote:

> Srirangan Madhavan wrote:
> > CXL Device Reset affects all CXL.cache and CXL.mem functions in the reset
> > scope. Lock same-scope siblings with pci_dev_trylock(), save/disable them,
> > drain pending transactions, and hold IOMMU reset blocks until recovery.
> > 
> > Also include mem-capable siblings in HDM range validation and CPU cache
> > invalidation. Cache-only siblings are quiesced, but skipped for HDM range
> > handling.  
> 
> PCI reset locking and ordering is already a source of some burden
> without adding this new sibling model to consider.
> 
> Is there evidence that multi-function CXL devices, where most of the
> functions are non-CXL, is going to be a common occurrence?
> 
> In other words if CXL reset borrowed the bus reset locking model:
> 
>     if (pci_bus_trylock(bus)) {
>         pci_bus_save_and_disable_locked(bus);
>         might_sleep();
> 	rc = cxl_request_and_flush_hdm(bus);
> 	if (rc == 0) {
> 		rc = cxl_reset_execute(pdev);
> 		cxl_release_and_flush_hdm(bus);
> 	}
>         pci_bus_restore_locked(bus);
>         pci_bus_unlock(bus);
>     }
> 
> The cost is disturbing some non-CXL functions, the benefit is reusing an
> existing reset order / locking model.

I'd say further that this exceeds the boundaries of what
pci_reset_function(), or the @reset sysfs attribute per pci_dev, is
scoped to do.  pci_reset_function() must limit the scope to the
pci_dev (and in this case the CXL state associated with only that
pci_dev).  See for instance how bus and slot use cases through
pci_reset_function() are limited to non-multifunction devices.

For multiple functions, the precedent is something more like
pci_reset_bus(), where the caller is responsible for coordinating the
set of affected devices.  The locking is still complicated, but at
least it's managed in vfio-pci-core, with a variant driver that
actually owns the device, rather than pci-core.

Also note that there's currently no mechanism for performing a
multi-function scoped reset through sysfs (excluding raw access to the
parent bridge that bypasses all save/restore mechanics).  I'd suggest
that cxl_reset can only be available as a function scoped reset when
only function 0 supports cxl.mem or cxl.cache, but that may also lead
to the question of whether the reset sysfs attribute should be exposed
at all if it only resets the cxl.io state, for example via FLR.  Thanks,

Alex

  reply	other threads:[~2026-06-26 22:08 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
2026-06-23  3:24 ` [PATCH v7 01/11] cxl: Split decoder programming into a reusable helper Srirangan Madhavan
2026-06-23  3:42   ` sashiko-bot
2026-06-23  3:24 ` [PATCH v7 02/11] cxl: Cache decoder settings on PCI devices Srirangan Madhavan
2026-06-23  3:42   ` sashiko-bot
2026-06-23 23:13   ` Dan Williams (nvidia)
2026-06-23  3:24 ` [PATCH v7 03/11] cxl: Cache endpoint decoder settings during PCI enumeration Srirangan Madhavan
2026-06-23  3:45   ` sashiko-bot
2026-06-24  2:15   ` Dan Williams (nvidia)
2026-06-23  3:24 ` [PATCH v7 04/11] PCI: Export pci_dev_save_and_disable() and pci_dev_restore() Srirangan Madhavan
2026-06-23  3:34   ` sashiko-bot
2026-06-24  2:17   ` Dan Williams (nvidia)
2026-06-23  3:24 ` [PATCH v7 05/11] cxl: Add CXL Device Reset helper Srirangan Madhavan
2026-06-23  3:36   ` sashiko-bot
2026-06-24 14:33   ` Richard Cheng
2026-06-23  3:24 ` [PATCH v7 06/11] cxl: Validate HDM ranges before CXL reset Srirangan Madhavan
2026-06-23  3:33   ` sashiko-bot
2026-06-23  3:24 ` [PATCH v7 07/11] PCI/cxl: Discover the CXL reset scope Srirangan Madhavan
2026-06-23  3:34   ` sashiko-bot
2026-06-23  3:24 ` [PATCH v7 08/11] cxl: Coordinate sibling functions for CXL reset Srirangan Madhavan
2026-06-23  3:42   ` sashiko-bot
2026-06-23 23:00   ` Dan Williams (nvidia)
2026-06-26 22:08     ` Alex Williamson [this message]
2026-06-23  3:24 ` [PATCH v7 09/11] cxl: Restore CXL HDM state after PCI reset Srirangan Madhavan
2026-06-23  3:39   ` sashiko-bot
2026-06-24 14:55   ` Richard Cheng
2026-06-23  3:24 ` [PATCH v7 10/11] PCI/cxl: Expose CXL Reset as a PCI reset method Srirangan Madhavan
2026-06-23  3:47   ` sashiko-bot
2026-06-26 22:08   ` Alex Williamson
2026-06-23  3:24 ` [PATCH v7 11/11] Documentation/ABI: Document CXL Reset " Srirangan Madhavan
2026-06-23  3:35   ` sashiko-bot
2026-06-24 14:26 ` [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Richard Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260626160846.426e71b6@nvidia.com \
    --to=alwilliamson@nvidia.com \
    --cc=alison.schofield@intel.com \
    --cc=bhelgaas@google.com \
    --cc=danwilliams@nvidia.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=djbw@kernel.org \
    --cc=icheng@nvidia.com \
    --cc=ira.weiny@intel.com \
    --cc=jan@nvidia.com \
    --cc=jic23@kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=mhonap@nvidia.com \
    --cc=skancherla@nvidia.com \
    --cc=smadhavan@nvidia.com \
    --cc=vaslot@nvidia.com \
    --cc=vishal.l.verma@intel.com \
    --cc=vsethi@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox