From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Vishal Aslot <vaslot@nvidia.com>
Cc: Srirangan Madhavan <smadhavan@nvidia.com>,
"bhelgaas@google.com" <bhelgaas@google.com>,
"dan.j.williams@intel.com" <dan.j.williams@intel.com>,
"dave.jiang@intel.com" <dave.jiang@intel.com>,
"ira.weiny@intel.com" <ira.weiny@intel.com>,
"vishal.l.verma@intel.com" <vishal.l.verma@intel.com>,
"alison.schofield@intel.com" <alison.schofield@intel.com>,
"dave@stgolabs.net" <dave@stgolabs.net>,
Alex Williamson <alwilliamson@nvidia.com>,
Jeshua Smith <jeshuas@nvidia.com>,
Vikram Sethi <vsethi@nvidia.com>,
"Sai Yashwanth Reddy Kancherla" <skancherla@nvidia.com>,
Shanker Donthineni <sdonthineni@nvidia.com>,
Manish Honap <mhonap@nvidia.com>, Vidya Sagar <vidyas@nvidia.com>,
Jiandi An <jan@nvidia.com>, Matt Ochs <mochs@nvidia.com>,
Derek Schumacher <dschumacher@nvidia.com>,
"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/5] PCI/CXL: Save and restore CXL DVSEC and HDM state across resets
Date: Mon, 16 Mar 2026 17:28:07 +0000 [thread overview]
Message-ID: <20260316172807.00000abc@huawei.com> (raw)
In-Reply-To: <SA3PR12MB9131917FA823F0F659BEBDB6BB40A@SA3PR12MB9131.namprd12.prod.outlook.com>
On Mon, 16 Mar 2026 13:59:59 +0000
Vishal Aslot <vaslot@nvidia.com> wrote:
> Hi Jonathan,
> Replying on behalf of Srirangan. Srirangan, feel free to add to this.
Hi Vishal,
I've more hacked the thread to make it more readable going forwards
(it's tricky to spot replies in plain text otherwise!) If you can
persuade your email client to do quoting in similar fashion that would
be great.
>
> > In general I think
> > we have a problem if a driver is relying on the bios having set up the
> > decoders and simply doesn't function if the bios didn't do it (and
> > that applies in this reset case as well).
>
> The issue for us (and possibly for other devices as well) is that it
> is possible to reset a device without the device driver being loaded
> at all. One usecase is in-band fw update followed by reset to cause
> the device to go through its own secure boot. Now, arguably the
> device driver can restore DVSEC + HDM decoder + whatever else once it
> loads. If the driver is not loaded, it is not possible to use the
> device anyway. The main thing here is that without the driver
> present, reset_prep() and reset_done() cannot be handled, so we
> wanted this to be restored by the kernel itself.
Understood the use case. I'm just concerned on how vague the spec
leaves things.
I think we maybe need to go get some specification clarifications
or we are going to end up with a long running quirk list for when
this is safe to do restoration. One thing I did think about is
restoring the stuff the bios setup with the exception of some of
the 'enables'. That might be not turning on mem.enable or something
similar (maybe not committing the decoders). Thus 'safe' but
retaining info in the hardware state.
Dan mentioned in another thread that he was looking at how we retain
some information on bios config over resets - that might help as well.
We can probably also 'steal' info from the RP assuming it's not
do a single RP hostbridge with pass through. The host window is
visible there for instance.
>
> There is a virtualization as well as confidential computing angle
> here as well but I am not fully versed in that path to comment.
That's messy as well as we need to extract some data, but also reset
to a clean state. Confidential compute is more fun in general as I'm
not sure we can restore TSP after a reset. Certainly nowhere near doing
so today.
>
> > I'm thinking the reset flow is a good deal more complex than simply
> >
> > putting the bios programmed values back. In some cases that might
> > be a very bad idea as autonomous traffic can hit the type 2 device
> > the moment these decoders are enabled and I'm guessing that may be
> > before the device has fully recovered.
>
> Yes, completely agree. In fact for our GPUs it's a combinatin of
> kernel-first and firmware-first. But we, at least, for our use-case
> ensure that the device will not be used before reset is completed by
> means outside of the OS control.
I'm curious on this. How? Do you mean the device is safe to be hit
with CPU side prefetchers afterc cachelines from CXL.mem immediately
after reset? Or the checks in here are good enough for that to be safe?
>
> > Note it would be good to document what is restored ....more clearly
> >
> I agree. Srirangan, could you please list out the DVSECs and
> capabilities that you are saving and restoring? In general, we should
> be saving and restoring anything that was locked but possibly got
> unlocked or was set back to its reset value. Things like mem.enable,
> cache.enable in DVSEC 0, range registers, HDM decoders would be at
> least unlocked if no wiped out, etc.
> ________________________________________ From: Jonathan Cameron
> <jonathan.cameron@huawei.com> Sent: Thursday, March 12, 2026 7:34 AM
> To: Srirangan Madhavan Cc: bhelgaas@google.com;
> dan.j.williams@intel.com; dave.jiang@intel.com; ira.weiny@intel.com;
> vishal.l.verma@intel.com; alison.schofield@intel.com;
> dave@stgolabs.net; Alex Williamson; Jeshua Smith; Vikram Sethi; Sai
> Yashwanth Reddy Kancherla; Vishal Aslot; Shanker Donthineni; Manish
> Honap; Vidya Sagar; Jiandi An; Matt Ochs; Derek Schumacher;
> linux-cxl@vger.kernel.org; linux-pci@vger.kernel.org;
> linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/5] PCI/CXL: Save
> and restore CXL DVSEC and HDM state across resets
>
> External email: Use caution opening links or attachments
>
>
> On Fri, 6 Mar 2026 08:00:14 +0000
> smadhavan@nvidia.com wrote:
>
> > From: Srirangan Madhavan <smadhavan@nvidia.com>
> >
> > CXL devices could lose their DVSEC configuration and HDM decoder
> > programming after multiple reset methods (whenever link
> > disable/enable). This means a device that was fully configured —
> > with DVSEC control/range registers set and HDM decoders committed —
> > loses that state after reset. In cases where these are programmed
> > by firmware, downstream drivers are unable to re-initialize the
> > device because CXL memory ranges are no longer mapped.
>
> Hi Srirangan,
>
> Firstly this might be because I'm behind on patch review and there is
> a lot going on right now! So this might be addressed in a different
> series.
>
> I'd like to understand the whole use case + flow here. In general I
> think we have a problem if a driver is relying on the bios having set
> up the decoders and simply doesn't function if the bios didn't do it
> (and that applies in this reset case as well). For starters, no
> hotplug. Anyhow, that's a different issue, so we can leave that for
> now.
>
> I'm thinking the reset flow is a good deal more complex than simply
> putting the bios programmed values back. In some cases that might
> be a very bad idea as autonomous traffic can hit the type 2 device
> the moment these decoders are enabled and I'm guessing that may be
> before the device has fully recovered. There are very few spec rules
> about this that I can recall. On the setup path the BIOS presumably
> got the device into a state where enabling such traffic was fine
> and hopefully the driver bind doesn't break that state.
>
> I think you are restoring CXL.mem as well so that gate isn't
> going to save us. Note it would be good to document what is restored
> and why more clearly. Sure we can figure it out from the code, but
> a document might make life easier.
>
> A device might handle this mess for us, but I doubt that this is
> universal. For type 3 devices, I'm not sure what we want to do on
> reset in general.
>
> Anyhow, this is really a request for a more detailed description of
> the expected reset flow that goes into what the spec constrains and
> what it doesn't. Probably something worthy of going in Documentation.
>
> Thanks,
>
> Jonathan
>
> >
> > This series adds CXL state save/restore logic to the PCI core so
> > that DVSEC and HDM decoder state is preserved across any PCI reset
> > path that calls pci_save_state() / pci_restore_state(), for a CXL
> > capable device.
> >
> > HDM decoder defines and the cxl_register_map infrastructure are
> > moved from internal CXL driver headers to a new public
> > include/cxl/pci.h, allowing drivers/pci/cxl.c to use them.
> > This layout aligns with Alejandro Lucero's CXL Type-2 device series
> > [1] to minimize conflicts when both land. When he rebases to
> > 7.0-rc2, I can move my changes on top of his.
> >
> > These patches were previously part of the CXL reset series and have
> > been split out [2] to allow independent review and merging. Review
> > feedback on the save/restore portions from v4 has been addressed.
> >
> > Tested on a CXL Type-2 device. DVSEC and HDM state is correctly
> > saved before reset and restored after, with decoder commit
> > confirmed via the COMMITTED status bit. Type-3 device testing is in
> > progress.
> >
> > This series is based on v7.0-rc1.
> >
> > [1] https://lore.kernel.org/linux-cxl/20260201155438.2664640-1-alejandro.lucero-palau@amd.com/
> > [2] https://lore.kernel.org/linux-cxl/aa8d4f6a-e7bd-4a20-8d34-4376ea314b8f@intel.com/T/#m825c6bdd1934022123807e86d235358a63b08dbc
> >
> > Srirangan Madhavan (5):
> > PCI: Add CXL DVSEC control, lock, and range register definitions
> > cxl: Move HDM decoder and register map definitions to
> > include/cxl/pci.h
> > PCI: Add virtual extended cap save buffer for CXL state
> > PCI: Add cxl DVSEC state save/restore across resets
> > PCI/CXL: Add HDM decoder state save/restore
> >
> > drivers/cxl/cxl.h | 107 +-------
> > drivers/cxl/cxlpci.h | 10 -
> > drivers/pci/Kconfig | 4 +
> > drivers/pci/Makefile | 1 +
> > drivers/pci/cxl.c | 468
> > ++++++++++++++++++++++++++++++++++ drivers/pci/pci.c |
> > 23 ++ drivers/pci/pci.h | 18 ++
> > include/cxl/pci.h | 129 ++++++++++
> > include/uapi/linux/pci_regs.h | 6 +
> > 9 files changed, 650 insertions(+), 116 deletions(-)
> > create mode 100644 drivers/pci/cxl.c
> > create mode 100644 include/cxl/pci.h
> >
> > base-commit: 6de23f81a5e0
> > --
> > 2.43.0
> >
> >
>
>
>
prev parent reply other threads:[~2026-03-16 17:28 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-06 8:00 [PATCH 0/5] PCI/CXL: Save and restore CXL DVSEC and HDM state across resets smadhavan
2026-03-06 8:00 ` [PATCH 1/5] PCI: Add CXL DVSEC control, lock, and range register definitions smadhavan
2026-03-06 17:45 ` Alex Williamson
2026-03-07 0:37 ` Srirangan Madhavan
2026-03-10 21:44 ` Dan Williams
2026-03-16 14:02 ` Vishal Aslot
2026-03-06 8:00 ` [PATCH 2/5] cxl: Move HDM decoder and register map definitions to include/cxl/pci.h smadhavan
2026-03-06 17:45 ` Alex Williamson
2026-03-07 0:35 ` Srirangan Madhavan
2026-03-10 16:13 ` Dave Jiang
2026-03-06 8:00 ` [PATCH 3/5] PCI: Add virtual extended cap save buffer for CXL state smadhavan
2026-03-10 21:45 ` Dan Williams
2026-03-06 8:00 ` [PATCH 4/5] PCI: Add cxl DVSEC state save/restore across resets smadhavan
2026-03-06 17:45 ` Alex Williamson
2026-03-12 12:28 ` Jonathan Cameron
2026-03-06 8:00 ` [PATCH 5/5] PCI: Add HDM decoder state save/restore smadhavan
2026-03-10 21:39 ` [PATCH 0/5] PCI/CXL: Save and restore CXL DVSEC and HDM state across resets Dan Williams
2026-03-10 22:46 ` Alex Williamson
2026-03-11 1:45 ` Dan Williams
2026-03-17 14:51 ` Manish Honap
2026-03-17 17:03 ` Dan Williams
2026-03-17 18:19 ` Alex Williamson
2026-03-12 12:34 ` Jonathan Cameron
2026-03-16 13:59 ` Vishal Aslot
2026-03-16 17:28 ` Jonathan Cameron [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260316172807.00000abc@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=alwilliamson@nvidia.com \
--cc=bhelgaas@google.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=dschumacher@nvidia.com \
--cc=ira.weiny@intel.com \
--cc=jan@nvidia.com \
--cc=jeshuas@nvidia.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=mhonap@nvidia.com \
--cc=mochs@nvidia.com \
--cc=sdonthineni@nvidia.com \
--cc=skancherla@nvidia.com \
--cc=smadhavan@nvidia.com \
--cc=vaslot@nvidia.com \
--cc=vidyas@nvidia.com \
--cc=vishal.l.verma@intel.com \
--cc=vsethi@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox