From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Chris Li <chrisl@kernel.org>, Jason Gunthorpe <jgg@ziepe.ca>,
Bjorn Helgaas <bhelgaas@google.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Danilo Krummrich <dakr@kernel.org>, Len Brown <lenb@kernel.org>,
linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
linux-acpi@vger.kernel.org, David Matlack <dmatlack@google.com>,
Pasha Tatashin <tatashin@google.com>,
Jason Miu <jasonmiu@google.com>,
Vipin Sharma <vipinsh@google.com>,
Saeed Mahameed <saeedm@nvidia.com>,
Adithya Jayachandran <ajayachandra@nvidia.com>,
Parav Pandit <parav@nvidia.com>, William Tu <witu@nvidia.com>,
Mike Rapoport <rppt@kernel.org>,
Leon Romanovsky <leon@kernel.org>
Subject: Re: [PATCH v2 06/10] PCI/LUO: Save and restore driver name
Date: Tue, 30 Sep 2025 15:41:40 +0200 [thread overview]
Message-ID: <2025093030-shrewdly-defiant-1f3e@gregkh> (raw)
In-Reply-To: <CA+CK2bBFZn7EGOJakvQs3SX3i-b_YiTLf5_RhW_B4pLjm2WBuw@mail.gmail.com>
On Tue, Sep 30, 2025 at 09:02:44AM -0400, Pasha Tatashin wrote:
> On Mon, Sep 29, 2025 at 10:10 PM Chris Li <chrisl@kernel.org> wrote:
> >
> > On Mon, Sep 29, 2025 at 10:57 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > >
> > > On Tue, Sep 16, 2025 at 12:45:14AM -0700, Chris Li wrote:
> > > > Save the PCI driver name into "struct pci_dev_ser" during the PCI
> > > > prepare callback.
> > > >
> > > > After kexec, use driver_set_override() to ensure the device is
> > > > bound only to the saved driver.
> > >
> > > This doesn't seem like a great idea, driver name should not be made
> > > ABI.
> >
> > Let's break it down with baby steps.
> >
> > 1) Do you agree the liveupdated PCI device needs to bind to the exact
> > same driver after kexec?
> > To me that is a firm yes. If the driver binds to another driver, we
> > can't expect the other driver will understand the original driver's
> > saved state.
>
> Hi Chris,
>
> Driver name does not have to be an ABI.
A driver name can NEVER be an abi, please don't do that.
> Drivers that support live
> updates should provide a live update-specific ABI to detect
> compatibility with the preserved data. We can use a preservation
> schema GUID for this.
>
> > 2) Assume the 1) is yes from you. Are you just not happy that the
> > kernel saves the driver name? You want user space to save it, is that
> > it?
> > How does it reference the driver after kexec otherwise?
>
> If we use GUID, drivers would advertise the GUIDs they support and we
> would modify the core device-driver matching process to use this
> information.
>
> Each driver that supports this mechanism would need to declare an
> array of GUIDs it is compatible with. This would be a new field in its
> struct pci_driver.
>
> static const guid_t my_driver_guids[] = {
> GUID_INIT(0x123e4567, ...), // Schema V1
> GUID_INIT(0x987a6543, ...), // Schema V2
> {},
> };
That's crazy, who is going to be adding all of that to all drivers? And
knowing to bump this if the internal data representaion changes? And it
will change underneath it without the driver even knowing? This feels
really really wrong, unless I'm missing something.
> static struct pci_driver my_pci_driver = {
> .name = "my_driver",
> .id_table = my_pci_ids,
> .probe = my_probe,
> .live_update_guids = my_driver_guids,
> };
>
> The kernel's PCI core would perform an extra check before falling back
> to the standard PCI ID matching.
> 1. When a PCI device is discovered, the core first asks the Live
> Update framework: "Is there a preserved GUID for this device?"
> 2. If a GUID is found, the core will only attempt to bind drivers that
> both match the device's PCI ID and have that specific GUID in their
> live_update_guids list.
What "core" is doing this? And how exactly?
And why is PCI somehow special here?
> 3. If no GUID is preserved for the device, the core proceeds with the
> normal matching logic
> 4. If no driver matches the GUID, the device is left unbound. The
> state gets removed during finish(), and the device is reset.
How do you reset a device you are not bound to? That feels ripe for
causing problems (think multi-function devices...)
And what about PCI drivers that are really just a aux-bus "root" point?
How is the sharing of all of the child devices going to work?
This feels really rough and might possibly work if you squint hard
enough and test it in a very limited way with almost no real hardware :)
good luck!
greg k-h
next prev parent reply other threads:[~2025-09-30 13:41 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-16 7:45 [PATCH v2 00/10] LUO: PCI subsystem (phase I) Chris Li
2025-09-16 7:45 ` [PATCH v2 01/10] PCI/LUO: Register with Liveupdate Orchestrator Chris Li
2025-09-30 15:15 ` Greg Kroah-Hartman
2025-09-30 23:41 ` Chris Li
2025-09-30 15:17 ` Greg Kroah-Hartman
2025-09-30 23:38 ` Chris Li
2025-09-16 7:45 ` [PATCH v2 02/10] PCI/LUO: Create requested liveupdate device list Chris Li
2025-09-29 17:46 ` Jason Gunthorpe
2025-09-30 2:13 ` Chris Li
2025-09-30 16:47 ` Jason Gunthorpe
2025-10-03 7:09 ` Chris Li
2025-10-03 5:33 ` Chris Li
2025-10-03 14:04 ` Jason Gunthorpe
2025-10-03 21:06 ` Chris Li
2025-09-30 15:26 ` Greg Kroah-Hartman
2025-10-03 6:57 ` Chris Li
2025-09-16 7:45 ` [PATCH v2 03/10] PCI/LUO: Forward prepare()/freeze()/cancel() callbacks to driver Chris Li
2025-09-29 17:48 ` Jason Gunthorpe
2025-09-30 2:11 ` Chris Li
2025-09-30 16:38 ` Jason Gunthorpe
2025-10-02 18:54 ` David Matlack
2025-10-02 20:57 ` Chris Li
2025-10-02 21:31 ` David Matlack
2025-10-02 23:21 ` Jason Gunthorpe
2025-10-02 23:42 ` David Matlack
2025-10-03 12:03 ` Jason Gunthorpe
2025-10-03 16:03 ` David Matlack
2025-10-03 16:16 ` Jason Gunthorpe
2025-10-03 16:28 ` Pasha Tatashin
2025-10-03 16:56 ` David Matlack
2025-10-03 5:24 ` Chris Li
2025-10-03 12:06 ` Jason Gunthorpe
2025-10-03 16:27 ` David Matlack
2025-10-03 16:41 ` Vipin Sharma
2025-10-03 17:44 ` Chris Li
2025-10-03 5:17 ` Chris Li
2025-10-02 20:44 ` Chris Li
2025-09-30 15:27 ` Greg Kroah-Hartman
2025-10-02 20:38 ` Chris Li
2025-10-03 6:18 ` Greg Kroah-Hartman
2025-10-03 7:26 ` Chris Li
2025-10-03 12:26 ` Greg Kroah-Hartman
2025-10-03 17:49 ` Chris Li
2025-10-03 18:27 ` David Matlack
2025-10-03 21:10 ` Chris Li
2025-09-16 7:45 ` [PATCH v2 04/10] PCI/LUO: Restore state at PCI enumeration Chris Li
2025-09-16 7:45 ` [PATCH v2 05/10] PCI/LUO: Forward finish callbacks to drivers Chris Li
2025-09-16 7:45 ` [PATCH v2 06/10] PCI/LUO: Save and restore driver name Chris Li
2025-09-29 17:57 ` Jason Gunthorpe
2025-09-30 2:10 ` Chris Li
2025-09-30 13:02 ` Pasha Tatashin
2025-09-30 13:41 ` Greg Kroah-Hartman [this message]
2025-09-30 14:53 ` Pasha Tatashin
2025-09-30 15:08 ` Greg Kroah-Hartman
2025-09-30 15:56 ` Pasha Tatashin
2025-10-01 5:06 ` Greg Kroah-Hartman
2025-10-01 21:03 ` Pasha Tatashin
2025-10-02 6:09 ` Greg Kroah-Hartman
2025-10-02 13:23 ` Jason Gunthorpe
2025-10-02 22:30 ` Chris Li
2025-09-30 15:41 ` Chris Li
2025-10-01 5:13 ` Greg Kroah-Hartman
2025-10-02 22:05 ` Chris Li
2025-09-30 16:37 ` Jason Gunthorpe
2025-10-02 21:39 ` Chris Li
2025-10-03 14:28 ` Jason Gunthorpe
2025-09-16 7:45 ` [PATCH v2 07/10] PCI/LUO: Add liveupdate to pcieport driver Chris Li
2025-09-16 7:45 ` [PATCH v2 08/10] PCI/LUO: Add pci_liveupdate_get_driver_data() Chris Li
2025-09-16 7:45 ` [PATCH v2 09/10] PCI/LUO: Avoid write to bus master at boot Chris Li
2025-09-29 17:14 ` Bjorn Helgaas
2025-09-16 7:45 ` [PATCH v2 10/10] PCI: pci-lu-stub: Add a stub driver for Live Update testing Chris Li
2025-09-27 17:13 ` [PATCH v2 00/10] LUO: PCI subsystem (phase I) Bjorn Helgaas
2025-09-27 18:05 ` Pasha Tatashin
2025-09-29 15:04 ` Bjorn Helgaas
2025-09-29 18:13 ` Chris Li
2025-10-07 23:32 ` Chris Li
2025-10-08 23:00 ` David Matlack
2025-10-09 17:12 ` Chris Li
2025-10-09 23:21 ` Pratyush Yadav
2025-10-10 4:19 ` Chris Li
2025-10-10 23:49 ` Jason Miu
2025-10-13 13:58 ` Pratyush Yadav
2025-10-14 16:11 ` Pratyush Yadav
2025-10-14 20:44 ` Chris Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2025093030-shrewdly-defiant-1f3e@gregkh \
--to=gregkh@linuxfoundation.org \
--cc=ajayachandra@nvidia.com \
--cc=bhelgaas@google.com \
--cc=chrisl@kernel.org \
--cc=dakr@kernel.org \
--cc=dmatlack@google.com \
--cc=jasonmiu@google.com \
--cc=jgg@ziepe.ca \
--cc=lenb@kernel.org \
--cc=leon@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=parav@nvidia.com \
--cc=pasha.tatashin@soleen.com \
--cc=rafael@kernel.org \
--cc=rppt@kernel.org \
--cc=saeedm@nvidia.com \
--cc=tatashin@google.com \
--cc=vipinsh@google.com \
--cc=witu@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).