linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Chris Li <chrisl@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Danilo Krummrich <dakr@kernel.org>, Len Brown <lenb@kernel.org>,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-acpi@vger.kernel.org, David Matlack <dmatlack@google.com>,
	Pasha Tatashin <tatashin@google.com>,
	Jason Miu <jasonmiu@google.com>,
	Vipin Sharma <vipinsh@google.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Adithya Jayachandran <ajayachandra@nvidia.com>,
	Parav Pandit <parav@nvidia.com>, William Tu <witu@nvidia.com>,
	Mike Rapoport <rppt@kernel.org>,
	Leon Romanovsky <leon@kernel.org>
Subject: Re: [PATCH v2 06/10] PCI/LUO: Save and restore driver name
Date: Wed, 1 Oct 2025 07:13:15 +0200	[thread overview]
Message-ID: <2025100142-slick-deserving-4aed@gregkh> (raw)
In-Reply-To: <CACePvbXrbR=A43UveqPrBmQHAfvjuJGtw9XyUQvpYe941KwzuA@mail.gmail.com>

On Tue, Sep 30, 2025 at 08:41:29AM -0700, Chris Li wrote:
> On Tue, Sep 30, 2025 at 6:41 AM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Sep 30, 2025 at 09:02:44AM -0400, Pasha Tatashin wrote:
> > > On Mon, Sep 29, 2025 at 10:10 PM Chris Li <chrisl@kernel.org> wrote:
> > > >
> > > > On Mon, Sep 29, 2025 at 10:57 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > > >
> > > > > On Tue, Sep 16, 2025 at 12:45:14AM -0700, Chris Li wrote:
> > > > > > Save the PCI driver name into "struct pci_dev_ser" during the PCI
> > > > > > prepare callback.
> > > > > >
> > > > > > After kexec, use driver_set_override() to ensure the device is
> > > > > > bound only to the saved driver.
> > > > >
> > > > > This doesn't seem like a great idea, driver name should not be made
> > > > > ABI.
> > > >
> > > > Let's break it down with baby steps.
> > > >
> > > > 1) Do you agree the liveupdated PCI device needs to bind to the exact
> > > > same driver after kexec?
> > > > To me that is a firm yes. If the driver binds to another driver, we
> > > > can't expect the other driver will understand the original driver's
> > > > saved state.
> > >
> > > Hi Chris,
> > >
> > > Driver name does not have to be an ABI.
> >
> > A driver name can NEVER be an abi, please don't do that.
> 
> Can you please clarify that.
> 
> for example, the pci has this sysfs control api:
> 
> "/sys/bus/pci/devices/0000:04:00.0/driver_override" which takes the
> *driver name* as data to override what driver is allowed to bind to
> this device.
> Does this driver_override consider it as using the driver name as part
> of the abi? If not, why?

Because the bind/unbind/override was created as a debug facility for
doing kernel development and then people have turned it into a "let's
operate our massive cloud systems with this fragile feature".

We have never said that driver names will remain the same across
releases, and they have changed over time.  Device ids have also moved
from one driver to another as well, making the "control" of the device
seem to have changed names.

> What live update wants is to make that driver_override persistent over
> kexec. It does not introduce the "driver_override" API. That is
> pre-existing conditions. The PCI liveupdate just wants to use it.

That does not mean that this is the correct api to use at all.  Again,
this was a debugging aid, to help with users who wanted to add a device
id to a driver without having to rebuild it.  Don't make it something
that it was never intended to be.

Why not just make a new api as you are doing something new here?  That
way you get to define it to work exactly the way you need?

> I want to get some basic understanding before adventure into the more
> complex solutions.

You mean "real" solutions :)

> > > Drivers that support live
> > > updates should provide a live update-specific ABI to detect
> > > compatibility with the preserved data. We can use a preservation
> > > schema GUID for this.
> > >
> > > > 2) Assume the 1) is yes from you. Are you just not happy that the
> > > > kernel saves the driver name? You want user space to save it, is that
> > > > it?
> > > > How does it reference the driver after kexec otherwise?
> > >
> > > If we use GUID, drivers would advertise the GUIDs they support and we
> > > would modify the core device-driver matching process to use this
> > > information.
> > >
> > > Each driver that supports this mechanism would need to declare an
> > > array of GUIDs it is compatible with. This would be a new field in its
> > > struct pci_driver.
> > >
> > > static const guid_t my_driver_guids[] = {
> > >     GUID_INIT(0x123e4567, ...), // Schema V1
> > >     GUID_INIT(0x987a6543, ...), // Schema V2
> > >     {},
> > > };
> >
> > That's crazy, who is going to be adding all of that to all drivers?  And
> > knowing to bump this if the internal data representaion changes?  And it
> > will change underneath it without the driver even knowing?  This feels
> > really really wrong, unless I'm missing something.
> 
> The GUID is more complex than a driver name. I am fine with not using
> GUID if you are so strongly opposed to it.
> 
> You are saying don't do A(driver name) and B(GUID). I am waiting for
> the part where you say "please do C instead".

It's not my requirement to say "here is C", but rather I am saying "B is
not going to scale over time as GUIDs are a pain to manage".

> Do you have any other suggestion how to prevent the live update PCI
> device bind to a different driver after kexec? I am happy to work on
> the direction you point out and turn that into a patch for the
> discussion purpose.

Why prevent it?  Why not just have a special api just for drivers that
want to use this new feature?

thanks,

greg k-h

  reply	other threads:[~2025-10-01  5:13 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-16  7:45 [PATCH v2 00/10] LUO: PCI subsystem (phase I) Chris Li
2025-09-16  7:45 ` [PATCH v2 01/10] PCI/LUO: Register with Liveupdate Orchestrator Chris Li
2025-09-30 15:15   ` Greg Kroah-Hartman
2025-09-30 23:41     ` Chris Li
2025-09-30 15:17   ` Greg Kroah-Hartman
2025-09-30 23:38     ` Chris Li
2025-09-16  7:45 ` [PATCH v2 02/10] PCI/LUO: Create requested liveupdate device list Chris Li
2025-09-29 17:46   ` Jason Gunthorpe
2025-09-30  2:13     ` Chris Li
2025-09-30 16:47       ` Jason Gunthorpe
2025-10-03  7:09         ` Chris Li
2025-10-03  5:33     ` Chris Li
2025-10-03 14:04       ` Jason Gunthorpe
2025-10-03 21:06         ` Chris Li
2025-09-30 15:26   ` Greg Kroah-Hartman
2025-10-03  6:57     ` Chris Li
2025-09-16  7:45 ` [PATCH v2 03/10] PCI/LUO: Forward prepare()/freeze()/cancel() callbacks to driver Chris Li
2025-09-29 17:48   ` Jason Gunthorpe
2025-09-30  2:11     ` Chris Li
2025-09-30 16:38       ` Jason Gunthorpe
2025-10-02 18:54         ` David Matlack
2025-10-02 20:57           ` Chris Li
2025-10-02 21:31             ` David Matlack
2025-10-02 23:21               ` Jason Gunthorpe
2025-10-02 23:42                 ` David Matlack
2025-10-03 12:03                   ` Jason Gunthorpe
2025-10-03 16:03                     ` David Matlack
2025-10-03 16:16                       ` Jason Gunthorpe
2025-10-03 16:28                         ` Pasha Tatashin
2025-10-03 16:56                           ` David Matlack
2025-10-03  5:24                 ` Chris Li
2025-10-03 12:06                   ` Jason Gunthorpe
2025-10-03 16:27                     ` David Matlack
2025-10-03 16:41                       ` Vipin Sharma
2025-10-03 17:44                     ` Chris Li
2025-10-03  5:17               ` Chris Li
2025-10-02 20:44         ` Chris Li
2025-09-30 15:27   ` Greg Kroah-Hartman
2025-10-02 20:38     ` Chris Li
2025-10-03  6:18       ` Greg Kroah-Hartman
2025-10-03  7:26         ` Chris Li
2025-10-03 12:26           ` Greg Kroah-Hartman
2025-10-03 17:49             ` Chris Li
2025-10-03 18:27               ` David Matlack
2025-10-03 21:10                 ` Chris Li
2025-09-16  7:45 ` [PATCH v2 04/10] PCI/LUO: Restore state at PCI enumeration Chris Li
2025-09-16  7:45 ` [PATCH v2 05/10] PCI/LUO: Forward finish callbacks to drivers Chris Li
2025-09-16  7:45 ` [PATCH v2 06/10] PCI/LUO: Save and restore driver name Chris Li
2025-09-29 17:57   ` Jason Gunthorpe
2025-09-30  2:10     ` Chris Li
2025-09-30 13:02       ` Pasha Tatashin
2025-09-30 13:41         ` Greg Kroah-Hartman
2025-09-30 14:53           ` Pasha Tatashin
2025-09-30 15:08             ` Greg Kroah-Hartman
2025-09-30 15:56               ` Pasha Tatashin
2025-10-01  5:06                 ` Greg Kroah-Hartman
2025-10-01 21:03                   ` Pasha Tatashin
2025-10-02  6:09                     ` Greg Kroah-Hartman
2025-10-02 13:23                       ` Jason Gunthorpe
2025-10-02 22:30                       ` Chris Li
2025-09-30 15:41           ` Chris Li
2025-10-01  5:13             ` Greg Kroah-Hartman [this message]
2025-10-02 22:05               ` Chris Li
2025-09-30 16:37         ` Jason Gunthorpe
2025-10-02 21:39           ` Chris Li
2025-10-03 14:28             ` Jason Gunthorpe
2025-09-16  7:45 ` [PATCH v2 07/10] PCI/LUO: Add liveupdate to pcieport driver Chris Li
2025-09-16  7:45 ` [PATCH v2 08/10] PCI/LUO: Add pci_liveupdate_get_driver_data() Chris Li
2025-09-16  7:45 ` [PATCH v2 09/10] PCI/LUO: Avoid write to bus master at boot Chris Li
2025-09-29 17:14   ` Bjorn Helgaas
2025-09-16  7:45 ` [PATCH v2 10/10] PCI: pci-lu-stub: Add a stub driver for Live Update testing Chris Li
2025-09-27 17:13 ` [PATCH v2 00/10] LUO: PCI subsystem (phase I) Bjorn Helgaas
2025-09-27 18:05   ` Pasha Tatashin
2025-09-29 15:04     ` Bjorn Helgaas
2025-09-29 18:13       ` Chris Li
2025-10-07 23:32         ` Chris Li
2025-10-08 23:00           ` David Matlack
2025-10-09 17:12             ` Chris Li
2025-10-09 23:21           ` Pratyush Yadav
2025-10-10  4:19             ` Chris Li
2025-10-10 23:49               ` Jason Miu
2025-10-13 13:58                 ` Pratyush Yadav
2025-10-14 16:11                   ` Pratyush Yadav
2025-10-14 20:44                   ` Chris Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2025100142-slick-deserving-4aed@gregkh \
    --to=gregkh@linuxfoundation.org \
    --cc=ajayachandra@nvidia.com \
    --cc=bhelgaas@google.com \
    --cc=chrisl@kernel.org \
    --cc=dakr@kernel.org \
    --cc=dmatlack@google.com \
    --cc=jasonmiu@google.com \
    --cc=jgg@ziepe.ca \
    --cc=lenb@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=parav@nvidia.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=rafael@kernel.org \
    --cc=rppt@kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=tatashin@google.com \
    --cc=vipinsh@google.com \
    --cc=witu@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).