From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Chris Li <chrisl@kernel.org>, Jason Gunthorpe <jgg@ziepe.ca>,
Bjorn Helgaas <bhelgaas@google.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Danilo Krummrich <dakr@kernel.org>, Len Brown <lenb@kernel.org>,
linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
linux-acpi@vger.kernel.org, David Matlack <dmatlack@google.com>,
Pasha Tatashin <tatashin@google.com>,
Jason Miu <jasonmiu@google.com>,
Vipin Sharma <vipinsh@google.com>,
Saeed Mahameed <saeedm@nvidia.com>,
Adithya Jayachandran <ajayachandra@nvidia.com>,
Parav Pandit <parav@nvidia.com>, William Tu <witu@nvidia.com>,
Mike Rapoport <rppt@kernel.org>,
Leon Romanovsky <leon@kernel.org>
Subject: Re: [PATCH v2 06/10] PCI/LUO: Save and restore driver name
Date: Thu, 2 Oct 2025 08:09:11 +0200 [thread overview]
Message-ID: <2025100225-abridge-shifty-3d50@gregkh> (raw)
In-Reply-To: <CA+CK2bA0acjg-CEKufERu_ov4up3E4XTkJ6kbEDCny0iASrFVQ@mail.gmail.com>
On Wed, Oct 01, 2025 at 05:03:19PM -0400, Pasha Tatashin wrote:
> On Wed, Oct 1, 2025 at 1:06 AM Greg Kroah-Hartman
> > On Tue, Sep 30, 2025 at 11:56:58AM -0400, Pasha Tatashin wrote:
> > > > > A driver that preserves state across a reboot already has an implicit
> > > > > contract with its future self about that data's format. The GUID
> > > > > simply makes that contract explicit and machine-checkable. It does not
> > > > > have to be GUID, but nevertheless there has to be a specific contract.
> > > >
> > > > So how are you going to "version" these GUID? I see you use "schema Vx"
> > >
> > > Driver developer who changes a driver to support live-update.
> >
> > I do not understand this response, sorry.
>
> Sorry for the confusion, I misunderstood your question. I thought you
> were asking who would add a new field to a driver. My answer was that
> it would be the developer who is adding support for the Live Update
> feature to that specific driver.
> I now realize you were asking about how the GUID would be versioned.
> Using a GUID was just one of several ideas. My main point is that we
> need some form of versioned compatibility identifier, whether it's a
> string or a number. This would allow the system to verify that the new
> driver can understand the preserved data for this device from the
> previous kernel before it binds to the device.
Again, "versioned" identifiers will not work over time as you can never
drop old versions, AND a driver author does not know if the underlying
structures that are outside of the driver have changed or not, nor if
the compiler settings have changed, or anything else that could affect
it like that have changed.
> > > > And when can you delete an old "schema"? This feels like you are
> > > > forcing future developers to maintain things "for forever"...
> > >
> > > This won't be an issue because of how live update support is planned.
> > > The support model will be phased and limited:
> > >
> > > Initially, and for a while there will be no stability guarantees
> > > between different kernel versions.
> > > Eventually, we will support specific, narrow upgrade paths (e.g.,
> > > minor-to-minor, or stable-A to stable-A+1).
> > > Downgrades and arbitrary version jumps ("any-to-any") will not be
> > > supported upstream. Since we only ever need to handle a well-defined
> > > forward path, the code for old, irrelevant schemas can always be
> > > removed. There is no "forever".
> >
> > This is kernel code, it is always "forever", sorry.
>
> I'm sorry, but I don't quite understand what you mean. There is no
> stable internal kernel API; the upstream tree is constantly evolving
> with features being added, improved, and removed.
Yes, that is very true, but you can not remove user-visible
functionality, which is what you are saying you are going to do here.
> > If you want "minor to minor" update, how is that going to work given
> > that you do not add changes only to "minor" releases (that being the
> > 6.12.y the "y" number).
>
> You are correct. Initially, our plan is to allow live updates to break
> between any kernel version.
Then there is no such thing as live updates :)
> However, it is my hope that we will
> eventually stabilize this process and only allow breakages between,
> for example, versions 6.n and 6.n+2, and eventually from one stable
> release to stable+2. This would create a well-defined window for
> safely removing deprecated data formats and the code that handles them
> from the kernel.
How are you going to define this? We can not break old users when they
upgrade, and so you are going to have to support this "upgrade path" for
forever.
> > Remember, Linux does not use "semantic versioning" as its release
> > numbering is older than that scheme. It just does "this version is
> > newer than that version" and that's it. You can't really take anything
> > else from the number.
>
> Understood. If that's the case, we could use stable releases as the
> basis for defining when a live update can break.
So every single release?
> It would take longer
> to achieve, but it is a possibility. These are the kinds of questions
> that will be discussed at the LPC Liveupdate MC. If you are attending
> LPC, I encourage you to join the discussion, as your thoughts on how
> we can frame long-term live update support would be very valuable.
I will be at LPC, but can't guarantee I can make it to that MC, it all
depends on scheduling.
> > And if this isn't for "upstream" at all, then why have it? We can't add
> > new features and support it if we can't actually use it and it's only
> > for out-of-tree vendor kernels.
>
> Our goal is to have full support in the upstream kernel. Downstream
> users will then need to adapt live updates to their specific needs.
> For example, if a live update from version A to version C is broken, a
> downstream user would either have to update incrementally from A to B
> and then to C, or they would have to internally fix whatever is
> causing the breakage before performing the live update.
What does "internally fix" mean exactly here?
> > And how will you document properly a "well defined forward path"? That
> > should be done first, before you have any code here that we are
> > reviewing.
>
> Currently, and for the near future, live updates will only be
> supported within the same kernel version.
Ok, then no need for any GUID at all. Just update and pray! :)
> > Please do that, get people to agree on the idea and how it will work
> > before asking us to review code.
>
> This is an industry-wide effort. We have engineers from Amazon,
> Google, Microsoft, Nvidia, and other companies meeting bi-weekly to
> discuss Live Update support, and sending and landing patches upstream.
> We are also organizing an LPC Live Update Micro Conference where the
> versioning strategy will be a topic.
>
> For now, we have agreed that the live update can break between and
> kernel versions or with any commit while the feature is under active
> development. This approach allows us the flexibility to build the core
> functionality while we collaboratively define the long-term versioning
> and stability model.
Just keeping a device "alive" while rebooting into the same exact kernel
image seems odd to me given that this is almost never what people
actually do. They update their kernel with the weekly stable release to
get the new bugfixes (remember we fix 13 CVEs a day), and away you go.
You are saying that this workload would not actually be supported, so
why do you want live update at all? Who needs this?
thanks,
greg k-h
next prev parent reply other threads:[~2025-10-02 6:09 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-16 7:45 [PATCH v2 00/10] LUO: PCI subsystem (phase I) Chris Li
2025-09-16 7:45 ` [PATCH v2 01/10] PCI/LUO: Register with Liveupdate Orchestrator Chris Li
2025-09-30 15:15 ` Greg Kroah-Hartman
2025-09-30 23:41 ` Chris Li
2025-09-30 15:17 ` Greg Kroah-Hartman
2025-09-30 23:38 ` Chris Li
2025-09-16 7:45 ` [PATCH v2 02/10] PCI/LUO: Create requested liveupdate device list Chris Li
2025-09-29 17:46 ` Jason Gunthorpe
2025-09-30 2:13 ` Chris Li
2025-09-30 16:47 ` Jason Gunthorpe
2025-10-03 7:09 ` Chris Li
2025-10-03 5:33 ` Chris Li
2025-10-03 14:04 ` Jason Gunthorpe
2025-10-03 21:06 ` Chris Li
2025-09-30 15:26 ` Greg Kroah-Hartman
2025-10-03 6:57 ` Chris Li
2025-09-16 7:45 ` [PATCH v2 03/10] PCI/LUO: Forward prepare()/freeze()/cancel() callbacks to driver Chris Li
2025-09-29 17:48 ` Jason Gunthorpe
2025-09-30 2:11 ` Chris Li
2025-09-30 16:38 ` Jason Gunthorpe
2025-10-02 18:54 ` David Matlack
2025-10-02 20:57 ` Chris Li
2025-10-02 21:31 ` David Matlack
2025-10-02 23:21 ` Jason Gunthorpe
2025-10-02 23:42 ` David Matlack
2025-10-03 12:03 ` Jason Gunthorpe
2025-10-03 16:03 ` David Matlack
2025-10-03 16:16 ` Jason Gunthorpe
2025-10-03 16:28 ` Pasha Tatashin
2025-10-03 16:56 ` David Matlack
2025-10-03 5:24 ` Chris Li
2025-10-03 12:06 ` Jason Gunthorpe
2025-10-03 16:27 ` David Matlack
2025-10-03 16:41 ` Vipin Sharma
2025-10-03 17:44 ` Chris Li
2025-10-03 5:17 ` Chris Li
2025-10-02 20:44 ` Chris Li
2025-09-30 15:27 ` Greg Kroah-Hartman
2025-10-02 20:38 ` Chris Li
2025-10-03 6:18 ` Greg Kroah-Hartman
2025-10-03 7:26 ` Chris Li
2025-10-03 12:26 ` Greg Kroah-Hartman
2025-10-03 17:49 ` Chris Li
2025-10-03 18:27 ` David Matlack
2025-10-03 21:10 ` Chris Li
2025-09-16 7:45 ` [PATCH v2 04/10] PCI/LUO: Restore state at PCI enumeration Chris Li
2025-09-16 7:45 ` [PATCH v2 05/10] PCI/LUO: Forward finish callbacks to drivers Chris Li
2025-09-16 7:45 ` [PATCH v2 06/10] PCI/LUO: Save and restore driver name Chris Li
2025-09-29 17:57 ` Jason Gunthorpe
2025-09-30 2:10 ` Chris Li
2025-09-30 13:02 ` Pasha Tatashin
2025-09-30 13:41 ` Greg Kroah-Hartman
2025-09-30 14:53 ` Pasha Tatashin
2025-09-30 15:08 ` Greg Kroah-Hartman
2025-09-30 15:56 ` Pasha Tatashin
2025-10-01 5:06 ` Greg Kroah-Hartman
2025-10-01 21:03 ` Pasha Tatashin
2025-10-02 6:09 ` Greg Kroah-Hartman [this message]
2025-10-02 13:23 ` Jason Gunthorpe
2025-10-02 22:30 ` Chris Li
2025-09-30 15:41 ` Chris Li
2025-10-01 5:13 ` Greg Kroah-Hartman
2025-10-02 22:05 ` Chris Li
2025-09-30 16:37 ` Jason Gunthorpe
2025-10-02 21:39 ` Chris Li
2025-10-03 14:28 ` Jason Gunthorpe
2025-09-16 7:45 ` [PATCH v2 07/10] PCI/LUO: Add liveupdate to pcieport driver Chris Li
2025-09-16 7:45 ` [PATCH v2 08/10] PCI/LUO: Add pci_liveupdate_get_driver_data() Chris Li
2025-09-16 7:45 ` [PATCH v2 09/10] PCI/LUO: Avoid write to bus master at boot Chris Li
2025-09-29 17:14 ` Bjorn Helgaas
2025-09-16 7:45 ` [PATCH v2 10/10] PCI: pci-lu-stub: Add a stub driver for Live Update testing Chris Li
2025-09-27 17:13 ` [PATCH v2 00/10] LUO: PCI subsystem (phase I) Bjorn Helgaas
2025-09-27 18:05 ` Pasha Tatashin
2025-09-29 15:04 ` Bjorn Helgaas
2025-09-29 18:13 ` Chris Li
2025-10-07 23:32 ` Chris Li
2025-10-08 23:00 ` David Matlack
2025-10-09 17:12 ` Chris Li
2025-10-09 23:21 ` Pratyush Yadav
2025-10-10 4:19 ` Chris Li
2025-10-10 23:49 ` Jason Miu
2025-10-13 13:58 ` Pratyush Yadav
2025-10-14 16:11 ` Pratyush Yadav
2025-10-14 20:44 ` Chris Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2025100225-abridge-shifty-3d50@gregkh \
--to=gregkh@linuxfoundation.org \
--cc=ajayachandra@nvidia.com \
--cc=bhelgaas@google.com \
--cc=chrisl@kernel.org \
--cc=dakr@kernel.org \
--cc=dmatlack@google.com \
--cc=jasonmiu@google.com \
--cc=jgg@ziepe.ca \
--cc=lenb@kernel.org \
--cc=leon@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=parav@nvidia.com \
--cc=pasha.tatashin@soleen.com \
--cc=rafael@kernel.org \
--cc=rppt@kernel.org \
--cc=saeedm@nvidia.com \
--cc=tatashin@google.com \
--cc=vipinsh@google.com \
--cc=witu@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).