Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Pratyush Yadav <pratyush@kernel.org>
To: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Mike Rapoport <rppt@kernel.org>,
	 linux-kselftest@vger.kernel.org, shuah@kernel.org,
	 akpm@linux-foundation.org,  linux-mm@kvack.org,
	skhan@linuxfoundation.org,  linux-doc@vger.kernel.org,
	jasonmiu@google.com,  linux-kernel@vger.kernel.org,
	 corbet@lwn.net, ran.xiaokai@zte.com.cn,
	 kexec@lists.infradead.org, pratyush@kernel.org,
	 graf@amazon.com, Logan Odell <loganodell@google.com>
Subject: Re: [RFC v1 0/9] kho: granular compatibility and header decoupling
Date: Tue, 09 Jun 2026 16:28:12 +0200	[thread overview]
Message-ID: <2vxzo6hjss8z.fsf@kernel.org> (raw)
In-Reply-To: <aibYJvzQQnpoN6YW@plex> (Pasha Tatashin's message of "Mon, 8 Jun 2026 16:12:56 +0000")

On Mon, Jun 08 2026, Pasha Tatashin wrote:

> On 06-08 13:26, Mike Rapoport wrote:
>> On 2026-06-07 13:43:09+00:00, Pasha Tatashin wrote:
>> > On 06-07 14:58, Mike Rapoport wrote:
>> > 
>> > > On Fri, 05 Jun 2026 03:32:26 +0000, Pasha Tatashin <pasha.tatashin@soleen.com> wrote:
[...]
>> > External users only need to include the headers they actually use. For
>> > example, LUO shouldn't have to pull vmalloc or radix tree KHO
>> > declarations, and memfd does not need block.
>> > 
>> > From a maintenance point of view, it is much easier to catch ABI
>> > changes when the file with the appropriate version has been changed,
>> > and most likely the version of that file should be updated. If a single
>> > header contains compatibility versions for several different data
>> > structures, it is easier to miss the correct version update.
>> 
>> No matter in what files the definition lives, someone can forget to
>> update version and we may miss it during review.

Perhaps we should have some tests (maybe with kunit?) that can catch
this? If you change the format, the test fails. So you'd have to go and
update the test, and at that point it should be more obvious that ABI
version needs bumping.

[...]
>> 
>> Sorry I wasn't clear. I agree that kho_vmalloc, block and radix tree
>> should have their own versioning rather than rely on global KHO version.
>> 
>> What I don't like in your proposal is mixing versioning of a component
>> with its dependencies.
>> 
>> I think that versioning should be completely local to each component.
>> LUO should not care about kho_block "on wire" layout. This should be
>> encapsulated in kho_block.
>
> That is a fair point.
>
> As I mentioned in my previous reply, we can definitely look into making 
> the version checking more modular. For example, each component could 
> implement a standard compatibility-checking interface.
>
> These checks could run early in boot to determine whether each component 
> is capable of accepting the incoming preserved data format.
>
> Whenever the component is later used by LUO, memfd, etc., we can query 
> that cached status. This achieves four key benefits:
>
> 1. It avoids delaying the compatibility check to the actual time of data 
> retrieval, which is too late to safely abort.
>
> 2. It prevents a local incompatibility from triggering a global kernel 
> panic, allowing us to handle failures gracefully for just that specific 
> component or session.

I think the right time to do the compatibility check is _before_ kexec.
That is the only point where you can safely abort. Once you boot into
the new kernel and discover you can't understand the passed data, you
are in a bad spot already and should reboot. I don't think think you
really can gracefully handle these failures.

For example, say you fail to understand the incoming PCI data. So you
have no idea which devices are participating in live update and cannot
correctly probe any of them. Which effectively means you cannot resume
any of your guests since you have no idea how to restore their device
state. The only path you are left with is to reboot. I haven't read the
IOMMU series, but I imagine the same story applies there.

For a more benign example, let's assume one of your memfds that back VM
memory fail to restore.

In this case, you can safely leak that memory and run the other guests,
but at that point the host is in impaired state. You don't want to keep
running it in this state. You likely either do a reboot, or if you feel
more adventurous, you do another live update.

In either case, there is no "safely abort" after the kexec happens.

So I think our energy is better spent solving the versioning story
_before_ kexec. After kexec I think it is perfectly fine to error out
and panic or expect a reboot. You can't salvage much at that point
anyway.

And I think how the versioning format looks also should be based on the
design of this pre-kexec check, not the other way round.

>
> 3. It keeps the local version local, as you suggested, so it is checked 
> only by the consumers of that specific component.
>
> 4. It provides a clean path for backward compatibility, as components 
> can individually decide whether they understand the incoming data 
> format.
>
[...]
>> 
>> Actually FDT "compatible" handles versioning nicer than composite strings
>> You can have
>> 
>> 	compatible="kho-v4", "vmalloc-v1", "radix-v1", "block-v2";
>> 
>> and check fdt_node_check_compatible("vmalloc-v1") for vmalloc and
>> fdt_node_check_compatible("block-v2") for block.

I agree. Even if we don't use FDT, something more structured than
composite strings would be nice to have.

>
> That is actually very similar to what I am proposing—individual version 
> tokens (which in my current series are concatenated into a composite 
> compatibility string separated by ';').
>
> But let's not get too fixated on the composite string formatting. I 
> actually really like what you are proposing: using integers for versions 
> and having each registered component carry its own "NAME" and version 
> number in the KHO FDT.

There is another nice thing about numbers that Logan (+cc) recently
pointed out. You can tell which one is bigger.

At some point I think we will support multiple versions of a data
structure to allow for upgrades. At that point, it will help to know
which one is "newer". So if both kernel versions support version 3 and
4, you can use 4 to serialize.

This of course is harder to do with strings.

>
>> And we wouldn't need to reimplement string parsing ;-)
>> 
>> But yeah, I do see value of making components versioning and KHO global
>> versioning independent. I just don't like composite strings and I don't
>> like mixing versioning with dependencies.
>> 
>> Since we are moving from FDT for the most things, version should become
>> a number rather than a string and version compatibility should be
[...]

-- 
Regards,
Pratyush Yadav


      parent reply	other threads:[~2026-06-09 14:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-05  3:32 [RFC v1 0/9] kho: granular compatibility and header decoupling Pasha Tatashin
2026-06-05  3:32 ` [RFC v1 1/9] kho: split out radix tree tracker into kho_radix.c Pasha Tatashin
2026-06-07 11:58   ` Mike Rapoport
2026-06-07 16:20     ` Pasha Tatashin
2026-06-07 17:59       ` Mike Rapoport
2026-06-08 14:56         ` Pasha Tatashin
2026-06-05  3:32 ` [RFC v1 2/9] kho: split radix tree headers out of kexec_handover.h Pasha Tatashin
2026-06-05  3:32 ` [RFC v1 3/9] kho: split out vmalloc preservation into kho_vmalloc.c Pasha Tatashin
2026-06-05  3:32 ` [RFC v1 4/9] kho: split vmalloc headers out of kexec_handover.h Pasha Tatashin
2026-06-05  3:32 ` [RFC v1 5/9] kho: move kho_block.h to kho/block.h Pasha Tatashin
2026-06-05  3:32 ` [RFC v1 6/9] kho: introduce compatibility helpers and decouple block version Pasha Tatashin
2026-06-05  3:32 ` [RFC v1 7/9] kho: decouple radix tree compatibility from global KHO version Pasha Tatashin
2026-06-05  3:32 ` [RFC v1 8/9] kho: decouple vmalloc compatibility from global KHO version and update memfd Pasha Tatashin
2026-06-05  3:32 ` [RFC v1 9/9] liveupdate: add KUnit test to verify alphabetical order of compatibility strings Pasha Tatashin
2026-06-07 11:58 ` [RFC v1 0/9] kho: granular compatibility and header decoupling Mike Rapoport
2026-06-07 13:43   ` Pasha Tatashin
2026-06-08 10:26     ` Mike Rapoport
2026-06-08 16:12       ` Pasha Tatashin
2026-06-08 18:11         ` Mike Rapoport
2026-06-09  1:14           ` Pasha Tatashin
2026-06-09 14:33             ` Pratyush Yadav
2026-06-09 14:28         ` Pratyush Yadav [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2vxzo6hjss8z.fsf@kernel.org \
    --to=pratyush@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=graf@amazon.com \
    --cc=jasonmiu@google.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=loganodell@google.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=ran.xiaokai@zte.com.cn \
    --cc=rppt@kernel.org \
    --cc=shuah@kernel.org \
    --cc=skhan@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox