From: Pasha Tatashin <pasha.tatashin@soleen.com>
To: Pratyush Yadav <pratyush@kernel.org>
Cc: jasonmiu@google.com, graf@amazon.com, changyuanl@google.com,
rppt@kernel.org, dmatlack@google.com, rientjes@google.com,
corbet@lwn.net, rdunlap@infradead.org,
ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com,
ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org,
akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr,
mmaurer@google.com, roman.gushchin@linux.dev,
chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com,
jannh@google.com, vincent.guittot@linaro.org,
hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com,
joel.granados@kernel.org, rostedt@goodmis.org,
anna.schumaker@oracle.com, song@kernel.org,
zhangguopeng@kylinos.cn, linux@weissschuh.net,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
linux-mm@kvack.org, gregkh@linuxfoundation.org,
tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
rafael@kernel.org, dakr@kernel.org,
bartosz.golaszewski@linaro.org, cw00.choi@samsung.com,
myungjoo.ham@samsung.com, yesanishhere@gmail.com,
Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com,
aleksander.lobakin@intel.com, ira.weiny@intel.com,
andriy.shevchenko@linux.intel.com, leon@kernel.org,
lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org,
djeffery@redhat.com, stuart.w.hayes@gmail.com,
lennart@poettering.net, brauner@kernel.org,
linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org,
saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com,
parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com,
hughd@google.com, skhawaja@google.com, chrisl@kernel.org,
steven.sistare@oracle.com
Subject: Re: [PATCH v4 00/30] Live Update Orchestrator
Date: Thu, 9 Oct 2025 19:50:12 -0400 [thread overview]
Message-ID: <CA+CK2bB6F634HCw_N5z9E5r_LpbGJrucuFb_5fL4da5_W99e4Q@mail.gmail.com> (raw)
In-Reply-To: <mafs0ms5zn0nm.fsf@kernel.org>
On Thu, Oct 9, 2025 at 6:58 PM Pratyush Yadav <pratyush@kernel.org> wrote:
>
> On Tue, Oct 07 2025, Pasha Tatashin wrote:
>
> > On Sun, Sep 28, 2025 at 9:03 PM Pasha Tatashin
> > <pasha.tatashin@soleen.com> wrote:
> >>
> [...]
> > 4. New File-Lifecycle-Bound Global State
> > ----------------------------------------
> > A new mechanism for managing global state was proposed, designed to be
> > tied to the lifecycle of the preserved files themselves. This would
> > allow a file owner (e.g., the IOMMU subsystem) to save and retrieve
> > global state that is only relevant when one or more of its FDs are
> > being managed by LUO.
>
> Is this going to replace LUO subsystems? If yes, then why? The global
> state will likely need to have its own lifecycle just like the FDs, and
> subsystems are a simple and clean abstraction to control that. I get the
> idea of only "activating" a subsystem when one or more of its FDs are
> participating in LUO, but we can do that while keeping subsystems
> around.
Thanks for the feedback. The FLB Global State is not replacing the LUO
subsystems. On the contrary, it's a higher-level abstraction that is
itself implemented as a LUO subsystem. The goal is to provide a
solution for a pattern that emerged during the PCI and IOMMU
discussions.
You can see the WIP implementation here, which shows it registering as
a subsystem named "luo-fh-states-v1-struct":
https://github.com/soleen/linux/commit/94e191aab6b355d83633718bc4a1d27dda390001
The existing subsystem API is a low-level tool that provides for the
preservation of a raw 8-byte handle. It doesn't provide locking, nor
is it explicitly tied to the lifecycle of any higher-level object like
a file handler. The new API is designed to solve a more specific
problem: allowing global components (like IOMMU or PCI) to
automatically track when resources relevant to them are added to or
removed from preservation. If HugeTLB requires a subsystem, it can
still use it, but I suspect it might benefit from FLB Global State as
well.
> Here is how I imagine the proposed API would compare against subsystems
> with hugetlb as an example (hugetlb support is still WIP, so I'm still
> not clear on specifics, but this is how I imagine it will work):
>
> - Hugetlb subsystem needs to track its huge page pools and which pages
> are allocated and free. This is its global state. The pools get
> reconstructed after kexec. Post-kexec, the free pages are ready for
> allocation from other "regular" files and the pages used in LUO files
> are reserved.
>
> - Pre-kexec, when a hugetlb FD is preserved, it marks that as preserved
> in hugetlb's global data structure tracking this. This is runtime data
> (say xarray), and _not_ serialized data. Reason being, there are
> likely more FDs to come so no point in wasting time serializing just
> yet.
>
> This can look something like:
>
> hugetlb_luo_preserve_folio(folio, ...);
>
> Nice and simple.
>
> Compare this with the new proposed API:
>
> liveupdate_fh_global_state_get(h, &hugetlb_data);
> // This will have update serialized state now.
> hugetlb_luo_preserve_folio(hugetlb_data, folio, ...);
> liveupdate_fh_global_state_put(h);
>
> We do the same thing but in a very complicated way.
>
> - When the system-wide preserve happens, the hugetlb subsystem gets a
> callback to serialize. It converts its runtime global state to
> serialized state since now it knows no more FDs will be added.
>
> With the new API, this doesn't need to be done since each FD prepare
> already updates serialized state.
>
> - If there are no hugetlb FDs, then the hugetlb subsystem doesn't put
> anything in LUO. This is same as new API.
>
> - If some hugetlb FDs are not restored after liveupdate and the finish
> event is triggered, the subsystem gets its finish() handler called and
> it can free things up.
>
> I don't get how that would work with the new API.
The new API isn't more complicated; It codifies the common pattern of
"create on first use, destroy on last use" into a reusable helper,
saving each file handler from having to reinvent the same reference
counting and locking scheme. But, as you point out, subsystems provide
more control, specifically they handle full creation/free instead of
relying on file-handlers for that.
> My point is, I see subsystems working perfectly fine here and I don't
> get how the proposed API is any better.
>
> Am I missing something?
No, I don't think you are. Your analysis is correct that this is
achievable with subsystems. The goal of the new API is to make that
specific, common use case simpler.
Pasha
next prev parent reply other threads:[~2025-10-09 23:50 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-29 1:02 [PATCH v4 00/30] Live Update Orchestrator Pasha Tatashin
2025-09-29 1:02 ` [PATCH v4 01/30] kho: allow to drive kho from within kernel Pasha Tatashin
2025-09-29 1:02 ` [PATCH v4 02/30] kho: make debugfs interface optional Pasha Tatashin
2025-10-06 16:30 ` Pratyush Yadav
2025-10-06 18:02 ` Pasha Tatashin
2025-10-06 16:55 ` Pratyush Yadav
2025-10-06 17:23 ` Pasha Tatashin
2025-09-29 1:02 ` [PATCH v4 03/30] kho: drop notifiers Pasha Tatashin
2025-10-06 14:30 ` Pratyush Yadav
2025-10-06 16:17 ` Pasha Tatashin
2025-10-06 16:38 ` Pratyush Yadav
2025-10-06 17:01 ` Pratyush Yadav
2025-10-06 17:21 ` Pasha Tatashin
2025-10-07 12:09 ` Pratyush Yadav
2025-10-07 13:16 ` Pasha Tatashin
2025-10-07 13:30 ` Pratyush Yadav
2025-09-29 1:02 ` [PATCH v4 04/30] kho: add interfaces to unpreserve folios and page ranes Pasha Tatashin
2025-09-29 1:02 ` [PATCH v4 05/30] kho: don't unpreserve memory during abort Pasha Tatashin
2025-09-29 1:02 ` [PATCH v4 06/30] liveupdate: kho: move to kernel/liveupdate Pasha Tatashin
2025-09-29 1:02 ` [PATCH v4 07/30] liveupdate: luo_core: luo_ioctl: Live Update Orchestrator Pasha Tatashin
2025-09-29 1:02 ` [PATCH v4 08/30] liveupdate: luo_core: integrate with KHO Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 09/30] liveupdate: luo_subsystems: add subsystem registration Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 10/30] liveupdate: luo_subsystems: implement subsystem callbacks Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 11/30] liveupdate: luo_session: Add sessions support Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 12/30] liveupdate: luo_ioctl: add user interface Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 13/30] liveupdate: luo_file: implement file systems callbacks Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 14/30] liveupdate: luo_session: Add ioctls for file preservation and state management Pasha Tatashin
2025-10-29 19:07 ` Pratyush Yadav
2025-10-29 20:13 ` Pasha Tatashin
2025-10-29 20:43 ` David Matlack
2025-10-29 20:57 ` Pasha Tatashin
2025-10-29 21:13 ` David Matlack
2025-10-29 21:17 ` Pasha Tatashin
2025-10-29 22:00 ` Samiullah Khawaja
2025-10-30 14:45 ` Pasha Tatashin
2025-10-29 20:37 ` Pratyush Yadav
2025-10-29 20:58 ` Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 15/30] reboot: call liveupdate_reboot() before kexec Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 16/30] kho: move kho debugfs directory to liveupdate Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 17/30] liveupdate: add selftests for subsystems un/registration Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 18/30] selftests/liveupdate: add subsystem/state tests Pasha Tatashin
2025-10-03 23:17 ` Vipin Sharma
2025-10-04 2:08 ` Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 19/30] docs: add luo documentation Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 20/30] MAINTAINERS: add liveupdate entry Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 21/30] mm: shmem: use SHMEM_F_* flags instead of VM_* flags Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 22/30] mm: shmem: allow freezing inode mapping Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 23/30] mm: shmem: export some functions to internal.h Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 24/30] luo: allow preserving memfd Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 25/30] docs: add documentation for memfd preservation via LUO Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 26/30] selftests/liveupdate: Add multi-kexec session lifecycle test Pasha Tatashin
2025-10-03 22:51 ` Vipin Sharma
2025-10-04 2:07 ` Pasha Tatashin
2025-10-04 2:37 ` Pasha Tatashin
2025-10-09 22:57 ` Vipin Sharma
2025-09-29 1:03 ` [PATCH v4 27/30] selftests/liveupdate: Add multi-file and unreclaimed file test Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 28/30] selftests/liveupdate: Add multi-session workflow and state interaction test Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 29/30] selftests/liveupdate: Add test for unreclaimed resource cleanup Pasha Tatashin
2025-09-29 1:03 ` [PATCH v4 30/30] selftests/liveupdate: Add tests for per-session state and cancel cycles Pasha Tatashin
2025-10-07 17:10 ` [PATCH v4 00/30] Live Update Orchestrator Pasha Tatashin
2025-10-07 17:50 ` Jason Gunthorpe
2025-10-08 3:18 ` Pasha Tatashin
2025-10-08 7:03 ` Samiullah Khawaja
2025-10-08 16:40 ` Pasha Tatashin
2025-10-08 19:35 ` Jason Gunthorpe
2025-10-08 20:26 ` Pasha Tatashin
2025-10-09 14:48 ` Jason Gunthorpe
2025-10-09 15:01 ` Pasha Tatashin
2025-10-09 15:03 ` Pasha Tatashin
2025-10-09 16:46 ` Samiullah Khawaja
2025-10-09 17:39 ` Jason Gunthorpe
2025-10-09 18:37 ` Pasha Tatashin
2025-10-10 14:35 ` Jason Gunthorpe
2025-10-09 21:58 ` Samiullah Khawaja
2025-10-09 22:42 ` Pasha Tatashin
2025-10-10 14:42 ` Jason Gunthorpe
2025-10-10 14:58 ` Pasha Tatashin
2025-10-10 15:02 ` Jason Gunthorpe
2025-10-09 22:57 ` Pratyush Yadav
2025-10-09 23:50 ` Pasha Tatashin [this message]
2025-10-10 15:01 ` Jason Gunthorpe
2025-10-14 13:29 ` Pratyush Yadav
2025-10-20 14:29 ` Jason Gunthorpe
2025-10-27 11:37 ` Pratyush Yadav
2025-10-13 15:23 ` Pratyush Yadav
2025-10-10 12:45 ` Pasha Tatashin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CA+CK2bB6F634HCw_N5z9E5r_LpbGJrucuFb_5fL4da5_W99e4Q@mail.gmail.com \
--to=pasha.tatashin@soleen.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=ajayachandra@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=aleksander.lobakin@intel.com \
--cc=aliceryhl@google.com \
--cc=andriy.shevchenko@linux.intel.com \
--cc=anna.schumaker@oracle.com \
--cc=axboe@kernel.dk \
--cc=bartosz.golaszewski@linaro.org \
--cc=bhelgaas@google.com \
--cc=bp@alien8.de \
--cc=brauner@kernel.org \
--cc=changyuanl@google.com \
--cc=chenridong@huawei.com \
--cc=chrisl@kernel.org \
--cc=corbet@lwn.net \
--cc=cw00.choi@samsung.com \
--cc=dakr@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=djeffery@redhat.com \
--cc=dmatlack@google.com \
--cc=graf@amazon.com \
--cc=gregkh@linuxfoundation.org \
--cc=hannes@cmpxchg.org \
--cc=hpa@zytor.com \
--cc=hughd@google.com \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=ira.weiny@intel.com \
--cc=jannh@google.com \
--cc=jasonmiu@google.com \
--cc=jgg@nvidia.com \
--cc=joel.granados@kernel.org \
--cc=kanie@linux.alibaba.com \
--cc=lennart@poettering.net \
--cc=leon@kernel.org \
--cc=leonro@nvidia.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux@weissschuh.net \
--cc=lukas@wunner.de \
--cc=mark.rutland@arm.com \
--cc=masahiroy@kernel.org \
--cc=mingo@redhat.com \
--cc=mmaurer@google.com \
--cc=myungjoo.ham@samsung.com \
--cc=ojeda@kernel.org \
--cc=parav@nvidia.com \
--cc=pratyush@kernel.org \
--cc=quic_zijuhu@quicinc.com \
--cc=rafael@kernel.org \
--cc=rdunlap@infradead.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=saeedm@nvidia.com \
--cc=skhawaja@google.com \
--cc=song@kernel.org \
--cc=steven.sistare@oracle.com \
--cc=stuart.w.hayes@gmail.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=wagi@kernel.org \
--cc=witu@nvidia.com \
--cc=x86@kernel.org \
--cc=yesanishhere@gmail.com \
--cc=yoann.congal@smile.fr \
--cc=zhangguopeng@kylinos.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).