linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com,
	changyuanl@google.com, rppt@kernel.org, dmatlack@google.com,
	rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org,
	ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com,
	ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org,
	akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr,
	mmaurer@google.com, roman.gushchin@linux.dev,
	chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com,
	jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org,
	dan.j.williams@intel.com, david@redhat.com,
	joel.granados@kernel.org, rostedt@goodmis.org,
	anna.schumaker@oracle.com, song@kernel.org,
	zhangguopeng@kylinos.cn, linux@weissschuh.net,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-mm@kvack.org, gregkh@linuxfoundation.org,
	tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
	rafael@kernel.org, dakr@kernel.org,
	bartosz.golaszewski@linaro.org, cw00.choi@samsung.com,
	myungjoo.ham@samsung.com, yesanishhere@gmail.com,
	Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com,
	aleksander.lobakin@intel.com, ira.weiny@intel.com,
	andriy.shevchenko@linux.intel.com, leon@kernel.org,
	lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org,
	djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de,
	lennart@poettering.net, brauner@kernel.org,
	linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	saeedm@nvidia.com, ajayachandra@nvidia.com, parav@nvidia.com,
	leonro@nvidia.com, witu@nvidia.com
Subject: Re: [PATCH v2 10/32] liveupdate: luo_core: Live Update Orchestrator
Date: Tue, 29 Jul 2025 14:28:12 -0300	[thread overview]
Message-ID: <20250729172812.GP36037@nvidia.com> (raw)
In-Reply-To: <20250723144649.1696299-11-pasha.tatashin@soleen.com>

On Wed, Jul 23, 2025 at 02:46:23PM +0000, Pasha Tatashin wrote:
> Introduce LUO, a mechanism intended to facilitate kernel updates while
> keeping designated devices operational across the transition (e.g., via
> kexec). The primary use case is updating hypervisors with minimal
> disruption to running virtual machines. For userspace side of hypervisor
> update we have copyless migration. LUO is for updating the kernel.
> 
> This initial patch lays the groundwork for the LUO subsystem.
> 
> Further functionality, including the implementation of state transition
> logic, integration with KHO, and hooks for subsystems and file
> descriptors, will be added in subsequent patches.
> 
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> ---
>  include/linux/liveupdate.h       | 140 ++++++++++++++
>  kernel/liveupdate/Kconfig        |  27 +++
>  kernel/liveupdate/Makefile       |   1 +
>  kernel/liveupdate/luo_core.c     | 301 +++++++++++++++++++++++++++++++
>  kernel/liveupdate/luo_internal.h |  21 +++
>  5 files changed, 490 insertions(+)
>  create mode 100644 include/linux/liveupdate.h
>  create mode 100644 kernel/liveupdate/luo_core.c
>  create mode 100644 kernel/liveupdate/luo_internal.h
> 
> diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
> new file mode 100644
> index 000000000000..da8f05c81e51
> --- /dev/null
> +++ b/include/linux/liveupdate.h
> @@ -0,0 +1,140 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +/*
> + * Copyright (c) 2025, Google LLC.
> + * Pasha Tatashin <pasha.tatashin@soleen.com>
> + */
> +#ifndef _LINUX_LIVEUPDATE_H
> +#define _LINUX_LIVEUPDATE_H
> +
> +#include <linux/bug.h>
> +#include <linux/types.h>
> +#include <linux/list.h>
> +
> +/**
> + * enum liveupdate_event - Events that trigger live update callbacks.
> + * @LIVEUPDATE_PREPARE: PREPARE should happen *before* the blackout window.
> + *                      Subsystems should prepare for an upcoming reboot by
> + *                      serializing their states. However, it must be considered
> + *                      that user applications, e.g. virtual machines are still
> + *                      running during this phase.
> + * @LIVEUPDATE_FREEZE:  FREEZE sent from the reboot() syscall, when the current
> + *                      kernel is on its way out. This is the final opportunity
> + *                      for subsystems to save any state that must persist
> + *                      across the reboot. Callbacks for this event should be as
> + *                      fast as possible since they are on the critical path of
> + *                      rebooting into the next kernel.
> + * @LIVEUPDATE_FINISH:  FINISH is sent in the newly booted kernel after a
> + *                      successful live update and normally *after* the blackout
> + *                      window. Subsystems should perform any final cleanup
> + *                      during this phase. This phase also provides an
> + *                      opportunity to clean up devices that were preserved but
> + *                      never explicitly reclaimed during the live update
> + *                      process. State restoration should have already occurred
> + *                      before this event. Callbacks for this event must not
> + *                      fail. The completion of this call transitions the
> + *                      machine from ``updated`` to ``normal`` state.
> + * @LIVEUPDATE_CANCEL:  CANCEL the live update and go back to normal state. This
> + *                      event is user initiated, or is done automatically when
> + *                      LIVEUPDATE_PREPARE or LIVEUPDATE_FREEZE stage fails.
> + *                      Subsystems should revert any actions taken during the
> + *                      corresponding prepare event. Callbacks for this event
> + *                      must not fail.
> + *
> + * These events represent the different stages and actions within the live
> + * update process that subsystems (like device drivers and bus drivers)
> + * need to be aware of to correctly serialize and restore their state.
> + *
> + */
> +enum liveupdate_event {
> +	LIVEUPDATE_PREPARE,
> +	LIVEUPDATE_FREEZE,
> +	LIVEUPDATE_FINISH,
> +	LIVEUPDATE_CANCEL,
> +};

I saw a later patch moves these hunks, that is poor patch planning.

Ideally an ioctl subsystem should start out with the first patch
introducing the basic cdev, file open, ioctl dispatch, ioctl uapi
header and related simple infrastructure.

Then you'd go basically ioctl by ioctl adding the new ioctls and
explaining what they do in the patch commit messages.

> +/**
> + * liveupdate_state_updated - Check if the system is in the live update
> + * 'updated' state.
> + *
> + * This function checks if the live update orchestrator is in the
> + * ``LIVEUPDATE_STATE_UPDATED`` state. This state indicates that the system has
> + * successfully rebooted into a new kernel as part of a live update, and the
> + * preserved devices are expected to be in the process of being reclaimed.
> + *
> + * This is typically used by subsystems during early boot of the new kernel
> + * to determine if they need to attempt to restore state from a previous
> + * live update.
> + *
> + * @return true if the system is in the ``LIVEUPDATE_STATE_UPDATED`` state,
> + * false otherwise.
> + */
> +bool liveupdate_state_updated(void)
> +{
> +	return is_current_luo_state(LIVEUPDATE_STATE_UPDATED);
> +}
> +EXPORT_SYMBOL_GPL(liveupdate_state_updated);

Unless there are existing in tree users there should not be exports.

I'm also not really sure why there is global state, I would expect the
fd and session objects to record what kind of things they are, not
having weird globals.

Like liveupdate_register_subsystem() stuff, it already has a lock,
&luo_subsystem_list_mutex, if you want to block mutation of the list
then, IMHO, it makes more sense to stick a specific variable
'luo_subsystems_list_immutable' under that lock and make it very
obvious.

Stuff like luo_files_startup() feels clunky to me:

+       ret = liveupdate_register_subsystem(&luo_file_subsys);
+       if (ret) {
+               pr_warn("Failed to register luo_file subsystem [%d]\n", ret);
+               return ret;
+       }
+
+       if (liveupdate_state_updated()) {

Thats going to be a standard pattern - I would expect that
liveupdate_register_subsystem() would do the check for updated and
then arrange to call back something like
liveupdate_subsystem.ops.post_update()

And then post_update() would get the info that is currently under
liveupdate_get_subsystem_data() as arguments instead of having to make
more functions calls.

Maybe even the fdt_node_check_compatible() can be hoisted.

That would remove a bunch more liveupdate_state_updated() calls.

etc.

Jason

  reply	other threads:[~2025-07-29 17:28 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-23 14:46 [PATCH v2 00/32] Live Update Orchestrator Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 01/32] kho: init new_physxa->phys_bits to fix lockdep Pasha Tatashin
2025-07-28 10:13   ` Mike Rapoport
2025-08-02 23:33     ` Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 02/32] kho: mm: Don't allow deferred struct page with KHO Pasha Tatashin
2025-07-28 10:14   ` Mike Rapoport
2025-07-23 14:46 ` [PATCH v2 03/32] kho: warn if KHO is disabled due to an error Pasha Tatashin
2025-07-28 10:15   ` Mike Rapoport
2025-07-23 14:46 ` [PATCH v2 04/32] kho: allow to drive kho from within kernel Pasha Tatashin
2025-07-28 10:18   ` Mike Rapoport
2025-08-02 23:40     ` Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 05/32] kho: make debugfs interface optional Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 06/32] kho: drop notifiers Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 07/32] kho: add interfaces to unpreserve folios and physical memory ranges Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 08/32] kho: don't unpreserve memory during abort Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 09/32] liveupdate: kho: move to kernel/liveupdate Pasha Tatashin
2025-07-29 17:14   ` Jason Gunthorpe
2025-08-02 23:46     ` Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 10/32] liveupdate: luo_core: Live Update Orchestrator Pasha Tatashin
2025-07-29 17:28   ` Jason Gunthorpe [this message]
2025-08-04  1:11     ` Pasha Tatashin
2025-08-05 12:31       ` Jason Gunthorpe
2025-08-06 22:28         ` Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 11/32] liveupdate: luo_core: integrate with KHO Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 12/32] liveupdate: luo_subsystems: add subsystem registration Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 13/32] liveupdate: luo_subsystems: implement subsystem callbacks Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 14/32] liveupdate: luo_files: add infrastructure for FDs Pasha Tatashin
2025-07-29 17:33   ` Jason Gunthorpe
2025-08-04 23:00     ` Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 15/32] liveupdate: luo_files: implement file systems callbacks Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 16/32] liveupdate: luo_ioctl: add ioctl interface Pasha Tatashin
2025-07-29 16:35   ` Jason Gunthorpe
2025-08-05 18:19     ` Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 17/32] liveupdate: luo_sysfs: add sysfs state monitoring Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 18/32] reboot: call liveupdate_reboot() before kexec Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 19/32] liveupdate: luo_files: luo_ioctl: session-based file descriptor tracking Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 20/32] kho: move kho debugfs directory to liveupdate Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 21/32] liveupdate: add selftests for subsystems un/registration Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 22/32] selftests/liveupdate: add subsystem/state tests Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 23/32] docs: add luo documentation Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 24/32] MAINTAINERS: add liveupdate entry Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 25/32] mm: shmem: use SHMEM_F_* flags instead of VM_* flags Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 26/32] mm: shmem: allow freezing inode mapping Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 27/32] mm: shmem: export some functions to internal.h Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 28/32] luo: allow preserving memfd Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 29/32] docs: add documentation for memfd preservation via LUO Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 30/32] tools: introduce libluo Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 31/32] libluo: introduce luoctl Pasha Tatashin
2025-07-29 16:14   ` Jason Gunthorpe
2025-07-29 19:53     ` Thomas Gleixner
2025-07-29 22:21       ` Jason Gunthorpe
2025-07-29 22:35         ` Steven Rostedt
2025-07-29 23:23           ` Pratyush Yadav
2025-08-05 18:24             ` Pasha Tatashin
2025-08-06 12:02               ` Pratyush Yadav
2025-08-06 20:14                 ` Pasha Tatashin
2025-07-23 14:46 ` [PATCH v2 32/32] libluo: add tests Pasha Tatashin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250729172812.GP36037@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=ajayachandra@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=aleksander.lobakin@intel.com \
    --cc=aliceryhl@google.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=anna.schumaker@oracle.com \
    --cc=axboe@kernel.dk \
    --cc=bartosz.golaszewski@linaro.org \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=brauner@kernel.org \
    --cc=changyuanl@google.com \
    --cc=chenridong@huawei.com \
    --cc=corbet@lwn.net \
    --cc=cw00.choi@samsung.com \
    --cc=dakr@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=djeffery@redhat.com \
    --cc=dmatlack@google.com \
    --cc=graf@amazon.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hpa@zytor.com \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=ira.weiny@intel.com \
    --cc=jannh@google.com \
    --cc=jasonmiu@google.com \
    --cc=joel.granados@kernel.org \
    --cc=kanie@linux.alibaba.com \
    --cc=lennart@poettering.net \
    --cc=leon@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@weissschuh.net \
    --cc=lukas@wunner.de \
    --cc=mark.rutland@arm.com \
    --cc=masahiroy@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mmaurer@google.com \
    --cc=myungjoo.ham@samsung.com \
    --cc=ojeda@kernel.org \
    --cc=parav@nvidia.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=pratyush@kernel.org \
    --cc=ptyadav@amazon.de \
    --cc=quic_zijuhu@quicinc.com \
    --cc=rafael@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=song@kernel.org \
    --cc=stuart.w.hayes@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=wagi@kernel.org \
    --cc=witu@nvidia.com \
    --cc=x86@kernel.org \
    --cc=yesanishhere@gmail.com \
    --cc=yoann.congal@smile.fr \
    --cc=zhangguopeng@kylinos.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).