Linux userland API discussions
 help / color / mirror / Atom feed
* Re: [PATCH v4 3/3] init: remove /proc/sys/kernel/real-root-dev
From: Christoph Hellwig @ 2025-11-24 14:28 UTC (permalink / raw)
  To: Askar Safin
  Cc: linux-fsdevel, linux-kernel, Linus Torvalds, Greg Kroah-Hartman,
	Christian Brauner, Al Viro, Jan Kara, Christoph Hellwig,
	Jens Axboe, Andy Shevchenko, Aleksa Sarai, Thomas Weißschuh,
	Julian Stecklina, Gao Xiang, Art Nikpal, Andrew Morton,
	Alexander Graf, Rob Landley, Lennart Poettering, linux-arch,
	linux-block, initramfs, linux-api, linux-doc, Michal Simek,
	Luis Chamberlain, Kees Cook, Thorsten Blum, Heiko Carstens,
	Arnd Bergmann, Dave Young, Christophe Leroy, Krzysztof Kozlowski,
	Borislav Petkov, Jessica Clarke, Nicolas Schichan,
	David Disseldorp, patches
In-Reply-To: <20251119222407.3333257-4-safinaskar@gmail.com>

On Wed, Nov 19, 2025 at 10:24:07PM +0000, Askar Safin wrote:
> It is not used anymore.

Let's hope whacky userspace agrees with that :)

But we'll have to try this to find out, so:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply

* Re: [PATCH v4 2/3] initrd: remove deprecated code path (linuxrc)
From: Christoph Hellwig @ 2025-11-24 14:27 UTC (permalink / raw)
  To: Askar Safin
  Cc: linux-fsdevel, linux-kernel, Linus Torvalds, Greg Kroah-Hartman,
	Christian Brauner, Al Viro, Jan Kara, Christoph Hellwig,
	Jens Axboe, Andy Shevchenko, Aleksa Sarai, Thomas Weißschuh,
	Julian Stecklina, Gao Xiang, Art Nikpal, Andrew Morton,
	Alexander Graf, Rob Landley, Lennart Poettering, linux-arch,
	linux-block, initramfs, linux-api, linux-doc, Michal Simek,
	Luis Chamberlain, Kees Cook, Thorsten Blum, Heiko Carstens,
	Arnd Bergmann, Dave Young, Christophe Leroy, Krzysztof Kozlowski,
	Borislav Petkov, Jessica Clarke, Nicolas Schichan,
	David Disseldorp, patches
In-Reply-To: <20251119222407.3333257-3-safinaskar@gmail.com>

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply

* Re: [PATCH v4 1/3] init: remove deprecated "load_ramdisk" and "prompt_ramdisk" command line parameters
From: Christoph Hellwig @ 2025-11-24 14:27 UTC (permalink / raw)
  To: Askar Safin
  Cc: linux-fsdevel, linux-kernel, Linus Torvalds, Greg Kroah-Hartman,
	Christian Brauner, Al Viro, Jan Kara, Christoph Hellwig,
	Jens Axboe, Andy Shevchenko, Aleksa Sarai, Thomas Weißschuh,
	Julian Stecklina, Gao Xiang, Art Nikpal, Andrew Morton,
	Alexander Graf, Rob Landley, Lennart Poettering, linux-arch,
	linux-block, initramfs, linux-api, linux-doc, Michal Simek,
	Luis Chamberlain, Kees Cook, Thorsten Blum, Heiko Carstens,
	Arnd Bergmann, Dave Young, Christophe Leroy, Krzysztof Kozlowski,
	Borislav Petkov, Jessica Clarke, Nicolas Schichan,
	David Disseldorp, patches
In-Reply-To: <20251119222407.3333257-2-safinaskar@gmail.com>

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply

* Re: [PATCH v7 02/22] liveupdate: luo_core: integrate with KHO
From: Pratyush Yadav @ 2025-11-24 14:21 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: pratyush, jasonmiu, graf, rppt, dmatlack, rientjes, corbet,
	rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm,
	tj, yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, lennart,
	brauner, linux-api, linux-fsdevel, saeedm, ajayachandra, jgg,
	parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251122222351.1059049-3-pasha.tatashin@soleen.com>

On Sat, Nov 22 2025, Pasha Tatashin wrote:

> Integrate the LUO with the KHO framework to enable passing LUO state
> across a kexec reboot.
>
> This patch implements the lifecycle integration with KHO:
>
> 1. Incoming State: During early boot (`early_initcall`), LUO checks if
>    KHO is active. If so, it retrieves the "LUO" subtree, verifies the
>    "luo-v1" compatibility string, and reads the `liveupdate-number` to
>    track the update count.
>
> 2. Outgoing State: During late initialization (`late_initcall`), LUO
>    allocates a new FDT for the next kernel, populates it with the basic
>    header (compatible string and incremented update number), and
>    registers it with KHO (`kho_add_subtree`).
>
> 3. Finalization: The `liveupdate_reboot()` notifier is updated to invoke
>    `kho_finalize()`. This ensures that all memory segments marked for
>    preservation are properly serialized before the kexec jump.
>
> LUO now depends on `CONFIG_KEXEC_HANDOVER`.

Nit: This patch does not add the dependency. That is done by patch 1. I
guess that change needs to be moved here or the comment removed?

Other than this,

Reviewed-by: Pratyush Yadav <pratyush@kernel.org>

[...]

-- 
Regards,
Pratyush Yadav

^ permalink raw reply

* Re: [PATCH v7 06/22] liveupdate: luo_file: implement file systems callbacks
From: Mike Rapoport @ 2025-11-24  8:18 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251122222351.1059049-7-pasha.tatashin@soleen.com>

On Sat, Nov 22, 2025 at 05:23:33PM -0500, Pasha Tatashin wrote:
> This patch implements the core mechanism for managing preserved
> files throughout the live update lifecycle. It provides the logic to
> invoke the file handler callbacks (preserve, unpreserve, freeze,
> unfreeze, retrieve, and finish) at the appropriate stages.
> 
> During the reboot phase, luo_file_freeze() serializes the final
> metadata for each file (handler compatible string, token, and data
> handle) into a memory region preserved by KHO. In the new kernel,
> luo_file_deserialize() reconstructs the in-memory file list from this
> data, preparing the session for retrieval.
> 
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>

With some comments below
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

> ---
>  include/linux/kho/abi/luo.h      |  39 +-
>  include/linux/liveupdate.h       |  98 ++++
>  kernel/liveupdate/Makefile       |   1 +
>  kernel/liveupdate/luo_file.c     | 882 +++++++++++++++++++++++++++++++
>  kernel/liveupdate/luo_internal.h |  38 ++
>  5 files changed, 1057 insertions(+), 1 deletion(-)
>  create mode 100644 kernel/liveupdate/luo_file.c
> 

...

> +int luo_preserve_file(struct luo_file_set *file_set, u64 token, int fd)
> +{
> +	struct liveupdate_file_op_args args = {0};
> +	struct liveupdate_file_handler *fh;
> +	struct luo_file *luo_file;
> +	struct file *file;
> +	int err;
> +
> +	if (luo_token_is_used(file_set, token))
> +		return -EEXIST;
> +
> +	file = fget(fd);
> +	if (!file)
> +		return -EBADF;
> +
> +	err = luo_alloc_files_mem(file_set);
> +	if (err)
> +		goto  err_files_mem;
> +
> +	if (file_set->count == LUO_FILE_MAX) {

This can be checked before getting the file and allocating memory, can't it?

> +		err = -ENOSPC;
> +		goto err_files_mem;

The goto label should say what it does, not what the error was.

> +	}
> +
> +	err = -ENOENT;
> +	luo_list_for_each_private(fh, &luo_file_handler_list, list) {
> +		if (fh->ops->can_preserve(fh, file)) {
> +			err = 0;
> +			break;
> +		}
> +	}
> +
> +	/* err is still -ENOENT if no handler was found */
> +	if (err)
> +		goto err_files_mem;
> +
> +	luo_file = kzalloc(sizeof(*luo_file), GFP_KERNEL);
> +	if (!luo_file) {
> +		err = -ENOMEM;
> +		goto err_files_mem;
> +	}
> +
> +	luo_file->file = file;
> +	luo_file->fh = fh;
> +	luo_file->token = token;
> +	luo_file->retrieved = false;
> +	mutex_init(&luo_file->mutex);
> +
> +	args.handler = fh;
> +	args.file = file;
> +	err = fh->ops->preserve(&args);
> +	if (err)
> +		goto err_kfree;
> +
> +	luo_file->serialized_data = args.serialized_data;
> +	list_add_tail(&luo_file->list, &file_set->files_list);
> +	file_set->count++;
> +
> +	return 0;
> +
> +err_kfree:
> +	mutex_destroy(&luo_file->mutex);

Don't think we need this, luo_file is freed in the next line.

> +	kfree(luo_file);
> +err_files_mem:
> +	fput(file);
> +	luo_free_files_mem(file_set);

I'd have the error path as

err_free_luo_file:
	kfree(luo_file);
err_free_files_mem:
	luo_free_files_mem(file_set);
err_put_file:
	fput(file);

> +
> +	return err;
> +}

...

> +void luo_file_unpreserve_files(struct luo_file_set *file_set)
> +{
> +	struct luo_file *luo_file;
> +
> +	while (!list_empty(&file_set->files_list)) {

list_for_each_entry_safe_reverse()?

> +		struct liveupdate_file_op_args args = {0};
> +
> +		luo_file = list_last_entry(&file_set->files_list,
> +					   struct luo_file, list);
> +
> +		args.handler = luo_file->fh;
> +		args.file = luo_file->file;
> +		args.serialized_data = luo_file->serialized_data;
> +		luo_file->fh->ops->unpreserve(&args);
> +
> +		list_del(&luo_file->list);
> +		file_set->count--;
> +
> +		fput(luo_file->file);
> +		mutex_destroy(&luo_file->mutex);
> +		kfree(luo_file);
> +	}
> +
> +	luo_free_files_mem(file_set);
> +}

...

> +int luo_file_finish(struct luo_file_set *file_set)
> +{
> +	struct list_head *files_list = &file_set->files_list;
> +	struct luo_file *luo_file;
> +	int err;
> +
> +	if (!file_set->count)
> +		return 0;
> +
> +	list_for_each_entry(luo_file, files_list, list) {
> +		err = luo_file_can_finish_one(file_set, luo_file);
> +		if (err)
> +			return err;
> +	}
> +
> +	while (!list_empty(&file_set->files_list)) {

list_for_each_entry_safe_reverse()?

> +		luo_file = list_last_entry(&file_set->files_list,
> +					   struct luo_file, list);
> +
> +		luo_file_finish_one(file_set, luo_file);
> +
> +		if (luo_file->file)
> +			fput(luo_file->file);
> +		list_del(&luo_file->list);
> +		file_set->count--;
> +		mutex_destroy(&luo_file->mutex);
> +		kfree(luo_file);
> +	}
> +

...

> diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_internal.h
> index 1292ac47eef8..c8973b543d1d 100644
> --- a/kernel/liveupdate/luo_internal.h
> +++ b/kernel/liveupdate/luo_internal.h
> @@ -40,6 +40,28 @@ static inline int luo_ucmd_respond(struct luo_ucmd *ucmd,
>   */
>  #define luo_restore_fail(__fmt, ...) panic(__fmt, ##__VA_ARGS__)
>  
> +/* Mimics list_for_each_entry() but for private list head entries */
> +#define luo_list_for_each_private(pos, head, member)				\
> +	for (struct list_head *__iter = (head)->next;				\
> +	     __iter != (head) &&						\
> +	     ({ pos = container_of(__iter, typeof(*(pos)), member); 1; });	\
> +	     __iter = __iter->next)

Ideally something like this should go to include/linux/list.h, but it can
be done later to avoid bikeshedding about the name :)

And you can reuse most of list_for_each_entry, just replace the line that
accesses __private member:

#define luo_list_for_each_private(pos, head, member)			\
	for (pos = list_first_entry(head, typeof(*pos), member);	\
	     &ACCESS_PRIVATE(pos, member) != head;			\
	     pos = list_next_entry(pos, member))

-- 
Sincerely yours,
Mike.

^ permalink raw reply

* Re: [PATCH v7 19/22] selftests/liveupdate: add test infrastructure and scripts
From: Mike Rapoport @ 2025-11-24  7:54 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251122222351.1059049-20-pasha.tatashin@soleen.com>

On Sat, Nov 22, 2025 at 05:23:46PM -0500, Pasha Tatashin wrote:
> Subject: [PATCH v7 19/22] selftests/liveupdate: add test infrastructure and scripts

Maybe                                                ^ end to end

> Add the testing infrastructure required to verify the liveupdate
> feature. This includes a custom init process, a test orchestration
> script, and a batch runner.

And say here that it's end to end test.
 
> The framework consists of:
> 
> init.c:
> A lightweight init process that manages the kexec lifecycle.
> It mounts necessary filesystems, determines the current execution
> stage (1 or 2) via the kernel command line, and handles the
> kexec_file_load() sequence to transition between kernels.
> 
> luo_test.sh:
> The primary KTAP-compliant test driver. It handles:
> - Kernel configuration merging and building.
> - Cross-compilation detection for x86_64 and arm64.
> - Generation of the initrd containing the test binary and init.
> - QEMU execution with automatic accelerator detection (KVM, HVF,
>  or TCG).
> 
> run.sh:
> A wrapper script to discover and execute all `luo_*.c`
> tests across supported architectures, providing a summary of
> pass/fail/skip results.
> 
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> ---
>  tools/testing/selftests/liveupdate/init.c     | 174 ++++++++++
>  .../testing/selftests/liveupdate/luo_test.sh  | 296 ++++++++++++++++++
>  tools/testing/selftests/liveupdate/run.sh     |  68 ++++
>  3 files changed, 538 insertions(+)
>  create mode 100644 tools/testing/selftests/liveupdate/init.c
>  create mode 100755 tools/testing/selftests/liveupdate/luo_test.sh
>  create mode 100755 tools/testing/selftests/liveupdate/run.sh
> 

...

> +static int is_stage_2(void)
> +{
> +	char cmdline[COMMAND_LINE_SIZE];
> +	ssize_t len;
> +	int fd;
> +
> +	fd = open("/proc/cmdline", O_RDONLY);
> +	if (fd < 0)
> +		return 0;
> +
> +	len = read(fd, cmdline, sizeof(cmdline) - 1);
> +	close(fd);
> +
> +	if (len < 0)
> +		return 0;

Shouldn't we bail out of the test if read of command line failed?

> +
> +	cmdline[len] = 0;
> +
> +	return !!strstr(cmdline, "luo_stage=2");
> +}
> +

...

> +function cleanup() {
> +	local exit_code=$?
> +
> +	if [ -z "$workspace_dir" ]; then
> +		ktap_finished
> +		return
> +	fi
> +
> +	if [ $exit_code -ne 0 ]; then
> +		echo "# Test failed (exit code $exit_code)."
> +		echo "# Workspace preserved at: $workspace_dir"
> +	elif [ "$KEEP_WORKSPACE" -eq 1 ]; then
> +		echo "# Workspace preserved (user request) at: $workspace_dir"
> +	else
> +		rm -fr "$workspace_dir"
> +	fi
> +	ktap_finished

	exit $exit_code

> +}

...

> +function build_kernel() {
> +	local build_dir=$1
> +	local make_cmd=$2
> +	local kimage=$3
> +	local target_arch=$4
> +
> +	local kconfig="$build_dir/.config"
> +	local common_conf="$test_dir/config"
> +	local arch_conf="$test_dir/config.$target_arch"
> +
> +	echo "# Building kernel in: $build_dir"
> +	$make_cmd defconfig
> +
> +	local fragments=""
> +	if [[ -f "$common_conf" ]]; then
> +		fragments="$fragments $common_conf"
> +	fi

Without this CONFIG_LIVEUPDATE won't be set
> +
> +	if [[ -f "$arch_conf" ]]; then
> +		fragments="$fragments $arch_conf"
> +	fi
> +
> +	if [[ -n "$fragments" ]]; then
> +		"$kernel_dir/scripts/kconfig/merge_config.sh" \
> +			-Q -m -O "$build_dir" "$kconfig" $fragments >> /dev/null
> +	fi

I believe you can just

	cat $common_conf $fragments >  $build_dir/.config
	make olddefconfig

without running defconfig at the beginning
It will build faster, just make sure to add CONFIG_SERIAL_ to $arch_conf

> +	$make_cmd olddefconfig
> +	$make_cmd "$kimage"
> +	$make_cmd headers_install INSTALL_HDR_PATH="$headers_dir"
> +}
> +
> +function mkinitrd() {
> +	local build_dir=$1
> +	local kernel_path=$2
> +	local test_name=$3
> +
> +	# 1. Compile the test binary and the init process

Didn't find 2. ;-)
Don't think we want the numbering here, plain comments are fine

> +	"$CROSS_COMPILE"gcc -static -O2 \
> +		-I "$headers_dir/include" \
> +		-I "$test_dir" \
> +		-o "$workspace_dir/test_binary" \
> +		"$test_dir/$test_name.c" "$test_dir/luo_test_utils.c"

This will have hard time cross-compiling with -nolibc toolchains

> +
> +	"$CROSS_COMPILE"gcc -s -static -Os -nostdinc -nostdlib		\
> +			-fno-asynchronous-unwind-tables -fno-ident	\
> +			-fno-stack-protector				\
> +			-I "$headers_dir/include"			\
> +			-I "$kernel_dir/tools/include/nolibc"		\
> +			-o "$workspace_dir/init" "$test_dir/init.c"

This failed for me with gcc 14.2.0 (Debian 14.2.0-19):

/home/mike/git/linux/tools/testing/selftests/liveupdate/init.c: In function ‘run_test’:
/home/mike/git/linux/tools/testing/selftests/liveupdate/init.c:111:65: error: initializer element is not constant
  111 |             static const char *const argv[] = {TEST_BINARY, stage_arg, NULL};
      |                                                             ^~~~~~~~~

/home/mike/git/linux/tools/testing/selftests/liveupdate/init.c:111:65: note: (near initialization for ‘argv[1]’)
/home/mike/git/linux/tools/testing/selftests/liveupdate/init.c:113:37: error: passing argument 2 of ‘execve’ from incompatible pointer type [-Wincompatible-pointer-types]
  113 |                 execve(TEST_BINARY, argv, NULL);
      |                                     ^~~~
      |                                     |
      |                                     const char * const*
In file included from /home/mike/git/linux/tools/testing/selftests/liveupdate/init.c:16:
/usr/include/unistd.h:572:52: note: expected ‘char * const*’ but argument is of type ‘const char * const*’
  572 | extern int execve (const char *__path, char *const __argv[],
      |                                        ~~~~~~~~~~~~^~~~~~~~

> +
> +	cat > "$workspace_dir/cpio_list_inner" <<EOF
> +dir /dev 0755 0 0
> +dir /proc 0755 0 0
> +dir /debugfs 0755 0 0
> +nod /dev/console 0600 0 0 c 5 1

Don't you need /dev/liveupdate node?

> +file /init $workspace_dir/init 0755 0 0
> +file /test_binary $workspace_dir/test_binary 0755 0 0
> +EOF
> +
> +	# Generate inner_initrd.cpio
> +	"$build_dir/usr/gen_init_cpio" "$workspace_dir/cpio_list_inner" > "$workspace_dir/inner_initrd.cpio"
> +
> +	cat > "$workspace_dir/cpio_list" <<EOF
> +dir /dev 0755 0 0
> +dir /proc 0755 0 0
> +dir /debugfs 0755 0 0
> +nod /dev/console 0600 0 0 c 5 1

And here as well.

> +file /init $workspace_dir/init 0755 0 0
> +file /kernel $kernel_path 0644 0 0
> +file /test_binary $workspace_dir/test_binary 0755 0 0
> +file /initrd.img $workspace_dir/inner_initrd.cpio 0644 0 0
> +EOF
> +
> +	# Generate the final initrd
> +	"$build_dir/usr/gen_init_cpio" "$workspace_dir/cpio_list" > "$initrd"
> +	local size=$(du -h "$initrd" | cut -f1)
> +}
> +
> +function run_qemu() {
> +	local qemu_cmd=$1
> +	local cmdline=$2
> +	local kernel_path=$3
> +	local serial="$workspace_dir/qemu.serial"
> +
> +	local accel="-accel tcg"
> +	local host_machine=$(uname -m)
> +
> +	[[ "$host_machine" == "arm64" ]] && host_machine="aarch64"
> +	[[ "$host_machine" == "x86_64" ]] && host_machine="x86_64"
> +
> +	if [[ "$qemu_cmd" == *"$host_machine"* ]]; then
> +		if [ -w /dev/kvm ]; then
> +			accel="-accel kvm"

Just pass both kvm and tcg and let qemu complain.

> +		fi
> +	fi
> +
> +	cmdline="$cmdline liveupdate=on panic=-1"
> +
> +	echo "# Serial Log: $serial"
> +	timeout 30s $qemu_cmd -m 1G -smp 2 -no-reboot -nographic -nodefaults	\
> +		  $accel							\
> +		  -serial file:"$serial"					\
> +		  -append "$cmdline"						\
> +		  -kernel "$kernel_path"					\
> +		  -initrd "$initrd"
> +
> +	local ret=$?
> +
> +	if [ $ret -eq 124 ]; then
> +		fail "QEMU timed out"
> +	fi
> +
> +	grep "TEST PASSED" "$serial" &> /dev/null || fail "Liveupdate failed. Check $serial for details."
> +}

-- 
Sincerely yours,
Mike.

^ permalink raw reply

* Re: [PATCH v7 18/22] selftests/liveupdate: Add kexec test for multiple and empty sessions
From: Mike Rapoport @ 2025-11-24  5:30 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251122222351.1059049-19-pasha.tatashin@soleen.com>

On Sat, Nov 22, 2025 at 05:23:45PM -0500, Pasha Tatashin wrote:
> Introduce a new kexec-based selftest, luo_kexec_multi_session, to
> validate the end-to-end lifecycle of a more complex LUO scenario.
> 
> While the existing luo_kexec_simple test covers the basic end-to-end
> lifecycle, it is limited to a single session with one preserved file.
> This new test significantly expands coverage by verifying LUO's ability
> to handle a mixed workload involving multiple sessions, some of which
> are intentionally empty. This ensures that the LUO core correctly
> preserves and restores the state of all session types across a reboot.
> 
> The test validates the following sequence:
> 
> Stage 1 (Pre-kexec):
> 
>   - Creates two empty test sessions (multi-test-empty-1,
>     multi-test-empty-2).
>   - Creates a session with one preserved memfd (multi-test-files-1).
>   - Creates another session with two preserved memfds
>     (multi-test-files-2), each containing unique data.
>   - Creates a state-tracking session to manage the transition to
>     Stage 2.
>   - Executes a kexec reboot via the helper script.
> 
> Stage 2 (Post-kexec):
> 
>   - Retrieves the state-tracking session to confirm it is in the
>     post-reboot stage.
>   - Retrieves all four test sessions (both the empty and non-empty
>     ones).
>   - For the non-empty sessions, restores the preserved memfds and
>     verifies their contents match the original data patterns.
>   - Finalizes all test sessions and the state session to ensure a clean
>     teardown and that all associated kernel resources are correctly
>     released.
> 
> This test provides greater confidence in the robustness of the LUO
> framework by validating its behavior in a more realistic, multi-faceted
> scenario.
> 
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

> ---
>  tools/testing/selftests/liveupdate/Makefile   |   1 +
>  .../selftests/liveupdate/luo_multi_session.c  | 162 ++++++++++++++++++
>  2 files changed, 163 insertions(+)
>  create mode 100644 tools/testing/selftests/liveupdate/luo_multi_session.c

-- 
Sincerely yours,
Mike.

^ permalink raw reply

* Re: [PATCH v7 17/22] selftests/liveupdate: Add kexec-based selftest for
From: Mike Rapoport @ 2025-11-24  5:29 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251122222351.1059049-18-pasha.tatashin@soleen.com>

On Sat, Nov 22, 2025 at 05:23:44PM -0500, Pasha Tatashin wrote:
> Subject: selftests/liveupdate: Add kexec-based selftest for

                                                         ^ for what? ;-)

> Introduce a kexec-based selftest, luo_kexec_simple, to validate the
> end-to-end lifecycle of a Live Update Orchestrator (LUO) session across
> a reboot.
> 
> While existing tests verify the uAPI in a pre-reboot context, this test
> ensures that the core functionality—preserving state via Kexec Handover
> and restoring it in a new kernel—works as expected.
> 
> The test operates in two stages, managing its state across the reboot by
> preserving a dedicated "state session" containing a memfd. This
> mechanism dogfoods the LUO feature itself for state tracking, making the
> test self-contained.
> 
> The test validates the following sequence:
> 
> Stage 1 (Pre-kexec):
>  - Creates a test session (test-session).
>  - Creates and preserves a memfd with a known data pattern into the test
>    session.
>  - Creates the state-tracking session to signal progression to Stage 2.
>  - Executes a kexec reboot via a helper script.
> 
> Stage 2 (Post-kexec):
>  - Retrieves the state-tracking session to confirm it is in the
>    post-reboot stage.
>  - Retrieves the preserved test session.
>  - Restores the memfd from the test session and verifies its contents
>    match the original data pattern written in Stage 1.
>  - Finalizes both the test and state sessions to ensure a clean
>    teardown.
> 
> The test relies on a helper script (do_kexec.sh) to perform the reboot
> and a shared utility library (luo_test_utils.c) for common LUO
> operations, keeping the main test logic clean and focused.
> 
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

> ---
>  tools/testing/selftests/liveupdate/Makefile   |   6 +
>  .../testing/selftests/liveupdate/do_kexec.sh  |  16 ++
>  .../selftests/liveupdate/luo_kexec_simple.c   |  89 ++++++
>  .../selftests/liveupdate/luo_test_utils.c     | 266 ++++++++++++++++++
>  .../selftests/liveupdate/luo_test_utils.h     |  44 +++
>  5 files changed, 421 insertions(+)
>  create mode 100755 tools/testing/selftests/liveupdate/do_kexec.sh
>  create mode 100644 tools/testing/selftests/liveupdate/luo_kexec_simple.c
>  create mode 100644 tools/testing/selftests/liveupdate/luo_test_utils.c
>  create mode 100644 tools/testing/selftests/liveupdate/luo_test_utils.h

-- 
Sincerely yours,
Mike.

^ permalink raw reply

* Re: [PATCH v7 16/22] selftests/liveupdate: Add userspace API selftests
From: Mike Rapoport @ 2025-11-24  5:24 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251122222351.1059049-17-pasha.tatashin@soleen.com>

On Sat, Nov 22, 2025 at 05:23:43PM -0500, Pasha Tatashin wrote:
> Introduce a selftest suite for LUO. These tests validate the core
> userspace-facing API provided by the /dev/liveupdate device and its
> associated ioctls.
> 
> The suite covers fundamental device behavior, session management, and
> the file preservation mechanism using memfd as a test case. This
> provides regression testing for the LUO uAPI.
> 
> The following functionality is verified:
> 
> Device Access:
>     Basic open and close operations on /dev/liveupdate.
>     Enforcement of exclusive device access (verifying EBUSY on a
>     second open).
> 
> Session Management:
>     Successful creation of sessions with unique names.
>     Failure to create sessions with duplicate names.
> 
> File Preservation:
>     Preserving a single memfd and verifying its content remains
>     intact post-preservation.
>     Preserving multiple memfds within a single session, each with
>     unique data.
>     A complex scenario involving multiple sessions, each containing
>     a mix of empty and data-filled memfds.
> 
> Note: This test suite is limited to verifying the pre-kexec
> functionality of LUO (e.g., session creation, file preservation).
> The post-kexec restoration of resources is not covered, as the kselftest
> framework does not currently support orchestrating a reboot and
> continuing execution in the new kernel.
> 
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

> ---
>  MAINTAINERS                                   |   1 +
>  tools/testing/selftests/Makefile              |   1 +
>  tools/testing/selftests/liveupdate/.gitignore |   9 +
>  tools/testing/selftests/liveupdate/Makefile   |  27 ++
>  tools/testing/selftests/liveupdate/config     |  11 +
>  .../testing/selftests/liveupdate/liveupdate.c | 348 ++++++++++++++++++
>  6 files changed, 397 insertions(+)
>  create mode 100644 tools/testing/selftests/liveupdate/.gitignore
>  create mode 100644 tools/testing/selftests/liveupdate/Makefile
>  create mode 100644 tools/testing/selftests/liveupdate/config
>  create mode 100644 tools/testing/selftests/liveupdate/liveupdate.c

-- 
Sincerely yours,
Mike.

^ permalink raw reply

* Re: [PATCH v7 07/22] liveupdate: luo_session: Add ioctls for file preservation
From: Mike Rapoport @ 2025-11-24  5:20 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251122222351.1059049-8-pasha.tatashin@soleen.com>

On Sat, Nov 22, 2025 at 05:23:34PM -0500, Pasha Tatashin wrote:
> Introducing the userspace interface and internal logic required to
> manage the lifecycle of file descriptors within a session. Previously, a
> session was merely a container; this change makes it a functional
> management unit.
> 
> The following capabilities are added:
> 
> A new set of ioctl commands are added, which operate on the file
> descriptor returned by CREATE_SESSION. This allows userspace to:
> - LIVEUPDATE_SESSION_PRESERVE_FD: Add a file descriptor to a session
>   to be preserved across the live update.
> - LIVEUPDATE_SESSION_RETRIEVE_FD: Retrieve a preserved file in the
>   new kernel using its unique token.
> - LIVEUPDATE_SESSION_FINISH: finish session
> 
> The session's .release handler is enhanced to be state-aware. When a
> session's file descriptor is closed, it correctly unpreserves
> the session based on its current state before freeing all
> associated file resources.
> 
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> Reviewed-by: Pratyush Yadav <pratyush@kernel.org>

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

> ---
>  include/uapi/linux/liveupdate.h | 103 ++++++++++++++++++
>  kernel/liveupdate/luo_session.c | 187 +++++++++++++++++++++++++++++++-
>  2 files changed, 288 insertions(+), 2 deletions(-)

-- 
Sincerely yours,
Mike.

^ permalink raw reply

* Re: [PATCH v7 01/22] liveupdate: luo_core: Live Update Orchestrator
From: Mike Rapoport @ 2025-11-24  5:07 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <CA+CK2bCN7x=eMwfTXF-2+vR=Gn3=41z6Xxx6wM1m7i-rxzug9w@mail.gmail.com>

On Sun, Nov 23, 2025 at 07:15:44AM -0500, Pasha Tatashin wrote:
> On Sun, Nov 23, 2025 at 6:12 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> > On Sat, Nov 22, 2025 at 05:23:28PM -0500, Pasha Tatashin wrote:
> > > Introduce LUO, a mechanism intended to facilitate kernel updates while
> > > keeping designated devices operational across the transition (e.g., via
> > > kexec). The primary use case is updating hypervisors with minimal
> > > disruption to running virtual machines. For userspace side of hypervisor
> > > update we have copyless migration. LUO is for updating the kernel.
> > >
> > > This initial patch lays the groundwork for the LUO subsystem.
> > >
> > > Further functionality, including the implementation of state transition
> > > logic, integration with KHO, and hooks for subsystems and file
> > > descriptors, will be added in subsequent patches.
> > >
> > > Create a character device at /dev/liveupdate.
> > >
> > > A new uAPI header, <uapi/linux/liveupdate.h>, will define the necessary
> > > structures. The magic number for IOCTL is registered in
> > > Documentation/userspace-api/ioctl/ioctl-number.rst.
> > >
> > > Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> > > Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
> >
> > Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> 
> Thank you
> 
> >
> > with a few nits below
> >
> > > ---
> >
> > > diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig
> > > index a973a54447de..90857dccb359 100644
> > > --- a/kernel/liveupdate/Kconfig
> > > +++ b/kernel/liveupdate/Kconfig
> > > @@ -1,4 +1,10 @@
> > >  # SPDX-License-Identifier: GPL-2.0-only
> > > +#
> > > +# Copyright (c) 2025, Google LLC.
> > > +# Pasha Tatashin <pasha.tatashin@soleen.com>
> > > +#
> > > +# Live Update Orchestrator
> > > +#
> >
> > If you are adding copyrights it should have Amazon and Microsoft as well.
> > I believe those from kexec_handover.c would work.
> >
> > @Alex?
> 
> Sure, or I can remove all of them from Kconfig, whatever you prefer :-)

Quick grepping shows that the vast majority of Kconfigs does not have
copyright, let's just drop it.

> > >  menu "Live Update and Kexec HandOver"
> > >       depends on !DEFERRED_STRUCT_PAGE_INIT
> > > @@ -51,4 +57,25 @@ config KEXEC_HANDOVER_ENABLE_DEFAULT
> > >         The default behavior can still be overridden at boot time by
> > >         passing 'kho=off'.
> > >
> > > +config LIVEUPDATE
> > > +     bool "Live Update Orchestrator"
> > > +     depends on KEXEC_HANDOVER
> > > +     help
> > > +       Enable the Live Update Orchestrator. Live Update is a mechanism,
> > > +       typically based on kexec, that allows the kernel to be updated
> > > +       while keeping selected devices operational across the transition.
> > > +       These devices are intended to be reclaimed by the new kernel and
> > > +       re-attached to their original workload without requiring a device
> > > +       reset.
> > > +
> > > +       Ability to handover a device from current to the next kernel depends
> > > +       on specific support within device drivers and related kernel
> > > +       subsystems.
> >
> > Sorry, somehow this slipped during v6 review.
> > These days LUO is less about devices and more about file descriptors :)
> 
> Device preservation through file descriptors: memfd, iommufd, vfiofd
> are all dependencies for preserving devices.
> 
> That Kconfig description is correct and essential because the core
> complexity of the LUO is the preservation of device state and I/O
> across a kernel transition, which is a harder problem than just
> preserving memory or files, for that we could have used a file system
> instead of inventing something new with logic of can_preserve() etc.
> 
> Device preservation requires exactly what is stated in the description
> for this config:
> "Ability to handover a device from current to the next kernel depends
> on specific support within device drivers and related kernel
> subsystems." The only subsystem that is getting upstreamed with this
> series is MEMFD, it is a hard pre-requirement for iommufd
> preservation; the other subsystems: VFIO, PCI, IOMMU are WIP.
 
Ok.

-- 
Sincerely yours,
Mike.

^ permalink raw reply

* Re: [PATCH v7 14/22] mm: memfd_luo: allow preserving memfd
From: Pasha Tatashin @ 2025-11-24  3:13 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <aSMsqD5mB2mHHH9v@kernel.org>

> > +unlock_folio:
> > +     folio_unlock(folio);
> > +     folio_put(folio);
> > +     i++;
>
> I'd add a counter and use it int the below for loop.

Done.

>
> > +put_folios:
> > +     /*
> > +      * Note: don't free the folios already added to the file. They will be
> > +      * freed when the file is freed. Free the ones not added yet here.
> > +      */
> > +     for (; i < nr_folios; i++) {
> > +             const struct memfd_luo_folio_ser *pfolio = &folios_ser[i];
> > +
> > +             folio = kho_restore_folio(pfolio->pfn);
> > +             if (folio)
> > +                     folio_put(folio);
> > +     }
> > +
> > +     return err;
> > +}
>
> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

Thanks!

Pasha

>
> --
> Sincerely yours,
> Mike.

^ permalink raw reply

* Re: [PATCH v6 6/9] futex: Wire up get_robust_list2 syscall
From: Thomas Gleixner @ 2025-11-23 20:02 UTC (permalink / raw)
  To: Arnd Bergmann, kernel test robot, André Almeida, Ingo Molnar,
	Peter Zijlstra, Darren Hart, Davidlohr Bueso,
	Sebastian Andrzej Siewior, Waiman Long, Ryan Houdek
  Cc: oe-kbuild-all, linux-kernel, linux-kselftest, linux-api,
	kernel-dev
In-Reply-To: <326957b0-fbce-4850-a8fb-8eed90fc4fae@app.fastmail.com>

On Sun, Nov 23 2025 at 20:19, Arnd Bergmann wrote:
> On Sun, Nov 23, 2025, at 19:47, Thomas Gleixner wrote:
>> On Sat, Nov 22 2025 at 14:49, kernel test robot wrote:
>>> kernel test robot noticed the following build warnings:
>>>
>>> [auto build test WARNING on c42ba5a87bdccbca11403b7ca8bad1a57b833732]
>>>
>>> url:    https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/futex-Use-explicit-sizes-for-compat_robust_list-structs/20251122-135406
>>> base:   c42ba5a87bdccbca11403b7ca8bad1a57b833732
>>> patch link:    https://lore.kernel.org/r/20251122-tonyk-robust_futex-v6-6-05fea005a0fd%40igalia.com
>>> patch subject: [PATCH v6 6/9] futex: Wire up get_robust_list2 syscall
>>> config: arc-allnoconfig (https://download.01.org/0day-ci/archive/20251122/202511221454.rsysOoSt-lkp@intel.com/config)
>>> compiler: arc-linux-gcc (GCC) 15.1.0
>>> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251122/202511221454.rsysOoSt-lkp@intel.com/reproduce)
>>>
>>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>>> the same patch/commit), kindly add following tags
>>> | Reported-by: kernel test robot <lkp@intel.com>
>>> | Closes: https://lore.kernel.org/oe-kbuild-all/202511221454.rsysOoSt-lkp@intel.com/
>>>
>>> All warnings (new ones prefixed by >>):
>>>
>>>>> <stdin>:1627:2: warning: #warning syscall get_robust_list2 not implemented [-Wcpp]
>>> --
>>>>> <stdin>:1627:2: warning: #warning syscall get_robust_list2 not implemented [-Wcpp]
>>
>> Lacks a COND_SYSCALL()
>
> No, it's actually
>
> scripts/syscall.tbl
>
> that is missing, which means that the newer architectures
> are missing the update. This used to be include/uapi/asm/unistd.h,
> which still exists but is now unused.

So it's both. That syscall depends on CONFIG_FUTEX, which means
COND_SYCALL() is required and it's actually added in patch 9/9 while 5/9
which adds the set() variant adds it right away :)

Thanks,

        tglx

^ permalink raw reply

* Re: [PATCH v7 11/22] mm: shmem: allow freezing inode mapping
From: Pasha Tatashin @ 2025-11-23 19:43 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <aSMoRRtanMkHo9Tr@kernel.org>

On Sun, Nov 23, 2025 at 10:29 AM Mike Rapoport <rppt@kernel.org> wrote:
>
> On Sat, Nov 22, 2025 at 05:23:38PM -0500, Pasha Tatashin wrote:
> > From: Pratyush Yadav <ptyadav@amazon.de>
> >
> > To prepare a shmem inode for live update, its index -> folio mappings
> > must be serialized. Once the mappings are serialized, they cannot change
> > since it would cause the serialized data to become inconsistent. This
> > can be done by pinning the folios to avoid migration, and by making sure
> > no folios can be added to or removed from the inode.
> >
> > While mechanisms to pin folios already exist, the only way to stop
> > folios being added or removed are the grow and shrink file seals. But
> > file seals come with their own semantics, one of which is that they
> > can't be removed. This doesn't work with liveupdate since it can be
> > cancelled or error out, which would need the seals to be removed and the
> > file's normal functionality to be restored.
> >
> > Introduce SHMEM_F_MAPPING_FROZEN to indicate this instead. It is
> > internal to shmem and is not directly exposed to userspace. It functions
> > similar to F_SEAL_GROW | F_SEAL_SHRINK, but additionally disallows hole
> > punching, and can be removed.
> >
> > Signed-off-by: Pratyush Yadav <ptyadav@amazon.de>
> > Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> > ---
> >  include/linux/shmem_fs.h | 17 +++++++++++++++++
> >  mm/shmem.c               | 19 ++++++++++++++++---
> >  2 files changed, 33 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
> > index 650874b400b5..d34a64eafe60 100644
> > --- a/include/linux/shmem_fs.h
> > +++ b/include/linux/shmem_fs.h
> > @@ -24,6 +24,14 @@ struct swap_iocb;
> >  #define SHMEM_F_NORESERVE    BIT(0)
> >  /* Disallow swapping. */
> >  #define SHMEM_F_LOCKED               BIT(1)
> > +/*
> > + * Disallow growing, shrinking, or hole punching in the inode. Combined with
> > + * folio pinning, makes sure the inode's mapping stays fixed.
> > + *
> > + * In some ways similar to F_SEAL_GROW | F_SEAL_SHRINK, but can be removed and
> > + * isn't directly visible to userspace.
> > + */
> > +#define SHMEM_F_MAPPING_FROZEN       BIT(2)
> >
> >  struct shmem_inode_info {
> >       spinlock_t              lock;
> > @@ -186,6 +194,15 @@ static inline bool shmem_file(struct file *file)
> >       return shmem_mapping(file->f_mapping);
> >  }
> >
> > +/* Must be called with inode lock taken exclusive. */
> > +static inline void shmem_freeze(struct inode *inode, bool freeze)
> > +{
> > +     if (freeze)
> > +             SHMEM_I(inode)->flags |= SHMEM_F_MAPPING_FROZEN;
> > +     else
> > +             SHMEM_I(inode)->flags &= ~SHMEM_F_MAPPING_FROZEN;
> > +}
> > +
> >  /*
> >   * If fallocate(FALLOC_FL_KEEP_SIZE) has been used, there may be pages
> >   * beyond i_size's notion of EOF, which fallocate has committed to reserving:
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 1d5036dec08a..cb74a5d202ac 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -1292,9 +1292,13 @@ static int shmem_setattr(struct mnt_idmap *idmap,
> >               loff_t newsize = attr->ia_size;
> >
> >               /* protected by i_rwsem */
> > -             if ((newsize < oldsize && (info->seals & F_SEAL_SHRINK)) ||
> > -                 (newsize > oldsize && (info->seals & F_SEAL_GROW)))
> > -                     return -EPERM;
> > +             if (newsize != oldsize) {
> > +                     if (info->flags & SHMEM_F_MAPPING_FROZEN)
> > +                             return -EPERM;
> > +                     if ((newsize < oldsize && (info->seals & F_SEAL_SHRINK)) ||
> > +                         (newsize > oldsize && (info->seals & F_SEAL_GROW)))
> > +                             return -EPERM;
> > +             }
> >
> >               if (newsize != oldsize) {
>
> I'd stick
>
>                         if (info->flags & SHMEM_F_MAPPING_FROZEN)
>                                 return -EPERM;
>
> here and leave the seals check alone.

Done.

>
> Other than than
>
> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

Thanks.

>
> --
> Sincerely yours,
> Mike.

^ permalink raw reply

* Re: [PATCH v7 08/22] docs: add luo documentation
From: Pasha Tatashin @ 2025-11-23 19:29 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <aSMwsLstAutayHbC@kernel.org>

On Sun, Nov 23, 2025 at 11:05 AM Mike Rapoport <rppt@kernel.org> wrote:
>
> On Sat, Nov 22, 2025 at 05:23:35PM -0500, Pasha Tatashin wrote:
> > Add the documentation files for the Live Update Orchestrator
> >
> > Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> > ---
>
> > +Public API
> > +==========
> > +.. kernel-doc:: include/linux/liveupdate.h
> > +
> > +.. kernel-doc:: include/linux/kho/abi/luo.h
>
> Please add
>
>    :functions:
>
> here, otherwise "DOC: Live Update Orchestrator ABI" is repeated here as
> well in the generated html.

Done, thanks!

>
> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
>
> --
> Sincerely yours,
> Mike.

^ permalink raw reply

* Re: [PATCH v7 05/22] liveupdate: luo_core: add user interface
From: Pasha Tatashin @ 2025-11-23 19:25 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <aSMX_pnUShilO_sj@kernel.org>

On Sun, Nov 23, 2025 at 9:20 AM Mike Rapoport <rppt@kernel.org> wrote:
>
> On Sat, Nov 22, 2025 at 05:23:32PM -0500, Pasha Tatashin wrote:
> > Introduce the user-space interface for the Live Update Orchestrator
> > via ioctl commands, enabling external control over the live update
> > process and management of preserved resources.
> >
> > The idea is that there is going to be a single userspace agent driving
> > the live update, therefore, only a single process can ever hold this
> > device opened at a time.
> >
> > The following ioctl commands are introduced:
> >
> > LIVEUPDATE_IOCTL_CREATE_SESSION
> > Provides a way for userspace to create a named session for grouping file
> > descriptors that need to be preserved. It returns a new file descriptor
> > representing the session.
> >
> > LIVEUPDATE_IOCTL_RETRIEVE_SESSION
> > Allows the userspace agent in the new kernel to reclaim a preserved
> > session by its name, receiving a new file descriptor to manage the
> > restored resources.
> >
> > Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
>
> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

Thanks

>
> > ---
> >  include/uapi/linux/liveupdate.h  |  64 +++++++++++
> >  kernel/liveupdate/luo_core.c     | 179 ++++++++++++++++++++++++++++++-
> >  kernel/liveupdate/luo_internal.h |  21 ++++
> >  3 files changed, 263 insertions(+), 1 deletion(-)
>
> --
> Sincerely yours,
> Mike.

^ permalink raw reply

* Re: [PATCH v6 6/9] futex: Wire up get_robust_list2 syscall
From: Arnd Bergmann @ 2025-11-23 19:19 UTC (permalink / raw)
  To: Thomas Gleixner, kernel test robot, André Almeida,
	Ingo Molnar, Peter Zijlstra, Darren Hart, Davidlohr Bueso,
	Sebastian Andrzej Siewior, Waiman Long, Ryan Houdek
  Cc: oe-kbuild-all, linux-kernel, linux-kselftest, linux-api,
	kernel-dev
In-Reply-To: <87ms4cio14.ffs@tglx>

On Sun, Nov 23, 2025, at 19:47, Thomas Gleixner wrote:
> On Sat, Nov 22 2025 at 14:49, kernel test robot wrote:
>> kernel test robot noticed the following build warnings:
>>
>> [auto build test WARNING on c42ba5a87bdccbca11403b7ca8bad1a57b833732]
>>
>> url:    https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/futex-Use-explicit-sizes-for-compat_robust_list-structs/20251122-135406
>> base:   c42ba5a87bdccbca11403b7ca8bad1a57b833732
>> patch link:    https://lore.kernel.org/r/20251122-tonyk-robust_futex-v6-6-05fea005a0fd%40igalia.com
>> patch subject: [PATCH v6 6/9] futex: Wire up get_robust_list2 syscall
>> config: arc-allnoconfig (https://download.01.org/0day-ci/archive/20251122/202511221454.rsysOoSt-lkp@intel.com/config)
>> compiler: arc-linux-gcc (GCC) 15.1.0
>> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251122/202511221454.rsysOoSt-lkp@intel.com/reproduce)
>>
>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>> the same patch/commit), kindly add following tags
>> | Reported-by: kernel test robot <lkp@intel.com>
>> | Closes: https://lore.kernel.org/oe-kbuild-all/202511221454.rsysOoSt-lkp@intel.com/
>>
>> All warnings (new ones prefixed by >>):
>>
>>>> <stdin>:1627:2: warning: #warning syscall get_robust_list2 not implemented [-Wcpp]
>> --
>>>> <stdin>:1627:2: warning: #warning syscall get_robust_list2 not implemented [-Wcpp]
>
> Lacks a COND_SYSCALL()

No, it's actually

scripts/syscall.tbl

that is missing, which means that the newer architectures
are missing the update. This used to be include/uapi/asm/unistd.h,
which still exists but is now unused.

     Arnd

^ permalink raw reply

* Re: [PATCH v7 04/22] liveupdate: luo_session: add sessions support
From: Pasha Tatashin @ 2025-11-23 19:07 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <aSMXM8ayzV2kx6Ws@kernel.org>

> > +     outgoing_buffer = kho_alloc_preserve(LUO_SESSION_PGCNT << PAGE_SHIFT);
> > +     if (IS_ERR(outgoing_buffer))
> > +             return PTR_ERR(header_ser);
>
> Should be
>                 return PTR_ERR(outgoing_buffer);

Thanks, fixed!

>
> Or, preferably, just drop outgoing_buffer and use header_ser everywhere.
>
> > +     header_ser = outgoing_buffer;
> > +     header_ser_pa = virt_to_phys(header_ser);
> > +
> > +     err = fdt_begin_node(fdt_out, LUO_FDT_SESSION_NODE_NAME);
> > +     err |= fdt_property_string(fdt_out, "compatible",
> > +                                LUO_FDT_SESSION_COMPATIBLE);
> > +     err |= fdt_property(fdt_out, LUO_FDT_SESSION_HEADER, &header_ser_pa,
> > +                         sizeof(header_ser_pa));
> > +     err |= fdt_end_node(fdt_out);
> > +
> > +     if (err)
> > +             goto err_unpreserve;
> > +
> > +     luo_session_global.outgoing.header_ser = header_ser;
> > +     luo_session_global.outgoing.ser = (void *)(header_ser + 1);
> > +     luo_session_global.outgoing.active = true;
> > +
> > +     return 0;
> > +
> > +err_unpreserve:
> > +     kho_unpreserve_free(header_ser);
> > +     return err;
> > +}
>
> ...
>
> > +int luo_session_deserialize(void)
> > +{
> > +     struct luo_session_header *sh = &luo_session_global.incoming;
> > +     static bool is_deserialized;
> > +     static int err;
> > +
> > +     /* If has been deserialized, always return the same error code */
> > +     if (is_deserialized)
> > +             return err;
>
> is_deserialized and err are uninitialized here.

These are global local variables. They are automatically initialized
to zero, and it is preferred in Linux source code to not set them to
zero.

> > +
> > +     is_deserialized = true;
> > +     if (!sh->active)
> > +             return 0;
> > +
>
> ...
>
> > +/**
> > + * luo_session_quiesce - Ensure no active sessions exist and lock session lists.
> > + *
> > + * Acquires exclusive write locks on both incoming and outgoing session lists.
> > + * It then validates no sessions exist in either list.
> > + *
> > + * This mechanism is used during file handler un/registration to ensure that no
> > + * sessions are currently using the handler, and no new sessions can be created
> > + * while un/registration is in progress.
>
> It makes sense to add something like this comment from luo_file.c here as well:
>
>          * This prevents registering new handlers while sessions are active or
>          * while deserialization is in progress.

Done

>
> > + *
> > + * Return:
> > + * true  - System is quiescent (0 sessions) and locked.
> > + * false - Active sessions exist. The locks are released internally.
> > + */
>
> With those addressed:
>
> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
>
>
> --
> Sincerely yours,
> Mike.

^ permalink raw reply

* Re: [PATCH v6 6/9] futex: Wire up get_robust_list2 syscall
From: Thomas Gleixner @ 2025-11-23 18:47 UTC (permalink / raw)
  To: kernel test robot, André Almeida, Ingo Molnar,
	Peter Zijlstra, Darren Hart, Davidlohr Bueso, Arnd Bergmann,
	Sebastian Andrzej Siewior, Waiman Long, Ryan Houdek
  Cc: oe-kbuild-all, linux-kernel, linux-kselftest, linux-api,
	kernel-dev, André Almeida
In-Reply-To: <202511221454.rsysOoSt-lkp@intel.com>

On Sat, Nov 22 2025 at 14:49, kernel test robot wrote:
> kernel test robot noticed the following build warnings:
>
> [auto build test WARNING on c42ba5a87bdccbca11403b7ca8bad1a57b833732]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/futex-Use-explicit-sizes-for-compat_robust_list-structs/20251122-135406
> base:   c42ba5a87bdccbca11403b7ca8bad1a57b833732
> patch link:    https://lore.kernel.org/r/20251122-tonyk-robust_futex-v6-6-05fea005a0fd%40igalia.com
> patch subject: [PATCH v6 6/9] futex: Wire up get_robust_list2 syscall
> config: arc-allnoconfig (https://download.01.org/0day-ci/archive/20251122/202511221454.rsysOoSt-lkp@intel.com/config)
> compiler: arc-linux-gcc (GCC) 15.1.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251122/202511221454.rsysOoSt-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202511221454.rsysOoSt-lkp@intel.com/
>
> All warnings (new ones prefixed by >>):
>
>>> <stdin>:1627:2: warning: #warning syscall get_robust_list2 not implemented [-Wcpp]
> --
>>> <stdin>:1627:2: warning: #warning syscall get_robust_list2 not implemented [-Wcpp]

Lacks a COND_SYSCALL()

^ permalink raw reply

* Re: [PATCH v7 02/22] liveupdate: luo_core: integrate with KHO
From: Pasha Tatashin @ 2025-11-23 18:23 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <aSMXUKMhroThYrlU@kernel.org>

On Sun, Nov 23, 2025 at 9:17 AM Mike Rapoport <rppt@kernel.org> wrote:
>
> On Sun, Nov 23, 2025 at 07:03:19AM -0500, Pasha Tatashin wrote:
> > On Sun, Nov 23, 2025 at 6:27 AM Mike Rapoport <rppt@kernel.org> wrote:
> > >
> > > On Sat, Nov 22, 2025 at 05:23:29PM -0500, Pasha Tatashin wrote:
> > > > Integrate the LUO with the KHO framework to enable passing LUO state
> > > > across a kexec reboot.
> > > >
> > > > This patch implements the lifecycle integration with KHO:
> > > >
> > > > 1. Incoming State: During early boot (`early_initcall`), LUO checks if
> > > >    KHO is active. If so, it retrieves the "LUO" subtree, verifies the
> > > >    "luo-v1" compatibility string, and reads the `liveupdate-number` to
> > > >    track the update count.
> > > >
> > > > 2. Outgoing State: During late initialization (`late_initcall`), LUO
> > > >    allocates a new FDT for the next kernel, populates it with the basic
> > > >    header (compatible string and incremented update number), and
> > > >    registers it with KHO (`kho_add_subtree`).
> > > >
> > > > 3. Finalization: The `liveupdate_reboot()` notifier is updated to invoke
> > > >    `kho_finalize()`. This ensures that all memory segments marked for
> > > >    preservation are properly serialized before the kexec jump.
> > > >
> > > > LUO now depends on `CONFIG_KEXEC_HANDOVER`.
> > > >
> > > > Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> > > > ---
> > > >  include/linux/kho/abi/luo.h      |  54 +++++++++++
> > > >  kernel/liveupdate/luo_core.c     | 154 ++++++++++++++++++++++++++++++-
> > > >  kernel/liveupdate/luo_internal.h |  22 +++++
> > > >  3 files changed, 229 insertions(+), 1 deletion(-)
> > > >  create mode 100644 include/linux/kho/abi/luo.h
> > > >  create mode 100644 kernel/liveupdate/luo_internal.h
> > > >
> > > > diff --git a/include/linux/kho/abi/luo.h b/include/linux/kho/abi/luo.h
> > > > new file mode 100644
> > > > index 000000000000..8523b3ff82d1
> > > > --- /dev/null
> > > > +++ b/include/linux/kho/abi/luo.h
> > > > @@ -0,0 +1,54 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > > +
> > > > +/*
> > > > + * Copyright (c) 2025, Google LLC.
> > > > + * Pasha Tatashin <pasha.tatashin@soleen.com>
> > > > + */
> > > > +
> > > > +/**
> > > > + * DOC: Live Update Orchestrator ABI
> > > > + *
> > > > + * This header defines the stable Application Binary Interface used by the
> > > > + * Live Update Orchestrator to pass state from a pre-update kernel to a
> > > > + * post-update kernel. The ABI is built upon the Kexec HandOver framework
> > > > + * and uses a Flattened Device Tree to describe the preserved data.
> > > > + *
> > > > + * This interface is a contract. Any modification to the FDT structure, node
> > > > + * properties, compatible strings, or the layout of the `__packed` serialization
> > > > + * structures defined here constitutes a breaking change. Such changes require
> > > > + * incrementing the version number in the relevant `_COMPATIBLE` string to
> > > > + * prevent a new kernel from misinterpreting data from an old kernel.
> > >
> > > From v6 thread:
> > >
> > > > > I'd add a sentence that stresses that ABI changes are possible as long they
> > > > > include changes to the FDT version.
> > > > > This is indeed implied by the last paragraph, but I think it's worth
> > > > > spelling it explicitly.
> > > > >
> > > > > Another thing that I think this should mention is that compatibility is
> > > > > only guaranteed for the kernels that use the same ABI version.
> > > >
> > > > Sure, I will add both.
> > >
> > > Looks like it fell between the cracks :/
> >
> > Hm, when I was updating the patches, I included the first part, and
> > then re-read the content, and I think it covers all points:
> >
> > 1. Changes are possible
> > This interface is a contract. Any modification to the FDT structure, node
> >  * properties, compatible strings, or the layout of the `__packed` serialization
> >  * structures defined here constitutes a breaking change. Such changes require
> >  * incrementing the version number in the relevant `_COMPATIBLE` string
> >
> > So, change as long as you update versioning number
> >
> > 2. Breaking if version is different:
> > to prevent a new kernel from misinterpreting data from an old kernel.
> >
> > So, the next kernel can interpret only if the version is the same.
> >
> > Which point do you think is not covered?
>
> As I said, it's covered, but it's implied. I'd prefer these stated
> explicitly.

Added, thanks.

>
> > > > +static int __init liveupdate_early_init(void)
> > > > +{
> > > > +     int err;
> > > > +
> > > > +     err = luo_early_startup();
> > > > +     if (err) {
> > > > +             luo_global.enabled = false;
> > > > +             luo_restore_fail("The incoming tree failed to initialize properly [%pe], disabling live update\n",
> > > > +                              ERR_PTR(err));
> > >
> > > What's wrong with a plain panic()?
> >
> > Jason suggested using the luo_restore_fail() function instead of
> > inserting panic() right in code somewhere in LUOv3 or earlier. It
> > helps avoid sprinkling panics in different places, and also in case if
> > we add the maintenance mode that we have discussed in LUOv6, we could
> > update this function as a place where that mode would be switched on.
>
> I'd agree if we were to have a bunch of panic()s sprinkled in the code.
> With a single one it's easier to parse panic() than lookup what
> luo_restore_fail() means.

The issue is that removing luo_restore_fail() removes the only
dependency on luo_internal.h in this patch. This would require me to
move the introduction of that header file to a later patch in the
series, which is difficult to handle via a simple fix-up.

Additionally, I still believe the abstraction is cleaner for future
extensibility (like the maintenance mode), even if it currently wraps
a single panic (which is actually a good thing, I have cleaned-up
things substantially to have  a single point  of panic since v2).
Therefore, it is my preference to keep it as is, unless a full series
is needed to be re-sent.

Pasha

^ permalink raw reply

* Re: [PATCH v7 15/22] docs: add documentation for memfd preservation via LUO
From: Mike Rapoport @ 2025-11-23 16:07 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251122222351.1059049-16-pasha.tatashin@soleen.com>

On Sat, Nov 22, 2025 at 05:23:42PM -0500, Pasha Tatashin wrote:
> From: Pratyush Yadav <ptyadav@amazon.de>
> 
> Add the documentation under the "Preserving file descriptors" section of
> LUO's documentation.
> 
> Signed-off-by: Pratyush Yadav <ptyadav@amazon.de>
> Co-developed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

> ---
>  Documentation/core-api/liveupdate.rst   |  7 +++++++
>  Documentation/mm/index.rst              |  1 +
>  Documentation/mm/memfd_preservation.rst | 23 +++++++++++++++++++++++
>  MAINTAINERS                             |  1 +
>  4 files changed, 32 insertions(+)
>  create mode 100644 Documentation/mm/memfd_preservation.rst

-- 
Sincerely yours,
Mike.

^ permalink raw reply

* Re: [PATCH v7 08/22] docs: add luo documentation
From: Mike Rapoport @ 2025-11-23 16:05 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251122222351.1059049-9-pasha.tatashin@soleen.com>

On Sat, Nov 22, 2025 at 05:23:35PM -0500, Pasha Tatashin wrote:
> Add the documentation files for the Live Update Orchestrator
> 
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> ---

> +Public API
> +==========
> +.. kernel-doc:: include/linux/liveupdate.h
> +
> +.. kernel-doc:: include/linux/kho/abi/luo.h

Please add 

   :functions:

here, otherwise "DOC: Live Update Orchestrator ABI" is repeated here as
well in the generated html.

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

-- 
Sincerely yours,
Mike.

^ permalink raw reply

* Re: [PATCH v7 14/22] mm: memfd_luo: allow preserving memfd
From: Mike Rapoport @ 2025-11-23 15:47 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251122222351.1059049-15-pasha.tatashin@soleen.com>

On Sat, Nov 22, 2025 at 05:23:41PM -0500, Pasha Tatashin wrote:
> From: Pratyush Yadav <ptyadav@amazon.de>
> 
> The ability to preserve a memfd allows userspace to use KHO and LUO to
> transfer its memory contents to the next kernel. This is useful in many
> ways. For one, it can be used with IOMMUFD as the backing store for
> IOMMU page tables. Preserving IOMMUFD is essential for performing a
> hypervisor live update with passthrough devices. memfd support provides
> the first building block for making that possible.
> 
> For another, applications with a large amount of memory that takes time
> to reconstruct, reboots to consume kernel upgrades can be very
> expensive. memfd with LUO gives those applications reboot-persistent
> memory that they can use to quickly save and reconstruct that state.
> 
> While memfd is backed by either hugetlbfs or shmem, currently only
> support on shmem is added. To be more precise, support for anonymous
> shmem files is added.
> 
> The handover to the next kernel is not transparent. All the properties
> of the file are not preserved; only its memory contents, position, and
> size. The recreated file gets the UID and GID of the task doing the
> restore, and the task's cgroup gets charged with the memory.
> 
> Once preserved, the file cannot grow or shrink, and all its pages are
> pinned to avoid migrations and swapping. The file can still be read from
> or written to.
> 
> Use vmalloc to get the buffer to hold the folios, and preserve
> it using kho_preserve_vmalloc(). This doesn't have the size limit.
> 
> Signed-off-by: Pratyush Yadav <ptyadav@amazon.de>
> Co-developed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> ---

...

> +static int memfd_luo_retrieve_folios(struct file *file,
> +				     struct memfd_luo_folio_ser *folios_ser,
> +				     u64 nr_folios)
> +{
> +	struct inode *inode = file_inode(file);
> +	struct address_space *mapping = inode->i_mapping;
> +	struct folio *folio;
> +	long i = 0;
> +	int err;
> +
> +	for (; i < nr_folios; i++) {
> +		const struct memfd_luo_folio_ser *pfolio = &folios_ser[i];
> +		phys_addr_t phys;
> +		u64 index;
> +		int flags;
> +
> +		if (!pfolio->pfn)
> +			continue;
> +
> +		phys = PFN_PHYS(pfolio->pfn);
> +		folio = kho_restore_folio(phys);
> +		if (!folio) {
> +			pr_err("Unable to restore folio at physical address: %llx\n",
> +			       phys);
> +			goto put_folios;
> +		}
> +		index = pfolio->index;
> +		flags = pfolio->flags;
> +
> +		/* Set up the folio for insertion. */
> +		__folio_set_locked(folio);
> +		__folio_set_swapbacked(folio);
> +
> +		err = mem_cgroup_charge(folio, NULL, mapping_gfp_mask(mapping));
> +		if (err) {
> +			pr_err("shmem: failed to charge folio index %ld: %d\n",
> +			       i, err);
> +			goto unlock_folio;
> +		}
> +
> +		err = shmem_add_to_page_cache(folio, mapping, index, NULL,
> +					      mapping_gfp_mask(mapping));
> +		if (err) {
> +			pr_err("shmem: failed to add to page cache folio index %ld: %d\n",
> +			       i, err);
> +			goto unlock_folio;
> +		}
> +
> +		if (flags & MEMFD_LUO_FOLIO_UPTODATE)
> +			folio_mark_uptodate(folio);
> +		if (flags & MEMFD_LUO_FOLIO_DIRTY)
> +			folio_mark_dirty(folio);
> +
> +		err = shmem_inode_acct_blocks(inode, 1);
> +		if (err) {
> +			pr_err("shmem: failed to account folio index %ld: %d\n",
> +			       i, err);
> +			goto unlock_folio;
> +		}
> +
> +		shmem_recalc_inode(inode, 1, 0);
> +		folio_add_lru(folio);
> +		folio_unlock(folio);
> +		folio_put(folio);
> +	}
> +
> +	return 0;
> +
> +unlock_folio:
> +	folio_unlock(folio);
> +	folio_put(folio);
> +	i++;
 
I'd add a counter and use it int the below for loop.

> +put_folios:
> +	/*
> +	 * Note: don't free the folios already added to the file. They will be
> +	 * freed when the file is freed. Free the ones not added yet here.
> +	 */
> +	for (; i < nr_folios; i++) {
> +		const struct memfd_luo_folio_ser *pfolio = &folios_ser[i];
> +
> +		folio = kho_restore_folio(pfolio->pfn);
> +		if (folio)
> +			folio_put(folio);
> +	}
> +
> +	return err;
> +}

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

-- 
Sincerely yours,
Mike.

^ permalink raw reply

* Re: [PATCH v7 09/22] MAINTAINERS: add liveupdate entry
From: Mike Rapoport @ 2025-11-23 15:29 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251122222351.1059049-10-pasha.tatashin@soleen.com>

On Sat, Nov 22, 2025 at 05:23:36PM -0500, Pasha Tatashin wrote:
> Add a MAINTAINERS file entry for the new Live Update Orchestrator
> introduced in previous patches.
> 
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

> ---
>  MAINTAINERS | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index b46425e3b4d3..868d3d23fdea 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -14466,6 +14466,18 @@ F:	kernel/module/livepatch.c
>  F:	samples/livepatch/
>  F:	tools/testing/selftests/livepatch/
>  
> +LIVE UPDATE
> +M:	Pasha Tatashin <pasha.tatashin@soleen.com>
> +M:	Mike Rapoport <rppt@kernel.org>
> +L:	linux-kernel@vger.kernel.org
> +S:	Maintained
> +F:	Documentation/core-api/liveupdate.rst
> +F:	Documentation/userspace-api/liveupdate.rst
> +F:	include/linux/liveupdate.h
> +F:	include/linux/liveupdate/
> +F:	include/uapi/linux/liveupdate.h
> +F:	kernel/liveupdate/
> +
>  LLC (802.2)
>  L:	netdev@vger.kernel.org
>  S:	Odd fixes
> -- 
> 2.52.0.rc2.455.g230fcf2819-goog
> 
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply

* Re: [PATCH v7 11/22] mm: shmem: allow freezing inode mapping
From: Mike Rapoport @ 2025-11-23 15:29 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
	ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
	yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
	mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
	david, joel.granados, rostedt, anna.schumaker, song, linux,
	linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
	dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
	cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
	quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
	leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
	lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
	jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251122222351.1059049-12-pasha.tatashin@soleen.com>

On Sat, Nov 22, 2025 at 05:23:38PM -0500, Pasha Tatashin wrote:
> From: Pratyush Yadav <ptyadav@amazon.de>
> 
> To prepare a shmem inode for live update, its index -> folio mappings
> must be serialized. Once the mappings are serialized, they cannot change
> since it would cause the serialized data to become inconsistent. This
> can be done by pinning the folios to avoid migration, and by making sure
> no folios can be added to or removed from the inode.
> 
> While mechanisms to pin folios already exist, the only way to stop
> folios being added or removed are the grow and shrink file seals. But
> file seals come with their own semantics, one of which is that they
> can't be removed. This doesn't work with liveupdate since it can be
> cancelled or error out, which would need the seals to be removed and the
> file's normal functionality to be restored.
> 
> Introduce SHMEM_F_MAPPING_FROZEN to indicate this instead. It is
> internal to shmem and is not directly exposed to userspace. It functions
> similar to F_SEAL_GROW | F_SEAL_SHRINK, but additionally disallows hole
> punching, and can be removed.
> 
> Signed-off-by: Pratyush Yadav <ptyadav@amazon.de>
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> ---
>  include/linux/shmem_fs.h | 17 +++++++++++++++++
>  mm/shmem.c               | 19 ++++++++++++++++---
>  2 files changed, 33 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
> index 650874b400b5..d34a64eafe60 100644
> --- a/include/linux/shmem_fs.h
> +++ b/include/linux/shmem_fs.h
> @@ -24,6 +24,14 @@ struct swap_iocb;
>  #define SHMEM_F_NORESERVE	BIT(0)
>  /* Disallow swapping. */
>  #define SHMEM_F_LOCKED		BIT(1)
> +/*
> + * Disallow growing, shrinking, or hole punching in the inode. Combined with
> + * folio pinning, makes sure the inode's mapping stays fixed.
> + *
> + * In some ways similar to F_SEAL_GROW | F_SEAL_SHRINK, but can be removed and
> + * isn't directly visible to userspace.
> + */
> +#define SHMEM_F_MAPPING_FROZEN	BIT(2)
>  
>  struct shmem_inode_info {
>  	spinlock_t		lock;
> @@ -186,6 +194,15 @@ static inline bool shmem_file(struct file *file)
>  	return shmem_mapping(file->f_mapping);
>  }
>  
> +/* Must be called with inode lock taken exclusive. */
> +static inline void shmem_freeze(struct inode *inode, bool freeze)
> +{
> +	if (freeze)
> +		SHMEM_I(inode)->flags |= SHMEM_F_MAPPING_FROZEN;
> +	else
> +		SHMEM_I(inode)->flags &= ~SHMEM_F_MAPPING_FROZEN;
> +}
> +
>  /*
>   * If fallocate(FALLOC_FL_KEEP_SIZE) has been used, there may be pages
>   * beyond i_size's notion of EOF, which fallocate has committed to reserving:
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 1d5036dec08a..cb74a5d202ac 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1292,9 +1292,13 @@ static int shmem_setattr(struct mnt_idmap *idmap,
>  		loff_t newsize = attr->ia_size;
>  
>  		/* protected by i_rwsem */
> -		if ((newsize < oldsize && (info->seals & F_SEAL_SHRINK)) ||
> -		    (newsize > oldsize && (info->seals & F_SEAL_GROW)))
> -			return -EPERM;
> +		if (newsize != oldsize) {
> +			if (info->flags & SHMEM_F_MAPPING_FROZEN)
> +				return -EPERM;
> +			if ((newsize < oldsize && (info->seals & F_SEAL_SHRINK)) ||
> +			    (newsize > oldsize && (info->seals & F_SEAL_GROW)))
> +				return -EPERM;
> +		}
>  
>  		if (newsize != oldsize) {

I'd stick 

			if (info->flags & SHMEM_F_MAPPING_FROZEN)
				return -EPERM;

here and leave the seals check alone.

Other than than

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

-- 
Sincerely yours,
Mike.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox