* Re: [PATCH v6 20/20] tests/liveupdate: Add in-kernel liveupdate test
From: Pasha Tatashin @ 2025-11-17 19:00 UTC (permalink / raw)
To: Mike Rapoport
Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <aRsDb-4bXFQ9Zmtu@kernel.org>
> > #endif /* _LINUX_LIVEUPDATE_ABI_LUO_H */
> > diff --git a/kernel/liveupdate/luo_file.c b/kernel/liveupdate/luo_file.c
> > index df337c9c4f21..9a531096bdb5 100644
> > --- a/kernel/liveupdate/luo_file.c
> > +++ b/kernel/liveupdate/luo_file.c
> > @@ -834,6 +834,8 @@ int liveupdate_register_file_handler(struct liveupdate_file_handler *fh)
> > INIT_LIST_HEAD(&fh->flb_list);
> > list_add_tail(&fh->list, &luo_file_handler_list);
> >
> > + liveupdate_test_register(fh);
> > +
>
> Why this cannot be called from the test?
Because test does not have access to all file_handlers that are being
registered with LUO.
Pasha
^ permalink raw reply
* Re: [PATCH v6 18/20] selftests/liveupdate: Add kexec-based selftest for session lifecycle
From: David Matlack @ 2025-11-17 19:27 UTC (permalink / raw)
To: Pasha Tatashin
Cc: pratyush, jasonmiu, graf, rppt, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251115233409.768044-19-pasha.tatashin@soleen.com>
On Sat, Nov 15, 2025 at 3:34 PM Pasha Tatashin
<pasha.tatashin@soleen.com> wrote:
> diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/selftests/liveupdate/Makefile
> index 2a573c36016e..1563ac84006a 100644
> --- a/tools/testing/selftests/liveupdate/Makefile
> +++ b/tools/testing/selftests/liveupdate/Makefile
> @@ -1,7 +1,39 @@
> # SPDX-License-Identifier: GPL-2.0-only
> +
> +KHDR_INCLUDES ?= -I../../../../usr/include
You shouldn't need to set this variable and $(OUTPUT). Both should be
provided by lib.mk. Maybe the include is too far down?
> CFLAGS += -Wall -O2 -Wno-unused-function
> CFLAGS += $(KHDR_INCLUDES)
> +LDFLAGS += -static
Is static build really required or just for your setup? If it's
setup-specific, I would recommend letting the user pass in -static via
EXTRA_CFLAGS. That what we do in the KVM and VFIO selftests.
CFLAGS += $(EXTRA_CFLAGS)
Then the user can pass EXTRA_CFLAGS=-static on the command line.
> +OUTPUT ?= .
> +
> +# --- Test Configuration (Edit this section when adding new tests) ---
> +LUO_SHARED_SRCS := luo_test_utils.c
> +LUO_SHARED_HDRS += luo_test_utils.h
I would suggest using the -MD flag and Make's -include directive to
automatically handle headers. That way you don't need to add every
header to Makefile for Make to detect changes. See the end of my email
for how to do this.
> +
> +LUO_MANUAL_TESTS += luo_kexec_simple
> +
> +TEST_FILES += do_kexec.sh
>
> TEST_GEN_PROGS += liveupdate
>
> +# --- Automatic Rule Generation (Do not edit below) ---
> +
> +TEST_GEN_PROGS_EXTENDED += $(LUO_MANUAL_TESTS)
> +
> +# Define the full list of sources for each manual test.
> +$(foreach test,$(LUO_MANUAL_TESTS), \
> + $(eval $(test)_SOURCES := $(test).c $(LUO_SHARED_SRCS)))
This does not build with Google's gbuild wrapper around make. I get
these errors (after fixing the semi-colon issue below):
clang: error: no such file or directory: 'luo_kexec_simple.c'
clang: error: no such file or directory: 'luo_test_utils.c'
clang: error: no such file or directory: 'luo_test_utils.h'
> +
> +# This loop automatically generates an explicit build rule for each manual test.
> +# It includes dependencies on the shared headers and makes the output
> +# executable.
> +# Note the use of '$$' to escape automatic variables for the 'eval' command.
> +$(foreach test,$(LUO_MANUAL_TESTS), \
> + $(eval $(OUTPUT)/$(test): $($(test)_SOURCES) $(LUO_SHARED_HDRS) \
> + $(call msg,LINK,,$$@) ; \
> + $(Q)$(LINK.c) $$^ $(LDLIBS) -o $$@ ; \
> + $(Q)chmod +x $$@ \
These semi-colons swollow any errors. I would recommend against using
a foreach and eval. Make supports pattern-based targets so there's
really no need for loops. See below.
> + ) \
> +)
> +
> include ../lib.mk
Putting it all together, here is what I'd recommend for this Makefile
(drop-in replacement for the current Makefile). This will also make it
easier for me to share the library code with VFIO selftests, which
I'll need to do in the VFIO series.
(Sorry in advance for the line wrap. I had to send this through gmail.)
# SPDX-License-Identifier: GPL-2.0-only
LIBLIVEUPDATE_C += luo_test_utils.c
TEST_GEN_PROGS_EXTENDED += luo_kexec_simple
TEST_GEN_PROGS_EXTENDED += luo_multi_session
TEST_FILES += do_kexec.sh
include ../lib.mk
CFLAGS += $(KHDR_INCLUDES)
CFLAGS += -Wall -O2 -Wno-unused-function
CFLAGS += -MD
CFLAGS += $(EXTRA_CFLAGS)
LIBLIVEUPDATE_O := $(patsubst %.c, $(OUTPUT)/%.o, $(LIBLIVEUPDATE_C))
TEST_GEN_PROGS_EXTENDED_O += $(patsubst %, %.o, $(TEST_GEN_PROGS_EXTENDED))
TEST_DEP_FILES += $(patsubst %.o, %.d, $(LIBLIVEUPDATE_O))
TEST_DEP_FILES += $(patsubst %.o, %.d, $(TEST_GEN_PROGS_EXTENDED_O))
-include $(TEST_DEP_FILES)
$(LIBLIVEUPDATE_O): $(OUTPUT)/%.o: %.c
$(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c $< -o $@
$(TEST_GEN_PROGS_EXTENDED): %: %.o $(LIBLIVEUPDATE_O)
$(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $(TARGET_ARCH) $<
$(LIBLIVEUPDATE_O) $(LDLIBS) -o $@
EXTRA_CLEAN += $(LIBLIVEUPDATE_O) $(TEST_GEN_PROGS_EXTENDED_O) $(TEST_DEP_FILES)
^ permalink raw reply
* Re: [PATCH v6 17/20] selftests/liveupdate: Add userspace API selftests
From: David Matlack @ 2025-11-17 19:38 UTC (permalink / raw)
To: Pasha Tatashin
Cc: pratyush, jasonmiu, graf, rppt, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251115233409.768044-18-pasha.tatashin@soleen.com>
On Sat, Nov 15, 2025 at 3:34 PM Pasha Tatashin
<pasha.tatashin@soleen.com> wrote:
> diff --git a/tools/testing/selftests/liveupdate/.gitignore b/tools/testing/selftests/liveupdate/.gitignore
> new file mode 100644
> index 000000000000..af6e773cf98f
> --- /dev/null
> +++ b/tools/testing/selftests/liveupdate/.gitignore
> @@ -0,0 +1 @@
> +/liveupdate
I would recommend the following .gitignore so you don't have to keep
updating it every time there's a new executable or other build
artifact. This is what we use in the KVM and VFIO selftests.
# SPDX-License-Identifier: GPL-2.0-only
*
!/**/
!*.c
!*.h
!*.S
!*.sh
!*.mk
!.gitignore
!config
!Makefile
^ permalink raw reply
* Re: [PATCH v6 18/20] selftests/liveupdate: Add kexec-based selftest for session lifecycle
From: David Matlack @ 2025-11-17 20:08 UTC (permalink / raw)
To: Pasha Tatashin
Cc: pratyush, jasonmiu, graf, rppt, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <CALzav=edxTsa7uO7XxiUSx+DZiX169T4WL39vYsn3_WcUuVKrg@mail.gmail.com>
On Mon, Nov 17, 2025 at 11:27 AM David Matlack <dmatlack@google.com> wrote:
> Putting it all together, here is what I'd recommend for this Makefile
> (drop-in replacement for the current Makefile). This will also make it
> easier for me to share the library code with VFIO selftests, which
> I'll need to do in the VFIO series.
>
> (Sorry in advance for the line wrap. I had to send this through gmail.)
Oops I dropped the build rule for liveupdate.c. Here it is with that included:
# SPDX-License-Identifier: GPL-2.0-only
LIBLIVEUPDATE_C += luo_test_utils.c
TEST_GEN_PROGS += liveupdate
TEST_GEN_PROGS_EXTENDED += luo_kexec_simple
TEST_GEN_PROGS_EXTENDED += luo_multi_session
TEST_FILES += do_kexec.sh
include ../lib.mk
CFLAGS += $(KHDR_INCLUDES)
CFLAGS += -Wall -O2 -Wno-unused-function
CFLAGS += -MD
CFLAGS += $(EXTRA_CFLAGS)
LIBLIVEUPDATE_O := $(patsubst %.c, $(OUTPUT)/%.o, $(LIBLIVEUPDATE_C))
TEST_PROGS := $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED)
TEST_PROGS_O := $(patsubst %, %.o, $(TEST_PROGS))
TEST_DEP_FILES += $(patsubst %.o, %.d, $(LIBLIVEUPDATE_O))
TEST_DEP_FILES += $(patsubst %.o, %.d, $(TEST_PROGS_O))
-include $(TEST_DEP_FILES)
$(LIBLIVEUPDATE_O): $(OUTPUT)/%.o: %.c
$(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c $< -o $@
$(TEST_PROGS): %: %.o $(LIBLIVEUPDATE_O)
$(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $(TARGET_ARCH) $<
$(LIBLIVEUPDATE_O) $(LDLIBS) -o $@
EXTRA_CLEAN += $(LIBLIVEUPDATE_O)
EXTRA_CLEAN += $(TEST_PROGS_O)
EXTRA_CLEAN += $(TEST_DEP_FILES)
^ permalink raw reply
* Re: [PATCH v6 17/20] selftests/liveupdate: Add userspace API selftests
From: Pasha Tatashin @ 2025-11-17 20:16 UTC (permalink / raw)
To: David Matlack
Cc: pratyush, jasonmiu, graf, rppt, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <CALzav=eskApQk6kstsQWThwV=h4Qmd85kAw3CxZt=6hj=JS-Xw@mail.gmail.com>
On Mon, Nov 17, 2025 at 2:39 PM David Matlack <dmatlack@google.com> wrote:
>
> On Sat, Nov 15, 2025 at 3:34 PM Pasha Tatashin
> <pasha.tatashin@soleen.com> wrote:
>
> > diff --git a/tools/testing/selftests/liveupdate/.gitignore b/tools/testing/selftests/liveupdate/.gitignore
> > new file mode 100644
> > index 000000000000..af6e773cf98f
> > --- /dev/null
> > +++ b/tools/testing/selftests/liveupdate/.gitignore
> > @@ -0,0 +1 @@
> > +/liveupdate
>
> I would recommend the following .gitignore so you don't have to keep
> updating it every time there's a new executable or other build
> artifact. This is what we use in the KVM and VFIO selftests.
Good idea, I will do that.
Thanks,
Pasha
>
> # SPDX-License-Identifier: GPL-2.0-only
> *
> !/**/
> !*.c
> !*.h
> !*.S
> !*.sh
> !*.mk
> !.gitignore
> !config
> !Makefile
^ permalink raw reply
* Re: [PATCH v6 02/20] liveupdate: luo_core: integrate with KHO
From: Mike Rapoport @ 2025-11-17 21:05 UTC (permalink / raw)
To: Pasha Tatashin
Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <CA+CK2bBEs2nr0TmsaV18S-xJTULkobYgv0sU9=RCdReiS0CbPQ@mail.gmail.com>
On Mon, Nov 17, 2025 at 01:29:47PM -0500, Pasha Tatashin wrote:
> On Sun, Nov 16, 2025 at 2:16 PM Mike Rapoport <rppt@kernel.org> wrote:
> >
> > On Sun, Nov 16, 2025 at 09:55:30AM -0500, Pasha Tatashin wrote:
> > > On Sun, Nov 16, 2025 at 7:43 AM Mike Rapoport <rppt@kernel.org> wrote:
> > > >
> > > > > +static int __init liveupdate_early_init(void)
> > > > > +{
> > > > > + int err;
> > > > > +
> > > > > + err = luo_early_startup();
> > > > > + if (err) {
> > > > > + pr_err("The incoming tree failed to initialize properly [%pe], disabling live update\n",
> > > > > + ERR_PTR(err));
> > > >
> > > > How do we report this to the userspace?
> > > > I think the decision what to do in this case belongs there. Even if it's
> > > > down to choosing between plain kexec and full reboot, it's still a policy
> > > > that should be implemented in userspace.
> > >
> > > I agree that policy belongs in userspace, and that is how we designed
> > > it. In this specific failure case (ABI mismatch or corrupt FDT), the
> > > preserved state is unrecoverable by the kernel. We cannot parse the
> > > incoming data, so we cannot offer it to userspace.
> > >
> > > We report this state by not registering the /dev/liveupdate device.
> > > When the userspace agent attempts to initialize, it receives ENOENT.
> > > At that point, the agent exercises its policy:
> > >
> > > - Check dmesg for the specific error and report the failure to the
> > > fleet control plane.
> >
> > Hmm, this is not nice. I think we still should register /dev/liveupdate and
> > let userspace discover this error via /dev/liveupdate ABIs.
>
> Not registering the device is the correct approach here for two reasons:
>
> 1. This follows the standard Linux driver pattern. If a driver fails
> to initialize its underlying resources (hardware, firmware, or in this
> case, the incoming FDT), it does not register a character device.
> 2. Registering a "zombie" device that exists solely to return errors
> adds significant complexity. We would need to introduce a specific
> "broken" state to the state machine and add checks to IOCTLs to reject
> commands with a specific error code.
You can avoid that complexity if you register the device with a different
fops, but that's technicality.
Your point about treating the incoming FDT as an underlying resource that
failed to initialize makes sense, but nevertheless userspace needs a
reliable way to detect it and parsing dmesg is not something we should rely
on.
> Pasha
--
Sincerely yours,
Mike.
^ permalink raw reply
* Re: [PATCH v6 18/20] selftests/liveupdate: Add kexec-based selftest for session lifecycle
From: David Matlack @ 2025-11-17 21:06 UTC (permalink / raw)
To: Pasha Tatashin
Cc: pratyush, jasonmiu, graf, rppt, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <CALzav=f+6hQ-UYBpwmAyKHPmtvEq-Q=mOL20_rZmAcTyd87+Vg@mail.gmail.com>
On Mon, Nov 17, 2025 at 12:08 PM David Matlack <dmatlack@google.com> wrote:
>
> On Mon, Nov 17, 2025 at 11:27 AM David Matlack <dmatlack@google.com> wrote:
>
> > Putting it all together, here is what I'd recommend for this Makefile
> > (drop-in replacement for the current Makefile). This will also make it
> > easier for me to share the library code with VFIO selftests, which
> > I'll need to do in the VFIO series.
> >
> > (Sorry in advance for the line wrap. I had to send this through gmail.)
>
> Oops I dropped the build rule for liveupdate.c. Here it is with that included:
>
> # SPDX-License-Identifier: GPL-2.0-only
>
> LIBLIVEUPDATE_C += luo_test_utils.c
>
> TEST_GEN_PROGS += liveupdate
> TEST_GEN_PROGS_EXTENDED += luo_kexec_simple
> TEST_GEN_PROGS_EXTENDED += luo_multi_session
>
> TEST_FILES += do_kexec.sh
>
> include ../lib.mk
>
> CFLAGS += $(KHDR_INCLUDES)
> CFLAGS += -Wall -O2 -Wno-unused-function
> CFLAGS += -MD
> CFLAGS += $(EXTRA_CFLAGS)
>
> LIBLIVEUPDATE_O := $(patsubst %.c, $(OUTPUT)/%.o, $(LIBLIVEUPDATE_C))
> TEST_PROGS := $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED)
Correction: I forgot that TEST_PROGS is reserved for test shell
scripts, so this variable needs a different name.
> TEST_PROGS_O := $(patsubst %, %.o, $(TEST_PROGS))
>
> TEST_DEP_FILES += $(patsubst %.o, %.d, $(LIBLIVEUPDATE_O))
> TEST_DEP_FILES += $(patsubst %.o, %.d, $(TEST_PROGS_O))
> -include $(TEST_DEP_FILES)
>
> $(LIBLIVEUPDATE_O): $(OUTPUT)/%.o: %.c
> $(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c $< -o $@
>
> $(TEST_PROGS): %: %.o $(LIBLIVEUPDATE_O)
> $(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $(TARGET_ARCH) $<
> $(LIBLIVEUPDATE_O) $(LDLIBS) -o $@
>
> EXTRA_CLEAN += $(LIBLIVEUPDATE_O)
> EXTRA_CLEAN += $(TEST_PROGS_O)
> EXTRA_CLEAN += $(TEST_DEP_FILES)
^ permalink raw reply
* Re: [PATCH v6 04/20] liveupdate: luo_session: add sessions support
From: Mike Rapoport @ 2025-11-17 21:11 UTC (permalink / raw)
To: Pasha Tatashin
Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <CA+CK2bC_z_6hgYu_qB7cBK2LrBSs8grjw7HCC+QrtUSrFuN5ZQ@mail.gmail.com>
On Mon, Nov 17, 2025 at 10:09:28AM -0500, Pasha Tatashin wrote:
>
> > > + }
> > > +
> > > + for (int i = 0; i < sh->header_ser->count; i++) {
> > > + struct luo_session *session;
> > > +
> > > + session = luo_session_alloc(sh->ser[i].name);
> > > + if (IS_ERR(session)) {
> > > + pr_warn("Failed to allocate session [%s] during deserialization %pe\n",
> > > + sh->ser[i].name, session);
> > > + return PTR_ERR(session);
> > > + }
> >
> > The allocated sessions still need to be freed if an insert fails ;-)
>
> No. We have failed to deserialize, so anyways the machine will need to
> be rebooted by the user in order to release the preserved resources.
>
> This is something that Jason Gunthrope also mentioned regarding IOMMU:
> if something is not correct (i.e., if a session cannot finish for some
> reason), don't add complicated "undo" code that cleans up all
> resources. Instead, treat them as a memory leak and allow a reboot to
> perform the cleanup.
>
> While in this particular patch the clean-up looks simple, later in the
> series we are adding file deserialization to each session to this
> function. So, the clean-up will look like this: we would have to free
> the resources for each session we deserialized, and also free the
> resources for files that were deserialized for those sessions, only to
> still boot into a "maintenance" mode where bunch of resources are not
> accessible from which the machine would have to be rebooted to get
> back to a normal state. This code will never be tested, and never be
> used, so let's use reboot to solve this problem, where devices are
> going to be properly reset, and memory is going to be properly freed.
A part of this explanation should be a comment in the code.
--
Sincerely yours,
Mike.
^ permalink raw reply
* Re: [PATCH v6 18/20] selftests/liveupdate: Add kexec-based selftest for session lifecycle
From: David Matlack @ 2025-11-18 0:06 UTC (permalink / raw)
To: Pasha Tatashin
Cc: pratyush, jasonmiu, graf, rppt, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <20251115233409.768044-19-pasha.tatashin@soleen.com>
On 2025-11-15 06:34 PM, Pasha Tatashin wrote:
> +/* Stage 1: Executed before the kexec reboot. */
> +static void run_stage_1(int luo_fd)
> +{
> + int session_fd;
> +
> + ksft_print_msg("[STAGE 1] Starting pre-kexec setup...\n");
> +
> + ksft_print_msg("[STAGE 1] Creating state file for next stage (2)...\n");
> + create_state_file(luo_fd, STATE_SESSION_NAME, STATE_MEMFD_TOKEN, 2);
> +
> + ksft_print_msg("[STAGE 1] Creating session '%s' and preserving memfd...\n",
> + TEST_SESSION_NAME);
> + session_fd = luo_create_session(luo_fd, TEST_SESSION_NAME);
> + if (session_fd < 0)
> + fail_exit("luo_create_session for '%s'", TEST_SESSION_NAME);
> +
> + if (create_and_preserve_memfd(session_fd, TEST_MEMFD_TOKEN,
> + TEST_MEMFD_DATA) < 0) {
> + fail_exit("create_and_preserve_memfd for token %#x",
> + TEST_MEMFD_TOKEN);
> + }
> +
> + ksft_print_msg("[STAGE 1] Executing kexec...\n");
> + if (system(KEXEC_SCRIPT) != 0)
> + fail_exit("kexec script failed");
> + exit(EXIT_FAILURE);
Can we separate the kexec from the test and allow the user/automation to
trigger it however is appropriate for their system? The current
do_kexec.sh script does not do any sort of graceful shutdown, and I bet
everyone will have different ways of initiating kexec on their systems.
For example, something like this (but sleeping in the child instead of
busy waiting):
diff --git a/tools/testing/selftests/liveupdate/luo_kexec_simple.c b/tools/testing/selftests/liveupdate/luo_kexec_simple.c
index 67ab6ebf9eec..513693bfb77b 100644
--- a/tools/testing/selftests/liveupdate/luo_kexec_simple.c
+++ b/tools/testing/selftests/liveupdate/luo_kexec_simple.c
@@ -24,6 +24,7 @@
static void run_stage_1(int luo_fd)
{
int session_fd;
+ int ret;
ksft_print_msg("[STAGE 1] Starting pre-kexec setup...\n");
@@ -42,10 +43,17 @@ static void run_stage_1(int luo_fd)
TEST_MEMFD_TOKEN);
}
- ksft_print_msg("[STAGE 1] Executing kexec...\n");
- if (system(KEXEC_SCRIPT) != 0)
- fail_exit("kexec script failed");
- exit(EXIT_FAILURE);
+ ksft_print_msg("[STAGE 1] Forking child process to hold session open\n");
+ ret = fork();
+ if (ret < 0)
+ fail_exit("fork() failed");
+ if (!ret)
+ for (;;) {}
+
+ ksft_print_msg("[STAGE 1] Child Process: %d\n", ret);
+ ksft_print_msg("[STAGE 1] Complete!\n");
+ ksft_print_msg("[STAGE 1] Execute kexec to continue\n");
+ exit(0);
}
/* Stage 2: Executed after the kexec reboot. */
> +int main(int argc, char *argv[])
> +{
> + int luo_fd;
> + int state_session_fd;
> +
> + luo_fd = luo_open_device();
> + if (luo_fd < 0)
> + ksft_exit_skip("Failed to open %s. Is the luo module loaded?\n",
> + LUO_DEVICE);
> +
> + /*
> + * Determine the stage by attempting to retrieve the state session.
> + * If it doesn't exist (ENOENT), we are in Stage 1 (pre-kexec).
> + */
> + state_session_fd = luo_retrieve_session(luo_fd, STATE_SESSION_NAME);
I don't think the test should try to infer the stage from the state of
the system. If a user runs this test, then does the kexec, then runs
this test again and the session can't be retrieved, that should be a
test failure (not just run stage 1 again).
I think it'd be better to require the user to pass in what stage of the
test should be run when invoking the test. e.g.
$ ./luo_kexec_simple stage_2
^ permalink raw reply related
* Re: [PATCH v6 18/20] selftests/liveupdate: Add kexec-based selftest for session lifecycle
From: Pasha Tatashin @ 2025-11-18 1:01 UTC (permalink / raw)
To: David Matlack
Cc: pratyush, jasonmiu, graf, rppt, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <CALzav=ekHM8a3yYHHUJNgtYVwLYf1hFhEmrXJjHUXRt=xrSy4A@mail.gmail.com>
> > TEST_PROGS_O := $(patsubst %, %.o, $(TEST_PROGS))
> >
> > TEST_DEP_FILES += $(patsubst %.o, %.d, $(LIBLIVEUPDATE_O))
> > TEST_DEP_FILES += $(patsubst %.o, %.d, $(TEST_PROGS_O))
> > -include $(TEST_DEP_FILES)
> >
> > $(LIBLIVEUPDATE_O): $(OUTPUT)/%.o: %.c
> > $(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c $< -o $@
> >
> > $(TEST_PROGS): %: %.o $(LIBLIVEUPDATE_O)
> > $(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $(TARGET_ARCH) $<
> > $(LIBLIVEUPDATE_O) $(LDLIBS) -o $@
> >
> > EXTRA_CLEAN += $(LIBLIVEUPDATE_O)
> > EXTRA_CLEAN += $(TEST_PROGS_O)
> > EXTRA_CLEAN += $(TEST_DEP_FILES)
Took your suggestion, thank you!
^ permalink raw reply
* Re: [PATCH v6 18/20] selftests/liveupdate: Add kexec-based selftest for session lifecycle
From: Pasha Tatashin @ 2025-11-18 1:08 UTC (permalink / raw)
To: David Matlack
Cc: pratyush, jasonmiu, graf, rppt, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <aRu4hBPz2g-cealt@google.com>
On Mon, Nov 17, 2025 at 7:06 PM David Matlack <dmatlack@google.com> wrote:
>
> On 2025-11-15 06:34 PM, Pasha Tatashin wrote:
>
> > +/* Stage 1: Executed before the kexec reboot. */
> > +static void run_stage_1(int luo_fd)
> > +{
> > + int session_fd;
> > +
> > + ksft_print_msg("[STAGE 1] Starting pre-kexec setup...\n");
> > +
> > + ksft_print_msg("[STAGE 1] Creating state file for next stage (2)...\n");
> > + create_state_file(luo_fd, STATE_SESSION_NAME, STATE_MEMFD_TOKEN, 2);
> > +
> > + ksft_print_msg("[STAGE 1] Creating session '%s' and preserving memfd...\n",
> > + TEST_SESSION_NAME);
> > + session_fd = luo_create_session(luo_fd, TEST_SESSION_NAME);
> > + if (session_fd < 0)
> > + fail_exit("luo_create_session for '%s'", TEST_SESSION_NAME);
> > +
> > + if (create_and_preserve_memfd(session_fd, TEST_MEMFD_TOKEN,
> > + TEST_MEMFD_DATA) < 0) {
> > + fail_exit("create_and_preserve_memfd for token %#x",
> > + TEST_MEMFD_TOKEN);
> > + }
> > +
> > + ksft_print_msg("[STAGE 1] Executing kexec...\n");
> > + if (system(KEXEC_SCRIPT) != 0)
> > + fail_exit("kexec script failed");
> > + exit(EXIT_FAILURE);
>
> Can we separate the kexec from the test and allow the user/automation to
> trigger it however is appropriate for their system? The current
> do_kexec.sh script does not do any sort of graceful shutdown, and I bet
> everyone will have different ways of initiating kexec on their systems.
Yes, this is a good idea, I am going to do what you suggested:
1. provide stage as argument.
2. allow user to do kexec command
Thank you,
Pasha
^ permalink raw reply
* Re: [PATCH v6 07/20] liveupdate: luo_session: Add ioctls for file preservation
From: Pasha Tatashin @ 2025-11-18 2:58 UTC (permalink / raw)
To: Mike Rapoport
Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <aRoXGYC4GeAoNKPl@kernel.org>
> > static int luo_session_release(struct inode *inodep, struct file *filep)
> > {
> > struct luo_session *session = filep->private_data;
> > struct luo_session_header *sh;
> > + int err = 0;
> >
> > /* If retrieved is set, it means this session is from incoming list */
> > - if (session->retrieved)
> > + if (session->retrieved) {
> > sh = &luo_session_global.incoming;
> > - else
> > +
> > + err = luo_session_finish_one(session);
> > + if (err) {
> > + pr_warn("Unable to finish session [%s] on release\n",
> > + session->name);
>
> return err;
>
> and then else can go away here and luo_session_remove() and
> luo_session_free() can be moved outside if (session->retrieved).
Done.
Thanks,
Pasha
^ permalink raw reply
* Re: [PATCH v6 08/20] liveupdate: luo_flb: Introduce File-Lifecycle-Bound global state
From: Pasha Tatashin @ 2025-11-18 3:54 UTC (permalink / raw)
To: Mike Rapoport
Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <aRrtRfJaaIHw5DZN@kernel.org>
>
> The concept makes sense to me, but it's hard to review the implementation
> without an actual user.
There are three users: we will have HugeTLB support that is going to
be posted as RFC in a few weeks. Also, in two weeks we are going to
have an updated VFIO and IOMMU series posted both using FLBs. In the
mean time, this series provides an FLB in-kernel test that verifies
that multiple FLBs can be attached to File-Handlers, and the basic
interfaces are working.
> > +struct liveupdate_flb {
> > + const struct liveupdate_flb_ops *ops;
> > + const char compatible[LIVEUPDATE_FLB_COMPAT_LENGTH];
> > + struct list_head list;
> > + void *internal;
>
> Can't list be a part of internal?
Yes, I moved it inside internal, and also, I removed
liveupdate_init_flb function (do that automatically now), and use the
__private as you suggested earlier, and also removed the kmalloc() for
the internal data, so FLBs can be safely used early in boot.
> And don't we usually call this .private rather than .internal?
Renamed.
>
> > };
> >
> > #ifdef CONFIG_LIVEUPDATE
> > @@ -111,6 +187,17 @@ int liveupdate_get_file_incoming(struct liveupdate_session *s, u64 token,
> > int liveupdate_get_token_outgoing(struct liveupdate_session *s,
> > struct file *file, u64 *tokenp);
> >
> > +/* Before using FLB for the first time it should be initialized */
> > +int liveupdate_init_flb(struct liveupdate_flb *flb);
> > +
> > +int liveupdate_register_flb(struct liveupdate_file_handler *h,
> > + struct liveupdate_flb *flb);
>
> While these are obvious ...
>
> > +
> > +int liveupdate_flb_incoming_locked(struct liveupdate_flb *flb, void **objp);
> > +void liveupdate_flb_incoming_unlock(struct liveupdate_flb *flb, void *obj);
> > +int liveupdate_flb_outgoing_locked(struct liveupdate_flb *flb, void **objp);
> > +void liveupdate_flb_outgoing_unlock(struct liveupdate_flb *flb, void *obj);
> > +
>
> ... it's not very clear what these APIs are for and how they are going to be
> used.
Global resource that is accessible either while a file is getting
preserved or anytime during boot.
>
> > #else /* CONFIG_LIVEUPDATE */
>
> ...
>
> > +int liveupdate_register_flb(struct liveupdate_file_handler *h,
> > + struct liveupdate_flb *flb)
> > +{
> > + struct luo_flb_internal *internal = flb->internal;
> > + struct luo_flb_link *link __free(kfree) = NULL;
> > + static DEFINE_MUTEX(register_flb_lock);
> > + struct liveupdate_flb *gflb;
> > + struct luo_flb_link *iter;
> > +
> > + if (!liveupdate_enabled())
> > + return -EOPNOTSUPP;
> > +
> > + if (WARN_ON(!h || !flb || !internal))
> > + return -EINVAL;
> > +
> > + if (WARN_ON(!flb->ops->preserve || !flb->ops->unpreserve ||
> > + !flb->ops->retrieve || !flb->ops->finish)) {
> > + return -EINVAL;
> > + }
> > +
> > + /*
> > + * Once session/files have been deserialized, FLBs cannot be registered,
> > + * it is too late. Deserialization uses file handlers, and FLB registers
> > + * to file handlers.
> > + */
> > + if (WARN_ON(luo_session_is_deserialized()))
> > + return -EBUSY;
> > +
> > + /*
> > + * File handler must already be registered, as it is initializes the
> > + * flb_list
> > + */
> > + if (WARN_ON(list_empty(&h->list)))
> > + return -EINVAL;
> > +
> > + link = kzalloc(sizeof(*link), GFP_KERNEL);
> > + if (!link)
> > + return -ENOMEM;
> > +
> > + guard(mutex)(®ister_flb_lock);
> > +
> > + /* Check that this FLB is not already linked to this file handler */
> > + list_for_each_entry(iter, &h->flb_list, list) {
> > + if (iter->flb == flb)
> > + return -EEXIST;
> > + }
> > +
> > + /* Is this FLB linked to global list ? */
>
> Maybe:
>
> /*
> * If this FLB is not linked to global list it's first time the FLB
> * is registered
> */
Done
> > +/**
> > + * liveupdate_flb_incoming_unlock - Unlock an incoming FLB object.
> > + * @flb: The FLB definition.
> > + * @obj: The object that was returned by the _locked call (used for validation).
> > + *
> > + * Releases the internal lock acquired by liveupdate_flb_incoming_locked().
> > + */
> > +void liveupdate_flb_incoming_unlock(struct liveupdate_flb *flb, void *obj)
> > +{
> > + struct luo_flb_internal *internal = flb->internal;
> > +
> > + lockdep_assert_held(&internal->incoming.lock);
> > + internal->incoming.obj = obj;
>
> The comment says obj is for validation and here it's assigned to flb.
> Something is off here :)
Thank you for catching stale comment, fixed.
> > + mutex_unlock(&internal->incoming.lock);
> > +}
> > +
> > +/**
> > + * liveupdate_flb_outgoing_locked - Lock and retrieve the outgoing FLB object.
> > + * @flb: The FLB definition.
> > + * @objp: Output parameter; will be populated with the live shared object.
> > + *
> > + * Acquires the FLB's internal lock and returns a pointer to its shared live
> > + * object for the outgoing (pre-reboot) path.
> > + *
> > + * This function assumes the object has already been created by the FLB's
> > + * .preserve() callback, which is triggered when the first dependent file
> > + * is preserved.
> > + *
> > + * The caller MUST call liveupdate_flb_outgoing_unlock() to release the lock.
> > + *
> > + * Return: 0 on success, or a negative errno on failure.
> > + */
> > +int liveupdate_flb_outgoing_locked(struct liveupdate_flb *flb, void **objp)
> > +{
> > + struct luo_flb_internal *internal = flb->internal;
> > +
> > + if (!liveupdate_enabled())
> > + return -EOPNOTSUPP;
> > +
> > + if (WARN_ON(!internal))
> > + return -EINVAL;
> > +
> > + mutex_lock(&internal->outgoing.lock);
> > +
> > + /* The object must exist if any file is being preserved */
> > + if (WARN_ON_ONCE(!internal->outgoing.obj)) {
> > + mutex_unlock(&internal->outgoing.lock);
> > + return -ENOENT;
> > + }
>
> _incoming_locked() and outgoing_locked() are nearly identical, it seems we
> can have the common part in a
> static liveupdate_flb_locked(struct luo_flb_state *state).
>
> liveupdate_flb_incoming_locked() will be oneline wrapper and
> liveupdate_flb_outgoing_locked() will have this WARN_ON if obj is NULL.
Done
^ permalink raw reply
* Re: [PATCH v6 12/20] mm: shmem: allow freezing inode mapping
From: Pasha Tatashin @ 2025-11-18 4:13 UTC (permalink / raw)
To: Mike Rapoport
Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <aRr0CQsV16usRW1J@kernel.org>
> > +/* Must be called with inode lock taken exclusive. */
> > +static inline void shmem_i_mapping_freeze(struct inode *inode, bool freeze)
>
> _mapping usually refers to operations on struct address_space.
> It seems that all shmem methods that take inode are just shmem_<operation>,
> so shmem_freeze() looks more appropriate.
Done, renamed to shmem_freeze()
>
> > +{
> > + if (freeze)
> > + SHMEM_I(inode)->flags |= SHMEM_F_MAPPING_FROZEN;
> > + else
> > + SHMEM_I(inode)->flags &= ~SHMEM_F_MAPPING_FROZEN;
> > +}
> > +
> > /*
> > * If fallocate(FALLOC_FL_KEEP_SIZE) has been used, there may be pages
> > * beyond i_size's notion of EOF, which fallocate has committed to reserving:
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 1d5036dec08a..05c3db840257 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -1292,7 +1292,8 @@ static int shmem_setattr(struct mnt_idmap *idmap,
> > loff_t newsize = attr->ia_size;
> >
> > /* protected by i_rwsem */
> > - if ((newsize < oldsize && (info->seals & F_SEAL_SHRINK)) ||
> > + if ((info->flags & SHMEM_F_MAPPING_FROZEN) ||
>
> A corner case: if newsize == oldsize this will be a false positive
Added a fix.
Thanks,
Pasha
>
> > + (newsize < oldsize && (info->seals & F_SEAL_SHRINK)) ||
> > (newsize > oldsize && (info->seals & F_SEAL_GROW)))
> > return -EPERM;
> >
> > @@ -3289,6 +3290,10 @@ shmem_write_begin(const struct kiocb *iocb, struct address_space *mapping,
> > return -EPERM;
> > }
> >
> > + if (unlikely((info->flags & SHMEM_F_MAPPING_FROZEN) &&
> > + pos + len > inode->i_size))
> > + return -EPERM;
> > +
> > ret = shmem_get_folio(inode, index, pos + len, &folio, SGP_WRITE);
> > if (ret)
> > return ret;
> > @@ -3662,6 +3667,11 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset,
> >
> > inode_lock(inode);
> >
> > + if (info->flags & SHMEM_F_MAPPING_FROZEN) {
> > + error = -EPERM;
> > + goto out;
> > + }
> > +
> > if (mode & FALLOC_FL_PUNCH_HOLE) {
> > struct address_space *mapping = file->f_mapping;
> > loff_t unmap_start = round_up(offset, PAGE_SIZE);
> > --
> > 2.52.0.rc1.455.g30608eb744-goog
> >
>
> --
> Sincerely yours,
> Mike.
^ permalink raw reply
* Re: [PATCH v6 02/20] liveupdate: luo_core: integrate with KHO
From: Pasha Tatashin @ 2025-11-18 4:22 UTC (permalink / raw)
To: Mike Rapoport
Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <aRuODFfqP-qsxa-j@kernel.org>
> You can avoid that complexity if you register the device with a different
> fops, but that's technicality.
>
> Your point about treating the incoming FDT as an underlying resource that
> failed to initialize makes sense, but nevertheless userspace needs a
> reliable way to detect it and parsing dmesg is not something we should rely
> on.
I see two solutions:
1. LUO fails to retrieve the preserved data, the user gets informed by
not finding /dev/liveupdate, and studying the dmesg for what has
happened (in reality in fleets version mismatches should not be
happening, those should be detected in quals).
2. Create a zombie device to return some errno on open, and still
study dmesg to understand what really happened.
I think that 1 is better
Pasha
^ permalink raw reply
* Re: [PATCH v6 04/20] liveupdate: luo_session: add sessions support
From: Pasha Tatashin @ 2025-11-18 4:28 UTC (permalink / raw)
To: Mike Rapoport
Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <aRuPcjyNBZqlZuEm@kernel.org>
On Mon, Nov 17, 2025 at 4:11 PM Mike Rapoport <rppt@kernel.org> wrote:
>
> On Mon, Nov 17, 2025 at 10:09:28AM -0500, Pasha Tatashin wrote:
> >
> > > > + }
> > > > +
> > > > + for (int i = 0; i < sh->header_ser->count; i++) {
> > > > + struct luo_session *session;
> > > > +
> > > > + session = luo_session_alloc(sh->ser[i].name);
> > > > + if (IS_ERR(session)) {
> > > > + pr_warn("Failed to allocate session [%s] during deserialization %pe\n",
> > > > + sh->ser[i].name, session);
> > > > + return PTR_ERR(session);
> > > > + }
> > >
> > > The allocated sessions still need to be freed if an insert fails ;-)
> >
> > No. We have failed to deserialize, so anyways the machine will need to
> > be rebooted by the user in order to release the preserved resources.
> >
> > This is something that Jason Gunthrope also mentioned regarding IOMMU:
> > if something is not correct (i.e., if a session cannot finish for some
> > reason), don't add complicated "undo" code that cleans up all
> > resources. Instead, treat them as a memory leak and allow a reboot to
> > perform the cleanup.
> >
> > While in this particular patch the clean-up looks simple, later in the
> > series we are adding file deserialization to each session to this
> > function. So, the clean-up will look like this: we would have to free
> > the resources for each session we deserialized, and also free the
> > resources for files that were deserialized for those sessions, only to
> > still boot into a "maintenance" mode where bunch of resources are not
> > accessible from which the machine would have to be rebooted to get
> > back to a normal state. This code will never be tested, and never be
> > used, so let's use reboot to solve this problem, where devices are
> > going to be properly reset, and memory is going to be properly freed.
>
> A part of this explanation should be a comment in the code.
Done.
>
> --
> Sincerely yours,
> Mike.
^ permalink raw reply
* Re: [PATCH v6 02/20] liveupdate: luo_core: integrate with KHO
From: Mike Rapoport @ 2025-11-18 11:21 UTC (permalink / raw)
To: Pasha Tatashin
Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <CA+CK2bAEdNE0Rs1i7GdHz8Q3DK9Npozm8sRL8Epa+o50NOMY7A@mail.gmail.com>
On Mon, Nov 17, 2025 at 11:22:54PM -0500, Pasha Tatashin wrote:
> > You can avoid that complexity if you register the device with a different
> > fops, but that's technicality.
> >
> > Your point about treating the incoming FDT as an underlying resource that
> > failed to initialize makes sense, but nevertheless userspace needs a
> > reliable way to detect it and parsing dmesg is not something we should rely
> > on.
>
> I see two solutions:
>
> 1. LUO fails to retrieve the preserved data, the user gets informed by
> not finding /dev/liveupdate, and studying the dmesg for what has
> happened (in reality in fleets version mismatches should not be
> happening, those should be detected in quals).
> 2. Create a zombie device to return some errno on open, and still
> study dmesg to understand what really happened.
User should not study dmesg. We need another solution.
What's wrong with e.g. ioctl()?
> I think that 1 is better
>
> Pasha
--
Sincerely yours,
Mike.
^ permalink raw reply
* Re: [PATCH v6 08/20] liveupdate: luo_flb: Introduce File-Lifecycle-Bound global state
From: Mike Rapoport @ 2025-11-18 11:28 UTC (permalink / raw)
To: Pasha Tatashin
Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <CA+CK2bBxVNRkJ-8Qv1AzfHEwpxnc4fSxdzKCL_7ku0TMd6Rjow@mail.gmail.com>
On Mon, Nov 17, 2025 at 10:54:29PM -0500, Pasha Tatashin wrote:
> >
> > The concept makes sense to me, but it's hard to review the implementation
> > without an actual user.
>
> There are three users: we will have HugeTLB support that is going to
> be posted as RFC in a few weeks. Also, in two weeks we are going to
> have an updated VFIO and IOMMU series posted both using FLBs. In the
> mean time, this series provides an FLB in-kernel test that verifies
> that multiple FLBs can be attached to File-Handlers, and the basic
> interfaces are working.
Which means that essentially there won't be a real kernel user for FLB for
a while.
We usually don't merge dead code because some future patchset depends on
it.
I think it should stay in mm-nonmm-unstable if Andrew does not mind keeping
it there until the first user is going to land and then FLB will move
upstream along with that user.
If keeping FLB in mm tree is an issue we can set up an integration tree for
LUO/KHO.
--
Sincerely yours,
Mike.
^ permalink raw reply
* Re: [PATCH v6 20/20] tests/liveupdate: Add in-kernel liveupdate test
From: Mike Rapoport @ 2025-11-18 11:30 UTC (permalink / raw)
To: Pasha Tatashin
Cc: pratyush, jasonmiu, graf, dmatlack, rientjes, corbet, rdunlap,
ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm, tj,
yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, linux,
linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav,
lennart, brauner, linux-api, linux-fsdevel, saeedm, ajayachandra,
jgg, parav, leonro, witu, hughd, skhawaja, chrisl
In-Reply-To: <CA+CK2bCfPeY558f499JHKN7aekDzsxQkZJ9Uz4e+saR0qtXyfg@mail.gmail.com>
On Mon, Nov 17, 2025 at 02:00:15PM -0500, Pasha Tatashin wrote:
> > > #endif /* _LINUX_LIVEUPDATE_ABI_LUO_H */
> > > diff --git a/kernel/liveupdate/luo_file.c b/kernel/liveupdate/luo_file.c
> > > index df337c9c4f21..9a531096bdb5 100644
> > > --- a/kernel/liveupdate/luo_file.c
> > > +++ b/kernel/liveupdate/luo_file.c
> > > @@ -834,6 +834,8 @@ int liveupdate_register_file_handler(struct liveupdate_file_handler *fh)
> > > INIT_LIST_HEAD(&fh->flb_list);
> > > list_add_tail(&fh->list, &luo_file_handler_list);
> > >
> > > + liveupdate_test_register(fh);
> > > +
> >
> > Why this cannot be called from the test?
>
> Because test does not have access to all file_handlers that are being
> registered with LUO.
Unless I'm missing something, an FLB users registers a file handlers and
let's LUO know that it will need FLB. Why the test can't do the same?
> Pasha
--
Sincerely yours,
Mike.
^ permalink raw reply
* Re: [PATCH 0/2] man7/ip.7: Clarify PKTINFO's docs
From: Alejandro Colomar @ 2025-11-18 13:51 UTC (permalink / raw)
To: Jakub Głogowski; +Cc: linux-man, LKML, Linux API, ej
In-Reply-To: <cover.1763130571.git.not@dzwdz.net>
[-- Attachment #1: Type: text/plain, Size: 1907 bytes --]
Hi Jakub,
On Fri, Nov 14, 2025 at 03:29:29PM +0100, Jakub Głogowski wrote:
> I found the PKTINFO docs pretty confusing, so I tried clarifying them:
> - being more specific about each field in the struct
> (e.g. "local address of the packet" for a received packet could've
> been interpreted in myriad ways),
> - making the differences between sendmsg(2)'s and recvmsg(2)'s handling
> of that struct more explicit,
> - and some other slight rewording to make it (IMO) more readable - I cut
> out most of a paragraph that wasn't really saying anything, etc.
>
> I'm not sure if this should even be documented in ip(7) together with
> the other sockopts, though? sendmsg(2)'s handling of in_pktinfo is
> completely unrelated to the IP_PKTINFO sockopt. Documenting it in its
> own manual page would also give us more room for subsection headings and
> other formatting, examples, etc - instead of trying to cram it into
> what's already an enormous manpage.
>
> Same goes for some of the other more complex sockopts, I guess.
Do you suggest moving each socket option to a manual page under
man2const/? I think I agree with that. There's precedent, and it makes
the pages more readable.
I'll try to do that soon. I'll ping you when I've finished, in case you
want to apply further changes.
> PS. sorry for not signing this email, but neomutt didn't want to
> cooperate :/ I'll try to figure it out for any followup patches.
Ok.
Have a lovely day!
Alex
> Jakub Głogowski (2):
> man/man7/ip.7: Clarify PKTINFO's semantics depending on packet
> direction
> man/man7/ip.7: Reword IP_PKTINFO's description
>
> man/man7/ip.7 | 57 +++++++++++++++++++++++++++------------------------
> 1 file changed, 30 insertions(+), 27 deletions(-)
>
> --
> 2.47.3
>
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH v6 02/20] liveupdate: luo_core: integrate with KHO
From: Jason Gunthorpe @ 2025-11-18 14:03 UTC (permalink / raw)
To: Mike Rapoport
Cc: Pasha Tatashin, pratyush, jasonmiu, graf, dmatlack, rientjes,
corbet, rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl,
masahiroy, akpm, tj, yoann.congal, mmaurer, roman.gushchin,
chenridong, axboe, mark.rutland, jannh, vincent.guittot, hannes,
dan.j.williams, david, joel.granados, rostedt, anna.schumaker,
song, linux, linux-kernel, linux-doc, linux-mm, gregkh, tglx,
mingo, bp, dave.hansen, x86, hpa, rafael, dakr,
bartosz.golaszewski, cw00.choi, myungjoo.ham, yesanishhere,
Jonathan.Cameron, quic_zijuhu, aleksander.lobakin, ira.weiny,
andriy.shevchenko, leon, lukas, bhelgaas, wagi, djeffery,
stuart.w.hayes, ptyadav, lennart, brauner, linux-api,
linux-fsdevel, saeedm, ajayachandra, parav, leonro, witu, hughd,
skhawaja, chrisl
In-Reply-To: <aRxWvsdv1dQz8oZ4@kernel.org>
On Tue, Nov 18, 2025 at 01:21:34PM +0200, Mike Rapoport wrote:
> On Mon, Nov 17, 2025 at 11:22:54PM -0500, Pasha Tatashin wrote:
> > > You can avoid that complexity if you register the device with a different
> > > fops, but that's technicality.
> > >
> > > Your point about treating the incoming FDT as an underlying resource that
> > > failed to initialize makes sense, but nevertheless userspace needs a
> > > reliable way to detect it and parsing dmesg is not something we should rely
> > > on.
> >
> > I see two solutions:
> >
> > 1. LUO fails to retrieve the preserved data, the user gets informed by
> > not finding /dev/liveupdate, and studying the dmesg for what has
> > happened (in reality in fleets version mismatches should not be
> > happening, those should be detected in quals).
> > 2. Create a zombie device to return some errno on open, and still
> > study dmesg to understand what really happened.
>
> User should not study dmesg. We need another solution.
> What's wrong with e.g. ioctl()?
It seems very dangerous to even boot at all if the next kernel doesn't
understand the serialization information..
IMHO I think we should not even be thinking about this, it is up to
the predecessor environment to prevent it from happening. The ideas to
use ELF metadata/etc to allow a pre-flight validation are the right
solution.
If we get into the next kernel and it receives information it cannot
process it should just BUG_ON and die, or some broad equivalent.
It is a catastrophic orchestration error, and we don't need some fine
grain recovery or userspace visibility. Crash dump the system and
reboot it.
IOW, I would not invest time in this.
Jason
^ permalink raw reply
* Re: [PATCH 1/2] man/man7/ip.7: Clarify PKTINFO's semantics depending on packet direction
From: Alejandro Colomar @ 2025-11-18 14:31 UTC (permalink / raw)
To: Jakub Głogowski; +Cc: linux-man, LKML, Linux API, ej
In-Reply-To: <fb3980b64d1c827ad59726bb30761d735396e109.1763130571.git.not@dzwdz.net>
[-- Attachment #1: Type: text/plain, Size: 4944 bytes --]
Hi Jakub,
On Fri, Nov 14, 2025 at 03:29:30PM +0100, Jakub Głogowski wrote:
> For recvmsg(2), ipi_spec_dst is set by ipv4_pktinfo_prepare() to the
> result of fib_compute_sec_dst(). The latter was introduced in
> linux.git 35ebf65e851c6d97 ("ipv4: Create and use fib_compute_spec_dst() helper.").
>
> Quoting its commit message:
>
> > The specific destination is the host we direct unicast replies to.
> > Usually this is the original packet source address, but if we are
> > responding to a multicast or broadcast packet we have to use something
> > different.
> >
> > Specifically we must use the source address we would use if we were to
> > send a packet to the unicast source of the original packet.
>
> Experimentation seems to confirm that behavior.
>
> As for the note about ipi_spec_dst being on a different interface:
> - For unicast packets (for which ipi_spec_dst is the original
> destination address), I believe this is trivially true because Linux
> uses the weak host model (unless there's some interaction with
> RTCF_LOCAL that I'm missing).
> - For multicast/broadcast packets, fib_compute_sec_dst() only passes the
> original interface to the lookup in the context of L3M. In
> particular, the original implementation (cited above) set iif and oof
> to 0. Also, citing
> linux.git e7372197e15856ec ("net/ipv4: Set oif in fib_compute_spec_dst"),
> > If the device is not enslaved, oif is still 0 so no affect.
>
> It doesn't seem like using an address specifically from the interface
> the packet was received on was ever the intention. I've also confirmed
> this behavior (sending a multicast packet from another machine, whose IP
> I've routed to a dummy interface).
>
> I'm focusing on this because that's a misconception I've had before
> digging into the code - the sendmsg behavior explained in the same
> paragraph made me think ipi_spec_dst was the (primary?) address of
> ipi_ifindex. I think this is worth clarifying.
>
> I've made it explicit that ipi_addr isn't used by sendmsg because that's
> another possible misconception.
>
> The (first) extra comma in sendmsg's ipi_spec_dst's description is meant
> to emphasize that it's used as the local source address _and_ for the
> routing table lookup, as opposed to just affecting the routing table
> lookup.
> Stylistically it might be a bit weird but idk how to convey this better.
>
> Apart from the cited commits I was referencing the linux-6.17.7 tarball.
>
> __fib_validate_source (and the comment near it) might also be of
> interest to people trying to figure out what "specific destinations"
> are, exactly.
>
> Signed-off-by: Jakub Głogowski <not@dzwdz.net>
Thanks! I've applied the patch. I've added CC tags (please, copy those
yourself in future patches).
I've also s/PKTINFO/IP_PKTINFO/ in the subject.
And I've applied minor wording and source improvements.
I've pushed here:
<https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=b8f472450f6607e2d5bd68a1b60615a91ed3d111>
(use port 80).
> ---
> man/man7/ip.7 | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/man/man7/ip.7 b/man/man7/ip.7
> index a92939cd0..a7f118b42 100644
> --- a/man/man7/ip.7
> +++ b/man/man7/ip.7
> @@ -809,12 +809,20 @@ .SS Socket options
> .EE
> .in
> .IP
> +When returned by
> +.BR recvmsg (2) ,
> .I ipi_ifindex
> is the unique index of the interface the packet was received on.
> .I ipi_spec_dst
> -is the local address of the packet and
> +is the preferred source address for replies to the given packet, and
> .I ipi_addr
> is the destination address in the packet header.
> +These addresses are usually the same,
> +but can differ for broadcast or multicast packets.
> +Note that, depending on the configured routes,
I've removed 'Note that,'. It's redundant. Everything in a manual page
should be noteworthy.
Have a lovely day!
Alex
> +.I ipi_spec_dst
> +might belong to a different interface from the one that received the packet.
> +.IP
> If
> .B IP_PKTINFO
> is passed to
> @@ -822,14 +830,16 @@ .SS Socket options
> and
> .\" This field is grossly misnamed
> .I ipi_spec_dst
> -is not zero, then it is used as the local source address for the routing
> -table lookup and for setting up IP source route options.
> +is not zero, then it is used as the local source address, for the routing
> +table lookup, and for setting up IP source route options.
> When
> .I ipi_ifindex
> is not zero, the primary local address of the interface specified by the
> index overwrites
> .I ipi_spec_dst
> for the routing table lookup.
> +.I ipi_addr
> +is ignored.
> .IP
> Not supported for
> .B SOCK_STREAM
> --
> 2.47.3
>
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH v6 02/20] liveupdate: luo_core: integrate with KHO
From: Mike Rapoport @ 2025-11-18 15:06 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Pasha Tatashin, pratyush, jasonmiu, graf, dmatlack, rientjes,
corbet, rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl,
masahiroy, akpm, tj, yoann.congal, mmaurer, roman.gushchin,
chenridong, axboe, mark.rutland, jannh, vincent.guittot, hannes,
dan.j.williams, david, joel.granados, rostedt, anna.schumaker,
song, linux, linux-kernel, linux-doc, linux-mm, gregkh, tglx,
mingo, bp, dave.hansen, x86, hpa, rafael, dakr,
bartosz.golaszewski, cw00.choi, myungjoo.ham, yesanishhere,
Jonathan.Cameron, quic_zijuhu, aleksander.lobakin, ira.weiny,
andriy.shevchenko, leon, lukas, bhelgaas, wagi, djeffery,
stuart.w.hayes, ptyadav, lennart, brauner, linux-api,
linux-fsdevel, saeedm, ajayachandra, parav, leonro, witu, hughd,
skhawaja, chrisl
In-Reply-To: <20251118140300.GK10864@nvidia.com>
On Tue, Nov 18, 2025 at 10:03:00AM -0400, Jason Gunthorpe wrote:
> On Tue, Nov 18, 2025 at 01:21:34PM +0200, Mike Rapoport wrote:
> > On Mon, Nov 17, 2025 at 11:22:54PM -0500, Pasha Tatashin wrote:
> > > > You can avoid that complexity if you register the device with a different
> > > > fops, but that's technicality.
> > > >
> > > > Your point about treating the incoming FDT as an underlying resource that
> > > > failed to initialize makes sense, but nevertheless userspace needs a
> > > > reliable way to detect it and parsing dmesg is not something we should rely
> > > > on.
> > >
> > > I see two solutions:
> > >
> > > 1. LUO fails to retrieve the preserved data, the user gets informed by
> > > not finding /dev/liveupdate, and studying the dmesg for what has
> > > happened (in reality in fleets version mismatches should not be
> > > happening, those should be detected in quals).
> > > 2. Create a zombie device to return some errno on open, and still
> > > study dmesg to understand what really happened.
> >
> > User should not study dmesg. We need another solution.
> > What's wrong with e.g. ioctl()?
>
> It seems very dangerous to even boot at all if the next kernel doesn't
> understand the serialization information..
>
> IMHO I think we should not even be thinking about this, it is up to
> the predecessor environment to prevent it from happening. The ideas to
> use ELF metadata/etc to allow a pre-flight validation are the right
> solution.
>
> If we get into the next kernel and it receives information it cannot
> process it should just BUG_ON and die, or some broad equivalent.
> It is a catastrophic orchestration error, and we don't need some fine
> grain recovery or userspace visibility. Crash dump the system and
> reboot it.
I was under impression Pasha wanted to get up to the userspace no matter
what.
panic() in liveupdate_early_init() makes perfect sense to me. Parsing dmesg
does not.
> IOW, I would not invest time in this.
>
> Jason
--
Sincerely yours,
Mike.
^ permalink raw reply
* Re: [PATCH v6 02/20] liveupdate: luo_core: integrate with KHO
From: Pasha Tatashin @ 2025-11-18 15:18 UTC (permalink / raw)
To: Mike Rapoport
Cc: Jason Gunthorpe, pratyush, jasonmiu, graf, dmatlack, rientjes,
corbet, rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl,
masahiroy, akpm, tj, yoann.congal, mmaurer, roman.gushchin,
chenridong, axboe, mark.rutland, jannh, vincent.guittot, hannes,
dan.j.williams, david, joel.granados, rostedt, anna.schumaker,
song, linux, linux-kernel, linux-doc, linux-mm, gregkh, tglx,
mingo, bp, dave.hansen, x86, hpa, rafael, dakr,
bartosz.golaszewski, cw00.choi, myungjoo.ham, yesanishhere,
Jonathan.Cameron, quic_zijuhu, aleksander.lobakin, ira.weiny,
andriy.shevchenko, leon, lukas, bhelgaas, wagi, djeffery,
stuart.w.hayes, ptyadav, lennart, brauner, linux-api,
linux-fsdevel, saeedm, ajayachandra, parav, leonro, witu, hughd,
skhawaja, chrisl
In-Reply-To: <aRyLbB8yoQwUJ3dh@kernel.org>
On Tue, Nov 18, 2025 at 10:06 AM Mike Rapoport <rppt@kernel.org> wrote:
>
> On Tue, Nov 18, 2025 at 10:03:00AM -0400, Jason Gunthorpe wrote:
> > On Tue, Nov 18, 2025 at 01:21:34PM +0200, Mike Rapoport wrote:
> > > On Mon, Nov 17, 2025 at 11:22:54PM -0500, Pasha Tatashin wrote:
> > > > > You can avoid that complexity if you register the device with a different
> > > > > fops, but that's technicality.
> > > > >
> > > > > Your point about treating the incoming FDT as an underlying resource that
> > > > > failed to initialize makes sense, but nevertheless userspace needs a
> > > > > reliable way to detect it and parsing dmesg is not something we should rely
> > > > > on.
> > > >
> > > > I see two solutions:
> > > >
> > > > 1. LUO fails to retrieve the preserved data, the user gets informed by
> > > > not finding /dev/liveupdate, and studying the dmesg for what has
> > > > happened (in reality in fleets version mismatches should not be
> > > > happening, those should be detected in quals).
> > > > 2. Create a zombie device to return some errno on open, and still
> > > > study dmesg to understand what really happened.
> > >
> > > User should not study dmesg. We need another solution.
> > > What's wrong with e.g. ioctl()?
> >
> > It seems very dangerous to even boot at all if the next kernel doesn't
> > understand the serialization information..
> >
> > IMHO I think we should not even be thinking about this, it is up to
> > the predecessor environment to prevent it from happening. The ideas to
> > use ELF metadata/etc to allow a pre-flight validation are the right
> > solution.
100% agreed, this is the goal.
> > If we get into the next kernel and it receives information it cannot
> > process it should just BUG_ON and die, or some broad equivalent.
I initially had a panic() that would kill the kernel, but after
further consideration, I realized that we can still boot into
"maintenance" mode and allow the user to decide when and how to reboot
the machine back to a normal state.
Crashing during early boot has its own disadvantages: the crash kernel
is not available. Also, because live-update has to be very fast, the
console is likely to be disabled. Therefore, getting to userspace and
allowing the user to investigate what happened (e.g., automatically
retrieving dmesg or a core dump and filing a bug) before rebooting
seems like the most sensible approach.
This won't leak data, as /dev/liveupdate is completely disabled, so
nothing preserved in memory will be recoverable.
Pasha
^ permalink raw reply
* Re: [PATCH v6 02/20] liveupdate: luo_core: integrate with KHO
From: Jason Gunthorpe @ 2025-11-18 15:36 UTC (permalink / raw)
To: Pasha Tatashin
Cc: Mike Rapoport, pratyush, jasonmiu, graf, dmatlack, rientjes,
corbet, rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl,
masahiroy, akpm, tj, yoann.congal, mmaurer, roman.gushchin,
chenridong, axboe, mark.rutland, jannh, vincent.guittot, hannes,
dan.j.williams, david, joel.granados, rostedt, anna.schumaker,
song, linux, linux-kernel, linux-doc, linux-mm, gregkh, tglx,
mingo, bp, dave.hansen, x86, hpa, rafael, dakr,
bartosz.golaszewski, cw00.choi, myungjoo.ham, yesanishhere,
Jonathan.Cameron, quic_zijuhu, aleksander.lobakin, ira.weiny,
andriy.shevchenko, leon, lukas, bhelgaas, wagi, djeffery,
stuart.w.hayes, ptyadav, lennart, brauner, linux-api,
linux-fsdevel, saeedm, ajayachandra, parav, leonro, witu, hughd,
skhawaja, chrisl
In-Reply-To: <CA+CK2bBFtG3LWmCtLs-5vfS8FYm_r24v=jJra9gOGPKKcs=55g@mail.gmail.com>
On Tue, Nov 18, 2025 at 10:18:28AM -0500, Pasha Tatashin wrote:
> On Tue, Nov 18, 2025 at 10:06 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> > On Tue, Nov 18, 2025 at 10:03:00AM -0400, Jason Gunthorpe wrote:
> > > On Tue, Nov 18, 2025 at 01:21:34PM +0200, Mike Rapoport wrote:
> > > > On Mon, Nov 17, 2025 at 11:22:54PM -0500, Pasha Tatashin wrote:
> > > > > > You can avoid that complexity if you register the device with a different
> > > > > > fops, but that's technicality.
> > > > > >
> > > > > > Your point about treating the incoming FDT as an underlying resource that
> > > > > > failed to initialize makes sense, but nevertheless userspace needs a
> > > > > > reliable way to detect it and parsing dmesg is not something we should rely
> > > > > > on.
> > > > >
> > > > > I see two solutions:
> > > > >
> > > > > 1. LUO fails to retrieve the preserved data, the user gets informed by
> > > > > not finding /dev/liveupdate, and studying the dmesg for what has
> > > > > happened (in reality in fleets version mismatches should not be
> > > > > happening, those should be detected in quals).
> > > > > 2. Create a zombie device to return some errno on open, and still
> > > > > study dmesg to understand what really happened.
> > > >
> > > > User should not study dmesg. We need another solution.
> > > > What's wrong with e.g. ioctl()?
> > >
> > > It seems very dangerous to even boot at all if the next kernel doesn't
> > > understand the serialization information..
> > >
> > > IMHO I think we should not even be thinking about this, it is up to
> > > the predecessor environment to prevent it from happening. The ideas to
> > > use ELF metadata/etc to allow a pre-flight validation are the right
> > > solution.
>
> 100% agreed, this is the goal.
>
> > > If we get into the next kernel and it receives information it cannot
> > > process it should just BUG_ON and die, or some broad equivalent.
>
> I initially had a panic() that would kill the kernel, but after
> further consideration, I realized that we can still boot into
> "maintenance" mode and allow the user to decide when and how to reboot
> the machine back to a normal state.
> This won't leak data, as /dev/liveupdate is completely disabled, so
> nothing preserved in memory will be recoverable.
This seems reasonable, but it is still dangerous.
At the minimum the KHO startup either needs to succeed, panic, or fail
to online most of the memory (ie run from the safe region only)
The above approach works better for things like VFIO or memfd where
you can boot significantly safely. Not sure about iommu though, if
iommu doesn't deserialize properly then it probably corrupts all
memory too.
Jason
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox