* [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths
@ 2026-03-26 4:25 Leo Timmins
2026-03-26 4:25 ` [PATCH v3 1/2] liveupdate: propagate file deserialization failures Leo Timmins
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Leo Timmins @ 2026-03-26 4:25 UTC (permalink / raw)
To: pasha.tatashin, rppt, linux-kernel; +Cc: pratyush, akpm, Leo Timmins
Hi,
This series fixes two issues in LUO's incoming-side error handling and
teardown paths.
The first patch makes session deserialization fail when file
deserialization fails, instead of silently continuing with a partially
restored session.
The second patch (formerly patch 3 on the v1) initializes incoming FLB
state before the finish pathdecrements its refcount, so the last-user
cleanup path does not run from an uninitialized count. (and now
utilises pr_warn instead of WARN_ON)
Changes in v2:
- drop the previous patch 2 after review
- patch 2/2: replace WARN_ON(err) with pr_warn() in
luo_flb_file_finish_one()
Changes in v3:
- patch 2/2. minor change to formatting. removed braces from scopped_guard
Leo Timmins (2):
liveupdate: propagate file deserialization failures
liveupdate: initialize incoming FLB state before finish
kernel/liveupdate/luo_flb.c | 19 ++++++++++++++++++-
kernel/liveupdate/luo_session.c | 9 +++++++--
2 files changed, 25 insertions(+), 3 deletions(-)
base-commit: e3c33bc767b5512dbfec643a02abf58ce608f3b2
--
2.53.0
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH v3 1/2] liveupdate: propagate file deserialization failures 2026-03-26 4:25 [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Leo Timmins @ 2026-03-26 4:25 ` Leo Timmins 2026-04-02 12:17 ` Pratyush Yadav 2026-03-26 4:25 ` [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish Leo Timmins 2026-03-28 0:32 ` [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Andrew Morton 2 siblings, 1 reply; 13+ messages in thread From: Leo Timmins @ 2026-03-26 4:25 UTC (permalink / raw) To: pasha.tatashin, rppt, linux-kernel; +Cc: pratyush, akpm, Leo Timmins luo_session_deserialize() ignored the return value from luo_file_deserialize(). As a result, a session could be left partially restored even though the /dev/liveupdate open path treats deserialization failures as fatal. Propagate the error so a failed file deserialization aborts session deserialization instead of silently continuing. Fixes: 16cec0d26521 ("liveupdate: luo_session: add ioctls for file preservation") Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Leo Timmins <leotimmins1974@gmail.com> --- kernel/liveupdate/luo_session.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/kernel/liveupdate/luo_session.c b/kernel/liveupdate/luo_session.c index 783677295640..25ae704d7787 100644 --- a/kernel/liveupdate/luo_session.c +++ b/kernel/liveupdate/luo_session.c @@ -558,8 +558,13 @@ int luo_session_deserialize(void) } scoped_guard(mutex, &session->mutex) { - luo_file_deserialize(&session->file_set, - &sh->ser[i].file_set_ser); + err = luo_file_deserialize(&session->file_set, + &sh->ser[i].file_set_ser); + } + if (err) { + pr_warn("Failed to deserialize files for session [%s] %pe\n", + session->name, ERR_PTR(err)); + return err; } } -- 2.53.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3 1/2] liveupdate: propagate file deserialization failures 2026-03-26 4:25 ` [PATCH v3 1/2] liveupdate: propagate file deserialization failures Leo Timmins @ 2026-04-02 12:17 ` Pratyush Yadav 0 siblings, 0 replies; 13+ messages in thread From: Pratyush Yadav @ 2026-04-02 12:17 UTC (permalink / raw) To: Leo Timmins; +Cc: pasha.tatashin, rppt, linux-kernel, pratyush, akpm On Thu, Mar 26 2026, Leo Timmins wrote: > luo_session_deserialize() ignored the return value from > luo_file_deserialize(). As a result, a session could be left partially > restored even though the /dev/liveupdate open path treats deserialization > failures as fatal. > > Propagate the error so a failed file deserialization aborts session > deserialization instead of silently continuing. > > Fixes: 16cec0d26521 ("liveupdate: luo_session: add ioctls for file preservation") > Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> > Signed-off-by: Leo Timmins <leotimmins1974@gmail.com> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> [...] -- Regards, Pratyush Yadav ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish 2026-03-26 4:25 [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Leo Timmins 2026-03-26 4:25 ` [PATCH v3 1/2] liveupdate: propagate file deserialization failures Leo Timmins @ 2026-03-26 4:25 ` Leo Timmins 2026-03-26 14:50 ` Pasha Tatashin 2026-04-02 13:28 ` Pratyush Yadav 2026-03-28 0:32 ` [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Andrew Morton 2 siblings, 2 replies; 13+ messages in thread From: Leo Timmins @ 2026-03-26 4:25 UTC (permalink / raw) To: pasha.tatashin, rppt, linux-kernel; +Cc: pratyush, akpm, Leo Timmins luo_flb_file_finish_one() decremented incoming.count before making sure that the incoming FLB state had been materialized. If no earlier incoming retrieval had populated that state, the first decrement ran from zero and skipped the last-user finish path. Initialize the incoming FLB state before the first decrement so finish uses the serialized refcount instead of an uninitialized value. Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state") Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Leo Timmins <leotimmins1974@gmail.com> --- kernel/liveupdate/luo_flb.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c index f52e8114837e..855af655b09b 100644 --- a/kernel/liveupdate/luo_flb.c +++ b/kernel/liveupdate/luo_flb.c @@ -192,10 +192,27 @@ static int luo_flb_retrieve_one(struct liveupdate_flb *flb) static void luo_flb_file_finish_one(struct liveupdate_flb *flb) { struct luo_flb_private *private = luo_flb_get_private(flb); + bool needs_retrieve = false; u64 count; - scoped_guard(mutex, &private->incoming.lock) + scoped_guard(mutex, &private->incoming.lock) { + if (!private->incoming.count && !private->incoming.finished) + needs_retrieve = true; + } + + if (needs_retrieve) { + int err = luo_flb_retrieve_one(flb); + + if (err) { + pr_warn("Failed to retrieve FLB '%s' during finish: %pe\n", + flb->compatible, ERR_PTR(err)); + return; + } + } + + scoped_guard(mutex, &private->incoming.lock) { count = --private->incoming.count; + } if (!count) { struct liveupdate_flb_op_args args = {0}; -- 2.53.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish 2026-03-26 4:25 ` [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish Leo Timmins @ 2026-03-26 14:50 ` Pasha Tatashin 2026-04-02 13:28 ` Pratyush Yadav 1 sibling, 0 replies; 13+ messages in thread From: Pasha Tatashin @ 2026-03-26 14:50 UTC (permalink / raw) To: Leo Timmins; +Cc: rppt, linux-kernel, pratyush, akpm On Thu, Mar 26, 2026 at 12:26 AM Leo Timmins <leotimmins1974@gmail.com> wrote: > > luo_flb_file_finish_one() decremented incoming.count before making sure > that the incoming FLB state had been materialized. If no earlier incoming > retrieval had populated that state, the first decrement ran from zero and > skipped the last-user finish path. > > Initialize the incoming FLB state before the first decrement so finish > uses the serialized refcount instead of an uninitialized value. > > Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state") > Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> > Signed-off-by: Leo Timmins <leotimmins1974@gmail.com> > --- > kernel/liveupdate/luo_flb.c | 19 ++++++++++++++++++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c > index f52e8114837e..855af655b09b 100644 > --- a/kernel/liveupdate/luo_flb.c > +++ b/kernel/liveupdate/luo_flb.c > @@ -192,10 +192,27 @@ static int luo_flb_retrieve_one(struct liveupdate_flb *flb) > static void luo_flb_file_finish_one(struct liveupdate_flb *flb) > { > struct luo_flb_private *private = luo_flb_get_private(flb); > + bool needs_retrieve = false; > u64 count; > > - scoped_guard(mutex, &private->incoming.lock) > + scoped_guard(mutex, &private->incoming.lock) { > + if (!private->incoming.count && !private->incoming.finished) > + needs_retrieve = true; > + } > + > + if (needs_retrieve) { > + int err = luo_flb_retrieve_one(flb); > + > + if (err) { > + pr_warn("Failed to retrieve FLB '%s' during finish: %pe\n", > + flb->compatible, ERR_PTR(err)); > + return; > + } > + } > + > + scoped_guard(mutex, &private->incoming.lock) { > count = --private->incoming.count; > + } The braces are still added > > if (!count) { > struct liveupdate_flb_op_args args = {0}; > -- > 2.53.0 > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish 2026-03-26 4:25 ` [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish Leo Timmins 2026-03-26 14:50 ` Pasha Tatashin @ 2026-04-02 13:28 ` Pratyush Yadav 2026-04-02 18:15 ` Andrew Morton 1 sibling, 1 reply; 13+ messages in thread From: Pratyush Yadav @ 2026-04-02 13:28 UTC (permalink / raw) To: Leo Timmins; +Cc: pasha.tatashin, rppt, linux-kernel, pratyush, akpm On Thu, Mar 26 2026, Leo Timmins wrote: > luo_flb_file_finish_one() decremented incoming.count before making sure > that the incoming FLB state had been materialized. If no earlier incoming > retrieval had populated that state, the first decrement ran from zero and > skipped the last-user finish path. > > Initialize the incoming FLB state before the first decrement so finish > uses the serialized refcount instead of an uninitialized value. This commit message makes it very hard to understand what the problem it fixes. It took me 20 minutes of reading this patch and looking at the FLB code to figure out what is going on. Here is what I'd suggest: The state of an incoming FLB object is initialized when it is first used. The initialization is done via luo_flb_retrieve_one(), which looks at all the incoming FLBs, matches the FLB to its serialized entry, and initializes the incoming data and count. luo_flb_file_finish_one() is called when finish is called for a file registered with this FLB. If no file handler has used the FLB by this point, the count stays un-initialized at 0. luo_flb_file_finish_one() then decrements this un-initialized count, leading to an underflow. This results in the FLB finish never being called since the count has underflowed to a very large value. Fix this by making sure the FLB is retrieved before using its count. > > Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state") > Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> > Signed-off-by: Leo Timmins <leotimmins1974@gmail.com> > --- > kernel/liveupdate/luo_flb.c | 19 ++++++++++++++++++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c > index f52e8114837e..855af655b09b 100644 > --- a/kernel/liveupdate/luo_flb.c > +++ b/kernel/liveupdate/luo_flb.c > @@ -192,10 +192,27 @@ static int luo_flb_retrieve_one(struct liveupdate_flb *flb) > static void luo_flb_file_finish_one(struct liveupdate_flb *flb) > { > struct luo_flb_private *private = luo_flb_get_private(flb); > + bool needs_retrieve = false; > u64 count; > > - scoped_guard(mutex, &private->incoming.lock) > + scoped_guard(mutex, &private->incoming.lock) { > + if (!private->incoming.count && !private->incoming.finished) > + needs_retrieve = true; > + } > + > + if (needs_retrieve) { > + int err = luo_flb_retrieve_one(flb); > + > + if (err) { > + pr_warn("Failed to retrieve FLB '%s' during finish: %pe\n", > + flb->compatible, ERR_PTR(err)); > + return; > + } > + } > + > + scoped_guard(mutex, &private->incoming.lock) { > count = --private->incoming.count; > + } This looks way too overcomplicated. We just need to move the retrieve call before we check the count. Pasha, side note: I think FLB suffers from the same problem I fixed for LUO files with f85b1c6af5bc ("liveupdate: luo_file: remember retrieve() status"). If retrieve fails, it doesn't remember the error code and will retry retrieve() on next usage, causing all sorts of problems. Pasha, another side note: I think incoming.private should be initialized when the FLB is registered, not when it is first used. This will make this simpler overall. This would need liveupdate_register_flb() to not call kzalloc(), but that can be done I think. Anyway, for this problem, how about the patch below (only compile-tested) addressing my comments. The diff looks bigger but the end result is a lot cleaner IMO. In effect it just moves the retrieve call outside the if (!count). Everything else is just cleaning up the locking situation. --- 8< --- From 775565e72d9b851839d37088549f0fc247cac2e1 Mon Sep 17 00:00:00 2001 From: "Pratyush Yadav (Google)" <pratyush@kernel.org> Date: Thu, 2 Apr 2026 13:22:05 +0000 Subject: [PATCH] liveupdate: initialize incoming FLB state before finish The state of an incoming FLB object is initialized when it is first used. The initialization is done via luo_flb_retrieve_one(), which looks at all the incoming FLBs, matches the FLB to its serialized entry, and initializes the incoming data and count. luo_flb_file_finish_one() is called when finish is called for a file registered with this FLB. If no file handler has used the FLB by this point, the count stays un-initialized at 0. luo_flb_file_finish_one() then decrements this un-initialized count, leading to an underflow. This results in the FLB finish never being called since the count has underflowed to a very large value. Fix this by making sure the FLB is retrieved before using its count. Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state") Suggested-by: Leo Timmins <leotimmins1974@gmail.com> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org> --- kernel/liveupdate/luo_flb.c | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c index f52e8114837e..be141620751e 100644 --- a/kernel/liveupdate/luo_flb.c +++ b/kernel/liveupdate/luo_flb.c @@ -194,28 +194,26 @@ static void luo_flb_file_finish_one(struct liveupdate_flb *flb) struct luo_flb_private *private = luo_flb_get_private(flb); u64 count; - scoped_guard(mutex, &private->incoming.lock) - count = --private->incoming.count; + if (!private->incoming.retrieved) { + int err = luo_flb_retrieve_one(flb); + if (WARN_ON(err)) + return; + } + + guard(mutex)(&private->incoming.lock); + + count = --private->incoming.count; if (!count) { struct liveupdate_flb_op_args args = {0}; - if (!private->incoming.retrieved) { - int err = luo_flb_retrieve_one(flb); + args.flb = flb; + args.obj = private->incoming.obj; + flb->ops->finish(&args); - if (WARN_ON(err)) - return; - } - - scoped_guard(mutex, &private->incoming.lock) { - args.flb = flb; - args.obj = private->incoming.obj; - flb->ops->finish(&args); - - private->incoming.data = 0; - private->incoming.obj = NULL; - private->incoming.finished = true; - } + private->incoming.data = 0; + private->incoming.obj = NULL; + private->incoming.finished = true; } } -- Regards, Pratyush Yadav ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish 2026-04-02 13:28 ` Pratyush Yadav @ 2026-04-02 18:15 ` Andrew Morton 2026-04-03 9:04 ` Pratyush Yadav 0 siblings, 1 reply; 13+ messages in thread From: Andrew Morton @ 2026-04-02 18:15 UTC (permalink / raw) To: Pratyush Yadav; +Cc: Leo Timmins, pasha.tatashin, rppt, linux-kernel On Thu, 02 Apr 2026 13:28:33 +0000 Pratyush Yadav <pratyush@kernel.org> wrote: > The state of an incoming FLB object is initialized when it is first > used. The initialization is done via luo_flb_retrieve_one(), which looks > at all the incoming FLBs, matches the FLB to its serialized entry, and > initializes the incoming data and count. > > luo_flb_file_finish_one() is called when finish is called for a file > registered with this FLB. If no file handler has used the FLB by this > point, the count stays un-initialized at 0. luo_flb_file_finish_one() > then decrements this un-initialized count, leading to an underflow. This > results in the FLB finish never being called since the count has > underflowed to a very large value. > > Fix this by making sure the FLB is retrieved before using its count. I like that the above tells people what the actual bug is! I still have both Leo's patches in mm.git, in wait-and-see mode. What to do here? Should I upstream [1/2] and drop [2/2]? Drop both and revisit after -rc1? Also, did we consider cc:stable for these two? Perhaps add the cc:stable if we decide to attend to this after -rc1? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish 2026-04-02 18:15 ` Andrew Morton @ 2026-04-03 9:04 ` Pratyush Yadav 2026-04-03 17:26 ` Andrew Morton 0 siblings, 1 reply; 13+ messages in thread From: Pratyush Yadav @ 2026-04-03 9:04 UTC (permalink / raw) To: Andrew Morton Cc: Pratyush Yadav, Leo Timmins, pasha.tatashin, rppt, linux-kernel On Thu, Apr 02 2026, Andrew Morton wrote: > On Thu, 02 Apr 2026 13:28:33 +0000 Pratyush Yadav <pratyush@kernel.org> wrote: > >> The state of an incoming FLB object is initialized when it is first >> used. The initialization is done via luo_flb_retrieve_one(), which looks >> at all the incoming FLBs, matches the FLB to its serialized entry, and >> initializes the incoming data and count. >> >> luo_flb_file_finish_one() is called when finish is called for a file >> registered with this FLB. If no file handler has used the FLB by this >> point, the count stays un-initialized at 0. luo_flb_file_finish_one() >> then decrements this un-initialized count, leading to an underflow. This >> results in the FLB finish never being called since the count has >> underflowed to a very large value. >> >> Fix this by making sure the FLB is retrieved before using its count. > > I like that the above tells people what the actual bug is! > > I still have both Leo's patches in mm.git, in wait-and-see mode. What > to do here? Should I upstream [1/2] and drop [2/2]? Drop both and > revisit after -rc1? These are independent fixes, so I would suggest keeping 1/2 regardless of what we do with 2/2. For 2/2, I would suggest replacing it with the version I sent in <2vxzmrzlfq4e.fsf@kernel.org>. Mike/Pasha/Leo, if you could review my version then that would be great. Also Leo, please help with testing. I don't have a setup ready for testing this corner case. I can set something up mid next week, but it would be great if you can test this before that. > > Also, did we consider cc:stable for these two? Perhaps add the > cc:stable if we decide to attend to this after -rc1? FLB landed in v7.0-rc1 so no need for cc:stable for patch 2/2. For patch 1/2, I think cc:stable does make sense, but it only landed in v6.19 so not super important given it is not LTS. -- Regards, Pratyush Yadav ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish 2026-04-03 9:04 ` Pratyush Yadav @ 2026-04-03 17:26 ` Andrew Morton [not found] ` <CA+uuG7Lnc94vTmZPEhbvQXgAzWJDL28Zf=QR=uAkmiWvoW+Uxw@mail.gmail.com> 0 siblings, 1 reply; 13+ messages in thread From: Andrew Morton @ 2026-04-03 17:26 UTC (permalink / raw) To: Pratyush Yadav; +Cc: Leo Timmins, pasha.tatashin, rppt, linux-kernel On Fri, 03 Apr 2026 09:04:25 +0000 Pratyush Yadav <pratyush@kernel.org> wrote: > On Thu, Apr 02 2026, Andrew Morton wrote: > > > On Thu, 02 Apr 2026 13:28:33 +0000 Pratyush Yadav <pratyush@kernel.org> wrote: > > > >> The state of an incoming FLB object is initialized when it is first > >> used. The initialization is done via luo_flb_retrieve_one(), which looks > >> at all the incoming FLBs, matches the FLB to its serialized entry, and > >> initializes the incoming data and count. > >> > >> luo_flb_file_finish_one() is called when finish is called for a file > >> registered with this FLB. If no file handler has used the FLB by this > >> point, the count stays un-initialized at 0. luo_flb_file_finish_one() > >> then decrements this un-initialized count, leading to an underflow. This > >> results in the FLB finish never being called since the count has > >> underflowed to a very large value. > >> > >> Fix this by making sure the FLB is retrieved before using its count. > > > > I like that the above tells people what the actual bug is! > > > > I still have both Leo's patches in mm.git, in wait-and-see mode. What > > to do here? Should I upstream [1/2] and drop [2/2]? Drop both and > > revisit after -rc1? > > These are independent fixes, so I would suggest keeping 1/2 regardless > of what we do with 2/2. For 2/2, I would suggest replacing it with the > version I sent in <2vxzmrzlfq4e.fsf@kernel.org>. OK, thanks. I removed [2/2] "liveupdate: initialize incoming FLB state before finish" from mm.git. I added cc:stable to [1/1] "liveupdate: propagate file deserialization failures" and altered its changelog to present it as a singleton patch (no "Patch series..." header, no "This patch (of 2)", etc. Below. > Mike/Pasha/Leo, if you could review my version then that would be great. > > Also Leo, please help with testing. I don't have a setup ready for > testing this corner case. I can set something up mid next week, but it > would be great if you can test this before that. > > > > > Also, did we consider cc:stable for these two? Perhaps add the > > cc:stable if we decide to attend to this after -rc1? > > FLB landed in v7.0-rc1 so no need for cc:stable for patch 2/2. For patch > 1/2, I think cc:stable does make sense, but it only landed in v6.19 so > not super important given it is not LTS. OK. When considering cc:stable I personally don't pay attention to LTS release cadence - I believe others do so. Because it could be that some downstream people are using non-LTS Linus kernels. But then they should be looking at the Fixes: tags. From: Leo Timmins <leotimmins1974@gmail.com> Subject: liveupdate: propagate file deserialization failures Date: Wed, 25 Mar 2026 12:46:07 +0800 luo_session_deserialize() ignored the return value from luo_file_deserialize(). As a result, a session could be left partially restored even though the /dev/liveupdate open path treats deserialization failures as fatal. Propagate the error so a failed file deserialization aborts session deserialization instead of silently continuing. Link: https://lkml.kernel.org/r/20260325044608.8407-1-leotimmins1974@gmail.com Link: https://lkml.kernel.org/r/20260325044608.8407-2-leotimmins1974@gmail.com Fixes: 16cec0d26521 ("liveupdate: luo_session: add ioctls for file preservation") Signed-off-by: Leo Timmins <leotimmins1974@gmail.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- kernel/liveupdate/luo_session.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) --- a/kernel/liveupdate/luo_session.c~liveupdate-propagate-file-deserialization-failures +++ a/kernel/liveupdate/luo_session.c @@ -558,8 +558,13 @@ int luo_session_deserialize(void) } scoped_guard(mutex, &session->mutex) { - luo_file_deserialize(&session->file_set, - &sh->ser[i].file_set_ser); + err = luo_file_deserialize(&session->file_set, + &sh->ser[i].file_set_ser); + } + if (err) { + pr_warn("Failed to deserialize files for session [%s] %pe\n", + session->name, ERR_PTR(err)); + return err; } } _ ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CA+uuG7Lnc94vTmZPEhbvQXgAzWJDL28Zf=QR=uAkmiWvoW+Uxw@mail.gmail.com>]
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish [not found] ` <CA+uuG7Lnc94vTmZPEhbvQXgAzWJDL28Zf=QR=uAkmiWvoW+Uxw@mail.gmail.com> @ 2026-04-04 15:43 ` Leo Timmins 2026-04-05 7:35 ` Pratyush Yadav 0 siblings, 1 reply; 13+ messages in thread From: Leo Timmins @ 2026-04-04 15:43 UTC (permalink / raw) To: Andrew Morton; +Cc: Pratyush Yadav, pasha.tatashin, rppt, linux-kernel Hello All, > Apologies for the duplicate email, I had accidentally sent it out with html instead of plain text. I have done my testing of Pratyush's alternate patch. My tests indicate it successfully fixes the potential underflow if incoming.count is <= 0 Patch successfully compiles with 7.0.0-rc6 and has been tested on a VM (QEMU + VMM) with Arch. I executed a liveupdate with kexec which was successful. I then wrote a test tool to replicate the underflow conditions, and added a check to see if incoming.count is <= 0 after the if statement block Having confirmed it works: Reviewed-by: Leo Timmins <leotimmins1974@gmail.com> _________________________________ From 775565e72d9b851839d37088549f0fc247cac2e1 Mon Sep 17 00:00:00 2001 From: "Pratyush Yadav (Google)" <pratyush@kernel.org> Date: Thu, 2 Apr 2026 13:22:05 +0000 Subject: [PATCH] liveupdate: initialize incoming FLB state before finish The state of an incoming FLB object is initialized when it is first used. The initialization is done via luo_flb_retrieve_one(), which looks at all the incoming FLBs, matches the FLB to its serialized entry, and initializes the incoming data and count. luo_flb_file_finish_one() is called when finish is called for a file registered with this FLB. If no file handler has used the FLB by this point, the count stays un-initialized at 0. luo_flb_file_finish_one() then decrements this un-initialized count, leading to an underflow. This results in the FLB finish never being called since the count has underflowed to a very large value. Fix this by making sure the FLB is retrieved before using its count. Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state") Suggested-by: Leo Timmins <leotimmins1974@gmail.com> Reviewed-by: Leo Timmins <leotimmins1974@gmail.com> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org> --- kernel/liveupdate/luo_flb.c | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c index f52e8114837e..be141620751e 100644 --- a/kernel/liveupdate/luo_flb.c +++ b/kernel/liveupdate/luo_flb.c @@ -194,28 +194,26 @@ static void luo_flb_file_finish_one(struct liveupdate_flb *flb) struct luo_flb_private *private = luo_flb_get_private(flb); u64 count; - scoped_guard(mutex, &private->incoming.lock) - count = --private->incoming.count; + if (!private->incoming.retrieved) { + int err = luo_flb_retrieve_one(flb); + if (WARN_ON(err)) + return; + } + + guard(mutex)(&private->incoming.lock); + + count = --private->incoming.count; if (!count) { struct liveupdate_flb_op_args args = {0}; - if (!private->incoming.retrieved) { - int err = luo_flb_retrieve_one(flb); + args.flb = flb; + args.obj = private->incoming.obj; + flb->ops->finish(&args); - if (WARN_ON(err)) - return; - } - - scoped_guard(mutex, &private->incoming.lock) { - args.flb = flb; - args.obj = private->incoming.obj; - flb->ops->finish(&args); - - private->incoming.data = 0; - private->incoming.obj = NULL; - private->incoming.finished = true; - } + private->incoming.data = 0; + private->incoming.obj = NULL; + private->incoming.finished = true; } } -- On Sat, Apr 4, 2026 at 11:39 PM Leo Timmins <leotimmins1974@gmail.com> wrote: > > Hello All, > > I have done my testing of Pratyush's alternate patch. My tests indicate it successfully fixes the potential underflow if incoming.count is <= 0 > > Patch successfully compiles with 7.0.0-rc6 and has been tested on a VM (QEMU + VMM) with Arch. > > I executed a liveupdate with kexec which was successful. I then wrote a test tool to replicate the underflow conditions, and added a check to see if incoming.count is <= 0 after the if statement block > > Having confirmed it works: > > Reviewed-by: Leo Timmins <leotimmins1974@gmail.com> > > --------------------------------------------------------------------------------------- > > From 775565e72d9b851839d37088549f0fc247cac2e1 Mon Sep 17 00:00:00 2001 > From: "Pratyush Yadav (Google)" <pratyush@kernel.org> > Date: Thu, 2 Apr 2026 13:22:05 +0000 > Subject: [PATCH] liveupdate: initialize incoming FLB state before finish > > The state of an incoming FLB object is initialized when it is first > used. The initialization is done via luo_flb_retrieve_one(), which looks > at all the incoming FLBs, matches the FLB to its serialized entry, and > initializes the incoming data and count. > > luo_flb_file_finish_one() is called when finish is called for a file > registered with this FLB. If no file handler has used the FLB by this > point, the count stays un-initialized at 0. luo_flb_file_finish_one() > then decrements this un-initialized count, leading to an underflow. This > results in the FLB finish never being called since the count has > underflowed to a very large value. > > Fix this by making sure the FLB is retrieved before using its count. > > Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state") > Suggested-by: Leo Timmins <leotimmins1974@gmail.com> > Reviewed-by: Leo Timmins <leotimmins1974@gmail.com> > Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org> > --- > kernel/liveupdate/luo_flb.c | 32 +++++++++++++++----------------- > 1 file changed, 15 insertions(+), 17 deletions(-) > > diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c > index f52e8114837e..be141620751e 100644 > --- a/kernel/liveupdate/luo_flb.c > +++ b/kernel/liveupdate/luo_flb.c > @@ -194,28 +194,26 @@ static void luo_flb_file_finish_one(struct liveupdate_flb *flb) > struct luo_flb_private *private = luo_flb_get_private(flb); > u64 count; > > - scoped_guard(mutex, &private->incoming.lock) > - count = --private->incoming.count; > + if (!private->incoming.retrieved) { > + int err = luo_flb_retrieve_one(flb); > > + if (WARN_ON(err)) > + return; > + } > + > + guard(mutex)(&private->incoming.lock); > + > + count = --private->incoming.count; > if (!count) { > struct liveupdate_flb_op_args args = {0}; > > - if (!private->incoming.retrieved) { > - int err = luo_flb_retrieve_one(flb); > + args.flb = flb; > + args.obj = private->incoming.obj; > + flb->ops->finish(&args); > > - if (WARN_ON(err)) > - return; > - } > - > - scoped_guard(mutex, &private->incoming.lock) { > - args.flb = flb; > - args.obj = private->incoming.obj; > - flb->ops->finish(&args); > - > - private->incoming.data = 0; > - private->incoming.obj = NULL; > - private->incoming.finished = true; > - } > + private->incoming.data = 0; > + private->incoming.obj = NULL; > + private->incoming.finished = true; > } > } > > -- > > On Sat, Apr 4, 2026 at 1:26 AM Andrew Morton <akpm@linux-foundation.org> wrote: >> >> On Fri, 03 Apr 2026 09:04:25 +0000 Pratyush Yadav <pratyush@kernel.org> wrote: >> >> > On Thu, Apr 02 2026, Andrew Morton wrote: >> > >> > > On Thu, 02 Apr 2026 13:28:33 +0000 Pratyush Yadav <pratyush@kernel.org> wrote: >> > > >> > >> The state of an incoming FLB object is initialized when it is first >> > >> used. The initialization is done via luo_flb_retrieve_one(), which looks >> > >> at all the incoming FLBs, matches the FLB to its serialized entry, and >> > >> initializes the incoming data and count. >> > >> >> > >> luo_flb_file_finish_one() is called when finish is called for a file >> > >> registered with this FLB. If no file handler has used the FLB by this >> > >> point, the count stays un-initialized at 0. luo_flb_file_finish_one() >> > >> then decrements this un-initialized count, leading to an underflow. This >> > >> results in the FLB finish never being called since the count has >> > >> underflowed to a very large value. >> > >> >> > >> Fix this by making sure the FLB is retrieved before using its count. >> > > >> > > I like that the above tells people what the actual bug is! >> > > >> > > I still have both Leo's patches in mm.git, in wait-and-see mode. What >> > > to do here? Should I upstream [1/2] and drop [2/2]? Drop both and >> > > revisit after -rc1? >> > >> > These are independent fixes, so I would suggest keeping 1/2 regardless >> > of what we do with 2/2. For 2/2, I would suggest replacing it with the >> > version I sent in <2vxzmrzlfq4e.fsf@kernel.org>. >> >> OK, thanks. I removed [2/2] "liveupdate: initialize incoming FLB state >> before finish" from mm.git. >> >> I added cc:stable to [1/1] "liveupdate: propagate file deserialization >> failures" and altered its changelog to present it as a singleton patch >> (no "Patch series..." header, no "This patch (of 2)", etc. Below. >> >> > Mike/Pasha/Leo, if you could review my version then that would be great. >> > >> > Also Leo, please help with testing. I don't have a setup ready for >> > testing this corner case. I can set something up mid next week, but it >> > would be great if you can test this before that. >> > >> > > >> > > Also, did we consider cc:stable for these two? Perhaps add the >> > > cc:stable if we decide to attend to this after -rc1? >> > >> > FLB landed in v7.0-rc1 so no need for cc:stable for patch 2/2. For patch >> > 1/2, I think cc:stable does make sense, but it only landed in v6.19 so >> > not super important given it is not LTS. >> >> OK. When considering cc:stable I personally don't pay attention to LTS >> release cadence - I believe others do so. >> >> Because it could be that some downstream people are using non-LTS Linus >> kernels. But then they should be looking at the Fixes: tags. >> >> >> >> From: Leo Timmins <leotimmins1974@gmail.com> >> Subject: liveupdate: propagate file deserialization failures >> Date: Wed, 25 Mar 2026 12:46:07 +0800 >> >> luo_session_deserialize() ignored the return value from >> luo_file_deserialize(). As a result, a session could be left partially >> restored even though the /dev/liveupdate open path treats deserialization >> failures as fatal. >> >> Propagate the error so a failed file deserialization aborts session >> deserialization instead of silently continuing. >> >> Link: https://lkml.kernel.org/r/20260325044608.8407-1-leotimmins1974@gmail.com >> Link: https://lkml.kernel.org/r/20260325044608.8407-2-leotimmins1974@gmail.com >> Fixes: 16cec0d26521 ("liveupdate: luo_session: add ioctls for file preservation") >> Signed-off-by: Leo Timmins <leotimmins1974@gmail.com> >> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> >> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> >> Cc: Mike Rapoport <rppt@kernel.org> >> Cc: <stable@vger.kernel.org> >> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> >> --- >> >> kernel/liveupdate/luo_session.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> --- a/kernel/liveupdate/luo_session.c~liveupdate-propagate-file-deserialization-failures >> +++ a/kernel/liveupdate/luo_session.c >> @@ -558,8 +558,13 @@ int luo_session_deserialize(void) >> } >> >> scoped_guard(mutex, &session->mutex) { >> - luo_file_deserialize(&session->file_set, >> - &sh->ser[i].file_set_ser); >> + err = luo_file_deserialize(&session->file_set, >> + &sh->ser[i].file_set_ser); >> + } >> + if (err) { >> + pr_warn("Failed to deserialize files for session [%s] %pe\n", >> + session->name, ERR_PTR(err)); >> + return err; >> } >> } >> >> _ >> ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish 2026-04-04 15:43 ` Leo Timmins @ 2026-04-05 7:35 ` Pratyush Yadav 0 siblings, 0 replies; 13+ messages in thread From: Pratyush Yadav @ 2026-04-05 7:35 UTC (permalink / raw) To: Leo Timmins Cc: Andrew Morton, Pratyush Yadav, pasha.tatashin, rppt, linux-kernel On Sat, Apr 04 2026, Leo Timmins wrote: > Hello All, > >> Apologies for the duplicate email, I had accidentally sent it out with html instead of plain text. > > I have done my testing of Pratyush's alternate patch. My tests > indicate it successfully fixes the potential underflow if > incoming.count is <= 0 > > Patch successfully compiles with 7.0.0-rc6 and has been tested on a VM > (QEMU + VMM) with Arch. > > I executed a liveupdate with kexec which was successful. I then wrote > a test tool to replicate the underflow conditions, and added a check > to see if incoming.count is <= 0 after the if statement block > > Having confirmed it works: > > Reviewed-by: Leo Timmins <leotimmins1974@gmail.com> Thanks! [...] -- Regards, Pratyush Yadav ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths 2026-03-26 4:25 [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Leo Timmins 2026-03-26 4:25 ` [PATCH v3 1/2] liveupdate: propagate file deserialization failures Leo Timmins 2026-03-26 4:25 ` [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish Leo Timmins @ 2026-03-28 0:32 ` Andrew Morton 2026-03-28 1:47 ` Pasha Tatashin 2 siblings, 1 reply; 13+ messages in thread From: Andrew Morton @ 2026-03-28 0:32 UTC (permalink / raw) To: Leo Timmins; +Cc: pasha.tatashin, rppt, linux-kernel, pratyush On Thu, 26 Mar 2026 12:25:33 +0800 Leo Timmins <leotimmins1974@gmail.com> wrote: > This series fixes two issues in LUO's incoming-side error handling and > teardown paths. > > The first patch makes session deserialization fail when file > deserialization fails, instead of silently continuing with a partially > restored session. > > The second patch (formerly patch 3 on the v1) initializes incoming FLB > state before the finish pathdecrements its refcount, so the last-user > cleanup path does not run from an uninitialized count. (and now > utilises pr_warn instead of WARN_ON) I'm not clear how we want to schedule these two patches. Into next merge widow? Into 7.0-rcX? Into 7.0-rcX and cc:stable? Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths 2026-03-28 0:32 ` [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Andrew Morton @ 2026-03-28 1:47 ` Pasha Tatashin 0 siblings, 0 replies; 13+ messages in thread From: Pasha Tatashin @ 2026-03-28 1:47 UTC (permalink / raw) To: Andrew Morton; +Cc: Leo Timmins, rppt, linux-kernel, pratyush On Fri, Mar 27, 2026 at 8:32 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Thu, 26 Mar 2026 12:25:33 +0800 Leo Timmins <leotimmins1974@gmail.com> wrote: > > > This series fixes two issues in LUO's incoming-side error handling and > > teardown paths. > > > > The first patch makes session deserialization fail when file > > deserialization fails, instead of silently continuing with a partially > > restored session. > > > > The second patch (formerly patch 3 on the v1) initializes incoming FLB > > state before the finish pathdecrements its refcount, so the last-user > > cleanup path does not run from an uninitialized count. (and now > > utilises pr_warn instead of WARN_ON) > > I'm not clear how we want to schedule these two patches. Into next > merge widow? Into 7.0-rcX? Into 7.0-rcX and cc:stable? I think, there is no need to cc:stable live update feature is still very new and actively being developed. However, 7.0-rcX, would be appropriate. Pasha > > Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-04-05 7:35 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-26 4:25 [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Leo Timmins
2026-03-26 4:25 ` [PATCH v3 1/2] liveupdate: propagate file deserialization failures Leo Timmins
2026-04-02 12:17 ` Pratyush Yadav
2026-03-26 4:25 ` [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish Leo Timmins
2026-03-26 14:50 ` Pasha Tatashin
2026-04-02 13:28 ` Pratyush Yadav
2026-04-02 18:15 ` Andrew Morton
2026-04-03 9:04 ` Pratyush Yadav
2026-04-03 17:26 ` Andrew Morton
[not found] ` <CA+uuG7Lnc94vTmZPEhbvQXgAzWJDL28Zf=QR=uAkmiWvoW+Uxw@mail.gmail.com>
2026-04-04 15:43 ` Leo Timmins
2026-04-05 7:35 ` Pratyush Yadav
2026-03-28 0:32 ` [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Andrew Morton
2026-03-28 1:47 ` Pasha Tatashin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox