* [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths
@ 2026-03-26 4:25 Leo Timmins
2026-03-26 4:25 ` [PATCH v3 1/2] liveupdate: propagate file deserialization failures Leo Timmins
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Leo Timmins @ 2026-03-26 4:25 UTC (permalink / raw)
To: pasha.tatashin, rppt, linux-kernel; +Cc: pratyush, akpm, Leo Timmins
Hi,
This series fixes two issues in LUO's incoming-side error handling and
teardown paths.
The first patch makes session deserialization fail when file
deserialization fails, instead of silently continuing with a partially
restored session.
The second patch (formerly patch 3 on the v1) initializes incoming FLB
state before the finish pathdecrements its refcount, so the last-user
cleanup path does not run from an uninitialized count. (and now
utilises pr_warn instead of WARN_ON)
Changes in v2:
- drop the previous patch 2 after review
- patch 2/2: replace WARN_ON(err) with pr_warn() in
luo_flb_file_finish_one()
Changes in v3:
- patch 2/2. minor change to formatting. removed braces from scopped_guard
Leo Timmins (2):
liveupdate: propagate file deserialization failures
liveupdate: initialize incoming FLB state before finish
kernel/liveupdate/luo_flb.c | 19 ++++++++++++++++++-
kernel/liveupdate/luo_session.c | 9 +++++++--
2 files changed, 25 insertions(+), 3 deletions(-)
base-commit: e3c33bc767b5512dbfec643a02abf58ce608f3b2
--
2.53.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v3 1/2] liveupdate: propagate file deserialization failures
2026-03-26 4:25 [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Leo Timmins
@ 2026-03-26 4:25 ` Leo Timmins
2026-04-02 12:17 ` Pratyush Yadav
2026-03-26 4:25 ` [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish Leo Timmins
2026-03-28 0:32 ` [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Andrew Morton
2 siblings, 1 reply; 13+ messages in thread
From: Leo Timmins @ 2026-03-26 4:25 UTC (permalink / raw)
To: pasha.tatashin, rppt, linux-kernel; +Cc: pratyush, akpm, Leo Timmins
luo_session_deserialize() ignored the return value from
luo_file_deserialize(). As a result, a session could be left partially
restored even though the /dev/liveupdate open path treats deserialization
failures as fatal.
Propagate the error so a failed file deserialization aborts session
deserialization instead of silently continuing.
Fixes: 16cec0d26521 ("liveupdate: luo_session: add ioctls for file preservation")
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Leo Timmins <leotimmins1974@gmail.com>
---
kernel/liveupdate/luo_session.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/kernel/liveupdate/luo_session.c b/kernel/liveupdate/luo_session.c
index 783677295640..25ae704d7787 100644
--- a/kernel/liveupdate/luo_session.c
+++ b/kernel/liveupdate/luo_session.c
@@ -558,8 +558,13 @@ int luo_session_deserialize(void)
}
scoped_guard(mutex, &session->mutex) {
- luo_file_deserialize(&session->file_set,
- &sh->ser[i].file_set_ser);
+ err = luo_file_deserialize(&session->file_set,
+ &sh->ser[i].file_set_ser);
+ }
+ if (err) {
+ pr_warn("Failed to deserialize files for session [%s] %pe\n",
+ session->name, ERR_PTR(err));
+ return err;
}
}
--
2.53.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish
2026-03-26 4:25 [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Leo Timmins
2026-03-26 4:25 ` [PATCH v3 1/2] liveupdate: propagate file deserialization failures Leo Timmins
@ 2026-03-26 4:25 ` Leo Timmins
2026-03-26 14:50 ` Pasha Tatashin
2026-04-02 13:28 ` Pratyush Yadav
2026-03-28 0:32 ` [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Andrew Morton
2 siblings, 2 replies; 13+ messages in thread
From: Leo Timmins @ 2026-03-26 4:25 UTC (permalink / raw)
To: pasha.tatashin, rppt, linux-kernel; +Cc: pratyush, akpm, Leo Timmins
luo_flb_file_finish_one() decremented incoming.count before making sure
that the incoming FLB state had been materialized. If no earlier incoming
retrieval had populated that state, the first decrement ran from zero and
skipped the last-user finish path.
Initialize the incoming FLB state before the first decrement so finish
uses the serialized refcount instead of an uninitialized value.
Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state")
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Leo Timmins <leotimmins1974@gmail.com>
---
kernel/liveupdate/luo_flb.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c
index f52e8114837e..855af655b09b 100644
--- a/kernel/liveupdate/luo_flb.c
+++ b/kernel/liveupdate/luo_flb.c
@@ -192,10 +192,27 @@ static int luo_flb_retrieve_one(struct liveupdate_flb *flb)
static void luo_flb_file_finish_one(struct liveupdate_flb *flb)
{
struct luo_flb_private *private = luo_flb_get_private(flb);
+ bool needs_retrieve = false;
u64 count;
- scoped_guard(mutex, &private->incoming.lock)
+ scoped_guard(mutex, &private->incoming.lock) {
+ if (!private->incoming.count && !private->incoming.finished)
+ needs_retrieve = true;
+ }
+
+ if (needs_retrieve) {
+ int err = luo_flb_retrieve_one(flb);
+
+ if (err) {
+ pr_warn("Failed to retrieve FLB '%s' during finish: %pe\n",
+ flb->compatible, ERR_PTR(err));
+ return;
+ }
+ }
+
+ scoped_guard(mutex, &private->incoming.lock) {
count = --private->incoming.count;
+ }
if (!count) {
struct liveupdate_flb_op_args args = {0};
--
2.53.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish
2026-03-26 4:25 ` [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish Leo Timmins
@ 2026-03-26 14:50 ` Pasha Tatashin
2026-04-02 13:28 ` Pratyush Yadav
1 sibling, 0 replies; 13+ messages in thread
From: Pasha Tatashin @ 2026-03-26 14:50 UTC (permalink / raw)
To: Leo Timmins; +Cc: rppt, linux-kernel, pratyush, akpm
On Thu, Mar 26, 2026 at 12:26 AM Leo Timmins <leotimmins1974@gmail.com> wrote:
>
> luo_flb_file_finish_one() decremented incoming.count before making sure
> that the incoming FLB state had been materialized. If no earlier incoming
> retrieval had populated that state, the first decrement ran from zero and
> skipped the last-user finish path.
>
> Initialize the incoming FLB state before the first decrement so finish
> uses the serialized refcount instead of an uninitialized value.
>
> Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state")
> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> Signed-off-by: Leo Timmins <leotimmins1974@gmail.com>
> ---
> kernel/liveupdate/luo_flb.c | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c
> index f52e8114837e..855af655b09b 100644
> --- a/kernel/liveupdate/luo_flb.c
> +++ b/kernel/liveupdate/luo_flb.c
> @@ -192,10 +192,27 @@ static int luo_flb_retrieve_one(struct liveupdate_flb *flb)
> static void luo_flb_file_finish_one(struct liveupdate_flb *flb)
> {
> struct luo_flb_private *private = luo_flb_get_private(flb);
> + bool needs_retrieve = false;
> u64 count;
>
> - scoped_guard(mutex, &private->incoming.lock)
> + scoped_guard(mutex, &private->incoming.lock) {
> + if (!private->incoming.count && !private->incoming.finished)
> + needs_retrieve = true;
> + }
> +
> + if (needs_retrieve) {
> + int err = luo_flb_retrieve_one(flb);
> +
> + if (err) {
> + pr_warn("Failed to retrieve FLB '%s' during finish: %pe\n",
> + flb->compatible, ERR_PTR(err));
> + return;
> + }
> + }
> +
> + scoped_guard(mutex, &private->incoming.lock) {
> count = --private->incoming.count;
> + }
The braces are still added
>
> if (!count) {
> struct liveupdate_flb_op_args args = {0};
> --
> 2.53.0
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths
2026-03-26 4:25 [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Leo Timmins
2026-03-26 4:25 ` [PATCH v3 1/2] liveupdate: propagate file deserialization failures Leo Timmins
2026-03-26 4:25 ` [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish Leo Timmins
@ 2026-03-28 0:32 ` Andrew Morton
2026-03-28 1:47 ` Pasha Tatashin
2 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2026-03-28 0:32 UTC (permalink / raw)
To: Leo Timmins; +Cc: pasha.tatashin, rppt, linux-kernel, pratyush
On Thu, 26 Mar 2026 12:25:33 +0800 Leo Timmins <leotimmins1974@gmail.com> wrote:
> This series fixes two issues in LUO's incoming-side error handling and
> teardown paths.
>
> The first patch makes session deserialization fail when file
> deserialization fails, instead of silently continuing with a partially
> restored session.
>
> The second patch (formerly patch 3 on the v1) initializes incoming FLB
> state before the finish pathdecrements its refcount, so the last-user
> cleanup path does not run from an uninitialized count. (and now
> utilises pr_warn instead of WARN_ON)
I'm not clear how we want to schedule these two patches. Into next
merge widow? Into 7.0-rcX? Into 7.0-rcX and cc:stable?
Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths
2026-03-28 0:32 ` [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Andrew Morton
@ 2026-03-28 1:47 ` Pasha Tatashin
0 siblings, 0 replies; 13+ messages in thread
From: Pasha Tatashin @ 2026-03-28 1:47 UTC (permalink / raw)
To: Andrew Morton; +Cc: Leo Timmins, rppt, linux-kernel, pratyush
On Fri, Mar 27, 2026 at 8:32 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu, 26 Mar 2026 12:25:33 +0800 Leo Timmins <leotimmins1974@gmail.com> wrote:
>
> > This series fixes two issues in LUO's incoming-side error handling and
> > teardown paths.
> >
> > The first patch makes session deserialization fail when file
> > deserialization fails, instead of silently continuing with a partially
> > restored session.
> >
> > The second patch (formerly patch 3 on the v1) initializes incoming FLB
> > state before the finish pathdecrements its refcount, so the last-user
> > cleanup path does not run from an uninitialized count. (and now
> > utilises pr_warn instead of WARN_ON)
>
> I'm not clear how we want to schedule these two patches. Into next
> merge widow? Into 7.0-rcX? Into 7.0-rcX and cc:stable?
I think, there is no need to cc:stable live update feature is still
very new and actively being developed. However, 7.0-rcX, would be
appropriate.
Pasha
>
> Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 1/2] liveupdate: propagate file deserialization failures
2026-03-26 4:25 ` [PATCH v3 1/2] liveupdate: propagate file deserialization failures Leo Timmins
@ 2026-04-02 12:17 ` Pratyush Yadav
0 siblings, 0 replies; 13+ messages in thread
From: Pratyush Yadav @ 2026-04-02 12:17 UTC (permalink / raw)
To: Leo Timmins; +Cc: pasha.tatashin, rppt, linux-kernel, pratyush, akpm
On Thu, Mar 26 2026, Leo Timmins wrote:
> luo_session_deserialize() ignored the return value from
> luo_file_deserialize(). As a result, a session could be left partially
> restored even though the /dev/liveupdate open path treats deserialization
> failures as fatal.
>
> Propagate the error so a failed file deserialization aborts session
> deserialization instead of silently continuing.
>
> Fixes: 16cec0d26521 ("liveupdate: luo_session: add ioctls for file preservation")
> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> Signed-off-by: Leo Timmins <leotimmins1974@gmail.com>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
[...]
--
Regards,
Pratyush Yadav
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish
2026-03-26 4:25 ` [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish Leo Timmins
2026-03-26 14:50 ` Pasha Tatashin
@ 2026-04-02 13:28 ` Pratyush Yadav
2026-04-02 18:15 ` Andrew Morton
1 sibling, 1 reply; 13+ messages in thread
From: Pratyush Yadav @ 2026-04-02 13:28 UTC (permalink / raw)
To: Leo Timmins; +Cc: pasha.tatashin, rppt, linux-kernel, pratyush, akpm
On Thu, Mar 26 2026, Leo Timmins wrote:
> luo_flb_file_finish_one() decremented incoming.count before making sure
> that the incoming FLB state had been materialized. If no earlier incoming
> retrieval had populated that state, the first decrement ran from zero and
> skipped the last-user finish path.
>
> Initialize the incoming FLB state before the first decrement so finish
> uses the serialized refcount instead of an uninitialized value.
This commit message makes it very hard to understand what the problem it
fixes. It took me 20 minutes of reading this patch and looking at the
FLB code to figure out what is going on. Here is what I'd suggest:
The state of an incoming FLB object is initialized when it is first
used. The initialization is done via luo_flb_retrieve_one(), which looks
at all the incoming FLBs, matches the FLB to its serialized entry, and
initializes the incoming data and count.
luo_flb_file_finish_one() is called when finish is called for a file
registered with this FLB. If no file handler has used the FLB by this
point, the count stays un-initialized at 0. luo_flb_file_finish_one()
then decrements this un-initialized count, leading to an underflow. This
results in the FLB finish never being called since the count has
underflowed to a very large value.
Fix this by making sure the FLB is retrieved before using its count.
>
> Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state")
> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> Signed-off-by: Leo Timmins <leotimmins1974@gmail.com>
> ---
> kernel/liveupdate/luo_flb.c | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c
> index f52e8114837e..855af655b09b 100644
> --- a/kernel/liveupdate/luo_flb.c
> +++ b/kernel/liveupdate/luo_flb.c
> @@ -192,10 +192,27 @@ static int luo_flb_retrieve_one(struct liveupdate_flb *flb)
> static void luo_flb_file_finish_one(struct liveupdate_flb *flb)
> {
> struct luo_flb_private *private = luo_flb_get_private(flb);
> + bool needs_retrieve = false;
> u64 count;
>
> - scoped_guard(mutex, &private->incoming.lock)
> + scoped_guard(mutex, &private->incoming.lock) {
> + if (!private->incoming.count && !private->incoming.finished)
> + needs_retrieve = true;
> + }
> +
> + if (needs_retrieve) {
> + int err = luo_flb_retrieve_one(flb);
> +
> + if (err) {
> + pr_warn("Failed to retrieve FLB '%s' during finish: %pe\n",
> + flb->compatible, ERR_PTR(err));
> + return;
> + }
> + }
> +
> + scoped_guard(mutex, &private->incoming.lock) {
> count = --private->incoming.count;
> + }
This looks way too overcomplicated. We just need to move the retrieve
call before we check the count.
Pasha, side note: I think FLB suffers from the same problem I fixed for
LUO files with f85b1c6af5bc ("liveupdate: luo_file: remember retrieve()
status"). If retrieve fails, it doesn't remember the error code and will
retry retrieve() on next usage, causing all sorts of problems.
Pasha, another side note: I think incoming.private should be initialized
when the FLB is registered, not when it is first used. This will make
this simpler overall. This would need liveupdate_register_flb() to not
call kzalloc(), but that can be done I think.
Anyway, for this problem, how about the patch below (only
compile-tested) addressing my comments. The diff looks bigger but the
end result is a lot cleaner IMO. In effect it just moves the retrieve
call outside the if (!count). Everything else is just cleaning up the
locking situation.
--- 8< ---
From 775565e72d9b851839d37088549f0fc247cac2e1 Mon Sep 17 00:00:00 2001
From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
Date: Thu, 2 Apr 2026 13:22:05 +0000
Subject: [PATCH] liveupdate: initialize incoming FLB state before finish
The state of an incoming FLB object is initialized when it is first
used. The initialization is done via luo_flb_retrieve_one(), which looks
at all the incoming FLBs, matches the FLB to its serialized entry, and
initializes the incoming data and count.
luo_flb_file_finish_one() is called when finish is called for a file
registered with this FLB. If no file handler has used the FLB by this
point, the count stays un-initialized at 0. luo_flb_file_finish_one()
then decrements this un-initialized count, leading to an underflow. This
results in the FLB finish never being called since the count has
underflowed to a very large value.
Fix this by making sure the FLB is retrieved before using its count.
Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state")
Suggested-by: Leo Timmins <leotimmins1974@gmail.com>
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
---
kernel/liveupdate/luo_flb.c | 32 +++++++++++++++-----------------
1 file changed, 15 insertions(+), 17 deletions(-)
diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c
index f52e8114837e..be141620751e 100644
--- a/kernel/liveupdate/luo_flb.c
+++ b/kernel/liveupdate/luo_flb.c
@@ -194,28 +194,26 @@ static void luo_flb_file_finish_one(struct liveupdate_flb *flb)
struct luo_flb_private *private = luo_flb_get_private(flb);
u64 count;
- scoped_guard(mutex, &private->incoming.lock)
- count = --private->incoming.count;
+ if (!private->incoming.retrieved) {
+ int err = luo_flb_retrieve_one(flb);
+ if (WARN_ON(err))
+ return;
+ }
+
+ guard(mutex)(&private->incoming.lock);
+
+ count = --private->incoming.count;
if (!count) {
struct liveupdate_flb_op_args args = {0};
- if (!private->incoming.retrieved) {
- int err = luo_flb_retrieve_one(flb);
+ args.flb = flb;
+ args.obj = private->incoming.obj;
+ flb->ops->finish(&args);
- if (WARN_ON(err))
- return;
- }
-
- scoped_guard(mutex, &private->incoming.lock) {
- args.flb = flb;
- args.obj = private->incoming.obj;
- flb->ops->finish(&args);
-
- private->incoming.data = 0;
- private->incoming.obj = NULL;
- private->incoming.finished = true;
- }
+ private->incoming.data = 0;
+ private->incoming.obj = NULL;
+ private->incoming.finished = true;
}
}
--
Regards,
Pratyush Yadav
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish
2026-04-02 13:28 ` Pratyush Yadav
@ 2026-04-02 18:15 ` Andrew Morton
2026-04-03 9:04 ` Pratyush Yadav
0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2026-04-02 18:15 UTC (permalink / raw)
To: Pratyush Yadav; +Cc: Leo Timmins, pasha.tatashin, rppt, linux-kernel
On Thu, 02 Apr 2026 13:28:33 +0000 Pratyush Yadav <pratyush@kernel.org> wrote:
> The state of an incoming FLB object is initialized when it is first
> used. The initialization is done via luo_flb_retrieve_one(), which looks
> at all the incoming FLBs, matches the FLB to its serialized entry, and
> initializes the incoming data and count.
>
> luo_flb_file_finish_one() is called when finish is called for a file
> registered with this FLB. If no file handler has used the FLB by this
> point, the count stays un-initialized at 0. luo_flb_file_finish_one()
> then decrements this un-initialized count, leading to an underflow. This
> results in the FLB finish never being called since the count has
> underflowed to a very large value.
>
> Fix this by making sure the FLB is retrieved before using its count.
I like that the above tells people what the actual bug is!
I still have both Leo's patches in mm.git, in wait-and-see mode. What
to do here? Should I upstream [1/2] and drop [2/2]? Drop both and
revisit after -rc1?
Also, did we consider cc:stable for these two? Perhaps add the
cc:stable if we decide to attend to this after -rc1?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish
2026-04-02 18:15 ` Andrew Morton
@ 2026-04-03 9:04 ` Pratyush Yadav
2026-04-03 17:26 ` Andrew Morton
0 siblings, 1 reply; 13+ messages in thread
From: Pratyush Yadav @ 2026-04-03 9:04 UTC (permalink / raw)
To: Andrew Morton
Cc: Pratyush Yadav, Leo Timmins, pasha.tatashin, rppt, linux-kernel
On Thu, Apr 02 2026, Andrew Morton wrote:
> On Thu, 02 Apr 2026 13:28:33 +0000 Pratyush Yadav <pratyush@kernel.org> wrote:
>
>> The state of an incoming FLB object is initialized when it is first
>> used. The initialization is done via luo_flb_retrieve_one(), which looks
>> at all the incoming FLBs, matches the FLB to its serialized entry, and
>> initializes the incoming data and count.
>>
>> luo_flb_file_finish_one() is called when finish is called for a file
>> registered with this FLB. If no file handler has used the FLB by this
>> point, the count stays un-initialized at 0. luo_flb_file_finish_one()
>> then decrements this un-initialized count, leading to an underflow. This
>> results in the FLB finish never being called since the count has
>> underflowed to a very large value.
>>
>> Fix this by making sure the FLB is retrieved before using its count.
>
> I like that the above tells people what the actual bug is!
>
> I still have both Leo's patches in mm.git, in wait-and-see mode. What
> to do here? Should I upstream [1/2] and drop [2/2]? Drop both and
> revisit after -rc1?
These are independent fixes, so I would suggest keeping 1/2 regardless
of what we do with 2/2. For 2/2, I would suggest replacing it with the
version I sent in <2vxzmrzlfq4e.fsf@kernel.org>.
Mike/Pasha/Leo, if you could review my version then that would be great.
Also Leo, please help with testing. I don't have a setup ready for
testing this corner case. I can set something up mid next week, but it
would be great if you can test this before that.
>
> Also, did we consider cc:stable for these two? Perhaps add the
> cc:stable if we decide to attend to this after -rc1?
FLB landed in v7.0-rc1 so no need for cc:stable for patch 2/2. For patch
1/2, I think cc:stable does make sense, but it only landed in v6.19 so
not super important given it is not LTS.
--
Regards,
Pratyush Yadav
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish
2026-04-03 9:04 ` Pratyush Yadav
@ 2026-04-03 17:26 ` Andrew Morton
[not found] ` <CA+uuG7Lnc94vTmZPEhbvQXgAzWJDL28Zf=QR=uAkmiWvoW+Uxw@mail.gmail.com>
0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2026-04-03 17:26 UTC (permalink / raw)
To: Pratyush Yadav; +Cc: Leo Timmins, pasha.tatashin, rppt, linux-kernel
On Fri, 03 Apr 2026 09:04:25 +0000 Pratyush Yadav <pratyush@kernel.org> wrote:
> On Thu, Apr 02 2026, Andrew Morton wrote:
>
> > On Thu, 02 Apr 2026 13:28:33 +0000 Pratyush Yadav <pratyush@kernel.org> wrote:
> >
> >> The state of an incoming FLB object is initialized when it is first
> >> used. The initialization is done via luo_flb_retrieve_one(), which looks
> >> at all the incoming FLBs, matches the FLB to its serialized entry, and
> >> initializes the incoming data and count.
> >>
> >> luo_flb_file_finish_one() is called when finish is called for a file
> >> registered with this FLB. If no file handler has used the FLB by this
> >> point, the count stays un-initialized at 0. luo_flb_file_finish_one()
> >> then decrements this un-initialized count, leading to an underflow. This
> >> results in the FLB finish never being called since the count has
> >> underflowed to a very large value.
> >>
> >> Fix this by making sure the FLB is retrieved before using its count.
> >
> > I like that the above tells people what the actual bug is!
> >
> > I still have both Leo's patches in mm.git, in wait-and-see mode. What
> > to do here? Should I upstream [1/2] and drop [2/2]? Drop both and
> > revisit after -rc1?
>
> These are independent fixes, so I would suggest keeping 1/2 regardless
> of what we do with 2/2. For 2/2, I would suggest replacing it with the
> version I sent in <2vxzmrzlfq4e.fsf@kernel.org>.
OK, thanks. I removed [2/2] "liveupdate: initialize incoming FLB state
before finish" from mm.git.
I added cc:stable to [1/1] "liveupdate: propagate file deserialization
failures" and altered its changelog to present it as a singleton patch
(no "Patch series..." header, no "This patch (of 2)", etc. Below.
> Mike/Pasha/Leo, if you could review my version then that would be great.
>
> Also Leo, please help with testing. I don't have a setup ready for
> testing this corner case. I can set something up mid next week, but it
> would be great if you can test this before that.
>
> >
> > Also, did we consider cc:stable for these two? Perhaps add the
> > cc:stable if we decide to attend to this after -rc1?
>
> FLB landed in v7.0-rc1 so no need for cc:stable for patch 2/2. For patch
> 1/2, I think cc:stable does make sense, but it only landed in v6.19 so
> not super important given it is not LTS.
OK. When considering cc:stable I personally don't pay attention to LTS
release cadence - I believe others do so.
Because it could be that some downstream people are using non-LTS Linus
kernels. But then they should be looking at the Fixes: tags.
From: Leo Timmins <leotimmins1974@gmail.com>
Subject: liveupdate: propagate file deserialization failures
Date: Wed, 25 Mar 2026 12:46:07 +0800
luo_session_deserialize() ignored the return value from
luo_file_deserialize(). As a result, a session could be left partially
restored even though the /dev/liveupdate open path treats deserialization
failures as fatal.
Propagate the error so a failed file deserialization aborts session
deserialization instead of silently continuing.
Link: https://lkml.kernel.org/r/20260325044608.8407-1-leotimmins1974@gmail.com
Link: https://lkml.kernel.org/r/20260325044608.8407-2-leotimmins1974@gmail.com
Fixes: 16cec0d26521 ("liveupdate: luo_session: add ioctls for file preservation")
Signed-off-by: Leo Timmins <leotimmins1974@gmail.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
kernel/liveupdate/luo_session.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/kernel/liveupdate/luo_session.c~liveupdate-propagate-file-deserialization-failures
+++ a/kernel/liveupdate/luo_session.c
@@ -558,8 +558,13 @@ int luo_session_deserialize(void)
}
scoped_guard(mutex, &session->mutex) {
- luo_file_deserialize(&session->file_set,
- &sh->ser[i].file_set_ser);
+ err = luo_file_deserialize(&session->file_set,
+ &sh->ser[i].file_set_ser);
+ }
+ if (err) {
+ pr_warn("Failed to deserialize files for session [%s] %pe\n",
+ session->name, ERR_PTR(err));
+ return err;
}
}
_
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish
[not found] ` <CA+uuG7Lnc94vTmZPEhbvQXgAzWJDL28Zf=QR=uAkmiWvoW+Uxw@mail.gmail.com>
@ 2026-04-04 15:43 ` Leo Timmins
2026-04-05 7:35 ` Pratyush Yadav
0 siblings, 1 reply; 13+ messages in thread
From: Leo Timmins @ 2026-04-04 15:43 UTC (permalink / raw)
To: Andrew Morton; +Cc: Pratyush Yadav, pasha.tatashin, rppt, linux-kernel
Hello All,
> Apologies for the duplicate email, I had accidentally sent it out with html instead of plain text.
I have done my testing of Pratyush's alternate patch. My tests
indicate it successfully fixes the potential underflow if
incoming.count is <= 0
Patch successfully compiles with 7.0.0-rc6 and has been tested on a VM
(QEMU + VMM) with Arch.
I executed a liveupdate with kexec which was successful. I then wrote
a test tool to replicate the underflow conditions, and added a check
to see if incoming.count is <= 0 after the if statement block
Having confirmed it works:
Reviewed-by: Leo Timmins <leotimmins1974@gmail.com>
_________________________________
From 775565e72d9b851839d37088549f0fc247cac2e1 Mon Sep 17 00:00:00 2001
From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
Date: Thu, 2 Apr 2026 13:22:05 +0000
Subject: [PATCH] liveupdate: initialize incoming FLB state before finish
The state of an incoming FLB object is initialized when it is first
used. The initialization is done via luo_flb_retrieve_one(), which looks
at all the incoming FLBs, matches the FLB to its serialized entry, and
initializes the incoming data and count.
luo_flb_file_finish_one() is called when finish is called for a file
registered with this FLB. If no file handler has used the FLB by this
point, the count stays un-initialized at 0. luo_flb_file_finish_one()
then decrements this un-initialized count, leading to an underflow. This
results in the FLB finish never being called since the count has
underflowed to a very large value.
Fix this by making sure the FLB is retrieved before using its count.
Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce
File-Lifecycle-Bound global state")
Suggested-by: Leo Timmins <leotimmins1974@gmail.com>
Reviewed-by: Leo Timmins <leotimmins1974@gmail.com>
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
---
kernel/liveupdate/luo_flb.c | 32 +++++++++++++++-----------------
1 file changed, 15 insertions(+), 17 deletions(-)
diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c
index f52e8114837e..be141620751e 100644
--- a/kernel/liveupdate/luo_flb.c
+++ b/kernel/liveupdate/luo_flb.c
@@ -194,28 +194,26 @@ static void luo_flb_file_finish_one(struct
liveupdate_flb *flb)
struct luo_flb_private *private = luo_flb_get_private(flb);
u64 count;
- scoped_guard(mutex, &private->incoming.lock)
- count = --private->incoming.count;
+ if (!private->incoming.retrieved) {
+ int err = luo_flb_retrieve_one(flb);
+ if (WARN_ON(err))
+ return;
+ }
+
+ guard(mutex)(&private->incoming.lock);
+
+ count = --private->incoming.count;
if (!count) {
struct liveupdate_flb_op_args args = {0};
- if (!private->incoming.retrieved) {
- int err = luo_flb_retrieve_one(flb);
+ args.flb = flb;
+ args.obj = private->incoming.obj;
+ flb->ops->finish(&args);
- if (WARN_ON(err))
- return;
- }
-
- scoped_guard(mutex, &private->incoming.lock) {
- args.flb = flb;
- args.obj = private->incoming.obj;
- flb->ops->finish(&args);
-
- private->incoming.data = 0;
- private->incoming.obj = NULL;
- private->incoming.finished = true;
- }
+ private->incoming.data = 0;
+ private->incoming.obj = NULL;
+ private->incoming.finished = true;
}
}
--
On Sat, Apr 4, 2026 at 11:39 PM Leo Timmins <leotimmins1974@gmail.com> wrote:
>
> Hello All,
>
> I have done my testing of Pratyush's alternate patch. My tests indicate it successfully fixes the potential underflow if incoming.count is <= 0
>
> Patch successfully compiles with 7.0.0-rc6 and has been tested on a VM (QEMU + VMM) with Arch.
>
> I executed a liveupdate with kexec which was successful. I then wrote a test tool to replicate the underflow conditions, and added a check to see if incoming.count is <= 0 after the if statement block
>
> Having confirmed it works:
>
> Reviewed-by: Leo Timmins <leotimmins1974@gmail.com>
>
> ---------------------------------------------------------------------------------------
>
> From 775565e72d9b851839d37088549f0fc247cac2e1 Mon Sep 17 00:00:00 2001
> From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
> Date: Thu, 2 Apr 2026 13:22:05 +0000
> Subject: [PATCH] liveupdate: initialize incoming FLB state before finish
>
> The state of an incoming FLB object is initialized when it is first
> used. The initialization is done via luo_flb_retrieve_one(), which looks
> at all the incoming FLBs, matches the FLB to its serialized entry, and
> initializes the incoming data and count.
>
> luo_flb_file_finish_one() is called when finish is called for a file
> registered with this FLB. If no file handler has used the FLB by this
> point, the count stays un-initialized at 0. luo_flb_file_finish_one()
> then decrements this un-initialized count, leading to an underflow. This
> results in the FLB finish never being called since the count has
> underflowed to a very large value.
>
> Fix this by making sure the FLB is retrieved before using its count.
>
> Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state")
> Suggested-by: Leo Timmins <leotimmins1974@gmail.com>
> Reviewed-by: Leo Timmins <leotimmins1974@gmail.com>
> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
> ---
> kernel/liveupdate/luo_flb.c | 32 +++++++++++++++-----------------
> 1 file changed, 15 insertions(+), 17 deletions(-)
>
> diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c
> index f52e8114837e..be141620751e 100644
> --- a/kernel/liveupdate/luo_flb.c
> +++ b/kernel/liveupdate/luo_flb.c
> @@ -194,28 +194,26 @@ static void luo_flb_file_finish_one(struct liveupdate_flb *flb)
> struct luo_flb_private *private = luo_flb_get_private(flb);
> u64 count;
>
> - scoped_guard(mutex, &private->incoming.lock)
> - count = --private->incoming.count;
> + if (!private->incoming.retrieved) {
> + int err = luo_flb_retrieve_one(flb);
>
> + if (WARN_ON(err))
> + return;
> + }
> +
> + guard(mutex)(&private->incoming.lock);
> +
> + count = --private->incoming.count;
> if (!count) {
> struct liveupdate_flb_op_args args = {0};
>
> - if (!private->incoming.retrieved) {
> - int err = luo_flb_retrieve_one(flb);
> + args.flb = flb;
> + args.obj = private->incoming.obj;
> + flb->ops->finish(&args);
>
> - if (WARN_ON(err))
> - return;
> - }
> -
> - scoped_guard(mutex, &private->incoming.lock) {
> - args.flb = flb;
> - args.obj = private->incoming.obj;
> - flb->ops->finish(&args);
> -
> - private->incoming.data = 0;
> - private->incoming.obj = NULL;
> - private->incoming.finished = true;
> - }
> + private->incoming.data = 0;
> + private->incoming.obj = NULL;
> + private->incoming.finished = true;
> }
> }
>
> --
>
> On Sat, Apr 4, 2026 at 1:26 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>>
>> On Fri, 03 Apr 2026 09:04:25 +0000 Pratyush Yadav <pratyush@kernel.org> wrote:
>>
>> > On Thu, Apr 02 2026, Andrew Morton wrote:
>> >
>> > > On Thu, 02 Apr 2026 13:28:33 +0000 Pratyush Yadav <pratyush@kernel.org> wrote:
>> > >
>> > >> The state of an incoming FLB object is initialized when it is first
>> > >> used. The initialization is done via luo_flb_retrieve_one(), which looks
>> > >> at all the incoming FLBs, matches the FLB to its serialized entry, and
>> > >> initializes the incoming data and count.
>> > >>
>> > >> luo_flb_file_finish_one() is called when finish is called for a file
>> > >> registered with this FLB. If no file handler has used the FLB by this
>> > >> point, the count stays un-initialized at 0. luo_flb_file_finish_one()
>> > >> then decrements this un-initialized count, leading to an underflow. This
>> > >> results in the FLB finish never being called since the count has
>> > >> underflowed to a very large value.
>> > >>
>> > >> Fix this by making sure the FLB is retrieved before using its count.
>> > >
>> > > I like that the above tells people what the actual bug is!
>> > >
>> > > I still have both Leo's patches in mm.git, in wait-and-see mode. What
>> > > to do here? Should I upstream [1/2] and drop [2/2]? Drop both and
>> > > revisit after -rc1?
>> >
>> > These are independent fixes, so I would suggest keeping 1/2 regardless
>> > of what we do with 2/2. For 2/2, I would suggest replacing it with the
>> > version I sent in <2vxzmrzlfq4e.fsf@kernel.org>.
>>
>> OK, thanks. I removed [2/2] "liveupdate: initialize incoming FLB state
>> before finish" from mm.git.
>>
>> I added cc:stable to [1/1] "liveupdate: propagate file deserialization
>> failures" and altered its changelog to present it as a singleton patch
>> (no "Patch series..." header, no "This patch (of 2)", etc. Below.
>>
>> > Mike/Pasha/Leo, if you could review my version then that would be great.
>> >
>> > Also Leo, please help with testing. I don't have a setup ready for
>> > testing this corner case. I can set something up mid next week, but it
>> > would be great if you can test this before that.
>> >
>> > >
>> > > Also, did we consider cc:stable for these two? Perhaps add the
>> > > cc:stable if we decide to attend to this after -rc1?
>> >
>> > FLB landed in v7.0-rc1 so no need for cc:stable for patch 2/2. For patch
>> > 1/2, I think cc:stable does make sense, but it only landed in v6.19 so
>> > not super important given it is not LTS.
>>
>> OK. When considering cc:stable I personally don't pay attention to LTS
>> release cadence - I believe others do so.
>>
>> Because it could be that some downstream people are using non-LTS Linus
>> kernels. But then they should be looking at the Fixes: tags.
>>
>>
>>
>> From: Leo Timmins <leotimmins1974@gmail.com>
>> Subject: liveupdate: propagate file deserialization failures
>> Date: Wed, 25 Mar 2026 12:46:07 +0800
>>
>> luo_session_deserialize() ignored the return value from
>> luo_file_deserialize(). As a result, a session could be left partially
>> restored even though the /dev/liveupdate open path treats deserialization
>> failures as fatal.
>>
>> Propagate the error so a failed file deserialization aborts session
>> deserialization instead of silently continuing.
>>
>> Link: https://lkml.kernel.org/r/20260325044608.8407-1-leotimmins1974@gmail.com
>> Link: https://lkml.kernel.org/r/20260325044608.8407-2-leotimmins1974@gmail.com
>> Fixes: 16cec0d26521 ("liveupdate: luo_session: add ioctls for file preservation")
>> Signed-off-by: Leo Timmins <leotimmins1974@gmail.com>
>> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
>> Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
>> Cc: Mike Rapoport <rppt@kernel.org>
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>> ---
>>
>> kernel/liveupdate/luo_session.c | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> --- a/kernel/liveupdate/luo_session.c~liveupdate-propagate-file-deserialization-failures
>> +++ a/kernel/liveupdate/luo_session.c
>> @@ -558,8 +558,13 @@ int luo_session_deserialize(void)
>> }
>>
>> scoped_guard(mutex, &session->mutex) {
>> - luo_file_deserialize(&session->file_set,
>> - &sh->ser[i].file_set_ser);
>> + err = luo_file_deserialize(&session->file_set,
>> + &sh->ser[i].file_set_ser);
>> + }
>> + if (err) {
>> + pr_warn("Failed to deserialize files for session [%s] %pe\n",
>> + session->name, ERR_PTR(err));
>> + return err;
>> }
>> }
>>
>> _
>>
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish
2026-04-04 15:43 ` Leo Timmins
@ 2026-04-05 7:35 ` Pratyush Yadav
0 siblings, 0 replies; 13+ messages in thread
From: Pratyush Yadav @ 2026-04-05 7:35 UTC (permalink / raw)
To: Leo Timmins
Cc: Andrew Morton, Pratyush Yadav, pasha.tatashin, rppt, linux-kernel
On Sat, Apr 04 2026, Leo Timmins wrote:
> Hello All,
>
>> Apologies for the duplicate email, I had accidentally sent it out with html instead of plain text.
>
> I have done my testing of Pratyush's alternate patch. My tests
> indicate it successfully fixes the potential underflow if
> incoming.count is <= 0
>
> Patch successfully compiles with 7.0.0-rc6 and has been tested on a VM
> (QEMU + VMM) with Arch.
>
> I executed a liveupdate with kexec which was successful. I then wrote
> a test tool to replicate the underflow conditions, and added a check
> to see if incoming.count is <= 0 after the if statement block
>
> Having confirmed it works:
>
> Reviewed-by: Leo Timmins <leotimmins1974@gmail.com>
Thanks!
[...]
--
Regards,
Pratyush Yadav
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-04-05 7:35 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-26 4:25 [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Leo Timmins
2026-03-26 4:25 ` [PATCH v3 1/2] liveupdate: propagate file deserialization failures Leo Timmins
2026-04-02 12:17 ` Pratyush Yadav
2026-03-26 4:25 ` [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish Leo Timmins
2026-03-26 14:50 ` Pasha Tatashin
2026-04-02 13:28 ` Pratyush Yadav
2026-04-02 18:15 ` Andrew Morton
2026-04-03 9:04 ` Pratyush Yadav
2026-04-03 17:26 ` Andrew Morton
[not found] ` <CA+uuG7Lnc94vTmZPEhbvQXgAzWJDL28Zf=QR=uAkmiWvoW+Uxw@mail.gmail.com>
2026-04-04 15:43 ` Leo Timmins
2026-04-05 7:35 ` Pratyush Yadav
2026-03-28 0:32 ` [PATCH v3 0/2] liveupdate: fix incoming error handling and teardown paths Andrew Morton
2026-03-28 1:47 ` Pasha Tatashin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox