From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1C06B2D592C for ; Thu, 2 Apr 2026 13:28:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775136517; cv=none; b=aqSN2DFEDF0H2T99TIagrugUGAV8L9ZSFyLr1kcyPVzdytAXxpIATyeYGlqJUkjHdO6GXEB4Pbo1QVTnXso2iI+bxfUWsK0rXrCWO6coQle0Z0zIsWY/mYYxit4baopkv0d6ca/rZrAh9KuWZpHQnqqxloDMSsZYkn+q7G/JYn8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775136517; c=relaxed/simple; bh=3iF6VC6rbLhAbAs2chw8rnvKMhN2HIP9uL7ttq4LweI=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=PktzWjpr+X5SD8cv50zMAjRAA9+6I271oo8EozMYyXV4d7LKdEfdxDS0mJY11hxTAWr6yKZ1rFqVMedigb8ixWjeEsN61PFTnNtBloSACWflVjhrh2yr+MmxPjPb/W31CD1Bg3p3qRMJAMzC9KRinM1PMURMnBE4kGYcZMlgmhY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SeHX38gb; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SeHX38gb" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 66113C116C6; Thu, 2 Apr 2026 13:28:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1775136516; bh=3iF6VC6rbLhAbAs2chw8rnvKMhN2HIP9uL7ttq4LweI=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=SeHX38gbtZo/jtpGnSIWKNc0wOhHZA6X0mmyXhV+eme1aJ9BRccEdiAy+o/wh/z7r uX4bVyh2gNHocw69XoRleNM6Y56xPIoHW+bL1hnwB6z1VvMppbGr4qmgby44o/QIx+ Ao1oToa/I3Wdc/JbCRROH67sDndbvUS5y/3cp41J6CDS3Zx4QeSmHEvWUqImagq1rN J1i7nHTPMkDeuYI2zS7uxrNoNacgDF8udy1b2VCO312OjsgqOUeeSgisM8AwzNVIOc avTirtPcrvgLAXis63LWV94xeoJOoBXdvy7QxEHQwAS9KtRrM1uiPQHF0sxGTrApN9 NvUourOFgct6g== From: Pratyush Yadav To: Leo Timmins Cc: pasha.tatashin@soleen.com, rppt@kernel.org, linux-kernel@vger.kernel.org, pratyush@kernel.org, akpm@linux-foundation.org Subject: Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish In-Reply-To: <20260326042546.8031-3-leotimmins1974@gmail.com> (Leo Timmins's message of "Thu, 26 Mar 2026 12:25:35 +0800") References: <20260326042546.8031-1-leotimmins1974@gmail.com> <20260326042546.8031-3-leotimmins1974@gmail.com> Date: Thu, 02 Apr 2026 13:28:33 +0000 Message-ID: <2vxzmrzlfq4e.fsf@kernel.org> User-Agent: Gnus/5.13 (Gnus v5.13) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain On Thu, Mar 26 2026, Leo Timmins wrote: > luo_flb_file_finish_one() decremented incoming.count before making sure > that the incoming FLB state had been materialized. If no earlier incoming > retrieval had populated that state, the first decrement ran from zero and > skipped the last-user finish path. > > Initialize the incoming FLB state before the first decrement so finish > uses the serialized refcount instead of an uninitialized value. This commit message makes it very hard to understand what the problem it fixes. It took me 20 minutes of reading this patch and looking at the FLB code to figure out what is going on. Here is what I'd suggest: The state of an incoming FLB object is initialized when it is first used. The initialization is done via luo_flb_retrieve_one(), which looks at all the incoming FLBs, matches the FLB to its serialized entry, and initializes the incoming data and count. luo_flb_file_finish_one() is called when finish is called for a file registered with this FLB. If no file handler has used the FLB by this point, the count stays un-initialized at 0. luo_flb_file_finish_one() then decrements this un-initialized count, leading to an underflow. This results in the FLB finish never being called since the count has underflowed to a very large value. Fix this by making sure the FLB is retrieved before using its count. > > Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state") > Reviewed-by: Pasha Tatashin > Signed-off-by: Leo Timmins > --- > kernel/liveupdate/luo_flb.c | 19 ++++++++++++++++++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c > index f52e8114837e..855af655b09b 100644 > --- a/kernel/liveupdate/luo_flb.c > +++ b/kernel/liveupdate/luo_flb.c > @@ -192,10 +192,27 @@ static int luo_flb_retrieve_one(struct liveupdate_flb *flb) > static void luo_flb_file_finish_one(struct liveupdate_flb *flb) > { > struct luo_flb_private *private = luo_flb_get_private(flb); > + bool needs_retrieve = false; > u64 count; > > - scoped_guard(mutex, &private->incoming.lock) > + scoped_guard(mutex, &private->incoming.lock) { > + if (!private->incoming.count && !private->incoming.finished) > + needs_retrieve = true; > + } > + > + if (needs_retrieve) { > + int err = luo_flb_retrieve_one(flb); > + > + if (err) { > + pr_warn("Failed to retrieve FLB '%s' during finish: %pe\n", > + flb->compatible, ERR_PTR(err)); > + return; > + } > + } > + > + scoped_guard(mutex, &private->incoming.lock) { > count = --private->incoming.count; > + } This looks way too overcomplicated. We just need to move the retrieve call before we check the count. Pasha, side note: I think FLB suffers from the same problem I fixed for LUO files with f85b1c6af5bc ("liveupdate: luo_file: remember retrieve() status"). If retrieve fails, it doesn't remember the error code and will retry retrieve() on next usage, causing all sorts of problems. Pasha, another side note: I think incoming.private should be initialized when the FLB is registered, not when it is first used. This will make this simpler overall. This would need liveupdate_register_flb() to not call kzalloc(), but that can be done I think. Anyway, for this problem, how about the patch below (only compile-tested) addressing my comments. The diff looks bigger but the end result is a lot cleaner IMO. In effect it just moves the retrieve call outside the if (!count). Everything else is just cleaning up the locking situation. --- 8< --- >From 775565e72d9b851839d37088549f0fc247cac2e1 Mon Sep 17 00:00:00 2001 From: "Pratyush Yadav (Google)" Date: Thu, 2 Apr 2026 13:22:05 +0000 Subject: [PATCH] liveupdate: initialize incoming FLB state before finish The state of an incoming FLB object is initialized when it is first used. The initialization is done via luo_flb_retrieve_one(), which looks at all the incoming FLBs, matches the FLB to its serialized entry, and initializes the incoming data and count. luo_flb_file_finish_one() is called when finish is called for a file registered with this FLB. If no file handler has used the FLB by this point, the count stays un-initialized at 0. luo_flb_file_finish_one() then decrements this un-initialized count, leading to an underflow. This results in the FLB finish never being called since the count has underflowed to a very large value. Fix this by making sure the FLB is retrieved before using its count. Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state") Suggested-by: Leo Timmins Signed-off-by: Pratyush Yadav (Google) --- kernel/liveupdate/luo_flb.c | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c index f52e8114837e..be141620751e 100644 --- a/kernel/liveupdate/luo_flb.c +++ b/kernel/liveupdate/luo_flb.c @@ -194,28 +194,26 @@ static void luo_flb_file_finish_one(struct liveupdate_flb *flb) struct luo_flb_private *private = luo_flb_get_private(flb); u64 count; - scoped_guard(mutex, &private->incoming.lock) - count = --private->incoming.count; + if (!private->incoming.retrieved) { + int err = luo_flb_retrieve_one(flb); + if (WARN_ON(err)) + return; + } + + guard(mutex)(&private->incoming.lock); + + count = --private->incoming.count; if (!count) { struct liveupdate_flb_op_args args = {0}; - if (!private->incoming.retrieved) { - int err = luo_flb_retrieve_one(flb); + args.flb = flb; + args.obj = private->incoming.obj; + flb->ops->finish(&args); - if (WARN_ON(err)) - return; - } - - scoped_guard(mutex, &private->incoming.lock) { - args.flb = flb; - args.obj = private->incoming.obj; - flb->ops->finish(&args); - - private->incoming.data = 0; - private->incoming.obj = NULL; - private->incoming.finished = true; - } + private->incoming.data = 0; + private->incoming.obj = NULL; + private->incoming.finished = true; } } -- Regards, Pratyush Yadav