From: Pasha Tatashin <pasha.tatashin@soleen.com>
To: oskar@gerlicz.space
Cc: Mike Rapoport <rppt@kernel.org>, Baoquan He <bhe@redhat.com>,
Pratyush Yadav <pratyush@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, kexec@lists.infradead.org,
linux-mm@kvack.org
Subject: Re: [PATCH v3 1/5] liveupdate: block outgoing session updates during reboot
Date: Mon, 23 Mar 2026 18:23:29 -0400 [thread overview]
Message-ID: <CA+CK2bDN3bk5Y-LP9OD+6vPrr50Eyy4ALJvhK36ZFGEgj7yjLQ@mail.gmail.com> (raw)
In-Reply-To: <f76e6cbcf6752342ee940ed0dfeca7bd@gerlicz.space>
On Mon, Mar 23, 2026 at 4:54 PM <oskar@gerlicz.space> wrote:
>
> On 2026-03-23 20:00, Pasha Tatashin wrote:
> > On Sat, Mar 21, 2026 at 6:28 PM Pasha Tatashin
> > <pasha.tatashin@soleen.com> wrote:
> >>
> >> On Sat, Mar 21, 2026 at 10:38 AM Oskar Gerlicz Kowalczuk
> >> <oskar@gerlicz.space> wrote:
> >> >
> >> > kernel_kexec() serializes outgoing sessions before the reboot path
> >> > freezes tasks, so close() and session ioctls can still mutate a
> >> > session while handover state is being prepared. The original v2 code
> >> > also let incoming lookups keep a bare session pointer after dropping
> >> > the list lock.
> >> >
> >> > That leaves two correctness problems in the reboot path: outgoing state
> >> > can change after serialization starts, and incoming sessions can be
> >> > freed while another thread still holds a pointer to them.
> >> >
> >> > Add refcounted session lifetime management, track in-flight outgoing
> >> > close() paths with an atomic closing counter, and make serialization
> >> > wait for closing to drain before setting rebooting. Reject phase-invalid
> >> > ioctls, keep incoming release on a common cleanup path, and make the
> >> > release wait freezable without spinning.
> >> >
> >> > Fixes: fc5acd5c89fe ("liveupdate: block outgoing session updates during reboot")
> >> > Signed-off-by: Oskar Gerlicz Kowalczuk <oskar@gerlicz.space>
> >> > ---
> >> > kernel/liveupdate/luo_internal.h | 12 +-
> >> > kernel/liveupdate/luo_session.c | 236 +++++++++++++++++++++++++++----
> >> > 2 files changed, 221 insertions(+), 27 deletions(-)
> >>
> >> Hi Oskar,
> >>
> >> Thank you for sending this series and finding these bugs in LUO. I
> >> agree with Andrew that a cover letter would help to understand the
> >> summary of the overall effort.
> >>
> >> I have not reviewed the other patches yet, but for this patch, my
> >> understanding is that it solves two specific races during reboot()
> >> syscalls: session closure after serialization, and the addition of new
> >> sessions or preserving new files after serialization.
> >>
> >> Given that KHO is now stateless, and liveupdate_reboot() is
> >> specifically placed at the last point where we can still return an
> >> error to userspace, we should simply return an error if a userspace is
> >> doing something unexpected.
> >>
> >> Instead of creating a new state machine, let's just reuse the file
> >> references and simply take them for each session at the beginning of
> >> serialization. This ensures that no session closes will happen later.
> >> For file preservation and session addition, we can block them by
> >> simply adding a new boolean.
> >>
> >> Please take a look at the two patches below and see if this approach
> >> would work. It is a much smaller change compared to the proposed state
> >> machine in this patch.
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/tatashin/linux.git/log/?h=luo-reboot-sync/rfc/1
> >
> > Oskar, I made a few more changes to avoid returning an error if
> > get_file_active() fails. This prevents a race condition where the user
> > might call close(session_fd) right before calling reboot(). I
> > force-updated the above branch. Please let me know if you want to take
> > these changes and use them to in the next version.
> >
> > Pasha
>
> Hi Pasha,
>
> thank you for taking the time to prototype this approach and for the
> detailed explanation, I really appreciate it.
>
> I agree that reusing file references and introducing a simple blocking
> mechanism makes the solution much smaller and easier to reason about
> compared to a dedicated state machine. Your patches definitely move
> things in a nice direction in terms of simplicity.
>
> While going through it, I was wondering if there might still be a couple
> of corner cases worth discussing. In particular, do you think a boolean
> gate is sufficient to cover in-flight operations that may have already
> passed the check before serialization starts? It seems like those paths
> could still potentially mutate session state during serialization.
I think it is robust, it works in conjucition with session mutex. If
an operation already passed the check, it already holds the session
mutex, and since serialization also takes this mutex, it will see
consistent data after pinning sessions via
luo_session_get_all_outgoing().
>
> I was also thinking about the lifetime of incoming sessions (especially
> lookups holding pointers). Do you think file reference handling alone is
> enough there, or would we still need some explicit lifetime protection?
I'm not sure about that; I have not looked into those patches in your
series yet.
>
> I’m currently working on v4 and will take a closer look at your branch
> to see if we can combine both approaches in a way that keeps the
> solution simple while still covering these cases.
Thanks!
Pasha
prev parent reply other threads:[~2026-03-23 22:24 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-21 14:36 [PATCH v3 1/5] liveupdate: block outgoing session updates during reboot Oskar Gerlicz Kowalczuk
2026-03-21 14:36 ` [PATCH v3 2/5] kexec: abort liveupdate handover on kernel_kexec() unwind Oskar Gerlicz Kowalczuk
2026-03-21 14:36 ` [PATCH v3 3/5] liveupdate: fail session restore on file deserialization errors Oskar Gerlicz Kowalczuk
2026-03-21 14:36 ` [PATCH v3 4/5] liveupdate: validate handover metadata before using it Oskar Gerlicz Kowalczuk
2026-03-21 14:36 ` [PATCH v3 5/5] liveupdate: harden FLB lifetime and teardown paths Oskar Gerlicz Kowalczuk
2026-03-21 23:05 ` [PATCH v3 2/5] kexec: abort liveupdate handover on kernel_kexec() unwind Pasha Tatashin
2026-03-23 14:12 ` Pasha Tatashin
2026-03-21 17:45 ` [PATCH v3 1/5] liveupdate: block outgoing session updates during reboot Andrew Morton
2026-03-21 22:28 ` Pasha Tatashin
2026-03-23 19:00 ` Pasha Tatashin
2026-03-23 20:52 ` oskar
2026-03-23 22:23 ` Pasha Tatashin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CA+CK2bDN3bk5Y-LP9OD+6vPrr50Eyy4ALJvhK36ZFGEgj7yjLQ@mail.gmail.com \
--to=pasha.tatashin@soleen.com \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=oskar@gerlicz.space \
--cc=pratyush@kernel.org \
--cc=rppt@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox