From: Pasha Tatashin <pasha.tatashin@soleen.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Pratyush Yadav <pratyush@kernel.org>,
jasonmiu@google.com, graf@amazon.com, changyuanl@google.com,
rppt@kernel.org, dmatlack@google.com, rientjes@google.com,
corbet@lwn.net, rdunlap@infradead.org,
ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com,
ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org,
akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr,
mmaurer@google.com, roman.gushchin@linux.dev,
chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com,
jannh@google.com, vincent.guittot@linaro.org,
hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com,
joel.granados@kernel.org, rostedt@goodmis.org,
anna.schumaker@oracle.com, song@kernel.org,
zhangguopeng@kylinos.cn, linux@weissschuh.net,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
linux-mm@kvack.org, gregkh@linuxfoundation.org,
tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
rafael@kernel.org, dakr@kernel.org,
bartosz.golaszewski@linaro.org, cw00.choi@samsung.com,
myungjoo.ham@samsung.com, yesanishhere@gmail.com,
Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com,
aleksander.lobakin@intel.com, ira.weiny@intel.com,
andriy.shevchenko@linux.intel.com, leon@kernel.org,
lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org,
djeffery@redhat.com, stuart.w.hayes@gmail.com,
lennart@poettering.net, brauner@kernel.org,
linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org,
saeedm@nvidia.com, ajayachandra@nvidia.com, parav@nvidia.com,
leonro@nvidia.com, witu@nvidia.com
Subject: Re: [PATCH v3 00/30] Live Update Orchestrator
Date: Tue, 26 Aug 2025 15:02:13 +0000 [thread overview]
Message-ID: <CA+CK2bBrCd8t_BUeE-sVPGjsJwmtk3mCSVhTMGbseTi_Wk+4yQ@mail.gmail.com> (raw)
In-Reply-To: <20250826142406.GE1970008@nvidia.com>
On Tue, Aug 26, 2025 at 2:24 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Aug 26, 2025 at 01:54:31PM +0000, Pasha Tatashin wrote:
> > > > https://github.com/googleprodkernel/linux-liveupdate/tree/luo/v3
> > > >
> > > > Changelog from v2:
> > > > - Addressed comments from Mike Rapoport and Jason Gunthorpe
> > > > - Only one user agent (LiveupdateD) can open /dev/liveupdate
> > > > - With the above changes, sessions are not needed, and should be
> > > > maintained by the user-agent itself, so removed support for
> > > > sessions.
> > >
> > > If all the FDs are restored in the agent's context, this assigns all the
> > > resources to the agent. For example, if the agent restores a memfd, all
> > > the memory gets charged to the agent's cgroup, and the client gets none
> > > of it. This makes it impossible to do any kind of resource limits.
> > >
> > > This was one of the advantages of being able to pass around sessions
> > > instead of FDs. The agent can pass on the right session to the right
> > > client, and then the client does the restore, getting all the resources
> > > charged to it.
> > >
> > > If we don't allow this, I think we will make LUO/LiveupdateD unsuitable
> > > for many kinds of workloads. Do you have any ideas on how to do proper
> > > resource attribution with the current patches? If not, then perhaps we
> > > should reconsider this change?
> >
> > Hi Pratyush,
> >
> > That's an excellent point, and you're right that we must have a
> > solution for correct resource charging.
> >
> > I'd prefer to keep the session logic in the userspace agent (luod
> > https://tinyurl.com/luoddesign).
> >
> > For the charging problem, I believe there's a clear path forward with
> > the current ioctl-based API. The design of the ioctl commands (with a
> > size field in each struct) is intentionally extensible. In a follow-up
> > patch, we can extend the liveupdate_ioctl_fd_restore struct to include
> > a target pid field. The luod agent, would then be able to restore an
> > FD on behalf of a client and instruct the kernel to charge the
> > associated resources to that client's PID.
>
> This wasn't quite the idea though..
>
> The sessions sub FD were intended to be passed directly to other
> processes though unix sockets and fd passing so they could run their
> own ioctls in their own context for both save and restore. The ioctls
> available on the sessions should be specifically narrowed to be safe
> for this.
>
> I can understand not implementing session FDs in the first version,
> but when sessions FD are available they should work like this and
> solve the namespace/cgroup/etc issues.
>
> Passing some PID in an ioctl is not a great idea...
Hi Jason,
I'm trying to understand the drawbacks of the PID-based approach.
Could you elaborate on why passing a PID in the RESTORE_FD ioctl is
not a good idea?
From my perspective, luod would have a live, open socket to the client
process requesting the restore. It can use SO_PEERCRED to securely
identify the client's PID at that moment. The flow would be:
1. Client connects and resumes its session with luod.
2. Client requests to restore TOKEN_X.
3. luod verifies the client owns TOKEN_X for its session.
4. luod calls the RESTORE_FD ioctl, telling the kernel: "Please
restore TOKEN_X and charge the resources to PID Y (which I just
verified is on the other end of this socket)."
5. The kernel performs the action.
6. luod receives the new FD from the kernel and passes it back to the
client over the socket.
In this flow, the client isn't providing an arbitrary PID; the trusted
luod agent is providing the PID of a process it has an active
connection with.
The idea was to let luod handle the session/security story, and the
kernel handle the core preservation mechanism. Adding sessions to the
kernel, delegates the management and part of the security model into
the kernel. I am not sure if it is necessary, what can be cleanly
managed in userspace should stay in userspace.
Thanks,
Pasha
>
> Jason
next prev parent reply other threads:[~2025-08-26 15:02 UTC|newest]
Thread overview: 114+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-07 1:44 [PATCH v3 00/30] Live Update Orchestrator Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 01/30] kho: init new_physxa->phys_bits to fix lockdep Pasha Tatashin
2025-08-08 11:42 ` Pratyush Yadav
2025-08-08 11:52 ` Pratyush Yadav
2025-08-08 14:00 ` Pasha Tatashin
2025-08-08 19:06 ` Andrew Morton
2025-08-08 19:51 ` Pasha Tatashin
2025-08-08 20:19 ` Pasha Tatashin
2025-08-14 13:11 ` Jason Gunthorpe
2025-08-14 14:57 ` Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 02/30] kho: mm: Don't allow deferred struct page with KHO Pasha Tatashin
2025-08-08 11:47 ` Pratyush Yadav
2025-08-08 14:01 ` Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 03/30] kho: warn if KHO is disabled due to an error Pasha Tatashin
2025-08-08 11:48 ` Pratyush Yadav
2025-08-07 1:44 ` [PATCH v3 04/30] kho: allow to drive kho from within kernel Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 05/30] kho: make debugfs interface optional Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 06/30] kho: drop notifiers Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 07/30] kho: add interfaces to unpreserve folios and physical memory ranges Pasha Tatashin
2025-08-14 13:22 ` Jason Gunthorpe
2025-08-14 15:05 ` Pasha Tatashin
2025-08-14 17:01 ` Jason Gunthorpe
2025-08-15 9:12 ` Mike Rapoport
2025-08-18 13:55 ` Jason Gunthorpe
2025-08-07 1:44 ` [PATCH v3 08/30] kho: don't unpreserve memory during abort Pasha Tatashin
2025-08-14 13:30 ` Jason Gunthorpe
2025-08-07 1:44 ` [PATCH v3 09/30] liveupdate: kho: move to kernel/liveupdate Pasha Tatashin
2025-08-30 8:35 ` Mike Rapoport
2025-08-07 1:44 ` [PATCH v3 10/30] liveupdate: luo_core: luo_ioctl: Live Update Orchestrator Pasha Tatashin
2025-08-14 13:31 ` Jason Gunthorpe
2025-08-07 1:44 ` [PATCH v3 11/30] liveupdate: luo_core: integrate with KHO Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 12/30] liveupdate: luo_subsystems: add subsystem registration Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 13/30] liveupdate: luo_subsystems: implement subsystem callbacks Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 14/30] liveupdate: luo_files: add infrastructure for FDs Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 15/30] liveupdate: luo_files: implement file systems callbacks Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 16/30] liveupdate: luo_ioctl: add userpsace interface Pasha Tatashin
2025-08-14 13:49 ` Jason Gunthorpe
2025-08-07 1:44 ` [PATCH v3 17/30] liveupdate: luo_files: luo_ioctl: Unregister all FDs on device close Pasha Tatashin
2025-08-27 15:34 ` Pratyush Yadav
2025-08-07 1:44 ` [PATCH v3 18/30] liveupdate: luo_files: luo_ioctl: Add ioctls for per-file state management Pasha Tatashin
2025-08-14 14:02 ` Jason Gunthorpe
2025-08-07 1:44 ` [PATCH v3 19/30] liveupdate: luo_sysfs: add sysfs state monitoring Pasha Tatashin
2025-08-26 16:03 ` Jason Gunthorpe
2025-08-26 18:58 ` Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 20/30] reboot: call liveupdate_reboot() before kexec Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 21/30] kho: move kho debugfs directory to liveupdate Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 22/30] liveupdate: add selftests for subsystems un/registration Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 23/30] selftests/liveupdate: add subsystem/state tests Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 24/30] docs: add luo documentation Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 25/30] MAINTAINERS: add liveupdate entry Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 26/30] mm: shmem: use SHMEM_F_* flags instead of VM_* flags Pasha Tatashin
2025-08-11 23:11 ` Vipin Sharma
2025-08-13 12:42 ` Pratyush Yadav
2025-08-07 1:44 ` [PATCH v3 27/30] mm: shmem: allow freezing inode mapping Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 28/30] mm: shmem: export some functions to internal.h Pasha Tatashin
2025-08-07 1:44 ` [PATCH v3 29/30] luo: allow preserving memfd Pasha Tatashin
2025-08-08 20:22 ` Pasha Tatashin
2025-08-13 12:44 ` Pratyush Yadav
2025-08-13 6:34 ` Vipin Sharma
2025-08-13 7:09 ` Greg KH
2025-08-13 12:02 ` Pratyush Yadav
2025-08-13 12:14 ` Greg KH
2025-08-13 12:41 ` Jason Gunthorpe
2025-08-13 13:00 ` Greg KH
2025-08-13 13:37 ` Pratyush Yadav
2025-08-13 13:41 ` Pasha Tatashin
2025-08-13 13:53 ` Greg KH
2025-08-13 13:53 ` Greg KH
2025-08-13 20:03 ` Jason Gunthorpe
2025-08-13 13:31 ` Pratyush Yadav
2025-08-13 12:29 ` Pratyush Yadav
2025-08-13 13:49 ` Pasha Tatashin
2025-08-13 13:55 ` Pratyush Yadav
2025-08-26 16:20 ` Jason Gunthorpe
2025-08-27 15:03 ` Pratyush Yadav
2025-08-28 12:43 ` Jason Gunthorpe
2025-08-28 23:00 ` Chris Li
2025-09-01 17:10 ` Pratyush Yadav
2025-09-02 13:48 ` Jason Gunthorpe
2025-09-03 14:10 ` Pratyush Yadav
2025-09-03 15:01 ` Jason Gunthorpe
2025-08-28 7:14 ` Mike Rapoport
2025-08-29 18:47 ` Chris Li
2025-08-29 19:18 ` Chris Li
2025-09-02 13:41 ` Jason Gunthorpe
2025-09-03 12:01 ` Chris Li
2025-09-01 16:23 ` Mike Rapoport
2025-09-01 16:54 ` Pasha Tatashin
2025-09-01 17:21 ` Pratyush Yadav
2025-09-01 19:02 ` Pasha Tatashin
2025-09-02 11:38 ` Jason Gunthorpe
2025-09-03 15:59 ` Pasha Tatashin
2025-09-03 16:40 ` Jason Gunthorpe
2025-09-03 19:29 ` Mike Rapoport
2025-09-02 11:58 ` Mike Rapoport
2025-09-01 17:01 ` Pratyush Yadav
2025-09-02 11:44 ` Mike Rapoport
2025-09-03 14:17 ` Pratyush Yadav
2025-09-03 19:39 ` Mike Rapoport
2025-08-07 1:44 ` [PATCH v3 30/30] docs: add documentation for memfd preservation via LUO Pasha Tatashin
2025-08-08 12:07 ` [PATCH v3 00/30] Live Update Orchestrator David Hildenbrand
2025-08-08 12:24 ` Pratyush Yadav
2025-08-08 13:53 ` Pasha Tatashin
2025-08-08 13:52 ` Pasha Tatashin
2025-08-26 13:16 ` Pratyush Yadav
2025-08-26 13:54 ` Pasha Tatashin
2025-08-26 14:24 ` Jason Gunthorpe
2025-08-26 15:02 ` Pasha Tatashin [this message]
2025-08-26 15:13 ` Jason Gunthorpe
2025-08-26 16:10 ` Pasha Tatashin
2025-08-26 16:22 ` Jason Gunthorpe
2025-08-26 17:03 ` Pasha Tatashin
2025-08-26 17:08 ` Jason Gunthorpe
2025-08-27 14:01 ` Pratyush Yadav
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CA+CK2bBrCd8t_BUeE-sVPGjsJwmtk3mCSVhTMGbseTi_Wk+4yQ@mail.gmail.com \
--to=pasha.tatashin@soleen.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=ajayachandra@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=aleksander.lobakin@intel.com \
--cc=aliceryhl@google.com \
--cc=andriy.shevchenko@linux.intel.com \
--cc=anna.schumaker@oracle.com \
--cc=axboe@kernel.dk \
--cc=bartosz.golaszewski@linaro.org \
--cc=bhelgaas@google.com \
--cc=bp@alien8.de \
--cc=brauner@kernel.org \
--cc=changyuanl@google.com \
--cc=chenridong@huawei.com \
--cc=corbet@lwn.net \
--cc=cw00.choi@samsung.com \
--cc=dakr@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=djeffery@redhat.com \
--cc=dmatlack@google.com \
--cc=graf@amazon.com \
--cc=gregkh@linuxfoundation.org \
--cc=hannes@cmpxchg.org \
--cc=hpa@zytor.com \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=ira.weiny@intel.com \
--cc=jannh@google.com \
--cc=jasonmiu@google.com \
--cc=jgg@nvidia.com \
--cc=joel.granados@kernel.org \
--cc=kanie@linux.alibaba.com \
--cc=lennart@poettering.net \
--cc=leon@kernel.org \
--cc=leonro@nvidia.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux@weissschuh.net \
--cc=lukas@wunner.de \
--cc=mark.rutland@arm.com \
--cc=masahiroy@kernel.org \
--cc=mingo@redhat.com \
--cc=mmaurer@google.com \
--cc=myungjoo.ham@samsung.com \
--cc=ojeda@kernel.org \
--cc=parav@nvidia.com \
--cc=pratyush@kernel.org \
--cc=quic_zijuhu@quicinc.com \
--cc=rafael@kernel.org \
--cc=rdunlap@infradead.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=saeedm@nvidia.com \
--cc=song@kernel.org \
--cc=stuart.w.hayes@gmail.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=wagi@kernel.org \
--cc=witu@nvidia.com \
--cc=x86@kernel.org \
--cc=yesanishhere@gmail.com \
--cc=yoann.congal@smile.fr \
--cc=zhangguopeng@kylinos.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).