From: Andrei Vagin <avagin@google.com>
To: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Cc: "Aleksandr Mikhalitsyn" <aleksandr.mikhalitsyn@canonical.com>,
"Cristian Rodríguez" <cristian@rodriguez.im>,
"Florian Weimer" <fweimer@redhat.com>,
libc-alpha@sourceware.org, "Jeff Xu" <jeffxu@google.com>,
"H . J . Lu" <hjl.tools@gmail.com>,
rppt@kernel.org, 0x7f454c46@gmail.com, criu@lists.linux.dev
Subject: Re: [PATCH v8 0/8] Add support for memory sealing
Date: Fri, 7 Feb 2025 00:47:31 +0000 [thread overview]
Message-ID: <Z6VYI9JMIk1h_Jx4@google.com> (raw)
In-Reply-To: <b7d4a2c8-70da-4eaa-8905-4def3eacae37@linaro.org>
On Thu, Feb 06, 2025 at 04:47:32PM -0300, Adhemerval Zanella Netto wrote:
>
>
> On 06/02/25 15:03, Aleksandr Mikhalitsyn wrote:
> > On Thu, Feb 6, 2025 at 3:25 PM Adhemerval Zanella Netto
> > <adhemerval.zanella@linaro.org> wrote:
> >>
> >>
> >>
> >> On 06/02/25 06:15, Andrei Vagin wrote:
> >>> On Mon, Feb 03, 2025 at 11:11:56PM -0300, Cristian Rodríguez wrote:
> >>>> On Mon, Feb 3, 2025 at 4:40 PM Florian Weimer <fweimer@redhat.com> wrote:
> >>>>>
> >>>>> * Adhemerval Zanella Netto:
> >>>>>
> >>>>>>> CRIU needs to be able to unmap everything that was initially loaded by
> >>>>>>> the kernel and glibc. This will stop working if we use mseal for glibc
> >>>>>>> itself.
> >>>>>>
> >>>>>> So in this case the easiest way it to filter of mseal (with seccomp or
> >>>>>> something related) and disable sealing. I don't have a easy solution.
> >>>>>
> >>>>> Please test with CRIU and trace and find a way to make them work again
> >>>>> if they are broken.
> >>>>
> >>>> that is a kernel problem afaik..
> >>>
> >>> Could you please provide more details on why you think that is the
> >>> kernel issue?
> >>>
> >>> btw: this reminds me another discussion about mseal on lkml:
> >>> https://lore.kernel.org/lkml/htdv44tqzi4jl2b7dwutsdwnh4tgrxq6xdvumi5wwu3hnh7sgw@tfwlal74ukx6/
> >>>
> >>>> .why libc has to care about this limitation ?
> >>>
> >>> CRIU has worked with glibc for many years... It's not just about CRIU;
> >>> other projects, such as gVisor and UML, are also likely to be affected.
> >>
> >> The current proposal is a opt-in feature, but also without a way to disable it
> >> (similar to how RELRO is enableD).
> >>
> >> I don't have much experience on how CRIU or gVisor works internally, but if
> >> any requires to change any metadata (munmap, mprotect) of the PT_LOAD elf
> >> segments after startup this basically defeats the whole idea of the memory
> >> sealing hardening.
> >>
> >> I don't see a way to support both semantics without some extra kernel support,
> >> where either you can mark some process with extra credentials to do the
> >> required VMA operations (like process_madvise, etc.) or disable sealing during
> >> the snapshot.
> >>
> >> The mseal usage idea was primarily for program loaders, similar to how
> >> mimmutable for OpenBSD; but it seems that some programs also intend to
> >> use the syscall directly for some internal hardening (like Chrome). How
> >> CRIU/gVisor would handle such scenarios?
> >
> > Dear friends,
> >
> > I've quickly read a patchset [PATCH v8 0/8] Add support for memory
> > sealing (https://sourceware.org/pipermail/libc-alpha/2025-January/164361.html)
> > and noticed that on
> > https://sourceware.org/pipermail/libc-alpha/2025-January/164368.html
> > it's said:
> >> The GNU_PROPERTY_MEMORY_SEAL enforcement depends on whether the kernel
> >> supports the mseal syscall and how glibc is configured. On the default
> >> configuration that aims to support older kernel releases, the memory
> >> sealing attribute is taken as a hint. If glibc is configured with a
> >> minimum kernel of 6.10, where mseal is implied to be supported,
> >> sealing is enforced.
> >
> > => if I understand it right, it makes memory sealing to be enabled by
> > default if the kernel supports it even without a linker flag, right?
> >
> > I don't really understand what "glibc is configured with a minimum
> > kernel of 6.10" means from the user perspective.
> > I'm not very familiar with glibc internals, so can somebody put some
> > light on this, please?
>
> On glibc has a minimum support kernel version of 3.2; but some
> architectures override it (either because the ABI was added in newer
> versions, or due some other reason).
>
> We also have an option on where you can build glibc assuming it will
> always run on a specific kernel version (--enable-kernel=x.y.z). On
> previous releases we enforced by checking the kernel version at loading
> time, but currently glibc only uses to assume that certain syscall are
> always present (so there is no need to use fallbacks or handle ENOSYS).
>
> So if you build glibc with --enable-kernel=6.10 it means that mseal
> is expected to be always usable, ENOSYS is not possible, and thus any
> syscall failure is expected to be an error (assuming that we are passing
> valid arguments).
>
> If --enable-kernel is not used, it means that glibc can run on a kernel
> without mseal, and thus memory sealing can not be applied (we still might
> enforce it, but I think since we do have a way to enforce with
> --enable-kernel there is no urgent need for it).
>
> In any case, memory sealing will be only applied in the presence
> of GNU_PROPERTY_MEMORY_SEAL.
But this flag is considered for a binary and its libraries separately.
If libc is compiled with GNU_PROPERTY_MEMORY_SEAL, all binaries that
load this libc will have sealed mappings, regardless of whether the
binary itself has the flag or not.
I compiled glibc with the patches and performed a simple experiment:
```
[root@bc2868439161 install]# cat test.c
int main() {
return 0;
}
[root@bc2868439161 install]# gcc -Wl,-dynamic-linker,/mnt/glibc/install/lib/ld-linux-x86-64.so.2 -Wl,-z,nomemory-seal test.c
[root@bc2868439161 install]# strace -e mseal,openat,mmap ./a.out
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fda54b59000
mseal(0x7fda54b59000, 8192, 0) = 0
openat(AT_FDCWD, "/mnt/glibc/install/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
mmap(NULL, 2001, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fda54b58000
openat(AT_FDCWD, "/mnt/glibc/install/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
mmap(NULL, 1998928, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fda5496f000
mmap(0x7fda54ace000, 483328, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x15f000) = 0x7fda54ace000
mmap(0x7fda54b44000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1d5000) = 0x7fda54b44000
mmap(0x7fda54b4a000, 53328, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fda54b4a000
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fda5496c000
mseal(0x7fda5496c000, 12288, 0) = 0
mseal(0x7fda5496f000, 1998928, 0) = 0
mseal(0x7fda54b61000, 163665, 0) = 0
mseal(0x7fda54b89000, 45544, 0) = 0
mseal(0x7fda54b95000, 13096, 0) = 0
+++ exited with 0 +++
```
The test binary was compiled without the GNU_PROPERTY_MEMORY_SEAL flag.
However, we can see that all glibc mappings have been sealed. The
initial mapping is sealed even before libc.so is loaded, likely because
ld.so also has the GNU_PROPERTY_MEMORY_SEAL flag.
For operation, CRIU needs to be able to unmap all its mappings, which is
essential for restoring process address spaces. This means we need to
compile CRIU so that its process doesn't have any sealed mappings.
The same requirement applies to gVisor and UML, which both use stub
processes to manage guest address spaces. Basically, the main process
forks a new process, unmaps all existing mappings in the forked process,
and then populates it with guest mappings.
>
> >
> > I can't see how this can break the CRIU dump for us (I believe it
> > shouldn't but still worth checking), but for CRIU restore it's
> > definitely a problem
> > and reminds me of the rseq()&CRIU story we had a few years ago. My
> > current understanding is:
> >
> > *during CRIU restore*
> > 0. somehow disable mseal for CRIU binary itself, to make sure that
> > when CRIU do clone() we don't get any mappings sealed
> > 1. restore all memory mappings of the restorable process without
> > mseal() applied to them
> > 2. at the later criu restore stage go over them and apply mseal()
> >
> > I have a bad feeling that I still miss something, but even step 0 is a
> > problem right now if we go with the current approach from this
> > patch series, isn't it?
>
> I am not familiar on how CRIU snapshot/restore is done, and how is
> responsible to do each step. Is the kernel involved in any dump step,
> meaning that you need either to start the process with some IPC, or it
> just done in userland (with ptrace or other way to stop the process
> plus reading /proc/mem)?
It is done in userland. CRIU uses ptrace, proc and even injects a small
binary code in a target process to collect all required information to
be able to restore the process in the same state later.
>
> And on restore, how is this accomplished?
The process is a bit more complicated, but for a basic understanding, it
involves the following steps: fork a new process; restore all mappings;
unmap all CRIU mappings; remap the restored mappings to the correct
addresses; and finally, resume the process.
Thanks,
Andrei
next parent reply other threads:[~2025-02-07 0:47 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <87zfj2j47n.fsf@oldenburg.str.redhat.com>
[not found] ` <15cf9325-aba9-4bd8-a297-e5a0b0349e1c@linaro.org>
[not found] ` <87ed0ej1cl.fsf@oldenburg.str.redhat.com>
[not found] ` <6cda841a-e6d3-40e7-a2c7-9fdffa193909@linaro.org>
[not found] ` <878qqmizqh.fsf@oldenburg.str.redhat.com>
[not found] ` <CAPBLoAdrB9Hu=R25BTMj-sPJ_-xO3rX0bO6AhDGy6J_nHsOaBw@mail.gmail.com>
[not found] ` <Z6R9zDo2TZBHA4fs@google.com>
[not found] ` <d7220fff-913f-4fa5-9474-299916676d37@linaro.org>
[not found] ` <CAEivzxcUJkxqt2oDZ8jZsDhmm=6G18G5F+8Gdde_CR6w8TQpKQ@mail.gmail.com>
[not found] ` <b7d4a2c8-70da-4eaa-8905-4def3eacae37@linaro.org>
2025-02-07 0:47 ` Andrei Vagin [this message]
2025-02-07 12:10 ` [PATCH v8 0/8] Add support for memory sealing Adhemerval Zanella Netto
2025-02-07 12:17 ` Adhemerval Zanella Netto
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z6VYI9JMIk1h_Jx4@google.com \
--to=avagin@google.com \
--cc=0x7f454c46@gmail.com \
--cc=adhemerval.zanella@linaro.org \
--cc=aleksandr.mikhalitsyn@canonical.com \
--cc=cristian@rodriguez.im \
--cc=criu@lists.linux.dev \
--cc=fweimer@redhat.com \
--cc=hjl.tools@gmail.com \
--cc=jeffxu@google.com \
--cc=libc-alpha@sourceware.org \
--cc=rppt@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox