From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>,
jolsa@kernel.org, mhiramat@kernel.org, cgzones@googlemail.com,
brauner@kernel.org, linux-kernel@vger.kernel.org, arnd@arndb.de,
"Adhemerval Zanella Netto" <adhemerval.zanella@linaro.org>,
"Zack Weinberg" <zack@owlfolio.org>,
"Cristian Rodríguez" <cristian@rodriguez.im>,
"Florian Weimer" <fweimer@redhat.com>,
"Wilco Dijkstra" <Wilco.Dijkstra@arm.com>
Subject: Re: deconflicting new syscall numbers for 6.11
Date: Fri, 5 Jul 2024 21:14:03 -0400 [thread overview]
Message-ID: <ZoiaWz9mG9rb0QND@localhost.localdomain> (raw)
In-Reply-To: <CAHk-=wiGk+1eNy4Vk6QsEgM=Ru3jE40qrDwgq_CSKgqwLgMdRg@mail.gmail.com>
On 04-Jul-2024 10:21:34 AM, Linus Torvalds wrote:
> On Thu, 4 Jul 2024 at 10:10, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >
> > The three of us all have new syscalls planned for 6.11. Arnd suggested
> > that we coordinate to deconflict, to make the merge easier.
>
> Nobody has explained to me what has changed since your last vdso
> getrandom, and I'm not planning on pulling it unless that fundamental
> flaw is fixed.
>
> Why is this _so_ critical that it needs a vdso?
>
> Why isn't user space just doing it itself?
>
> What's so magical about this all?
>
> This all seems entirely pointless to me still, because it's optimizing
> something that nobody seems to care about, adding new VM
> infrastructure, new magic system calls, yadda yadda.
>
> I was very sceptical last time, and absolutely _nothing_ has changed.
> Not a peep on why it's now suddenly so hugely important again.
>
> We don't add stuff "just because we can". We need to have a damn good
> reason for it. And I still don't see the reason, and I haven't seen
> anybody even trying to explain the reason.
[ Note: as I wrote down this email, I notice that you are heading
towards the same conclusions I'm reaching on other sub-threads of this
discussion. But I'm providing this feedback because it adds relevant
information based on earlier discussions with libc developers. ]
Earlier this year in March, I've jumped into the discussion on the
libc-alpha mailing list to understand the userspace RNG seeding
requirements better. The interesting bits that explain how the kernel
can play an important role start here:
https://sourceware.org/pipermail/libc-alpha/2024-March/155534.html
From an absolutely-not-security-expert perspective, here is how I see
the desiderata breakdown:
- There appears to be a need to make sure the random seed is not exposed
across fork, core dump and other similar scenarios. This can be
achieved by simply letting userspace use the appropriate madvise(2)
advices on a memory mapping created through mmap(2). I don't see why
there would be any need to create any RNG-centric ABI for this. If
new madvise(2) advices are needed, they can simply be added there.
- There appears to be interest in having a RNG faster than a system call
for various reasons I'm not familiar with. A vDSO appears to be one
way to do this. Another way would be to let userspace implement it
all, which raises the following question: what is the minimal state
known only by the kernel currently unknown from userspace ? This
brings the following point.
- Based on the libc-alpha discussion, I understand that the main thing
the kernel knows about which is unknown from userspace is a sort-of
generation counter, which tracks for instance the fact that the kernel
was migrated to a different VM, or suspended and then resumed, and
hence the current seed should be discarded and re-seeded entirely.
I suspect that is the _key_ information that is currently missing from
a purely userspace RNG perspective today. I hinted at extending the
rseq(2) ABI for that purpose: exposing a generation counter for the
RNG in a thread area shared between kernel and user-space. The
per-thread area is already there and the hard work of integrating it
with libc is mostly complete. Another alternative would be, as you
hint elsewhere in this thread
(https://lore.kernel.org/lkml/CAHk-=wgqD9h0Eb-n94ZEuK9SugnkczXvX497X=OdACVEhsw5xQ@mail.gmail.com/)
to create a vDSO to expose exactly this kind of generation counter.
Given this is not a thread-specific thing, it might be a better
approach that the rseq per-thread area.
So either I'm missing something important (please enlighten me), or we
could achieve all those end-goals with a small fraction of the ABI
complexity introduced by the vDSO as it is initially proposed.
I don't think that just because there happens to be bad userspace RNG
implementations out there we should give up on userspace and maintain
this all complexity in the kernel. This is just working around userspace
ecosystem issues by moving the implementation and maintainance burden
into the kernel.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2024-07-06 1:14 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-04 17:10 deconflicting new syscall numbers for 6.11 Jason A. Donenfeld
2024-07-04 17:21 ` Linus Torvalds
2024-07-04 17:33 ` Linus Torvalds
2024-07-04 17:47 ` Linus Torvalds
2024-07-04 17:51 ` Jason A. Donenfeld
2024-07-04 17:46 ` Jason A. Donenfeld
2024-07-04 17:55 ` Linus Torvalds
2024-07-04 18:04 ` Jason A. Donenfeld
2024-07-04 18:18 ` Linus Torvalds
2024-07-04 18:35 ` Linus Torvalds
2024-07-04 18:46 ` Jason A. Donenfeld
2024-07-04 18:52 ` Linus Torvalds
2024-07-04 18:57 ` Jason A. Donenfeld
2024-07-04 19:19 ` Linus Torvalds
2024-07-04 21:07 ` Linus Torvalds
2024-07-04 21:44 ` Arnd Bergmann
2024-07-04 22:07 ` Linus Torvalds
2024-07-05 8:32 ` Arnd Bergmann
2024-07-05 16:59 ` Linus Torvalds
2024-07-05 16:18 ` Jason A. Donenfeld
2024-07-05 17:39 ` Linus Torvalds
2024-07-05 17:53 ` Jason A. Donenfeld
2024-07-05 18:08 ` Linus Torvalds
2024-07-05 18:56 ` Jason A. Donenfeld
2024-07-05 19:21 ` Linus Torvalds
2024-07-05 19:46 ` Linus Torvalds
2024-07-06 0:11 ` Jason A. Donenfeld
2024-07-06 2:10 ` Jason A. Donenfeld
2024-07-06 2:56 ` Linus Torvalds
2024-07-06 23:26 ` Jason A. Donenfeld
2024-07-07 16:56 ` Russell Haley
2024-07-04 18:36 ` Jason A. Donenfeld
2024-07-04 18:44 ` Willy Tarreau
2024-07-05 7:01 ` Matthias Urlichs
2024-07-06 1:14 ` Mathieu Desnoyers [this message]
2024-07-06 10:01 ` Florian Weimer
2024-07-06 14:34 ` Zack Weinberg
2024-07-06 15:30 ` Florian Weimer
2024-07-07 20:57 ` Adhemerval Zanella Netto
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZoiaWz9mG9rb0QND@localhost.localdomain \
--to=mathieu.desnoyers@efficios.com \
--cc=Jason@zx2c4.com \
--cc=Wilco.Dijkstra@arm.com \
--cc=adhemerval.zanella@linaro.org \
--cc=arnd@arndb.de \
--cc=brauner@kernel.org \
--cc=cgzones@googlemail.com \
--cc=cristian@rodriguez.im \
--cc=fweimer@redhat.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=zack@owlfolio.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox