Re: deconflicting new syscall numbers for 6.11

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>,
	jolsa@kernel.org, mhiramat@kernel.org, cgzones@googlemail.com,
	brauner@kernel.org, linux-kernel@vger.kernel.org, arnd@arndb.de,
	"Adhemerval Zanella Netto" <adhemerval.zanella@linaro.org>,
	"Zack Weinberg" <zack@owlfolio.org>,
	"Cristian Rodríguez" <cristian@rodriguez.im>,
	"Florian Weimer" <fweimer@redhat.com>,
	"Wilco Dijkstra" <Wilco.Dijkstra@arm.com>
Subject: Re: deconflicting new syscall numbers for 6.11
Date: Fri, 5 Jul 2024 21:14:03 -0400	[thread overview]
Message-ID: <ZoiaWz9mG9rb0QND@localhost.localdomain> (raw)
In-Reply-To: <CAHk-=wiGk+1eNy4Vk6QsEgM=Ru3jE40qrDwgq_CSKgqwLgMdRg@mail.gmail.com>

On 04-Jul-2024 10:21:34 AM, Linus Torvalds wrote:
> On Thu, 4 Jul 2024 at 10:10, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >
> > The three of us all have new syscalls planned for 6.11. Arnd suggested
> > that we coordinate to deconflict, to make the merge easier.
> 
> Nobody has explained to me what has changed since your last vdso
> getrandom, and I'm not planning on pulling it unless that fundamental
> flaw is fixed.
> 
> Why is this _so_ critical that it needs a vdso?
> 
> Why isn't user space just doing it itself?
> 
> What's so magical about this all?
> 
> This all seems entirely pointless to me still, because it's optimizing
> something that nobody seems to care about, adding new VM
> infrastructure, new magic system calls, yadda yadda.
> 
> I was very sceptical last time, and absolutely _nothing_ has changed.
> Not a peep on why it's now suddenly so hugely important again.
> 
> We don't add stuff "just because we can". We need to have a damn good
> reason for it. And I still don't see the reason, and I haven't seen
> anybody even trying to explain the reason.

[ Note: as I wrote down this email, I notice that you are heading
  towards the same conclusions I'm reaching on other sub-threads of this
  discussion. But I'm providing this feedback because it adds relevant
  information based on earlier discussions with libc developers. ]

Earlier this year in March, I've jumped into the discussion on the
libc-alpha mailing list to understand the userspace RNG seeding
requirements better. The interesting bits that explain how the kernel
can play an important role start here:

https://sourceware.org/pipermail/libc-alpha/2024-March/155534.html

From an absolutely-not-security-expert perspective, here is how I see
the desiderata breakdown:

- There appears to be a need to make sure the random seed is not exposed
  across fork, core dump and other similar scenarios. This can be
  achieved by simply letting userspace use the appropriate madvise(2)
  advices on a memory mapping created through mmap(2). I don't see why
  there would be any need to create any RNG-centric ABI for this. If
  new madvise(2) advices are needed, they can simply be added there.

- There appears to be interest in having a RNG faster than a system call
  for various reasons I'm not familiar with. A vDSO appears to be one
  way to do this. Another way would be to let userspace implement it
  all, which raises the following question: what is the minimal state
  known only by the kernel currently unknown from userspace ? This
  brings the following point.

- Based on the libc-alpha discussion, I understand that the main thing
  the kernel knows about which is unknown from userspace is a sort-of
  generation counter, which tracks for instance the fact that the kernel
  was migrated to a different VM, or suspended and then resumed, and
  hence the current seed should be discarded and re-seeded entirely.
  I suspect that is the _key_ information that is currently missing from
  a purely userspace RNG perspective today. I hinted at extending the
  rseq(2) ABI for that purpose: exposing a generation counter for the
  RNG in a thread area shared between kernel and user-space. The
  per-thread area is already there and the hard work of integrating it
  with libc is mostly complete. Another alternative would be, as you
  hint elsewhere in this thread
(https://lore.kernel.org/lkml/CAHk-=wgqD9h0Eb-n94ZEuK9SugnkczXvX497X=OdACVEhsw5xQ@mail.gmail.com/)
  to create a vDSO to expose exactly this kind of generation counter.
  Given this is not a thread-specific thing, it might be a better
  approach that the rseq per-thread area.

So either I'm missing something important (please enlighten me), or we
could achieve all those end-goals with a small fraction of the ABI
complexity introduced by the vDSO as it is initially proposed.

I don't think that just because there happens to be bad userspace RNG
implementations out there we should give up on userspace and maintain
this all complexity in the kernel. This is just working around userspace
ecosystem issues by moving the implementation and maintainance burden
into the kernel.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

next prev parent reply	other threads:[~2024-07-06  1:14 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-04 17:10 deconflicting new syscall numbers for 6.11 Jason A. Donenfeld
2024-07-04 17:21 ` Linus Torvalds
2024-07-04 17:33   ` Linus Torvalds
2024-07-04 17:47     ` Linus Torvalds
2024-07-04 17:51       ` Jason A. Donenfeld
2024-07-04 17:46   ` Jason A. Donenfeld
2024-07-04 17:55     ` Linus Torvalds
2024-07-04 18:04       ` Jason A. Donenfeld
2024-07-04 18:18         ` Linus Torvalds
2024-07-04 18:35           ` Linus Torvalds
2024-07-04 18:46             ` Jason A. Donenfeld
2024-07-04 18:52               ` Linus Torvalds
2024-07-04 18:57                 ` Jason A. Donenfeld
2024-07-04 19:19                   ` Linus Torvalds
2024-07-04 21:07                     ` Linus Torvalds
2024-07-04 21:44                       ` Arnd Bergmann
2024-07-04 22:07                         ` Linus Torvalds
2024-07-05  8:32                           ` Arnd Bergmann
2024-07-05 16:59                             ` Linus Torvalds
2024-07-05 16:18                       ` Jason A. Donenfeld
2024-07-05 17:39                         ` Linus Torvalds
2024-07-05 17:53                           ` Jason A. Donenfeld
2024-07-05 18:08                             ` Linus Torvalds
2024-07-05 18:56                               ` Jason A. Donenfeld
2024-07-05 19:21                                 ` Linus Torvalds
2024-07-05 19:46                                   ` Linus Torvalds
2024-07-06  0:11                                     ` Jason A. Donenfeld
2024-07-06  2:10                                       ` Jason A. Donenfeld
2024-07-06  2:56                                         ` Linus Torvalds
2024-07-06 23:26                                           ` Jason A. Donenfeld
2024-07-07 16:56                   ` Russell Haley
2024-07-04 18:36           ` Jason A. Donenfeld
2024-07-04 18:44       ` Willy Tarreau
2024-07-05  7:01         ` Matthias Urlichs
2024-07-06  1:14   ` Mathieu Desnoyers [this message]
2024-07-06 10:01     ` Florian Weimer
2024-07-06 14:34       ` Zack Weinberg
2024-07-06 15:30         ` Florian Weimer
2024-07-07 20:57           ` Adhemerval Zanella Netto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZoiaWz9mG9rb0QND@localhost.localdomain \
    --to=mathieu.desnoyers@efficios.com \
    --cc=Jason@zx2c4.com \
    --cc=Wilco.Dijkstra@arm.com \
    --cc=adhemerval.zanella@linaro.org \
    --cc=arnd@arndb.de \
    --cc=brauner@kernel.org \
    --cc=cgzones@googlemail.com \
    --cc=cristian@rodriguez.im \
    --cc=fweimer@redhat.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=zack@owlfolio.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox