public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Florian Weimer <fweimer@redhat.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	mathieu.desnoyers@efficios.com, peterz@infradead.org,
	boqun.feng@gmail.com, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, hpa@zytor.com,
	aruna.ramakrishna@oracle.com, elver@google.com,
	"Paul E. McKenney" <paulmck@kernel.org>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Jens Axboe <axboe@kernel.dk>
Subject: Re: [PATCH v7 3/4] rseq: Make rseq work with protection keys
Date: Wed, 26 Nov 2025 21:52:44 +0100	[thread overview]
Message-ID: <873460h5yb.ffs@tglx> (raw)
In-Reply-To: <lhuy0ns3971.fsf@oldenburg.str.redhat.com>

On Wed, Nov 26 2025 at 20:06, Florian Weimer wrote:
> * Thomas Gleixner:
>> But like with signals just blindly enabling key0 and hope that it works
>> is not really a solution. Nothing prevents me from disabling RSEQ for
>> glibc. Then install my own RSEQ page and mprotect it. When that key
>> becomes disabled in PKRU and the code section is interrupted then exit
>> to user space will fault and die in exactly the same way as
>> today. That's progress...
>
> But does that matter?  If I mprotect the stack and a signal arrives,
> that results in a crash, too.  Some things just don't work.

They can be made work when we have a dedicated permission setting for
signals, which can be used for rseq access too. And having the explicit
signal permissions make a lot of sense independent of the above absurd
use case which I just used for illustration.

>> So we really need to sit down and actually define a proper programming
>> model first instead of trying to duct tape the current ill defined mess
>> forever.
>>
>> What do we have to take into account:
>>
>>    1) signals
>>
>>       Broken as we know already.
>>
>>       IMO, the proper solution is to provide a mechanism to register a
>>       set of permissions which are used for signal delivery. The
>>       resulting hardware value should expand the permission, but keep
>>       the current active ones enabled.
>>
>>       That can be kinda kept backwards compatible as the signal perms
>>       would default to PKEY0.
>
> I had validated at one point that this works (although the patch that
> enables internal pkeys usage in glibc did not exist back then).
>
>   pkeys: Support setting access rights for signal handlers
>   <https://lore.kernel.org/linux-mm/5fee976a-42d4-d469-7058-b78ad8897219@redhat.com/>

That looks about right and what I had in mind. Seems I missed that back
in the days and that discussion unfortunately ran into a dead end :(

>>    2) rseq
>>
>>       The option of having a separate key which needs to be always
>>       enabled is definitely simple, but it wastes a key just for
>>       that. There are only 16 of them :(
>>
>>       If we solve the signal case with an explicit permission set, we
>>       can just reuse those signal permissions. They are maybe wider than
>>       what's required to access RSEQ, but the signal permissions have to
>>       include the TLS/RSEQ area to actually work.
>
> Would it address the use case for single-colored memory access?  Or
> would that still crash if the process gets descheduled while the access
> rights register is set to the restricted value?

It would just work the same way as signals. Assume

         signal_perms = [PK0=RW, PK1=R, PK2=RW]

         set_pkey(PK0..6=NONE, PK7=R)

         access()              <- can fault
                               <- or interrupt can happen

         set_pkey(normal)

So when the fault or interrupt results in a signal and/or the return to
user space needs to access RSEQ we have in signal delivery:

         cur = pkey_extend(signal_perms);

--> Perms are now [PK0=RW, PK1=R, PK2=RW, PK7=R]         

         access_user_stack();
         ....
         // Return with the extended permissions to deliver the signal
         // Will be restored on sigreturn

and in rseq:

         cur = pkey_extend(signal_perms);

--> Perms are now [PK0=RW, PK1=R, PK2=RW, PK7=R]         

         access_user_rseq();
         pkey_set(cur);

If the RSEQ access is nested in the signal delivery return then nothing
happens as the permissions are not changing because they are already
extended: A | A = A :).

The kernel does not care about the PKEY permissions when the user to
kernel transition is due to an interrupt/exception except for the signal
and rseq case.

In fact the above also works with my made up example. Just assume the
RSEQ page is protected by PK2. :)

Syscalls are a different story as copy_to/from_user() obviously requires
the proper permissions and the kernel can rightfully expect that stack
and rseq are accessible, but that's not what we are debating here.

Thanks,

        tglx


  reply	other threads:[~2025-11-26 20:52 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-21  8:47 [PATCH v7 0/4] rseq: Make rseq work with protection keys Dmitry Vyukov
2025-05-21  8:47 ` [PATCH v7 1/4] pkeys: add API to switch to permissive/zero pkey register Dmitry Vyukov
2025-05-21  8:47 ` [PATCH v7 2/4] x86/signal: Use write_permissive_pkey_val() helper Dmitry Vyukov
2025-05-21  8:47 ` [PATCH v7 3/4] rseq: Make rseq work with protection keys Dmitry Vyukov
2025-05-21  8:59   ` Dmitry Vyukov
2025-06-24  9:17     ` Dmitry Vyukov
2025-07-18  9:01       ` Dmitry Vyukov
2025-07-21 13:25         ` Mathieu Desnoyers
2025-07-21 17:41           ` Dave Hansen
2025-08-21 15:12             ` Dmitry Vyukov
2025-09-19 13:07               ` Dmitry Vyukov
2025-09-22 13:06                 ` Mathieu Desnoyers
2025-10-20 13:46   ` Kevin Brodsky
2025-11-26  0:45     ` Thomas Gleixner
2025-11-26  9:32       ` Florian Weimer
2025-11-26 17:56         ` Thomas Gleixner
2025-11-26 19:06           ` Florian Weimer
2025-11-26 20:52             ` Thomas Gleixner [this message]
2025-11-26 22:06               ` Florian Weimer
2025-11-27 14:38                 ` Thomas Gleixner
2025-12-02 19:19           ` Kevin Brodsky
2025-05-21  8:47 ` [PATCH v7 4/4] selftests/rseq: Add test for rseq+pkeys Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=873460h5yb.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=aruna.ramakrishna@oracle.com \
    --cc=axboe@kernel.dk \
    --cc=boqun.feng@gmail.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=fweimer@redhat.com \
    --cc=hpa@zytor.com \
    --cc=kevin.brodsky@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox