Re: [PATCH v7 3/4] rseq: Make rseq work with protection keys

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Thomas Gleixner <tglx@linutronix.de>
To: Florian Weimer <fweimer@redhat.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	mathieu.desnoyers@efficios.com, peterz@infradead.org,
	boqun.feng@gmail.com, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, hpa@zytor.com,
	aruna.ramakrishna@oracle.com, elver@google.com,
	"Paul E. McKenney" <paulmck@kernel.org>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Jens Axboe <axboe@kernel.dk>
Subject: Re: [PATCH v7 3/4] rseq: Make rseq work with protection keys
Date: Wed, 26 Nov 2025 21:52:44 +0100	[thread overview]
Message-ID: <873460h5yb.ffs@tglx> (raw)
In-Reply-To: <lhuy0ns3971.fsf@oldenburg.str.redhat.com>

On Wed, Nov 26 2025 at 20:06, Florian Weimer wrote:
> * Thomas Gleixner:
>> But like with signals just blindly enabling key0 and hope that it works
>> is not really a solution. Nothing prevents me from disabling RSEQ for
>> glibc. Then install my own RSEQ page and mprotect it. When that key
>> becomes disabled in PKRU and the code section is interrupted then exit
>> to user space will fault and die in exactly the same way as
>> today. That's progress...
>
> But does that matter?  If I mprotect the stack and a signal arrives,
> that results in a crash, too.  Some things just don't work.

They can be made work when we have a dedicated permission setting for
signals, which can be used for rseq access too. And having the explicit
signal permissions make a lot of sense independent of the above absurd
use case which I just used for illustration.

>> So we really need to sit down and actually define a proper programming
>> model first instead of trying to duct tape the current ill defined mess
>> forever.
>>
>> What do we have to take into account:
>>
>>    1) signals
>>
>>       Broken as we know already.
>>
>>       IMO, the proper solution is to provide a mechanism to register a
>>       set of permissions which are used for signal delivery. The
>>       resulting hardware value should expand the permission, but keep
>>       the current active ones enabled.
>>
>>       That can be kinda kept backwards compatible as the signal perms
>>       would default to PKEY0.
>
> I had validated at one point that this works (although the patch that
> enables internal pkeys usage in glibc did not exist back then).
>
>   pkeys: Support setting access rights for signal handlers
>   <https://lore.kernel.org/linux-mm/5fee976a-42d4-d469-7058-b78ad8897219@redhat.com/>

That looks about right and what I had in mind. Seems I missed that back
in the days and that discussion unfortunately ran into a dead end :(

>>    2) rseq
>>
>>       The option of having a separate key which needs to be always
>>       enabled is definitely simple, but it wastes a key just for
>>       that. There are only 16 of them :(
>>
>>       If we solve the signal case with an explicit permission set, we
>>       can just reuse those signal permissions. They are maybe wider than
>>       what's required to access RSEQ, but the signal permissions have to
>>       include the TLS/RSEQ area to actually work.
>
> Would it address the use case for single-colored memory access?  Or
> would that still crash if the process gets descheduled while the access
> rights register is set to the restricted value?

It would just work the same way as signals. Assume

         signal_perms = [PK0=RW, PK1=R, PK2=RW]

         set_pkey(PK0..6=NONE, PK7=R)

         access()              <- can fault
                               <- or interrupt can happen

         set_pkey(normal)

So when the fault or interrupt results in a signal and/or the return to
user space needs to access RSEQ we have in signal delivery:

         cur = pkey_extend(signal_perms);

--> Perms are now [PK0=RW, PK1=R, PK2=RW, PK7=R]         

         access_user_stack();
         ....
         // Return with the extended permissions to deliver the signal
         // Will be restored on sigreturn

and in rseq:

         cur = pkey_extend(signal_perms);

--> Perms are now [PK0=RW, PK1=R, PK2=RW, PK7=R]         

         access_user_rseq();
         pkey_set(cur);

If the RSEQ access is nested in the signal delivery return then nothing
happens as the permissions are not changing because they are already
extended: A | A = A :).

The kernel does not care about the PKEY permissions when the user to
kernel transition is due to an interrupt/exception except for the signal
and rseq case.

In fact the above also works with my made up example. Just assume the
RSEQ page is protected by PK2. :)

Syscalls are a different story as copy_to/from_user() obviously requires
the proper permissions and the kernel can rightfully expect that stack
and rseq are accessible, but that's not what we are debating here.

Thanks,

        tglx

next prev parent reply	other threads:[~2025-11-26 20:52 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-21  8:47 [PATCH v7 0/4] rseq: Make rseq work with protection keys Dmitry Vyukov
2025-05-21  8:47 ` [PATCH v7 1/4] pkeys: add API to switch to permissive/zero pkey register Dmitry Vyukov
2025-05-21  8:47 ` [PATCH v7 2/4] x86/signal: Use write_permissive_pkey_val() helper Dmitry Vyukov
2025-05-21  8:47 ` [PATCH v7 3/4] rseq: Make rseq work with protection keys Dmitry Vyukov
2025-05-21  8:59   ` Dmitry Vyukov
2025-06-24  9:17     ` Dmitry Vyukov
2025-07-18  9:01       ` Dmitry Vyukov
2025-07-21 13:25         ` Mathieu Desnoyers
2025-07-21 17:41           ` Dave Hansen
2025-08-21 15:12             ` Dmitry Vyukov
2025-09-19 13:07               ` Dmitry Vyukov
2025-09-22 13:06                 ` Mathieu Desnoyers
2025-10-20 13:46   ` Kevin Brodsky
2025-11-26  0:45     ` Thomas Gleixner
2025-11-26  9:32       ` Florian Weimer
2025-11-26 17:56         ` Thomas Gleixner
2025-11-26 19:06           ` Florian Weimer
2025-11-26 20:52             ` Thomas Gleixner [this message]
2025-11-26 22:06               ` Florian Weimer
2025-11-27 14:38                 ` Thomas Gleixner
2025-12-02 19:19           ` Kevin Brodsky
2025-05-21  8:47 ` [PATCH v7 4/4] selftests/rseq: Add test for rseq+pkeys Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=873460h5yb.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=aruna.ramakrishna@oracle.com \
    --cc=axboe@kernel.dk \
    --cc=boqun.feng@gmail.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=fweimer@redhat.com \
    --cc=hpa@zytor.com \
    --cc=kevin.brodsky@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox