All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Peter Collingbourne <pcc@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>,
	Kostya Serebryany <kcc@google.com>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	Dave Martin <Dave.Martin@arm.com>, Will Deacon <will@kernel.org>,
	Szabolcs Nagy <szabolcs.nagy@arm.com>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH v2] arm64: allow TCR_EL1.TBID0 to be configured
Date: Thu, 29 Jul 2021 18:44:09 +0100	[thread overview]
Message-ID: <20210729174409.GC31848@arm.com> (raw)
In-Reply-To: <CAMn1gO4zkXBdW-qzY+15Q=Pk7E-x3vR-PXmKRakyE4Wr87Xhdg@mail.gmail.com>

On Wed, Jul 28, 2021 at 04:50:07PM -0700, Peter Collingbourne wrote:
> On Wed, Jul 28, 2021 at 9:42 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Tue, Jul 27, 2021 at 03:00:10PM -0700, Peter Collingbourne wrote:
> > > On Tue, Jul 27, 2021 at 9:51 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > > On Mon, Jun 21, 2021 at 10:12:04PM -0700, Peter Collingbourne wrote:
> > > > > Introduce a command line flag that controls whether TCR_EL1.TBID0
> > > > > is set at boot time. Since this is a change to the userspace ABI the
> > > > > option defaults to off for now, although it seems likely that we'll
> > > > > be able to change the default at some future point.
> > > > >
> > > > > Setting TCR_EL1.TBID0 increases the number of signature bits used by
> > > > > the pointer authentication instructions for instruction addresses by 8,
> > > > > which improves the security of pointer authentication, but it also has
> > > > > the consequence of changing the operation of the branch instructions
> > > > > so that they no longer ignore the top byte of the target address but
> > > > > instead fault if they are non-zero.
> > > >
> > > > I'm a bit uneasy about the ABI change and not so keen on constraining
> > > > the ABI through the kernel command line. Ideally we should make this an
> > > > opt-in per application (prctl()) but that has some aspects to address
> > >
> > > This doesn't necessarily need to be the end state, we can enhance this
> > > based on need. For example, we could choose to take this patch now and
> > > later implement the per-process opt-in where the default is controlled
> > > by the command line.
> >
> > What's the risk of an application becoming reliant on the new mode
> > (TBID0) and breaking with the old one? Probably very small but I haven't
> > figured out if it's zero. Depending on whether we have PAC or PAC2 (the
> > latter came with 8.6 but optional in 8.3) with TBID0, there are some
> > differences on how the PAC/AUT instructions work and the code generated
> > (XOR with the top bits).
> 
> I think it would be quite small. On Android, at least to begin with
> there would be a mixture of devices with different TBID0 settings
> (devices without PAC support and devices with older kernels would all
> have this disabled), so I think it would be difficult for an
> application to depend on it being enabled.
> 
> > > Or just implement the software-only per-process
> > > TBID0 almost-disablement which would be much simpler than doing it in
> > > hardware, if that is enough to satisfy future needs.
> >
> > I don't entirely follow this.
> 
> Sorry, I was referring to my earlier proposal for recovering from
> tagged PC in the kernel by clearing the tag bits:
> https://lore.kernel.org/linux-arm-kernel/CAMn1gO7+_JhTUw2gS6jnyRV+TCqprmpuCAfee3RCAhNjoVyy9w@mail.gmail.com/

Ah, sorry, I missed this one (or just paged it out entirely over the
holiday).

> > > Otherwise we risk adding "unused" complexity to the kernel that we can
> > > never remove due to API stability guarantees.
> >
> > We've had other debates over the years and, in general, if a kernel
> > change causes apps to break, we'd have to keep the original behaviour.
> > Are there any plans to fix the JITs tools you discovered?
> 
> Yes, we would definitely want to fix the JIT issue in the Android
> platform before rolling out a forward PAC ABI. This would be separate
> from fixing apps, which would need to opt into MTE (or address tagging
> via the target API level) anyway. But if it turns out that there are
> too many apps with these JITs that use MTE or address tagging, I think
> we would need to come back to the kernel to figure out some way to let
> these programs run.

OK, I guess we don't yet have a clear view on how many such apps are
affected.

> > Talking to Will about this he was wondering whether we could make TBID0
> > on by default and clear the tag in PC if we take a fault (on tagged PC),
> > restarting the context. PAC shouldn't be affected since we would only
> > branch to an authenticated (PAC code removed) pointer. If this works,
> > we'd only affect performance slightly on such apps but don't completely
> > break them.
> 
> Right, this sounds exactly like my earlier proposal.

Indeed. Something I haven't figured out yet is whether such handling in
the kernel would weaken PAC. For example, following a failed AUT (the
old style which does not trap), the resulting address would cause a
translation fault. Would the kernel clearing the top byte result in a
potentially valid address?

> > > > first: (a) this bit is permitted to be cached in the TLB so we'd need
> > > > some TLBI when setting it (and a clarification in the specs that it is
> > > > tagged by ASID/VMID, probably fine) and (b) we'd need to context-switch
> > > > TCR_EL1, with a small performance penalty (I don't think it's
> > > > significant but worth testing).
> > >
> > > So TLBI all of the CPUs on prctl() and context-switch TCR_EL1? I
> > > thought there would be DOS concerns with the first part of that?
> >
> > The DoS problem appears if we need to issue an IPI to all CPUs (like
> > stop_machine). The TLBI with broadcast handled in hardware should be OK
> > as it's targeted to a specific ASID. But this would have to be issued
> 
> I see -- I hadn't realised that this instruction is implemented as a
> broadcast. So we would just need to issue the instruction from any CPU
> and we should be good.

Well, as long as it's single-threaded when the prctl() is issued. If
there are multiple threads, we'd have to synchronise all the CPUs. Such
requirement is probably not a big deal anyway as it affects the return
addresses, so it would have to be done early.

> > before any app threads are started, otherwise we'd need to synchronise
> > TCR_EL1. Given that TBID0 toggling affects PAC, this can only be done
> 
> Right, so this would be different from everything currently in
> tagged_addr_ctrl because it would be per-process rather than
> per-thread. So if this were a true TBID0 control we may even want it
> as a separate prctl() since there are certainly use cases for changing
> the other bits of tagged_addr_ctrl while the other threads are
> running.

Since it's permitted to be cached in the TLB, switching it between
threads would require TLBI on each switch, so not feasible.

How early is the prctl(PR_TAGGED_ADDR_ENABLE) called? Anyway, it
probably makes more sense to have a separate control since TBID0 is not
necessarily linked to hwasan or MTE. We'd want more PAC bits even if
hwasan or MTE are not deployed.

> > safely very early in the application before return addresses get a PAC
> > code.
> 
> Right, so maybe the "almost-disablement" would work better since all
> of the signatures would then still be valid, and you could even have
> different settings for different threads if you wanted to (e.g. if you
> arranged to run legacy code only on a specific thread).

Yes, if (a) it doesn't weaken PAC and (b) there are no future plans to
use code TBI as trapping would make it slower.

I think we need to decide whether full TBI would be any useful going
forward. One comment from Szabolcs on the previous version was that
"data only tbi is a more complicated concept than plain tbi". Are there
any tooling plans to benefit from code TBI? Or any use-cases that would
be affected? For example, if we decide mmap() to return tagged pointers
(with a new flag), would JITs want to benefit?

If we need both data and code TBI to coexist, we should look into a
per-process control.

-- 
Catalin

WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com>
To: Peter Collingbourne <pcc@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>,
	Kostya Serebryany <kcc@google.com>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	Dave Martin <Dave.Martin@arm.com>, Will Deacon <will@kernel.org>,
	Szabolcs Nagy <szabolcs.nagy@arm.com>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH v2] arm64: allow TCR_EL1.TBID0 to be configured
Date: Thu, 29 Jul 2021 18:44:09 +0100	[thread overview]
Message-ID: <20210729174409.GC31848@arm.com> (raw)
In-Reply-To: <CAMn1gO4zkXBdW-qzY+15Q=Pk7E-x3vR-PXmKRakyE4Wr87Xhdg@mail.gmail.com>

On Wed, Jul 28, 2021 at 04:50:07PM -0700, Peter Collingbourne wrote:
> On Wed, Jul 28, 2021 at 9:42 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Tue, Jul 27, 2021 at 03:00:10PM -0700, Peter Collingbourne wrote:
> > > On Tue, Jul 27, 2021 at 9:51 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > > On Mon, Jun 21, 2021 at 10:12:04PM -0700, Peter Collingbourne wrote:
> > > > > Introduce a command line flag that controls whether TCR_EL1.TBID0
> > > > > is set at boot time. Since this is a change to the userspace ABI the
> > > > > option defaults to off for now, although it seems likely that we'll
> > > > > be able to change the default at some future point.
> > > > >
> > > > > Setting TCR_EL1.TBID0 increases the number of signature bits used by
> > > > > the pointer authentication instructions for instruction addresses by 8,
> > > > > which improves the security of pointer authentication, but it also has
> > > > > the consequence of changing the operation of the branch instructions
> > > > > so that they no longer ignore the top byte of the target address but
> > > > > instead fault if they are non-zero.
> > > >
> > > > I'm a bit uneasy about the ABI change and not so keen on constraining
> > > > the ABI through the kernel command line. Ideally we should make this an
> > > > opt-in per application (prctl()) but that has some aspects to address
> > >
> > > This doesn't necessarily need to be the end state, we can enhance this
> > > based on need. For example, we could choose to take this patch now and
> > > later implement the per-process opt-in where the default is controlled
> > > by the command line.
> >
> > What's the risk of an application becoming reliant on the new mode
> > (TBID0) and breaking with the old one? Probably very small but I haven't
> > figured out if it's zero. Depending on whether we have PAC or PAC2 (the
> > latter came with 8.6 but optional in 8.3) with TBID0, there are some
> > differences on how the PAC/AUT instructions work and the code generated
> > (XOR with the top bits).
> 
> I think it would be quite small. On Android, at least to begin with
> there would be a mixture of devices with different TBID0 settings
> (devices without PAC support and devices with older kernels would all
> have this disabled), so I think it would be difficult for an
> application to depend on it being enabled.
> 
> > > Or just implement the software-only per-process
> > > TBID0 almost-disablement which would be much simpler than doing it in
> > > hardware, if that is enough to satisfy future needs.
> >
> > I don't entirely follow this.
> 
> Sorry, I was referring to my earlier proposal for recovering from
> tagged PC in the kernel by clearing the tag bits:
> https://lore.kernel.org/linux-arm-kernel/CAMn1gO7+_JhTUw2gS6jnyRV+TCqprmpuCAfee3RCAhNjoVyy9w@mail.gmail.com/

Ah, sorry, I missed this one (or just paged it out entirely over the
holiday).

> > > Otherwise we risk adding "unused" complexity to the kernel that we can
> > > never remove due to API stability guarantees.
> >
> > We've had other debates over the years and, in general, if a kernel
> > change causes apps to break, we'd have to keep the original behaviour.
> > Are there any plans to fix the JITs tools you discovered?
> 
> Yes, we would definitely want to fix the JIT issue in the Android
> platform before rolling out a forward PAC ABI. This would be separate
> from fixing apps, which would need to opt into MTE (or address tagging
> via the target API level) anyway. But if it turns out that there are
> too many apps with these JITs that use MTE or address tagging, I think
> we would need to come back to the kernel to figure out some way to let
> these programs run.

OK, I guess we don't yet have a clear view on how many such apps are
affected.

> > Talking to Will about this he was wondering whether we could make TBID0
> > on by default and clear the tag in PC if we take a fault (on tagged PC),
> > restarting the context. PAC shouldn't be affected since we would only
> > branch to an authenticated (PAC code removed) pointer. If this works,
> > we'd only affect performance slightly on such apps but don't completely
> > break them.
> 
> Right, this sounds exactly like my earlier proposal.

Indeed. Something I haven't figured out yet is whether such handling in
the kernel would weaken PAC. For example, following a failed AUT (the
old style which does not trap), the resulting address would cause a
translation fault. Would the kernel clearing the top byte result in a
potentially valid address?

> > > > first: (a) this bit is permitted to be cached in the TLB so we'd need
> > > > some TLBI when setting it (and a clarification in the specs that it is
> > > > tagged by ASID/VMID, probably fine) and (b) we'd need to context-switch
> > > > TCR_EL1, with a small performance penalty (I don't think it's
> > > > significant but worth testing).
> > >
> > > So TLBI all of the CPUs on prctl() and context-switch TCR_EL1? I
> > > thought there would be DOS concerns with the first part of that?
> >
> > The DoS problem appears if we need to issue an IPI to all CPUs (like
> > stop_machine). The TLBI with broadcast handled in hardware should be OK
> > as it's targeted to a specific ASID. But this would have to be issued
> 
> I see -- I hadn't realised that this instruction is implemented as a
> broadcast. So we would just need to issue the instruction from any CPU
> and we should be good.

Well, as long as it's single-threaded when the prctl() is issued. If
there are multiple threads, we'd have to synchronise all the CPUs. Such
requirement is probably not a big deal anyway as it affects the return
addresses, so it would have to be done early.

> > before any app threads are started, otherwise we'd need to synchronise
> > TCR_EL1. Given that TBID0 toggling affects PAC, this can only be done
> 
> Right, so this would be different from everything currently in
> tagged_addr_ctrl because it would be per-process rather than
> per-thread. So if this were a true TBID0 control we may even want it
> as a separate prctl() since there are certainly use cases for changing
> the other bits of tagged_addr_ctrl while the other threads are
> running.

Since it's permitted to be cached in the TLB, switching it between
threads would require TLBI on each switch, so not feasible.

How early is the prctl(PR_TAGGED_ADDR_ENABLE) called? Anyway, it
probably makes more sense to have a separate control since TBID0 is not
necessarily linked to hwasan or MTE. We'd want more PAC bits even if
hwasan or MTE are not deployed.

> > safely very early in the application before return addresses get a PAC
> > code.
> 
> Right, so maybe the "almost-disablement" would work better since all
> of the signatures would then still be valid, and you could even have
> different settings for different threads if you wanted to (e.g. if you
> arranged to run legacy code only on a specific thread).

Yes, if (a) it doesn't weaken PAC and (b) there are no future plans to
use code TBI as trapping would make it slower.

I think we need to decide whether full TBI would be any useful going
forward. One comment from Szabolcs on the previous version was that
"data only tbi is a more complicated concept than plain tbi". Are there
any tooling plans to benefit from code TBI? Or any use-cases that would
be affected? For example, if we decide mmap() to return tagged pointers
(with a new flag), would JITs want to benefit?

If we need both data and code TBI to coexist, we should look into a
per-process control.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-07-29 17:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-22  5:12 [PATCH v2] arm64: allow TCR_EL1.TBID0 to be configured Peter Collingbourne
2021-06-22  5:12 ` Peter Collingbourne
2021-07-27 16:51 ` Catalin Marinas
2021-07-27 16:51   ` Catalin Marinas
2021-07-27 22:00   ` Peter Collingbourne
2021-07-27 22:00     ` Peter Collingbourne
2021-07-28 16:42     ` Catalin Marinas
2021-07-28 16:42       ` Catalin Marinas
2021-07-28 23:50       ` Peter Collingbourne
2021-07-28 23:50         ` Peter Collingbourne
2021-07-29 17:44         ` Catalin Marinas [this message]
2021-07-29 17:44           ` Catalin Marinas
2021-08-06  1:48           ` Peter Collingbourne
2021-08-06  1:48             ` Peter Collingbourne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210729174409.GC31848@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=Dave.Martin@arm.com \
    --cc=eugenis@google.com \
    --cc=kcc@google.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=pcc@google.com \
    --cc=szabolcs.nagy@arm.com \
    --cc=vincenzo.frascino@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.