From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF2DBC433E7 for ; Tue, 1 Sep 2020 16:04:14 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9B99920866 for ; Tue, 1 Sep 2020 16:04:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="Ui7xkwmq" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9B99920866 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=XjyY44rTpW+49/2r3Kq1DOdI46kTTh6Ab4jeflgutnk=; b=Ui7xkwmqhM65tIBi8/8cNXLyS duA1CdLGsFoBlLIIjsITlHIrzYxnEVFcrTXO6dQLmccp/fTQMEdw4In2DhK+Ulyh1bdNUiXU7FqtX NC/uNAyfgnIg4D3a64sm/Fwoqcqra8z2JwD2mihGIWjiKnyKr9D48S2ewaUGEWCMiNRrDgKRULqZu Eq9LXCL2OHJWp1ICv7anD3uTMStbTYXHg0dzO6QHmCd6loXwwb36oj/DOtEIScu5QEtDOQUcDlUMG jC1iiEJb0uJqgka0ZFYUFo5ysymX0DFz5xjgedjMZmM8ERr3i1gLkLUVG164ItmxpZtFg99O/8YRW 7I0vT3k/Q==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kD8k4-0004Yv-TS; Tue, 01 Sep 2020 16:02:40 +0000 Received: from foss.arm.com ([217.140.110.172]) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kD8jz-0004Wj-NY for linux-arm-kernel@lists.infradead.org; Tue, 01 Sep 2020 16:02:38 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D88AB1045; Tue, 1 Sep 2020 09:02:34 -0700 (PDT) Received: from arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9F1EF3F71F; Tue, 1 Sep 2020 09:02:33 -0700 (PDT) Date: Tue, 1 Sep 2020 17:02:31 +0100 From: Dave Martin To: Peter Collingbourne Subject: Re: [PATCH] arm64: add prctl(PR_PAC_SET_ENABLED_KEYS) Message-ID: <20200901160231.GB6642@arm.com> References: <20200801011152.39838-1-pcc@google.com> <20200819101809.GD6642@arm.com> <20200824144910.GO6642@arm.com> <20200826163726.GV6642@arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200901_120235_932224_D91CF020 X-CRM114-Status: GOOD ( 67.52 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Andrey Konovalov , Kevin Brodsky , Kostya Serebryany , Evgenii Stepanov , Catalin Marinas , Vincenzo Frascino , Will Deacon , Linux ARM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Aug 26, 2020 at 07:55:51PM -0700, Peter Collingbourne wrote: > On Wed, Aug 26, 2020 at 9:37 AM Dave Martin wrote: > > > > On Mon, Aug 24, 2020 at 05:23:27PM -0700, Peter Collingbourne wrote: > > > On Mon, Aug 24, 2020 at 7:49 AM Dave Martin wrote: > > > > > > > > On Wed, Aug 19, 2020 at 02:25:45PM -0700, Peter Collingbourne wrote: > > > > > On Wed, Aug 19, 2020 at 3:18 AM Dave Martin wrote: > > > > > > > > > > > > On Fri, Jul 31, 2020 at 06:11:52PM -0700, Peter Collingbourne wrote: [...] > > > > > > > diff --git a/arch/arm64/include/asm/asm_pointer_auth.h b/arch/arm64/include/asm/asm_pointer_auth.h > > > > > > > index 52dead2a8640..d121fa5fed5f 100644 > > > > > > > --- a/arch/arm64/include/asm/asm_pointer_auth.h > > > > > > > +++ b/arch/arm64/include/asm/asm_pointer_auth.h > > > > > > > @@ -31,6 +31,14 @@ alternative_else_nop_endif > > > > > > > ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APDB] > > > > > > > msr_s SYS_APDBKEYLO_EL1, \tmp2 > > > > > > > msr_s SYS_APDBKEYHI_EL1, \tmp3 > > > > > > > + > > > > > > > + ldr \tmp2, [\tsk, #THREAD_SCTLR_ENXX_MASK] > > > > > > > + cbz \tmp2, .Laddr_auth_skip_\@ > > > > > > > > > > > > I wonder whether it would make sense to simple store the thread's base > > > > > > SCTLR value (containing the ENxx bits), rather than storing the ENxx > > > > > > bits separately. There may be reasons outside this snippet why this > > > > > > isn't such a good idea though -- I haven't gone digging so far. > > > > > > > > > > As far as I know (as I learned [1] from the MTE patch series), it can > > > > > be expensive to access SCTLR_EL1, therefore I arrange to only access > > > > > SCTLR_EL1 in the hopefully-uncommon case where a process has disabled > > > > > keys. Detecting the "process has disabled keys" case is quite simple > > > > > if we only store the disabled keys mask here, not so much if we store > > > > > the full value of SCTLR_EL1. > > > > > > > > My thought was that we would still only write SCTLR_EL1 if needed, but > > > > we would do the write-if-needed across the whole register in one go. > > > > This would be easier to extend if we have to twiddle additional > > > > SCTLR_EL1 bits in the future. If the key enable bits are the only > > > > affected bits for now then we could of course defer this factoring until > > > > later. (I vaguely remember a similar discussion in the past, but > > > > possibly it was about the pauth keys anyway, rather than something > > > > else.) > > > > > > We will also need a per-task SCTLR_EL1.TCF0 setting for MTE. We could > > > potentially combine the two, so that both > > > prctl(PR_PAC_SET_ENABLED_KEYS) and prctl(PR_SET_TAGGED_ADDR_CTRL) end > > > up affecting a common SCTLR_EL1 field in the task_struct, and we do: > > > > > > On kernel entry: > > > - read SCTLR_EL1 from task_struct (or from the system register if we > > > decide that's cheap enough) > > > - if EnIA clear, set it, and write the value to the system register. > > > > > > On kernel exit: > > > - read system register SCTLR_EL1 > > > - read SCTLR_EL1 from task_struct > > > - if they are different, write task's SCTLR_EL1 to the system register. > > > > > > But this would require at least an unconditional read of SCTLR_EL1 per > > > kernel exit. The comment that I referenced says that "accesses" to the > > > register are expensive. I was operating under the assumption that both > > > reads and writes are expensive, but if it's actually only writes that > > > are expensive, we may be able to get away with the above. > > > > The kernel is in full control over what's in SCTLR_EL1, so perhaps we > > could cache what we wrote percpu, if the overhead of reading it is > > problematic. > > Maybe, but that sounds like the kind of complexity that I'd like to avoid. Ack, this is more something to explore if the other approaches turn out to be more costly. > I went ahead and did some performance measurements using the following > program, which issues 10M invalid syscalls and exits. This should give > us something as close as possible to a measurement of a kernel entry > and exit on its own. > > .globl _start > _start: > movz x1, :abs_g1:10000000 > movk x1, :abs_g0_nc:10000000 > > .Lloop: > mov x8, #65536 // invalid > svc #0 > sub x1, x1, #1 > cbnz x1, .Lloop > > mov x0, #0 > mov x8, #0x5d // exit > svc #0 > > On a Pixel 4 (which according to Wikipedia has a CPU based on > Cortex-A76/A55) I measured the median of 1000 runs execution time at > 0.554511s, implying an entry/exit cost of 55.5ns. If I make this > modification to the kernel (Pixel 4 uses kernel version 4.14 so this > won't apply cleanly to modern kernels; this is in the same place as > the ptrauth_keys_install_user macro invocation in a modern kernel): > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S > index 303447c765f7..b7c72d767f3e 100644 > --- a/arch/arm64/kernel/entry.S > +++ b/arch/arm64/kernel/entry.S > @@ -340,6 +340,7 @@ alternative_else_nop_endif > #endif > 3: > apply_ssbd 0, 5f, x0, x1 > + mrs x0, sctlr_el1 > 5: > .endif > > The median execution time goes to 0.565604s, implying a cost of 1.1ns > to read the system register. I also measured the cost of reading from > a percpu variable at kernel exit like so: > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S > index 303447c765f7..c5a89785f650 100644 > --- a/arch/arm64/kernel/entry.S > +++ b/arch/arm64/kernel/entry.S > @@ -340,6 +340,7 @@ alternative_else_nop_endif > #endif > 3: > apply_ssbd 0, 5f, x0, x1 > + ldr_this_cpu x0, sctlr_el1_cache, x1 > 5: > .endif > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c > index 0b6ca5479f1f..ac9c9a6179d4 100644 > --- a/arch/arm64/kernel/traps.c > +++ b/arch/arm64/kernel/traps.c > @@ -52,6 +52,8 @@ > #include > #include > > +DEFINE_PER_CPU(unsigned long, sctlr_el1_cache); > + > static const char *handler[]= { > "Synchronous Abort", > "IRQ", > > With that the median execution time goes to 0.600186s, implying a cost > of 4.5ns. So at least on existing hardware, it looks like reading > SCTLR_EL1 directly is probably our cheapest option if we're going to > read *something*. Therefore I'll look at implementing this with > unconditional reads from SCTLR_EL1 in v2. Thanks for the investigation! I'm not too surprised by this outcome. There is a possibility that reading SCTLR_EL1 sucks badly on some implementations, but on others it may be little more than a register rename. We should probably wait for evidence before panicking about it. So long as this SCTLR_EL1 switching is completely disabled via alternatives on hardware to which it isn't applicable, I expect that should be a reasonable compromise. [...] > > > > > > Do we need a way to query the enabled keys? > > > > > > > > > > > > We could either have a _GET_ prctl (the common approach), or have this > > > > > > prctl return the mask of enabled keys (avoids the extra prctl, but > > > > > > weirder). > > > > > > > > > > > > As above, we might > > > > > > > > > > > > Things like CRIU may need a GET in order to save/restore this setting > > > > > > properly. > > > > > > > > > > Maybe it makes sense for there to be a GET prctl() to support CRIU. > > > > > But it would need to be defined carefully because CRIU would > > > > > presumably need to know what value to pass as the "keys" argument when > > > > > it calls SET to restore the state. It can't just hardcode a value > > > > > because that may harm extensibility if new keys are added. > > > > > > > > > > If we require the kernel to start processes with all keys enabled > > > > > (including any keys that we may introduce in the future), then it > > > > > seems like if CRIU knows which keys were disabled, then it can restore > > > > > the state by issuing a syscall like this: > > > > > > > > > > prctl(PR_PAC_SET_ENABLED_KEYS, disabled_keys_mask, 0) > > > > > > > > > > which would imply that instead of providing PR_PAC_GET_ENABLED_KEYS we > > > > > instead provide PR_PAC_GET_DISABLED_KEYS, which CRIU would call when > > > > > saving the state to prepare the disabled_keys_mask argument passed > > > > > here. Then for consistency we can make the SET prctl() be > > > > > PR_PAC_SET_DISABLED_KEYS, and CRIU can then do: > > > > > > > > > > prctl(PR_PAC_SET_DISABLED_KEYS, disabled_keys_mask, disabled_keys_mask) > > > > > > > > > > Does that make sense? > > > > > > > > Possibly, though it's nicer if the GET returns the same value suppiled > > > > to the SET, rather than the complement of it. > > > > > > Maybe you misread my proposal? The part about the complement is > > > addressed by the sentence beginning "Then for consistency..." So we > > > would end up with PR_PAC_SET_DISABLED_KEYS and > > > PR_PAC_GET_DISABLED_KEYS, and they would both take/return a mask of > > > *disabled* keys. > > > > Oh, I see. Yes, while I prefer the two are consistent with each other, > > we can flip the sense of both here if desired. > > > > > > > > > If SET ignores set bits in arg3 that don't correspond to set bits in the > > > > mask arg2, then > > > > > > > > u64 current_keys = prctl(PR_PAC_GET_ENABLED_KEYS); > > > > > > > > prctl(PR_PAC_SET_ENABLED_KEYS, ~0UL, current_keys); > > > > > > > > should work. > > > > > > I'd prefer not to allow the SET prctl to accept all-ones as the first > > > argument, as this could lead to a scenario where sometimes people pass > > > all-ones as the first argument and sometimes they don't, which could > > > make it hard to add new keys in the future (since the addition of a > > > new key could cause existing callers to affect the new key > > > arbitrarily). Instead, the first argument should be required to > > > specify only known keys, in order to enforce that the caller is > > > explicit about which keys they intend to affect. > > > > That sounds fair. > > > > > > There's a final option, which is to expose this config through ptrace > > > > instead for save/restore purposes. From previous discussions with the > > > > CRIU folks, I recall that they are trying to move away from doing setup > > > > from within the new process where possible. > > > > > > > > There's no reason not to have both though -- there's precedent for that, > > > > such as for PR_SVE_{GET,SET}_VL and the NT_ARM_SVE regset. MTE may move > > > > in a similar direction too IIUC. > > > > > > > > Having a GET remains useful for in-process debugging and diagnostics, > > > > and it's extremely straightforward to add in the kernel. So from my > > > > side I'd vote to have it anyway... > > > > > > Okay. I see that the MTE series is adding NT_ARM_TAGGED_ADDR_CTRL as > > > well, and I suppose that I could add a > > > NT_ARM_PAC_(whatever)ABLED_KEYS. Since this wouldn't be used in normal > > > userspace applications (and it doesn't look like ptrace APIs such as > > > NT_ARM_PACA_KEYS will accommodate new keys being added so CRIU would > > > need to be made aware of new keys anyway), we might as well just have > > > all the APIs deal with the set of enabled keys. > > > > > > And if the ptrace is enough for CRIU, I'd mildly prefer not to add a > > > GET prctl, since it's just more uapi surface that needs to be > > > maintained forever. For in-process use cases it seems redundant with > > > e.g. issuing a pacia instruction and checking whether it is a no-op, > > > > Probing instructions to see whether they generate signals is a cardinal > > sin and we make huge efforts to ensure that userspace doesn't need to do > > that. As for the no-op check, I suppose that works, but it's nasty if > > the generic "check all the required featues" boilerplate on program > > entry relies on intrinsics or can't be written in C at all. This is > > just a way of discouraging people from checking at all IMHO. > > > > > > The implementation of that prctl would most likely be a one-line wrapper > > that calls the same internal function used to populate the regset for > > ptrace, so I think you might be worrying a bit too much about creating > > ABI here. This is more providing a second interface for same thing, > > because no single interface is usable in all situations. > > Okay. Since I don't feel too strongly about it, I'll add the GET prctl in v2. Thanks ---Dave _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel