From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69219C25B10 for ; Mon, 13 May 2024 17:26:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D46BA6B0139; Mon, 13 May 2024 13:26:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CCF298D0001; Mon, 13 May 2024 13:26:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B48856B013D; Mon, 13 May 2024 13:26:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9792D6B0139 for ; Mon, 13 May 2024 13:26:07 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 1339A140FB6 for ; Mon, 13 May 2024 17:26:07 +0000 (UTC) X-FDA: 82114050774.06.E608F8D Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by imf28.hostedemail.com (Postfix) with ESMTP id 1E1F6C0002 for ; Mon, 13 May 2024 17:26:04 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=xpMGs+om; dmarc=none; spf=pass (imf28.hostedemail.com: domain of debug@rivosinc.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=debug@rivosinc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715621165; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3CphAxnYmU7rHUQNVs8T3cwnTqgP0hWjWzVy/6SXnII=; b=dRbEtVhImkxqQ4iOStHduFSlI1Q9AhWp0pungXZbxywt+0IAEDxGmEaYlWbQcLTAsLzQnp Ki3B92AxDje8D1/SqigRxX+AJa0UjRowzA/MJ3916EII3diaL9nXB1Xo0bHqOyU7YFJd2U mNLc74GoERTId5A8ZAdg/0FJPWHqJBo= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=xpMGs+om; dmarc=none; spf=pass (imf28.hostedemail.com: domain of debug@rivosinc.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=debug@rivosinc.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715621165; a=rsa-sha256; cv=none; b=4IufJwck/orCk5VbFBwaild6GE0joZBY+T2c3kUymbLmK2NSueTNyIzxzek/5TwFZRJl6X Kjl3LCwp/6Faj8P6QCqzl90//BX5rRiy8V1liNVLZjy7ruIPSCnndZx3vI6C8oL8tTRFn8 kLCYG+/ARnBsWysJvEVHD6HZdVni4mM= Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-6f44e3fd382so3802099b3a.1 for ; Mon, 13 May 2024 10:26:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1715621164; x=1716225964; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=3CphAxnYmU7rHUQNVs8T3cwnTqgP0hWjWzVy/6SXnII=; b=xpMGs+omgD4REPuPnK70+AIdJZYwz4VMNE3vYd5VNAWxThaN+LI2gPzG8Pc5TTC3Gp WlJFPKimgUH5ivsHXHWFJDQjAfBeYKqQDqR7YTCh+xOmM4GDRlMcUcZgiKLk2PbRwS0N jRIi8yo9sS4XYQOqrKw9mJVmJvv8Es57llmfAx1gi3bf+Jzo/ITPdgxNSKLFoRoGomfS zsJeSVBX/anuqXhsEJ9froBJLhu9AK82+WJVGxfjyC3an5Pp3MxLFr+LtzBj7yiZ6wUI A6I05SogACAK9iHVxTlNid+jAKSf/ypmirdZP3jkGbb9bZLDn55I9cAR6BYR2j6wdq/3 jFPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715621164; x=1716225964; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=3CphAxnYmU7rHUQNVs8T3cwnTqgP0hWjWzVy/6SXnII=; b=uKzKK+twMJzpuzdBHc1nsPt5x7DdWu25sP4zYWmMVXIAfD99fxBszbKnTk1A9BMa8X 2la+6pOUfHF64Bik4G4Ln7hyJ1S3mOMmX5Zlgq/DsJJqJTXp+6PowoS/3aMAVXXdMwCQ MHK/WH0i3Nf6148gJ5xseBSS3mdCVEh+Na0jIuHV5uCw9ks3p+NV9hMPsgbY7BQGR4mu JGmobzWoGgpTS5Sil9xgGVZXRjjzpUw4ZHdFhMzDzlRIx6k5VVD2DaGZ3dodrM1Ribft nTVOnIBSZo4/29Pk3srccbhc2iZMY1Hzhw9m2SDODIT6tgXeXhLbq5UASw+ssL69Anez qhyA== X-Forwarded-Encrypted: i=1; AJvYcCW4l/OXb4jgb8ahpN7ubVucJ/Fe8Ugv1zago9yQb2WpMehCEvYjG3xruMqRpNpd3FDCuVsGBEG67UA+SlWluq1YLzg= X-Gm-Message-State: AOJu0Yzp8EFpUwFPaHLRgq2dDCNhqFHKgc7BMv/Hchvqf/q/u5GzopLh Q2mp1Kigjg0ZzPk9D4lJCxbJ/CxO1RETLqO4AN1U61ZFa83A1nk7EDfpYFVG/cE= X-Google-Smtp-Source: AGHT+IHWYfZIMujic/Efx734TOOxaipvPsENyzxTx20y3wHlQ1JvIbN+S3KCHp0gNXx2ROBa3RZW9g== X-Received: by 2002:a05:6a20:7343:b0:1af:dbe7:c976 with SMTP id adf61e73a8af0-1afde115bc1mr11659334637.36.1715621163769; Mon, 13 May 2024 10:26:03 -0700 (PDT) Received: from debug.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-63411530b0esm8050191a12.88.2024.05.13.10.26.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 May 2024 10:26:03 -0700 (PDT) Date: Mon, 13 May 2024 10:25:59 -0700 From: Deepak Gupta To: Alexandre Ghiti Cc: paul.walmsley@sifive.com, rick.p.edgecombe@intel.com, broonie@kernel.org, Szabolcs.Nagy@arm.com, kito.cheng@sifive.com, keescook@chromium.org, ajones@ventanamicro.com, conor.dooley@microchip.com, cleger@rivosinc.com, atishp@atishpatra.org, bjorn@rivosinc.com, alexghiti@rivosinc.com, samuel.holland@sifive.com, conor@kernel.org, linux-doc@vger.kernel.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, corbet@lwn.net, palmer@dabbelt.com, aou@eecs.berkeley.edu, robh+dt@kernel.org, krzysztof.kozlowski+dt@linaro.org, oleg@redhat.com, akpm@linux-foundation.org, arnd@arndb.de, ebiederm@xmission.com, Liam.Howlett@oracle.com, vbabka@suse.cz, lstoakes@gmail.com, shuah@kernel.org, brauner@kernel.org, andy.chiu@sifive.com, jerry.shih@sifive.com, hankuan.chen@sifive.com, greentime.hu@sifive.com, evan@rivosinc.com, xiao.w.wang@intel.com, charlie@rivosinc.com, apatel@ventanamicro.com, mchitale@ventanamicro.com, dbarboza@ventanamicro.com, sameo@rivosinc.com, shikemeng@huaweicloud.com, willy@infradead.org, vincent.chen@sifive.com, guoren@kernel.org, samitolvanen@google.com, songshuaishuai@tinylab.org, gerg@kernel.org, heiko@sntech.de, bhe@redhat.com, jeeheng.sia@starfivetech.com, cyy@cyyself.name, maskray@google.com, ancientmodern4@gmail.com, mathis.salmen@matsal.de, cuiyunhui@bytedance.com, bgray@linux.ibm.com, mpe@ellerman.id.au, baruch@tkos.co.il, alx@kernel.org, david@redhat.com, catalin.marinas@arm.com, revest@chromium.org, josh@joshtriplett.org, shr@devkernel.io, deller@gmx.de, omosnace@redhat.com, ojeda@kernel.org, jhubbard@nvidia.com Subject: Re: [PATCH v3 14/29] riscv/mm: Implement map_shadow_stack() syscall Message-ID: References: <20240403234054.2020347-1-debug@rivosinc.com> <20240403234054.2020347-15-debug@rivosinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 1E1F6C0002 X-Stat-Signature: 8bk1bmmu8j56sj8ejksjp3e173mdijds X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1715621164-107330 X-HE-Meta: U2FsdGVkX1+Av9EqYFBbj6XB96PSmyGBlznhsde0gPKwejOjzxrA0N6JJhYUVgqiUVwqtnx/Tnrw2Cgjkb9xmogJbr+DNf43GJmk+VhrQx2DaRAyrbb/m0pL2Au4mhiINc+IXHhvnqjTW3BEu4TlYylmr/WPy5eRrsKjETbaOkOZZ4GGVVpa6pakEmX8OdYoVlwMhnHnL+xjKmILaTbjWn0exu9FTlhghDjfJMMPO8Vrw1vdkhebpz1dKaS6KVoVOdNmPyP2AsfVembuGP8AYtiwHHN+B+42z3xD8b4VKyg/kOevp2jNMYMTdUbmh4fRlZt43gpAi/hyB5auOXcXug0E5Y/10z5xjgZ2zZeDaK7W6FjvMBHzIHdehsrOaOlA0IBop318p9H4vl4pcfEHE5vKgDtfJwDU+gepj+eOHPQ8VcdFZmzrBay2BFMl1XjEVVYT9j/R6Hf5LCPA5999FTw9GMtHVD6YzEb3I1ikZdqtLdgCfllxP4CsRYA1Fd0BmkDi+xNX6Rh9HOPTyYz/rlXftcR7Di1Q6qIULcaldzCjDP9OM+iEC+ejqCnN0f4hd8mbaHozaMROa4j44JCBI1C6Z1gN8Krkwh0eoxXRghhnIwFRfXj7i6ulATCWnfd9m+xy+MsVFXZ0OCn6bbWnX7FqiwE7n6Nmd//ee3fPhlopbVtMdM65UQUu8+/D7impODKLk6CU341YittsDTltp7lDGX+mweq6LD9kCBaxg1iSHPoeCDZCLRjfAcXV+wVnFdW75xuVy6wHjhtP9yoT9uAmXkjxvMb7MG5QD0J6HIyUeo+0PeAN3Et2xG5ouRsvnfKF8yiutMm87bbiEDhd4p27dPWQRFsQCE6r89bIq54aWpD2R7/lklRPkO+4ZCtm9pPVzKQQbD5vv230T5igwuBkVU7b95yqpVWNhyIuDrCH8lg2fl4D4M1FTM5d1qaQAV3+mQRYonNkrsCocCu jTQRoFdn izVE15PIpYNd1LpodjLry5mFlenWCv+dLvek4duSPvXCbrHi6vm8ERIKnbKbm2RwzfvRG5CZ12Ug0i3HBC/gfeEKNwnqlBXNl0g1yDgFhZCmxOWXLBTAbSXqdMm9eku8+67GcPL7umBxgaFpxplpwzLAGAatESma8UsJW7zdIioNFlC8ptj0HB4ZGfdLvznhgL0A21q8ThFQnVRfIxOIEOCD2nno28G/ER4gCFdLlhsm6RheneFHYLuXDjKvFSbQd3qR8ngHJeYcVi+4UK7/RyeeKidK0tW4vtVHIHQAuYDb0paamkzIQhRMZFiWR+JWWqh4Tz3AipANfHcQpKu9SMv3gsmIIaVf6ABFSEf5cTi6BT1VWCi/QVSXGZ5QmBYBF83JSK+kRFQR2nChbpfKQ4ZBHeWrHm+ijCDtjzQzvWsLgA24qsyzE0LV25UhpYL8KzfRv3+kAVgrYp+9kAc00oayjTSR8NqkWUNuiygtKWKskTXQxfGlv9OklL7602uwMioVf/xOfaeffvtszhlzb+F05XOgP0ep0Vw3EGwBTbXl0f7CknUe2Y4ltxVCOzAFrIQeGX4rzg2DUN4o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, May 12, 2024 at 06:50:18PM +0200, Alexandre Ghiti wrote: > >On 04/04/2024 01:35, Deepak Gupta wrote: >>As discussed extensively in the changelog for the addition of this >>syscall on x86 ("x86/shstk: Introduce map_shadow_stack syscall") the >>existing mmap() and madvise() syscalls do not map entirely well onto the >>security requirements for shadow stack memory since they lead to windows >>where memory is allocated but not yet protected or stacks which are not >>properly and safely initialised. Instead a new syscall map_shadow_stack() >>has been defined which allocates and initialises a shadow stack page. >> >>This patch implements this syscall for riscv. riscv doesn't require token >>to be setup by kernel because user mode can do that by itself. However to >>provide compatibility and portability with other architectues, user mode >>can specify token set flag. >> >>Signed-off-by: Deepak Gupta >>--- >> arch/riscv/kernel/Makefile | 2 + >> arch/riscv/kernel/usercfi.c | 149 ++++++++++++++++++++++++++++++++ >> include/uapi/asm-generic/mman.h | 1 + >> 3 files changed, 152 insertions(+) >> create mode 100644 arch/riscv/kernel/usercfi.c >> >>diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile >>index 604d6bf7e476..3bec82f4e94c 100644 >>--- a/arch/riscv/kernel/Makefile >>+++ b/arch/riscv/kernel/Makefile >>@@ -107,3 +107,5 @@ obj-$(CONFIG_COMPAT) += compat_vdso/ >> obj-$(CONFIG_64BIT) += pi/ >> obj-$(CONFIG_ACPI) += acpi.o >>+ >>+obj-$(CONFIG_RISCV_USER_CFI) += usercfi.o >>diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c >>new file mode 100644 >>index 000000000000..c4ed0d4e33d6 >>--- /dev/null >>+++ b/arch/riscv/kernel/usercfi.c >>@@ -0,0 +1,149 @@ >>+// SPDX-License-Identifier: GPL-2.0 >>+/* >>+ * Copyright (C) 2024 Rivos, Inc. >>+ * Deepak Gupta >>+ */ >>+ >>+#include >>+#include >>+#include >>+#include >>+#include >>+#include >>+#include >>+#include >>+#include >>+#include >>+#include >>+#include >>+ >>+#define SHSTK_ENTRY_SIZE sizeof(void *) >>+ >>+/* >>+ * Writes on shadow stack can either be `sspush` or `ssamoswap`. `sspush` can happen >>+ * implicitly on current shadow stack pointed to by CSR_SSP. `ssamoswap` takes pointer to >>+ * shadow stack. To keep it simple, we plan to use `ssamoswap` to perform writes on shadow >>+ * stack. >>+ */ >>+static noinline unsigned long amo_user_shstk(unsigned long *addr, unsigned long val) >>+{ >>+ /* >>+ * Since shadow stack is supported only in 64bit configuration, >>+ * ssamoswap.d is used below. > >>* * CONFIG_RISCV_USER_CFI is dependent >>+ * on 64BIT and compile of this file is dependent on CONFIG_RISCV_USER_CFI >>+ * In case ssamoswap faults, return -1. > > >To me, this part of the comment is not needed. Ok, will remove it. > > >>+ * Never expect -1 on shadow stack. Expect return addresses and zero > > >In that case, should we BUG() instead? Caller (create_rstor_token) of `amo_user_shstk` is returning -EFAULT. It'll translate to signal (SIGSEGV) delivery to user app or terminate. > > >>+ */ >>+ unsigned long swap = -1; >>+ >>+ __enable_user_access(); >>+ asm goto( >>+ ".option push\n" >>+ ".option arch, +zicfiss\n" >>+ "1: ssamoswap.d %[swap], %[val], %[addr]\n" >>+ _ASM_EXTABLE(1b, %l[fault]) >>+ RISCV_ACQUIRE_BARRIER >>+ ".option pop\n" >>+ : [swap] "=r" (swap), [addr] "+A" (*addr) >>+ : [val] "r" (val) >>+ : "memory" >>+ : fault >>+ ); >>+ __disable_user_access(); >>+ return swap; >>+fault: >>+ __disable_user_access(); >>+ return -1; >>+} >>+ >>+/* >>+ * Create a restore token on the shadow stack. A token is always XLEN wide >>+ * and aligned to XLEN. >>+ */ >>+static int create_rstor_token(unsigned long ssp, unsigned long *token_addr) >>+{ >>+ unsigned long addr; >>+ >>+ /* Token must be aligned */ >>+ if (!IS_ALIGNED(ssp, SHSTK_ENTRY_SIZE)) >>+ return -EINVAL; >>+ >>+ /* On RISC-V we're constructing token to be function of address itself */ >>+ addr = ssp - SHSTK_ENTRY_SIZE; >>+ >>+ if (amo_user_shstk((unsigned long __user *)addr, (unsigned long) ssp) == -1) >>+ return -EFAULT; >>+ >>+ if (token_addr) >>+ *token_addr = addr; >>+ >>+ return 0; >>+} >>+ >>+static unsigned long allocate_shadow_stack(unsigned long addr, unsigned long size, >>+ unsigned long token_offset, >>+ bool set_tok) >>+{ >>+ int flags = MAP_ANONYMOUS | MAP_PRIVATE; >>+ struct mm_struct *mm = current->mm; >>+ unsigned long populate, tok_loc = 0; >>+ >>+ if (addr) >>+ flags |= MAP_FIXED_NOREPLACE; >>+ >>+ mmap_write_lock(mm); >>+ addr = do_mmap(NULL, addr, size, PROT_READ, flags, > > >Hmmm why do you map the shadow stack as PROT_READ here? I believe its redundant here. I followed what x86 did for their shadow stack creation. GCS (arm shadow stack) patches also do same thing. Collectively, we think at some time in future many of these flows will become generic (arch agnostic). > > >>+ VM_SHADOW_STACK | VM_WRITE, 0, &populate, NULL); >>+ mmap_write_unlock(mm); >>+ >>+ if (!set_tok || IS_ERR_VALUE(addr)) >>+ goto out; >>+ >>+ if (create_rstor_token(addr + token_offset, &tok_loc)) { >>+ vm_munmap(addr, size); >>+ return -EINVAL; >>+ } >>+ >>+ addr = tok_loc; >>+ >>+out: >>+ return addr; >>+} >>+ >>+SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsigned int, flags) >>+{ >>+ bool set_tok = flags & SHADOW_STACK_SET_TOKEN; >>+ unsigned long aligned_size = 0; >>+ >>+ if (!cpu_supports_shadow_stack()) >>+ return -EOPNOTSUPP; >>+ >>+ /* Anything other than set token should result in invalid param */ >>+ if (flags & ~SHADOW_STACK_SET_TOKEN) >>+ return -EINVAL; >>+ >>+ /* >>+ * Unlike other architectures, on RISC-V, SSP pointer is held in CSR_SSP and is available >>+ * CSR in all modes. CSR accesses are performed using 12bit index programmed in instruction >>+ * itself. This provides static property on register programming and writes to CSR can't >>+ * be unintentional from programmer's perspective. As long as programmer has guarded areas >>+ * which perform writes to CSR_SSP properly, shadow stack pivoting is not possible. Since >>+ * CSR_SSP is writeable by user mode, it itself can setup a shadow stack token subsequent >>+ * to allocation. Although in order to provide portablity with other architecture (because >>+ * `map_shadow_stack` is arch agnostic syscall), RISC-V will follow expectation of a token >>+ * flag in flags and if provided in flags, setup a token at the base. >>+ */ >>+ >>+ /* If there isn't space for a token */ >>+ if (set_tok && size < SHSTK_ENTRY_SIZE) >>+ return -ENOSPC; >>+ >>+ if (addr && (addr % PAGE_SIZE)) > > >I would use: > >if (addr && (addr & (PAGE_SIZE - 1)) noted. > > >>+ return -EINVAL; >>+ >>+ aligned_size = PAGE_ALIGN(size); >>+ if (aligned_size < size) >>+ return -EOVERFLOW; >>+ >>+ return allocate_shadow_stack(addr, aligned_size, size, set_tok); >>+} >>diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h >>index 57e8195d0b53..0c0ac6214de6 100644 >>--- a/include/uapi/asm-generic/mman.h >>+++ b/include/uapi/asm-generic/mman.h >>@@ -19,4 +19,5 @@ >> #define MCL_FUTURE 2 /* lock all future mappings */ >> #define MCL_ONFAULT 4 /* lock all pages that are faulted in */ >>+#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */ >> #endif /* __ASM_GENERIC_MMAN_H */ > > >Don't we need to advertise this new syscall to the man pages? `map_shadow_stack` is already mainline as part of x86. I am assuming there is man page for this. I'll check to be sure and confirm here. >