From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E728DC10F27 for ; Mon, 9 Mar 2020 14:30:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 948B920578 for ; Mon, 9 Mar 2020 14:30:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 948B920578 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 259DC6B0005; Mon, 9 Mar 2020 10:30:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 20B5A6B0006; Mon, 9 Mar 2020 10:30:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 120CB6B0007; Mon, 9 Mar 2020 10:30:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id EE01D6B0005 for ; Mon, 9 Mar 2020 10:30:18 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 4D5998404 for ; Mon, 9 Mar 2020 14:30:17 +0000 (UTC) X-FDA: 76576058874.20.fly73_1f84bb0b75f5c X-HE-Tag: fly73_1f84bb0b75f5c X-Filterd-Recvd-Size: 14228 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Mon, 9 Mar 2020 14:30:16 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 93AAF30E; Mon, 9 Mar 2020 07:30:15 -0700 (PDT) Received: from [10.1.195.53] (e123572-lin.cambridge.arm.com [10.1.195.53]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0D3AD3F67D; Mon, 9 Mar 2020 07:30:13 -0700 (PDT) Subject: Re: [PATCH v2 19/19] arm64: mte: Add Memory Tagging Extension documentation To: Catalin Marinas , linux-arm-kernel@lists.infradead.org Cc: Will Deacon , Vincenzo Frascino , Szabolcs Nagy , Richard Earnshaw , Andrey Konovalov , Peter Collingbourne , linux-mm@kvack.org, linux-arch@vger.kernel.org References: <20200226180526.3272848-1-catalin.marinas@arm.com> <20200226180526.3272848-20-catalin.marinas@arm.com> From: Kevin Brodsky Message-ID: Date: Mon, 9 Mar 2020 14:30:12 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 In-Reply-To: <20200226180526.3272848-20-catalin.marinas@arm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 26/02/2020 18:05, Catalin Marinas wrote: > From: Vincenzo Frascino > > Memory Tagging Extension (part of the ARMv8.5 Extensions) provides > a mechanism to detect the sources of memory related errors which > may be vulnerable to exploitation, including bounds violations, > use-after-free, use-after-return, use-out-of-scope and use before > initialization errors. > > Add Memory Tagging Extension documentation for the arm64 linux > kernel support. > > Signed-off-by: Vincenzo Frascino > Co-developed-by: Catalin Marinas > Signed-off-by: Catalin Marinas > --- > > Notes: > v2: > - Documented the uaccess kernel tag checking mode. > - Removed the BTI definitions from cpu-feature-registers.rst. > - Removed the paragraph stating that MTE depends on the tagged add= ress > ABI (while the Kconfig entry does, there is no requirement for t= he > user to enable both). > - Changed the GCR_EL1.Exclude handling description following the c= hange > in the prctl() interface (include vs exclude mask). > - Updated the example code. > > Documentation/arm64/cpu-feature-registers.rst | 2 + > Documentation/arm64/elf_hwcaps.rst | 5 + > Documentation/arm64/index.rst | 1 + > .../arm64/memory-tagging-extension.rst | 228 +++++++++++++++++= + > 4 files changed, 236 insertions(+) > create mode 100644 Documentation/arm64/memory-tagging-extension.rst > > diff --git a/Documentation/arm64/cpu-feature-registers.rst b/Documentat= ion/arm64/cpu-feature-registers.rst > index 41937a8091aa..b5679fa85ad9 100644 > --- a/Documentation/arm64/cpu-feature-registers.rst > +++ b/Documentation/arm64/cpu-feature-registers.rst > @@ -174,6 +174,8 @@ infrastructure: > +------------------------------+---------+---------+ > | Name | bits | visible | > +------------------------------+---------+---------+ > + | MTE | [11-8] | y | > + +------------------------------+---------+---------+ > | SSBS | [7-4] | y | > +------------------------------+---------+---------+ > =20 > diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/e= lf_hwcaps.rst > index 7dfb97dfe416..ca7f90e99e3a 100644 > --- a/Documentation/arm64/elf_hwcaps.rst > +++ b/Documentation/arm64/elf_hwcaps.rst > @@ -236,6 +236,11 @@ HWCAP2_RNG > =20 > Functionality implied by ID_AA64ISAR0_EL1.RNDR =3D=3D 0b0001. > =20 > +HWCAP2_MTE > + > + Functionality implied by ID_AA64PFR1_EL1.MTE =3D=3D 0b0010, as des= cribed > + by Documentation/arm64/memory-tagging-extension.rst. > + > 4. Unused AT_HWCAP bits > ----------------------- > =20 > diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.= rst > index 5c0c69dc58aa..82970c6d384f 100644 > --- a/Documentation/arm64/index.rst > +++ b/Documentation/arm64/index.rst > @@ -13,6 +13,7 @@ ARM64 Architecture > hugetlbpage > legacy_instructions > memory > + memory-tagging-extension > pointer-authentication > silicon-errata > sve > diff --git a/Documentation/arm64/memory-tagging-extension.rst b/Documen= tation/arm64/memory-tagging-extension.rst > new file mode 100644 > index 000000000000..00ac0e22d5e9 > --- /dev/null > +++ b/Documentation/arm64/memory-tagging-extension.rst > @@ -0,0 +1,228 @@ > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > +Memory Tagging Extension (MTE) in AArch64 Linux > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > + > +Authors: Vincenzo Frascino > + Catalin Marinas > + > +Date: 2020-02-25 > + > +This document describes the provision of the Memory Tagging Extension > +functionality in AArch64 Linux. > + > +Introduction > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > + > +ARMv8.5 based processors introduce the Memory Tagging Extension (MTE) > +feature. MTE is built on top of the ARMv8.0 virtual address tagging TB= I > +(Top Byte Ignore) feature and allows software to access a 4-bit > +allocation tag for each 16-byte granule in the physical address space. > +Such memory range must be mapped with the Normal-Tagged memory > +attribute. A logical tag is derived from bits 59-56 of the virtual > +address used for the memory access. A CPU with MTE enabled will compar= e > +the logical tag against the allocation tag and potentially raise an > +exception on mismatch, subject to system registers configuration. > + > +Userspace Support > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > + > +When ``CONFIG_ARM64_MTE`` is selected and Memory Tagging Extension is > +supported by the hardware, the kernel advertises the feature to > +userspace via ``HWCAP2_MTE``. > + > +PROT_MTE > +-------- > + > +To access the allocation tags, a user process must enable the Tagged > +memory attribute on an address range using a new ``prot`` flag for > +``mmap()`` and ``mprotect()``: > + > +``PROT_MTE`` - Pages allow access to the MTE allocation tags. > + > +The allocation tag is set to 0 when such pages are first mapped in the > +user address space and preserved on copy-on-write. ``MAP_SHARED`` is > +supported and the allocation tags can be shared between processes. > + > +**Note**: ``PROT_MTE`` is only supported on ``MAP_ANONYMOUS`` and > +RAM-based file mappings (``tmpfs``, ``memfd``). Passing it to other > +types of mapping will result in ``-EINVAL`` returned by these system > +calls. > + > +**Note**: The ``PROT_MTE`` flag (and corresponding memory type) cannot > +be cleared by ``mprotect()``. > + > +Tag Check Faults > +---------------- > + > +When ``PROT_MTE`` is enabled on an address range and a mismatch betwee= n > +the logical and allocation tags occurs on access, there are three > +configurable behaviours: > + > +- *Ignore* - This is the default mode. The CPU (and kernel) ignores th= e > + tag check fault. > + > +- *Synchronous* - The kernel raises a ``SIGSEGV`` synchronously, with > + ``.si_code =3D SEGV_MTESERR`` and ``.si_addr =3D ``. = The > + memory access is not performed. > + > +- *Asynchronous* - The kernel raises a ``SIGSEGV``, in the current > + thread, asynchronously following one or multiple tag check faults, > + with ``.si_code =3D SEGV_MTEAERR`` and ``.si_addr =3D 0``. > + > +**Note**: There are no *match-all* logical tags available for user > +applications. > + > +The user can select the above modes, per thread, using the > +``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where > +``flags`` contain one of the following values in the ``PR_MTE_TCF_MASK= `` > +bit-field: > + > +- ``PR_MTE_TCF_NONE`` - *Ignore* tag check faults > +- ``PR_MTE_TCF_SYNC`` - *Synchronous* tag check fault mode > +- ``PR_MTE_TCF_ASYNC`` - *Asynchronous* tag check fault mode > + > +Tag checking can also be disabled for a user thread by setting the > +``PSTATE.TCO`` bit with ``MSR TCO, #1``. > + > +**Note**: Signal handlers are always invoked with ``PSTATE.TCO =3D 0``= , > +irrespective of the interrupted context. > + > +**Note**: Kernel accesses to user memory (e.g. ``read()`` system call) > +follow the same tag checking mode as set by the current thread. > + > +Excluding Tags in the ``IRG``, ``ADDG`` and ``SUBG`` instructions > +----------------------------------------------------------------- > + > +The architecture allows excluding certain tags to be randomly generate= d > +via the ``GCR_EL1.Exclude`` register bit-field. By default, Linux > +excludes all tags other than 0. A user thread can enable specific tags > +in the randomly generated set using the ``prctl(PR_SET_TAGGED_ADDR_CTR= L, > +flags, 0, 0, 0)`` system call where ``flags`` contains the tags bitmap > +in the ``PR_MTE_TAG_MASK`` bit-field. > + > +**Note**: The hardware uses an exclude mask but the ``prctl()`` > +interface provides an include mask. Maybe it's worth mentioning that a tag mask of 0x0, or equivalently an ex= clusion mask=20 of=C2=A0 0xFFFF, results in the generated tag being always 0. This is not= very clear even=20 in the Arm ARM, where this is only specified in the pseudocode (I shall r= eport this=20 internally). Kevin > + > +Example of correct usage > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > + > +*MTE Example code* > + > +.. code-block:: c > + > + /* > + * To be compiled with -march=3Darmv8.5-a+memtag > + */ > + #include > + #include > + #include > + #include > + #include > + #include > + #include > + > + /* > + * From arch/arm64/include/uapi/asm/hwcap.h > + */ > + #define HWCAP2_MTE (1 << 18) > + > + /* > + * From arch/arm64/include/uapi/asm/mman.h > + */ > + #define PROT_MTE 0x20 > + > + /* > + * From include/uapi/linux/prctl.h > + */ > + #define PR_SET_TAGGED_ADDR_CTRL 55 > + #define PR_GET_TAGGED_ADDR_CTRL 56 > + # define PR_TAGGED_ADDR_ENABLE (1UL << 0) > + # define PR_MTE_TCF_SHIFT 1 > + # define PR_MTE_TCF_NONE (0UL << PR_MTE_TCF_SHIFT) > + # define PR_MTE_TCF_SYNC (1UL << PR_MTE_TCF_SHIFT) > + # define PR_MTE_TCF_ASYNC (2UL << PR_MTE_TCF_SHIFT) > + # define PR_MTE_TCF_MASK (3UL << PR_MTE_TCF_SHIFT) > + # define PR_MTE_TAG_SHIFT 3 > + # define PR_MTE_TAG_MASK (0xffffUL << PR_MTE_TAG_SHIFT) > + > + /* > + * Insert a random logical tag into the given pointer. > + */ > + #define insert_random_tag(ptr) ({ \ > + __u64 __val; \ > + asm("irg %0, %1" : "=3Dr" (__val) : "r" (ptr)); \ > + __val; \ > + }) > + > + /* > + * Set the allocation tag on the destination address. > + */ > + #define set_tag(tagged_addr) do { = \ > + asm volatile("stg %0, [%0]" : : "r" (tagged_addr) : "memor= y"); \ > + } while (0) > + > + int main() > + { > + unsigned long *a; > + unsigned long page_sz =3D getpagesize(); > + unsigned long hwcap2 =3D getauxval(AT_HWCAP2); > + > + /* check if MTE is present */ > + if (!(hwcap2 & HWCAP2_MTE)) > + return -1; > + > + /* > + * Enable the tagged address ABI, synchronous MTE tag chec= k faults and > + * allow all non-zero tags in the randomly generated set. > + */ > + if (prctl(PR_SET_TAGGED_ADDR_CTRL, > + PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_SYNC | (0xfff= e << PR_MTE_TAG_SHIFT), > + 0, 0, 0)) { > + perror("prctl() failed"); > + return -1; > + } > + > + a =3D mmap(0, page_sz, PROT_READ | PROT_WRITE, > + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > + if (a =3D=3D MAP_FAILED) { > + perror("mmap() failed"); > + return -1; > + } > + > + /* > + * Enable MTE on the above anonymous mmap. The flag could = be passed > + * directly to mmap() and skip this step. > + */ > + if (mprotect(a, page_sz, PROT_READ | PROT_WRITE | PROT_MTE= )) { > + perror("mprotect() failed"); > + return -1; > + } > + > + /* access with the default tag (0) */ > + a[0] =3D 1; > + a[1] =3D 2; > + > + printf("a[0] =3D %lu a[1] =3D %lu\n", a[0], a[1]); > + > + /* set the logical and allocation tags */ > + a =3D (unsigned long *)insert_random_tag(a); > + set_tag(a); > + > + printf("%p\n", a); > + > + /* non-zero tag access */ > + a[0] =3D 3; > + printf("a[0] =3D %lu a[1] =3D %lu\n", a[0], a[1]); > + > + /* > + * If MTE is enabled correctly the next instruction will g= enerate an > + * exception. > + */ > + printf("Expecting SIGSEGV...\n"); > + a[2] =3D 0xdead; > + > + /* this should not be printed in the PR_MTE_TCF_SYNC mode = */ > + printf("...done\n"); > + > + return 0; > + }