From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 317B7C43603 for ; Thu, 12 Dec 2019 15:17:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BAC5A21655 for ; Thu, 12 Dec 2019 15:17:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=axtens.net header.i=@axtens.net header.b="YUuQUMd5" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BAC5A21655 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=axtens.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 433C48E0009; Thu, 12 Dec 2019 10:17:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3BC388E0001; Thu, 12 Dec 2019 10:17:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 237B18E0009; Thu, 12 Dec 2019 10:17:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0222.hostedemail.com [216.40.44.222]) by kanga.kvack.org (Postfix) with ESMTP id 07D8C8E0001 for ; Thu, 12 Dec 2019 10:17:18 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id B41494DCA for ; Thu, 12 Dec 2019 15:17:17 +0000 (UTC) X-FDA: 76256842914.27.dolls53_79c3f17d6a037 X-HE-Tag: dolls53_79c3f17d6a037 X-Filterd-Recvd-Size: 29676 Received: from mail-pj1-f65.google.com (mail-pj1-f65.google.com [209.85.216.65]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Thu, 12 Dec 2019 15:17:16 +0000 (UTC) Received: by mail-pj1-f65.google.com with SMTP id s35so1156236pjb.7 for ; Thu, 12 Dec 2019 07:17:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=axtens.net; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tmJHdXlJjhLweSaxrfPq2cQDIdmOIjj5LpdPinOKbRU=; b=YUuQUMd5GbTljLXnTeHmoTU3iVgzq9q1QGBfSi+MrjZtsq/OTKKjd5n5994+gAl45v gxLzF94sa2Qd9NYaV3wo3BFuVseVjUEUbinOJSKLZr/gYYBNM5JO56tf7Eywgg2/mapJ Tb4y+OUQfvWz2oH62f4M2EZSt4gnH9FHCeDo0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tmJHdXlJjhLweSaxrfPq2cQDIdmOIjj5LpdPinOKbRU=; b=XpSZQ8x2e9KQEuJvhFTsva8D6jMMetyua0Mz1bynYC5b3itgP7gYgDNFSjlJtVHWGs vDFX0VJQnQJz6Nn9HHSlJ96sBUoWofLYn1ofxKaF20Er9B7O2SWFbPEapvGhQNgGgwvK mPeVtZMhicLtwIDtfaFBz6wb3suxok2ECwYWgJ473pSQOlWOfAp5buq3aGLp8PH5TMRz MFc2vRI0YkzEp+qlY4lUVUhCJEiecXWnoTCHU0WNoFDDZvE+VJ5Orgp2kRdCiZ/A/AQO u1Z6bFdKrKylp5AI9BAOt+82UO9/r4mYCeCBuumDEz+YQAbRGIpNWm8LB6p/cE4wHotC M8SA== X-Gm-Message-State: APjAAAXb/Qfq5BDy6TpENPWPUDfCyrveUkNJAUUG+cd7NVhneYo9KmHw YZG2np0u1YrtkY/s2OCqCwvT0A== X-Google-Smtp-Source: APXvYqysfSeYFyBskapHtBnD5mgPIx/O4v11PcWc7As6jiLRyaolLZ5Kq09iRWpcb9zUKs8IbHQ4TQ== X-Received: by 2002:a17:90a:8c02:: with SMTP id a2mr10442822pjo.79.1576163835041; Thu, 12 Dec 2019 07:17:15 -0800 (PST) Received: from localhost (2001-44b8-1113-6700-b116-2689-a4a9-76f8.static.ipv6.internode.on.net. [2001:44b8:1113:6700:b116:2689:a4a9:76f8]) by smtp.gmail.com with ESMTPSA id r68sm8359089pfr.78.2019.12.12.07.17.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Dec 2019 07:17:13 -0800 (PST) From: Daniel Axtens To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, kasan-dev@googlegroups.com, christophe.leroy@c-s.fr, aneesh.kumar@linux.ibm.com, bsingharora@gmail.com Cc: Daniel Axtens , Michael Ellerman Subject: [PATCH v3 3/3] powerpc: Book3S 64-bit "heavyweight" KASAN support Date: Fri, 13 Dec 2019 02:16:56 +1100 Message-Id: <20191212151656.26151-4-dja@axtens.net> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191212151656.26151-1-dja@axtens.net> References: <20191212151656.26151-1-dja@axtens.net> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: KASAN support on Book3S is a bit tricky to get right: - It would be good to support inline instrumentation so as to be able to catch stack issues that cannot be caught with outline mode. - Inline instrumentation requires a fixed offset. - Book3S runs code in real mode after booting. Most notably a lot of KVM runs in real mode, and it would be good to be able to instrument it. - Because code runs in real mode after boot, the offset has to point to valid memory both in and out of real mode. [For those not immersed in ppc64, in real mode, the top nibble or 2 bi= ts (depending on radix/hash mmu) of the address is ignored. The linear mapping is placed at 0xc000000000000000. This means that a pointer to part of the linear mapping will work both in real mode, where it will = be interpreted as a physical address of the form 0x000..., and out of rea= l mode, where it will go via the linear mapping.] One approach is just to give up on inline instrumentation. This way all checks can be delayed until after everything set is up correctly, and the address-to-shadow calculations can be overridden. However, the features a= nd speed boost provided by inline instrumentation are worth trying to do better. If _at compile time_ it is known how much contiguous physical memory a system has, the top 1/8th of the first block of physical memory can be se= t aside for the shadow. This is a big hammer and comes with 3 big consequences: - there's no nice way to handle physically discontiguous memory, so only the first physical memory block can be used. - kernels will simply fail to boot on machines with less memory than specified when compiling. - kernels running on machines with more memory than specified when compiling will simply ignore the extra memory. Implement and document KASAN this way. The current implementation is Radi= x only. Despite the limitations, it can still find bugs, e.g. http://patchwork.ozlabs.org/patch/1103775/ At the moment, this physical memory limit must be set _even for outline mode_. This may be changed in a later series - a different implementation could be added for outline mode that dynamically allocates shadow at a fixed offset. For example, see https://patchwork.ozlabs.org/patch/795211/ Suggested-by: Michael Ellerman Cc: Balbir Singh # ppc64 out-of-line radix versio= n Cc: Christophe Leroy # ppc32 version Signed-off-by: Daniel Axtens --- Changes since v2: - Address feedback from Christophe around cleanups and docs. - Address feedback from Balbir: at this point I don't have a good soluti= on for the issues you identify around the limitations of the inline imple= mentation but I think that it's worth trying to get the stack instrumentation su= pport. I'm happy to have an alternative and more flexible outline mode - I ha= d envisoned this would be called 'lightweight' mode as it imposes fewer = restrictions. I've linked to your implementation. I think it's best to add it in a f= ollow-up series. - Made the default PHYS_MEM_SIZE_FOR_KASAN value 1024MB. I think most pe= ople have guests with at least that much memory in the Radix 64s case so it's a = much saner default - it means that if you just turn on KASAN without readin= g the docs you're much more likely to have a bootable kernel, which you will= never have if the value is set to zero! I'm happy to bikeshed the value if w= e want. Changes since v1: - Landed kasan vmalloc support upstream - Lots of feedback from Christophe. Changes since the rfc: - Boots real and virtual hardware, kvm works. - disabled reporting when we're checking the stack for exception frames. The behaviour isn't wrong, just incompatible with KASAN. - Documentation! - Dropped old module stuff in favour of KASAN_VMALLOC. The bugs with ftrace and kuap were due to kernel bloat pushing prom_init calls to be done via the plt. Because we did not have a relocatable kernel, and they are done very early, this caused everything to explode. Compile with CONFIG_RELOCATABLE! --- Documentation/dev-tools/kasan.rst | 8 +- Documentation/powerpc/kasan.txt | 112 +++++++++++++++++- arch/powerpc/Kconfig | 3 + arch/powerpc/Kconfig.debug | 21 ++++ arch/powerpc/Makefile | 11 ++ arch/powerpc/include/asm/book3s/64/hash.h | 4 + arch/powerpc/include/asm/book3s/64/pgtable.h | 7 ++ arch/powerpc/include/asm/book3s/64/radix.h | 5 + arch/powerpc/include/asm/kasan.h | 21 +++- arch/powerpc/kernel/process.c | 8 ++ arch/powerpc/kernel/prom.c | 64 +++++++++- arch/powerpc/mm/kasan/Makefile | 3 +- .../mm/kasan/{kasan_init_32.c =3D> init_32.c} | 0 arch/powerpc/mm/kasan/init_book3s_64.c | 72 +++++++++++ 14 files changed, 330 insertions(+), 9 deletions(-) rename arch/powerpc/mm/kasan/{kasan_init_32.c =3D> init_32.c} (100%) create mode 100644 arch/powerpc/mm/kasan/init_book3s_64.c diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/= kasan.rst index 4af2b5d2c9b4..d99dc580bc11 100644 --- a/Documentation/dev-tools/kasan.rst +++ b/Documentation/dev-tools/kasan.rst @@ -22,8 +22,9 @@ global variables yet. Tag-based KASAN is only supported in Clang and requires version 7.0.0 or= later. =20 Currently generic KASAN is supported for the x86_64, arm64, xtensa and s= 390 -architectures. It is also supported on 32-bit powerpc kernels. Tag-based= KASAN -is supported only on arm64. +architectures. It is also supported on powerpc, for 32-bit kernels, and = for +64-bit kernels running under the Radix MMU. Tag-based KASAN is supported= only +on arm64. =20 Usage ----- @@ -256,7 +257,8 @@ CONFIG_KASAN_VMALLOC ~~~~~~~~~~~~~~~~~~~~ =20 With ``CONFIG_KASAN_VMALLOC``, KASAN can cover vmalloc space at the -cost of greater memory usage. Currently this is only supported on x86. +cost of greater memory usage. Currently this is optional on x86, and +required on 64-bit powerpc. =20 This works by hooking into vmalloc and vmap, and dynamically allocating real shadow memory to back the mappings. diff --git a/Documentation/powerpc/kasan.txt b/Documentation/powerpc/kasa= n.txt index a85ce2ff8244..f134a91600ad 100644 --- a/Documentation/powerpc/kasan.txt +++ b/Documentation/powerpc/kasan.txt @@ -1,4 +1,4 @@ -KASAN is supported on powerpc on 32-bit only. +KASAN is supported on powerpc on 32-bit and Radix 64-bit only. =20 32 bit support =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D @@ -10,3 +10,113 @@ fixmap area and occupies one eighth of the total kern= el virtual memory space. =20 Instrumentation of the vmalloc area is not currently supported, but modu= les are. + +64 bit support +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Currently, only the radix MMU is supported. There have been versions for= Book3E +processors floating around on the mailing list, but nothing has been mer= ged. + +KASAN support on Book3S is a bit tricky to get right: + + - It would be good to support inline instrumentation so as to be able t= o catch + stack issues that cannot be caught with outline mode. + + - Inline instrumentation requires a fixed offset. + + - Book3S runs code in real mode after booting. Most notably a lot of KV= M runs + in real mode, and it would be good to be able to instrument it. + + - Because code runs in real mode after boot, the offset has to point to + valid memory both in and out of real mode. + +One approach is just to give up on inline instrumentation. This way all = checks +can be delayed until after everything set is up correctly, and the +address-to-shadow calculations can be overridden. However, the features = and +speed boost provided by inline instrumentation are worth trying to do be= tter. + +If _at compile time_ it is known how much contiguous physical memory a s= ystem +has, the top 1/8th of the first block of physical memory can be set asid= e for +the shadow. This is a big hammer and comes with 3 big consequences: + + - there's no nice way to handle physically discontiguous memory, so onl= y the + first physical memory block can be used. + + - kernels will simply fail to boot on machines with less memory than sp= ecified + when compiling. + + - kernels running on machines with more memory than specified when comp= iling + will simply ignore the extra memory. + +At the moment, this physical memory limit must be set _even for outline = mode_. +This may be changed in a future version - a different implementation cou= ld be +added for outline mode that dynamically allocates shadow at a fixed offs= et. +For example, see https://patchwork.ozlabs.org/patch/795211/ + +This value is configured in CONFIG_PHYS_MEM_SIZE_FOR_KASAN. + +Tips +---- + + - Compile with CONFIG_RELOCATABLE. + + In development, boot hangs were observed when building with ftrace an= d KUAP + on. These ended up being due to kernel bloat pushing prom_init calls = to be + done via the PLT. Because the kernel was not relocatable, and the cal= ls are + done very early, this caused execution to jump off into somewhere + invalid. Enabling relocation fixes this. + +NUMA/discontiguous physical memory +---------------------------------- + +Currently the code cannot really deal with discontiguous physical memory= . Only +physical memory that is contiguous from physical address zero can be use= d. The +size of that memory, not total memory, must be specified when configurin= g the +kernel. + +Discontiguous memory can occur on machines with memory spread across mul= tiple +nodes. For example, on a Talos II with 64GB of RAM: + + - 32GB runs from 0x0 to 0x0000_0008_0000_0000, + - then there's a gap, + - then the final 32GB runs from 0x0000_2000_0000_0000 to 0x0000_2008_00= 00_0000 + +This can create _significant_ issues: + + - If the machine is treated as having 64GB of _contiguous_ RAM, the + instrumentation would assume that it ran from 0x0 to + 0x0000_0010_0000_0000. The last 1/8th - 0x0000_000e_0000_0000 to + 0x0000_0010_0000_0000 would be reserved as the shadow region. But whe= n the + kernel tried to access any of that, it would be trying to access page= s that + are not physically present. + + - If the shadow region size is based on the top address, then the shado= w + region would be 0x2008_0000_0000 / 8 =3D 0x0401_0000_0000 bytes =3D 4= 100 GB of + memory, clearly more than the 64GB of RAM physically present. + +Therefore, the code currently is restricted to dealing with memory in th= e node +starting at 0x0. For this system, that's 32GB. If a contiguous physical = memory +size greater than the size of the first contiguous region of memory is +specified, the system will be unable to boot or even print an error mess= age. + +The layout of a system's memory can be observed in the messages that the= Radix +MMU prints on boot. The Talos II discussed earlier has: + +radix-mmu: Mapped 0x0000000000000000-0x0000000040000000 with 1.00 GiB pa= ges (exec) +radix-mmu: Mapped 0x0000000040000000-0x0000000800000000 with 1.00 GiB pa= ges +radix-mmu: Mapped 0x0000200000000000-0x0000200800000000 with 1.00 GiB pa= ges + +As discussed, this system would be configured for 32768 MB. + +Another system prints: + +radix-mmu: Mapped 0x0000000000000000-0x0000000040000000 with 1.00 GiB pa= ges (exec) +radix-mmu: Mapped 0x0000000040000000-0x0000002000000000 with 1.00 GiB pa= ges +radix-mmu: Mapped 0x0000200000000000-0x0000202000000000 with 1.00 GiB pa= ges + +This machine has more memory: 0x0000_0040_0000_0000 total, but only +0x0000_0020_0000_0000 is physically contiguous from zero, so it would be +configured for 131072 MB of physically contiguous memory. + +This restriction currently also affects outline mode, but this could be +changed in future if an alternative outline implementation is added. diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 6987b0832e5f..2561446e85a8 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -173,6 +173,9 @@ config PPC select HAVE_ARCH_HUGE_VMAP if PPC_BOOK3S_64 && PPC_RADIX_MMU select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_KASAN if PPC32 + select HAVE_ARCH_KASAN if PPC_BOOK3S_64 && PPC_RADIX_MMU + select HAVE_ARCH_KASAN_VMALLOC if PPC_BOOK3S_64 && PPC_RADIX_MMU + select KASAN_VMALLOC if KASAN && PPC_BOOK3S_64 select HAVE_ARCH_KGDB select HAVE_ARCH_MMAP_RND_BITS select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug index 4e1d39847462..5c454f8fa24b 100644 --- a/arch/powerpc/Kconfig.debug +++ b/arch/powerpc/Kconfig.debug @@ -394,6 +394,27 @@ config PPC_FAST_ENDIAN_SWITCH help If you're unsure what this is, say N. =20 +config PHYS_MEM_SIZE_FOR_KASAN + int "Contiguous physical memory size for KASAN (MB)" if KASAN && PPC_BO= OK3S_64 + default 1024 + help + + To get inline instrumentation support for KASAN on 64-bit Book3S + machines, you need to know how much contiguous physical memory your + system has. A shadow offset will be calculated based on this figure, + which will be compiled in to the kernel. KASAN will use this offset + to access its shadow region, which is used to verify memory accesses. + + If you attempt to boot on a system with less memory than you specify + here, your system will fail to boot very early in the process. If you + boot on a system with more memory than you specify, the extra memory + will wasted - it will be reserved and not used. + + For systems with discontiguous blocks of physical memory, specify the + size of the block starting at 0x0. You can determine this by looking + at the memory layout info printed to dmesg by the radix MMU code + early in boot. See Documentation/powerpc/kasan.txt. + config KASAN_SHADOW_OFFSET hex depends on KASAN diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index f35730548e42..eff693527462 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -230,6 +230,17 @@ ifdef CONFIG_476FPE_ERR46 -T $(srctree)/arch/powerpc/platforms/44x/ppc476_modules.lds endif =20 +ifdef CONFIG_PPC_BOOK3S_64 +# The KASAN shadow offset is such that linear map (0xc000...) is shadowe= d by +# the last 8th of linearly mapped physical memory. This way, if the code= uses +# 0xc addresses throughout, accesses work both in in real mode (where th= e top +# 2 bits are ignored) and outside of real mode. +# +# 0xc000000000000000 >> 3 =3D 0xa800000000000000 =3D 1210567579837189324= 8 +KASAN_SHADOW_OFFSET =3D $(shell echo 7 \* 1024 \* 1024 \* $(CONFIG_PHYS_= MEM_SIZE_FOR_KASAN) / 8 + 12105675798371893248 | bc) +KBUILD_CFLAGS +=3D -DKASAN_SHADOW_OFFSET=3D$(KASAN_SHADOW_OFFSET)UL +endif + # No AltiVec or VSX instructions when building kernel KBUILD_CFLAGS +=3D $(call cc-option,-mno-altivec) KBUILD_CFLAGS +=3D $(call cc-option,-mno-vsx) diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/inc= lude/asm/book3s/64/hash.h index 2781ebf6add4..fce329b8452e 100644 --- a/arch/powerpc/include/asm/book3s/64/hash.h +++ b/arch/powerpc/include/asm/book3s/64/hash.h @@ -18,6 +18,10 @@ #include #endif =20 +#define H_PTRS_PER_PTE (1 << H_PTE_INDEX_SIZE) +#define H_PTRS_PER_PMD (1 << H_PMD_INDEX_SIZE) +#define H_PTRS_PER_PUD (1 << H_PUD_INDEX_SIZE) + /* Bits to set in a PMD/PUD/PGD entry valid bit*/ #define HASH_PMD_VAL_BITS (0x8000000000000000UL) #define HASH_PUD_VAL_BITS (0x8000000000000000UL) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/= include/asm/book3s/64/pgtable.h index b01624e5c467..209817235a44 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -231,6 +231,13 @@ extern unsigned long __pmd_frag_size_shift; #define PTRS_PER_PUD (1 << PUD_INDEX_SIZE) #define PTRS_PER_PGD (1 << PGD_INDEX_SIZE) =20 +#define MAX_PTRS_PER_PTE ((H_PTRS_PER_PTE > R_PTRS_PER_PTE) ? \ + H_PTRS_PER_PTE : R_PTRS_PER_PTE) +#define MAX_PTRS_PER_PMD ((H_PTRS_PER_PMD > R_PTRS_PER_PMD) ? \ + H_PTRS_PER_PMD : R_PTRS_PER_PMD) +#define MAX_PTRS_PER_PUD ((H_PTRS_PER_PUD > R_PTRS_PER_PUD) ? \ + H_PTRS_PER_PUD : R_PTRS_PER_PUD) + /* PMD_SHIFT determines what a second-level page table entry can map */ #define PMD_SHIFT (PAGE_SHIFT + PTE_INDEX_SIZE) #define PMD_SIZE (1UL << PMD_SHIFT) diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/in= clude/asm/book3s/64/radix.h index d97db3ad9aae..4f826259de71 100644 --- a/arch/powerpc/include/asm/book3s/64/radix.h +++ b/arch/powerpc/include/asm/book3s/64/radix.h @@ -35,6 +35,11 @@ #define RADIX_PMD_SHIFT (PAGE_SHIFT + RADIX_PTE_INDEX_SIZE) #define RADIX_PUD_SHIFT (RADIX_PMD_SHIFT + RADIX_PMD_INDEX_SIZE) #define RADIX_PGD_SHIFT (RADIX_PUD_SHIFT + RADIX_PUD_INDEX_SIZE) + +#define R_PTRS_PER_PTE (1 << RADIX_PTE_INDEX_SIZE) +#define R_PTRS_PER_PMD (1 << RADIX_PMD_INDEX_SIZE) +#define R_PTRS_PER_PUD (1 << RADIX_PUD_INDEX_SIZE) + /* * Size of EA range mapped by our pagetables. */ diff --git a/arch/powerpc/include/asm/kasan.h b/arch/powerpc/include/asm/= kasan.h index 296e51c2f066..f18268cbdc33 100644 --- a/arch/powerpc/include/asm/kasan.h +++ b/arch/powerpc/include/asm/kasan.h @@ -2,6 +2,9 @@ #ifndef __ASM_KASAN_H #define __ASM_KASAN_H =20 +#include +#include + #ifdef CONFIG_KASAN #define _GLOBAL_KASAN(fn) _GLOBAL(__##fn) #define _GLOBAL_TOC_KASAN(fn) _GLOBAL_TOC(__##fn) @@ -14,13 +17,19 @@ =20 #ifndef __ASSEMBLY__ =20 -#include +#ifdef CONFIG_KASAN +void kasan_init(void); +#else +static inline void kasan_init(void) { } +#endif =20 #define KASAN_SHADOW_SCALE_SHIFT 3 =20 #define KASAN_SHADOW_START (KASAN_SHADOW_OFFSET + \ (PAGE_OFFSET >> KASAN_SHADOW_SCALE_SHIFT)) =20 +#ifdef CONFIG_PPC32 + #define KASAN_SHADOW_OFFSET ASM_CONST(CONFIG_KASAN_SHADOW_OFFSET) =20 #define KASAN_SHADOW_END 0UL @@ -30,11 +39,17 @@ #ifdef CONFIG_KASAN void kasan_early_init(void); void kasan_mmu_init(void); -void kasan_init(void); #else -static inline void kasan_init(void) { } static inline void kasan_mmu_init(void) { } #endif +#endif + +#ifdef CONFIG_PPC_BOOK3S_64 + +#define KASAN_SHADOW_SIZE ((u64)CONFIG_PHYS_MEM_SIZE_FOR_KASAN * \ + 1024 * 1024 * 1 / 8) + +#endif /* CONFIG_PPC_BOOK3S_64 */ =20 #endif /* __ASSEMBLY */ #endif diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.= c index 4df94b6e2f32..c60ff299f39b 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -2081,7 +2081,14 @@ void show_stack(struct task_struct *tsk, unsigned = long *stack) /* * See if this is an exception frame. * We look for the "regshere" marker in the current frame. + * + * KASAN may complain about this. If it is an exception frame, + * we won't have unpoisoned the stack in asm when we set the + * exception marker. If it's not an exception frame, who knows + * how things are laid out - the shadow could be in any state + * at all. Just disable KASAN reporting for now. */ + kasan_disable_current(); if (validate_sp(sp, tsk, STACK_INT_FRAME_SIZE) && stack[STACK_FRAME_MARKER] =3D=3D STACK_FRAME_REGS_MARKER) { struct pt_regs *regs =3D (struct pt_regs *) @@ -2091,6 +2098,7 @@ void show_stack(struct task_struct *tsk, unsigned l= ong *stack) regs->trap, (void *)regs->nip, (void *)lr); firstframe =3D 1; } + kasan_enable_current(); =20 sp =3D newsp; } while (count++ < kstack_depth_to_print); diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index 6620f37abe73..d994c7c39c8d 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -72,6 +72,7 @@ unsigned long tce_alloc_start, tce_alloc_end; u64 ppc64_rma_size; #endif static phys_addr_t first_memblock_size; +static phys_addr_t top_phys_addr; static int __initdata boot_cpu_count; =20 static int __init early_parse_mem(char *p) @@ -449,6 +450,26 @@ static bool validate_mem_limit(u64 base, u64 *size) { u64 max_mem =3D 1UL << (MAX_PHYSMEM_BITS); =20 + /* + * To handle the NUMA/discontiguous memory case, don't allow a block + * to be added if it falls completely beyond the configured physical + * memory. Print an informational message. + * + * Frustratingly we also see this with qemu - it seems to split the + * specified memory into a number of smaller blocks. If this happens + * under qemu, it probably represents misconfiguration. So we want + * the message to be noticeable, but not shouty. + * + * See Documentation/powerpc/kasan.txt + */ + if (IS_ENABLED(CONFIG_KASAN) && + (base >=3D ((u64)CONFIG_PHYS_MEM_SIZE_FOR_KASAN << 20))) { + pr_warn("KASAN: not adding memory block at %llx (size %llx)\n" + "This could be due to discontiguous memory or kernel misconfiguration= .", + base, *size); + return false; + } + if (base >=3D max_mem) return false; if ((base + *size) > max_mem) @@ -572,8 +593,11 @@ void __init early_init_dt_add_memory_arch(u64 base, = u64 size) =20 /* Add the chunk to the MEMBLOCK list */ if (add_mem_to_memblock) { - if (validate_mem_limit(base, &size)) + if (validate_mem_limit(base, &size)) { memblock_add(base, size); + if (base + size > top_phys_addr) + top_phys_addr =3D base + size; + } } } =20 @@ -613,6 +637,8 @@ static void __init early_reserve_mem_dt(void) static void __init early_reserve_mem(void) { __be64 *reserve_map; + phys_addr_t kasan_shadow_start; + phys_addr_t kasan_memory_size; =20 reserve_map =3D (__be64 *)(((unsigned long)initial_boot_params) + fdt_off_mem_rsvmap(initial_boot_params)); @@ -651,6 +677,42 @@ static void __init early_reserve_mem(void) return; } #endif + + if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_PPC_BOOK3S_64)) { + kasan_memory_size =3D + ((phys_addr_t)CONFIG_PHYS_MEM_SIZE_FOR_KASAN << 20); + + if (top_phys_addr < kasan_memory_size) { + /* + * We are doomed. We shouldn't even be able to get this + * far, but we do in qemu. If we continue and turn + * relocations on, we'll take fatal page faults for + * memory that's not physically present. Instead, + * panic() here: it will be saved to __log_buf even if + * it doesn't get printed to the console. + */ + panic("Tried to book a KASAN kernel configured for %u MB with only %l= lu MB! Aborting.", + CONFIG_PHYS_MEM_SIZE_FOR_KASAN, + (u64)(top_phys_addr >> 20)); + } else if (top_phys_addr > kasan_memory_size) { + /* print a biiiig warning in hopes people notice */ + pr_err("=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D\n" + "Physical memory exceeds compiled-in maximum!\n" + "This kernel was compiled for KASAN with %u MB physical memory.\n" + "The physical memory detected is at least %llu MB.\n" + "Memory above the compiled limit will not be used!\n" + "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D\n", + CONFIG_PHYS_MEM_SIZE_FOR_KASAN, + (u64)(top_phys_addr >> 20)); + } + + kasan_shadow_start =3D _ALIGN_DOWN(kasan_memory_size * 7 / 8, + PAGE_SIZE); + DBG("reserving %llx -> %llx for KASAN", + kasan_shadow_start, top_phys_addr); + memblock_reserve(kasan_shadow_start, + top_phys_addr - kasan_shadow_start); + } } =20 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM diff --git a/arch/powerpc/mm/kasan/Makefile b/arch/powerpc/mm/kasan/Makef= ile index 6577897673dd..f02b15c78e4d 100644 --- a/arch/powerpc/mm/kasan/Makefile +++ b/arch/powerpc/mm/kasan/Makefile @@ -2,4 +2,5 @@ =20 KASAN_SANITIZE :=3D n =20 -obj-$(CONFIG_PPC32) +=3D kasan_init_32.o +obj-$(CONFIG_PPC32) +=3D init_32.o +obj-$(CONFIG_PPC_BOOK3S_64) +=3D init_book3s_64.o diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasa= n/init_32.c similarity index 100% rename from arch/powerpc/mm/kasan/kasan_init_32.c rename to arch/powerpc/mm/kasan/init_32.c diff --git a/arch/powerpc/mm/kasan/init_book3s_64.c b/arch/powerpc/mm/kas= an/init_book3s_64.c new file mode 100644 index 000000000000..f961e96be136 --- /dev/null +++ b/arch/powerpc/mm/kasan/init_book3s_64.c @@ -0,0 +1,72 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KASAN for 64-bit Book3S powerpc + * + * Copyright (C) 2019 IBM Corporation + * Author: Daniel Axtens + */ + +#define DISABLE_BRANCH_PROFILING + +#include +#include +#include +#include + +void __init kasan_init(void) +{ + int i; + void *k_start =3D kasan_mem_to_shadow((void *)RADIX_KERN_VIRT_START); + void *k_end =3D kasan_mem_to_shadow((void *)RADIX_VMEMMAP_END); + + pte_t pte =3D __pte(__pa(kasan_early_shadow_page) | + pgprot_val(PAGE_KERNEL) | _PAGE_PTE); + + if (!early_radix_enabled()) + panic("KASAN requires radix!"); + + for (i =3D 0; i < PTRS_PER_PTE; i++) + __set_pte_at(&init_mm, (unsigned long)kasan_early_shadow_page, + &kasan_early_shadow_pte[i], pte, 0); + + for (i =3D 0; i < PTRS_PER_PMD; i++) + pmd_populate_kernel(&init_mm, &kasan_early_shadow_pmd[i], + kasan_early_shadow_pte); + + for (i =3D 0; i < PTRS_PER_PUD; i++) + pud_populate(&init_mm, &kasan_early_shadow_pud[i], + kasan_early_shadow_pmd); + + memset(kasan_mem_to_shadow((void *)PAGE_OFFSET), KASAN_SHADOW_INIT, + KASAN_SHADOW_SIZE); + + kasan_populate_early_shadow( + kasan_mem_to_shadow((void *)RADIX_KERN_VIRT_START), + kasan_mem_to_shadow((void *)RADIX_VMALLOC_START)); + + /* leave a hole here for vmalloc */ + + kasan_populate_early_shadow( + kasan_mem_to_shadow((void *)RADIX_VMALLOC_END), + kasan_mem_to_shadow((void *)RADIX_VMEMMAP_END)); + + flush_tlb_kernel_range((unsigned long)k_start, (unsigned long)k_end); + + /* mark early shadow region as RO and wipe */ + pte =3D __pte(__pa(kasan_early_shadow_page) | + pgprot_val(PAGE_KERNEL_RO) | _PAGE_PTE); + for (i =3D 0; i < PTRS_PER_PTE; i++) + __set_pte_at(&init_mm, (unsigned long)kasan_early_shadow_page, + &kasan_early_shadow_pte[i], pte, 0); + + /* + * clear_page relies on some cache info that hasn't been set up yet. + * It ends up looping ~forever and blows up other data. + * Use memset instead. + */ + memset(kasan_early_shadow_page, 0, PAGE_SIZE); + + /* Enable error messages */ + init_task.kasan_depth =3D 0; + pr_info("KASAN init done (64-bit Book3S heavyweight mode)\n"); +} --=20 2.20.1