From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS, USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92E9CC43381 for ; Thu, 28 Feb 2019 10:36:13 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5DC402171F for ; Thu, 28 Feb 2019 10:36:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="HkTgCZ/I"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.b="jKCp+h4+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5DC402171F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Content-ID:In-Reply-To: References:Message-ID:Date:Subject:To:From:Reply-To:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=2Dxhaqr5E5AQomrvW5wlUVUw9mJYC0pNCZibi0ONoQ4=; b=HkTgCZ/Iz+466x 5e3F04XZbAcqKpWvh4Q8oSKcecuLObbCT6ucTz1yH+svv2mEW10SxZgTbzvvz9wYA6Jq9Jlcl38ZB zyOuM0P7BnFgfArsihjyqc2i9Dv5KIru6QvnAIazyoezqfgROMSiQPe/vVppmEFYdiDPo69JIU0cB byn3LpF1L2FeFcIjPdTMsb43QrM6HgBFg1vQTn6sLTSjfo3n3YVlgmw2wVKsX0fBn1ubBfw/NXReL WqYOTY8sLdb5/lqaGCLhsbiEbw5X6VdHIZ0O8/Z9CB6y5+euoyN9qOypEUTj/gsUD7hqev727TLL4 h7h4OxmF3hct4TVoT57w==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1gzJ2y-0002tc-EE; Thu, 28 Feb 2019 10:36:12 +0000 Received: from mail-he1eur04on0600.outbound.protection.outlook.com ([2a01:111:f400:fe0d::600] helo=EUR04-HE1-obe.outbound.protection.outlook.com) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1gzJ2t-0002sj-9G for linux-arm-kernel@lists.infradead.org; Thu, 28 Feb 2019 10:36:10 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector1-arm-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=E5JsK75S5mM0v6sy2fDEVnobcO1eQf9xVD4CzbRPzhk=; b=jKCp+h4+Z4DPlCjcN9cnSBSU52bFlUaqdtQ1HL5hr6AocG2OT4SHH6mW/b9VLeFf9ekBhL1UXjSUTf380+QXTsp8KZbGdb2/uItrD4pNT4pVSun/m+wNEcOXTQIi1+i9AP9hpPIGH5o46Qt0lBIhptIJWUHn9FSASCQqOhV2yrU= Received: from DB8PR08MB4105.eurprd08.prod.outlook.com (20.179.12.12) by DB8PR08MB4937.eurprd08.prod.outlook.com (10.255.4.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1665.15; Thu, 28 Feb 2019 10:35:57 +0000 Received: from DB8PR08MB4105.eurprd08.prod.outlook.com ([fe80::c84c:c3bf:a13e:423a]) by DB8PR08MB4105.eurprd08.prod.outlook.com ([fe80::c84c:c3bf:a13e:423a%4]) with mapi id 15.20.1643.019; Thu, 28 Feb 2019 10:35:57 +0000 From: Steve Capper To: Ard Biesheuvel Subject: Re: [PATCH 0/9] 52-bit kernel + user VAs Thread-Topic: [PATCH 0/9] 52-bit kernel + user VAs Thread-Index: AQHUx6w1oMh6myss70WSkOxz9gCxzKXnCbYAgAAJv4CAAAD2gIAAArwAgAAD2wCAAAuFAIAAJ5MAgAsUZICAAC7ZgIACgguA Date: Thu, 28 Feb 2019 10:35:57 +0000 Message-ID: <20190228103547.GA27721@capper-debian.cambridge.arm.com> References: <20190218170245.14915-1-steve.capper@arm.com> <20190219124825.GH8501@fuggles.cambridge.arm.com> <20190219130138.GI8501@fuggles.cambridge.arm.com> <20190219135640.GA15458@capper-debian.cambridge.arm.com> <20190226173007.GA1553@capper-debian.cambridge.arm.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Mutt/1.10.1 (2018-07-13) x-originating-ip: [217.140.96.140] x-clientproxiedby: DM5PR18CA0069.namprd18.prod.outlook.com (2603:10b6:3:22::31) To DB8PR08MB4105.eurprd08.prod.outlook.com (2603:10a6:10:b0::12) authentication-results: spf=none (sender IP is ) smtp.mailfrom=Steve.Capper@arm.com; x-ms-exchange-messagesentrepresentingtype: 1 x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 48e7a9c4-9b61-4bfc-54e4-08d69d6885d5 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600127)(711020)(4605104)(4618075)(2017052603328)(7153060)(7193020); SRVR:DB8PR08MB4937; x-ms-traffictypediagnostic: DB8PR08MB4937: x-ms-exchange-purlcount: 1 nodisclaimer: True x-microsoft-exchange-diagnostics: 1; DB8PR08MB4937; 20:SOCE26cRgCt8LM/Cno2nup4heXhxb5wwv34cUMd3+1pqR9/rqt8ADRI3XgqA0EfT3MDfdRhs881J12IduOU259Fo2TmRPBnf+fDt7WgCCzZ4KAImYoM15glwavnCN32G+y7VHlLRQm7fABFSZt1blI9DMrM8emKjXGhUaApZ4PU= x-microsoft-antispam-prvs: x-forefront-prvs: 0962D394D2 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(376002)(346002)(366004)(39860400002)(136003)(396003)(189003)(199004)(6506007)(386003)(966005)(478600001)(14454004)(76176011)(3846002)(8676002)(486006)(81166006)(81156014)(446003)(11346002)(6116002)(6486002)(72206003)(68736007)(6436002)(476003)(44832011)(186003)(26005)(8936002)(71190400001)(102836004)(86362001)(71200400001)(229853002)(54906003)(97736004)(58126008)(316002)(93886005)(33656002)(6246003)(5660300002)(6916009)(25786009)(14444005)(106356001)(256004)(4326008)(105586002)(2906002)(66066001)(7736002)(6306002)(6512007)(53936002)(1076003)(99286004)(30864003)(52116002)(305945005)(18370500001); DIR:OUT; SFP:1101; SCL:1; SRVR:DB8PR08MB4937; H:DB8PR08MB4105.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: qF08oU3eYGNK7aVmsZg3I0cc2ZeZKMSlJAJcXTn0pIbsZ+XooAb8zTsinLHbYJv67YKU5mQQgua+bfOgvh3926MwKYRcs3hBx6Hjl+818lju65l6bdgwa8E8yybI3XWAmfu8X/jwi7wIL3dLFwGYK/REdwT/SSJayAtsP31Sfd4r2rZ8+z/sJKLsprI3gmBpp6de5C0gXVLa+7IUGSvljt0A9gCwUo+g/FjPp8DkHunpSFAoQLWVClIjpgoHsgISR+UcIWToDgXkBd0dQsS8V8Ug85NcCQEc1DinbIy3pY/f8+43AY+6q3ekBVbv7PpaF48yG3qYnbAonxiZGMr8pqRgDYQtmiJp4UFcdOZuUgeK2hcWL3Oz96Fn2NGP7ieDWMKzAQCC2/y5nqkIRuHpM9v8FAIbF0TVhku6Qu6bpYw= Content-ID: <7E5A1A8D3E43534EBA62C58788B79F77@eurprd08.prod.outlook.com> MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 48e7a9c4-9b61-4bfc-54e4-08d69d6885d5 X-MS-Exchange-CrossTenant-originalarrivaltime: 28 Feb 2019 10:35:54.6179 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR08MB4937 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20190228_023607_689819_B438C383 X-CRM114-Status: GOOD ( 43.85 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "crecklin@redhat.com" , Marc Zyngier , Catalin Marinas , Will Deacon , nd , linux-arm-kernel Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, Feb 26, 2019 at 09:17:49PM +0100, Ard Biesheuvel wrote: > On Tue, 26 Feb 2019 at 18:30, Steve Capper wrote: > > > > On Tue, Feb 19, 2019 at 05:18:18PM +0100, Ard Biesheuvel wrote: > > > On Tue, 19 Feb 2019 at 14:56, Steve Capper wrote: > > > > > > > > On Tue, Feb 19, 2019 at 02:15:26PM +0100, Ard Biesheuvel wrote: > > > > > On Tue, 19 Feb 2019 at 14:01, Will Deacon wrote: > > > > > > > > > > > > On Tue, Feb 19, 2019 at 01:51:51PM +0100, Ard Biesheuvel wrote: > > > > > > > On Tue, 19 Feb 2019 at 13:48, Will Deacon wrote: > > > > > > > > > > > > > > > > On Tue, Feb 19, 2019 at 01:13:32PM +0100, Ard Biesheuvel wrote: > > > > > > > > > On Mon, 18 Feb 2019 at 18:05, Steve Capper wrote: > > > > > > > > > > > > > > > > > > > > This patch series adds support for 52-bit kernel VAs using some of the > > > > > > > > > > machinery already introduced by the 52-bit userspace VA code in 5.0. > > > > > > > > > > > > > > > > > > > > As 52-bit virtual address support is an optional hardware feature, > > > > > > > > > > software support for 52-bit kernel VAs needs to be deduced at early boot > > > > > > > > > > time. If HW support is not available, the kernel falls back to 48-bit. > > > > > > > > > > > > > > > > > > > > A significant proportion of this series focuses on "de-constifying" > > > > > > > > > > VA_BITS related constants. > > > > > > > > > > > > > > > > > > > > In order to allow for a KASAN shadow that changes size at boot time, one > > > > > > > > > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > > > > > > > > > > start address. Also, it is highly desirable to maintain the same > > > > > > > > > > function addresses in the kernel .text between VA sizes. Both of these > > > > > > > > > > requirements necessitate us to flip the kernel address space halves s.t. > > > > > > > > > > the direct linear map occupies the lower addresses. > > > > > > > > > > > > > > > > > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > > > > > > > > > > can add with some more #ifdef'ery if needed. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Steve, > > > > > > > > > > > > > > > > > > Apologies if I am bringing up things that have been addressed > > > > > > > > > internally already. We discussed the 52-bit kernel VA work at > > > > > > > > > plumber's at some point, and IIUC, KASAN is the complicating factor > > > > > > > > > when it comes to having compile time constants for VA_BITS_MIN, > > > > > > > > > VA_BITS_MAX and PAGE_OFFSET, right? > > > > > > > > > > > > > > > > > > To clarify what I mean, please refer to the diagram below, which > > > > > > > > > describes a hybrid 48/52 kernel VA arrangement that does not rely on > > > > > > > > > runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52) > > > > > > > > > > > > > > > > > > +------------------- (~0) -------------------------+ > > > > > > > > > | | > > > > > > > > > | PCI IO / fixmap spaces | > > > > > > > > > | | > > > > > > > > > +------------------------------------------------+ > > > > > > > > > | | > > > > > > > > > | kernel/vmalloc space | > > > > > > > > > | | > > > > > > > > > +------------------------------------------------+ > > > > > > > > > | | > > > > > > > > > | module space | > > > > > > > > > | | > > > > > > > > > +------------------------------------------------+ > > > > > > > > > | | > > > > > > > > > | BPF space | > > > > > > > > > | | > > > > > > > > > +------------------------------------------------+ > > > > > > > > > | | > > > > > > > > > | | > > > > > > > > > | vmemmap space (size based on VA_BITS_MAX) | > > > > > > > > > | | > > > > > > > > > | | > > > > > > > > > +-- linear/vmalloc split based on VA_BITS_MIN -- + > > > > > > > > > | | > > > > > > > > > | linear mapping (48 bit addressable region) | > > > > > > > > > | | > > > > > > > > > +------------------------------------------------+ > > > > > > > > > | | > > > > > > > > > | linear mapping (52 bit addressable region) | > > > > > > > > > | | > > > > > > > > > +------ PAGE_OFFSET based on VA_BITS_MAX --------+ > > > > > > > > > > > > > > > > > > Since KASAN is what is preventing this, would it be acceptable for > > > > > > > > > KASAN to only be supported when you use a true 48 bit or a true 52 bit > > > > > > > > > configuration, and disable it for the 48/52 hybrid configuration? > > > > > > > > > > > > > > > > > > Just thinking out loud (and in ASCII art :-)) > > > > > > > > > > > > > > > > TBH, if we end up having support for 52-bit kernel VA, I'd be inclined to > > > > > > > > drop the 48/52 configuration altogether. But Catalin's on holiday at the > > > > > > > > moment, and may have a different opinion ;) > > > > > > > > > > > > > > > > > > > > > > But that implies that you cannot have an image that supports 52-bit > > > > > > > kernel VAs but can still boot on hardware that does not implement > > > > > > > support for it. If that is acceptable, then none of this hoop jumping > > > > > > > that Steve is doing in these patches is necessary to begin with, > > > > > > > right? > > > > > > > > > > > > Sorry, I misunderstood what you meant by a "48/52 hybrid configuration". I > > > > > > thought you were referring to the configuration where userspace is 52-bit > > > > > > and the kernel is 48-bit, which is something I think we can drop if we gain > > > > > > support for 52-bit kernel. > > > > > > > > > > > > Now that I understand what you mean, I think disabling KASAN would be fine > > > > > > as long as it's a runtime thing and the kernel continues to work in every > > > > > > other respect. > > > > > > > > > > > > > > > > No, it would be a limitation of the 52-bit config which also supports > > > > > 48-bit-VA-only-h/w that the address space is laid out in such a way > > > > > that there is simply no room for the KASAN shadow region, since it > > > > > would have to live in the 48-bit addressable area, but be big enough > > > > > to cover 52 bits of VA, which is impossible. > > > > > > > > > > For the vmemmap space, we could live with sizing it statically to > > > > > cover a 52-bit VA linear region, but the KASAN shadow region is simply > > > > > too big. > > > > > > > > > > So if KASAN support in that configuration is a requirement, then I > > > > > agree with Steve's approach, but it does imply that quite a number of > > > > > formerly compile-time constants now get turned into runtime variables. > > > > > > > > > > Steve, do you have any idea what the impact of that is? > > > > > > > > Hi Guys, > > > > > > > > The KASAN region only really necessitates two things: 1) that we think > > > > about the end address of the region (which is invariant) rather than the > > > > start address; and that 2) we flip the kernel VA space. IIUC both these > > > > changes have a neglible perf impact. > > > > > > > > As for VA_BITS_ACTUAL, we need this in a few places: KVM mapping > > > > support, and the big one phys_to/from_virt. For phys_to/from_virt the > > > > logic is changed s.t. we use a variable lookup for translation but this > > > > is folded into a new variable physvirt_offset (before the patch we used > > > > a single variable read too). > > > > > > > > Again IIUC there should be a minimal perf impact (unless one tries to do > > > > cat /sys/kernel/debug/kernel_page_tables with KASAN enabled - but that > > > > can be optimised later). > > > > > > > > I didn't have the patience for ASCII art ;-), but I have a picture of > > > > what I think it looks like here: > > > > https://s3.amazonaws.com/connect.linaro.org/yvr18/presentations/yvr18-119.pdf > > > > What I've tried to do is have most parts of the kernel VA space > > > > invariant between 48/52 bits. If it's helpful I can type this up into a > > > > document/commit log message? > > > > > > > > For this series I have tried to introduce VA_BITS_MIN in its own patch > > > > and also VA_BITS_ACTUAL into its own patch to make it easier to follow. > > > > > > > > Hi Ard, > > > > Apologies for my late reply, I had been staring at this for a while. > > > > > > > > OK, perhaps I am just rephrasing what you essentially implemented > > > already, but let me try to explain a bit better what I mean: > > > > > > - we flip the VA space in the way you suggest > > > - we limit the size of the top half of the address space to 47 bits > > > - KASAN region growns downwards from (~0) << 47 > > > - we define PAGE_OFFSET as (~0) << 52, regardless of whether the h/w > > > supports LVA or not > > > - however, we tweak the phys/virt translation so that memory appears > > > in the 48-bit addressable part of the linear region on non-LVA > > > hardware > > > > > > The latter basically means that the KASAN shadow region will intersect > > > the linear region, but whether we map memory or shadow pages there > > > depends on the h/w config at runtime. > > > > > > The heart of the matter is probably the different placement of the > > > memory inside the linear region, depending on whether the h/w is LVA > > > capable or not, which is also reflected in your physvirt_offset. I am > > > just trying to figure out why we need VA_BITS_ACTUAL to be a runtime > > > variable. > > > > Currently the direct linear map between configurations does not overlap, > > we have: > > > > FFF00000_00000000 - Direct linear map start (52-bit) > > FFF80000_00000000 - Direct linear map end (52-bit) > > FFFF0000_00000000 - Direct linear map start (48-bit) > > FFFF8000_00000000 - Direct linear map end (48-bit) > > > > We *can* make PAGE_OFFSET a constant for both 48/52 bit VA_BITS, if we > > offset it. vmemmap can then be adjusted on early boot to ensure that > > everything points to the right place. However we will get overlap for > > 52-bit configurations between KASAN and the direct linear map. > > > > The question is: are we okay with quite a large overlap? > > > > The KASAN region begins on 0xFFFDA000_00000000 for 52-bit. If we wish to > > employ a "full" 47-bit direct linear map on 48-bit systems we need a > > PAGE_OFFSET of 0xFFF78000_00000000 in order to make the direct linear > > map end addresses "match up" between 48/52 bit configurations. > > > > This doesn't leave us with a lot of room for 52-bit configurations > > though, if KASAN is enabled. > > > > OK, so with actual numbers, what I had in mind was > > > FFF00000_00000000 start of 52-bit addressable linear region | PAGE_OFFSET > > FFFD8000_00000000 start of KASAN shadow region | KASAN_SHADOW_OFFSET > > FFFF0000_00000000 start of 48-bit addressable linear region > > FFFF6000_00000000 start of used KASAN shadow region (48-bit VA) > (KASAN_SHADOW_OFFSET + F0000_00000000 >> 3) > > FFFF8000_00000000 start of vmemmap area - end of KASAN shadow region > > FFFF8200_00000000 end of vmemmap area - start of bpf/module/etc area > > > The trick is that the full (52 - 3) bits KASAN shadow space overlaps > with the 48-bit linear region, but since you don't need KASAN shadow > pages for memory that does not exist, the region FFFF0000_00000000 - > FFFF6000_00000000 can be used for mapping the memory in case the h/w > is 48-bit only. > > So in this case, PAGE_OFFSET and KASAN_SHADOW_OFFSET remain compile > time constants, and as long as we don't attempt to map anything > outside of the 48-bit addressable area on h/w that does not support > it, the fact that those quantities are outside the 48-bit range does > not really matter. Thanks Ard, I'll elaborate more on what I'm worrying about :-). The 48/52 bit linear regions above do not overlap and this creates the following issue. To go from a struct page * to a linear address we do the following: lva = (page - VMEMMAP_START) * PAGE_SIZE / sizeof(struct page) + PAGE_OFFSET (Before my series) all the constants are fixed at compile time and thus translation is very quick. My understanding is that you would like PAGE_OFFSET to be constant to preserve the optimised nature of this transform? (if not, please shout :-) ) The problem is that a 52-bit PAGE_OFFSET = 0xFFF00000_00000000 will never be able to give us an lva within a 48-bit addressable range. At best we will get an lva of FFF80000_00000000. We can get around this by adding a variable to the above transform, but this is essentially what my series does by making PAGE_OFFSET variable. Cheers, -- Steve _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel