From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29694C54798 for ; Tue, 5 Mar 2024 15:31:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 976856B0075; Tue, 5 Mar 2024 10:31:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9261F6B007B; Tue, 5 Mar 2024 10:31:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C72E6B007D; Tue, 5 Mar 2024 10:31:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6D98D6B0075 for ; Tue, 5 Mar 2024 10:31:07 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E5713A0E9D for ; Tue, 5 Mar 2024 15:31:06 +0000 (UTC) X-FDA: 81863373732.11.7F31EC3 Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) by imf04.hostedemail.com (Postfix) with ESMTP id DD1AE40019 for ; Tue, 5 Mar 2024 15:31:04 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0O2Qkz2h; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of qperret@google.com designates 209.85.218.45 as permitted sender) smtp.mailfrom=qperret@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709652665; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jFBfmrcURu1fWMu57pWuFBzvegWbvE+xd+YhP2c5sxY=; b=kJ1lCBbQwYyv5wdlaRuDHGPRfruCIJkBW8erghv228apTE7qVcby3vbY3hnce75/FlOmmt wS84XAoVQvPqLNSpTfVCY3plPxeTxIRWgKX6Bn3J1MojDJUrdPla6eYUf/85mcr/bOLx9o 6UEbmDpeZue5KhBP+pHmZieLAZTFnOc= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0O2Qkz2h; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of qperret@google.com designates 209.85.218.45 as permitted sender) smtp.mailfrom=qperret@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709652665; a=rsa-sha256; cv=none; b=T7/i/iv1CVrQZgiV0XWTHwZsvgAywGszOCV7ToT4bmM6kbaPG01p4fKuF7ZaX2NTkn6Iwl paLqSgyBF38MDMy2SI0X8iGOsz37U9iRmba0RYg/Ac50fC/2KJx8VaHmlTdMRR0MIPtx8A pHzM5+ZEtJkTjae/OX/dseg+lAVTG9I= Received: by mail-ej1-f45.google.com with SMTP id a640c23a62f3a-a44d084bfe1so429090466b.1 for ; Tue, 05 Mar 2024 07:31:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1709652663; x=1710257463; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=jFBfmrcURu1fWMu57pWuFBzvegWbvE+xd+YhP2c5sxY=; b=0O2Qkz2hSTa6FwEwePmzjCNI9aWWOooXvyDofiEW5QilAZVor3//sMPWLyX6pQiZQc iAarwQ5QGbkUERQFKdb2LqVwRFtgFOPFyIIT9kKYfodhlzPJmHwopMESueB/0vMFUjBL csZnvORoYY49oA+62A/ngLCxoPSGpBtKZPfeVGAhle5AOi9EC5q34omLA48MGjRxyBfs 0dijhDWxQIcZgq2EFMi4BA8EQ63SKRSfltPtBxpikgm2DKpRVr65tSUbcJ4Wnr/cEcry mprbnINH6Srr4rOetmIJ9iRzYa7/Py2O8HBa/lHn21dIqzAjyDz0YtAn4o67EtEF/wH9 v89w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709652663; x=1710257463; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=jFBfmrcURu1fWMu57pWuFBzvegWbvE+xd+YhP2c5sxY=; b=h2HXh057LEUmkgjO3opGEleSR/Ike3ysMHvJvWAiCnhRrHznNChTCyH2mwMwtfKJF/ 4M4i4hA/O6OBkmsyZSSnpdZ+wRr3PNevST/HqOxKVQKfoSkQbOAdwXuAAvMBJuN4+jnJ WhQzUmr/b4p252I+pkW2lBWL/RbdwUGnIJL+QR+DUyDeTE8QsdF0Y2z5tyjJ/mJ90GM1 G9wAHHRpsb2/n6b/mV5Sz0I0iff0XIv4XmT7MQkThjsfo7d5iYZeiRhHI60d/slU8q0w fUm6V5F2W3MWeerXDiS8bRhBc3GAsPb8OmjV21jv4LNvE7MbjLBpmTs2qC55vGx83+Iv j+FQ== X-Forwarded-Encrypted: i=1; AJvYcCU5idNOB5oO1cn218zKSkA3/NHLeLfUzuivp7RQm11iKav703oV8VWFaDr5NvYwHW9O+tGKuempjcbrChhHLrC2X2A= X-Gm-Message-State: AOJu0YzSVwZWde2rcWCX1ACQ7zHzDWYYI2NOTlyRNQcIzoj5axYPPwZO q6VGLmMauDlOmUdHN/2nY9pL0DEOa7pbG6UXIObLZ6N0Le5F+48KJzUeSoQsfQ== X-Google-Smtp-Source: AGHT+IG865qowYD6hypu+joiq3mkfJptCDN2wk6XWC4gZD4eKgj4/V+czQN2m8CgWrMksvV0eTX+4A== X-Received: by 2002:a17:906:4148:b0:a44:f89:a04e with SMTP id l8-20020a170906414800b00a440f89a04emr9475926ejk.35.1709652663164; Tue, 05 Mar 2024 07:31:03 -0800 (PST) Received: from google.com (64.227.90.34.bc.googleusercontent.com. [34.90.227.64]) by smtp.gmail.com with ESMTPSA id bw16-20020a170906c1d000b00a3fb9f1f10csm6146724ejb.161.2024.03.05.07.31.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Mar 2024 07:31:02 -0800 (PST) Date: Tue, 5 Mar 2024 15:30:58 +0000 From: Quentin Perret To: Christoph Hellwig , Will Deacon , Chris Goldsworthy , Android KVM , Patrick Daly , Alex Elder , Srinivas Kandagatla , Murali Nalajal , Trilok Soni , Srivatsa Vaddagiri , Carl van Schaik , Philip Derrin , Prakruthi Deepak Heragu , Jonathan Corbet , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Catalin Marinas , Konrad Dybcio , Bjorn Andersson , Dmitry Baryshkov , Fuad Tabba , Sean Christopherson , Andrew Morton , linux-arm-msm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org Subject: Re: Re: Re: [PATCH v17 19/35] arch/mm: Export direct {un,}map functions Message-ID: References: <20240222-gunyah-v17-0-1e9da6763d38@quicinc.com> <20240222-gunyah-v17-19-1e9da6763d38@quicinc.com> <20240223071006483-0800.eberman@hu-eberman-lv.qualcomm.com> <20240304094828133-0800.eberman@hu-eberman-lv.qualcomm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240304094828133-0800.eberman@hu-eberman-lv.qualcomm.com> X-Rspamd-Queue-Id: DD1AE40019 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: hhfj5gjepizahh1ra618p3yo5okp3n1y X-HE-Tag: 1709652664-303837 X-HE-Meta: U2FsdGVkX18nw5BJV1NyMrOJT7qBW92S4JHd9GFcJeNCRL1rdKYp57N688R2k2RKRMocfTheMQ2Bg3tu4NnqEN6l+EmRFa0rub73IFUKWlmmRnAsZKxLUrgcTE4jruFafQbj2hG6wCddEHV3rVPWWsiocdGYAg8gSFPxvrhrSjut9nL6JN6OF85h7P7ckDPMLZHdzGAKrigZi96Fas6o4ApdiOXjIroP5LaHd8DZRtG039EKx+HPxT18DbaiyDwl+M3u37siChOnzbyrhXWNCJGK+5v5fUro29aeAK92s1CZaUTyMhr1HD0yyQpqmBnrxK9GrutiA/hi0Nz8hpkpW5MQoG7jS4Aho021hU3qov0IRRFMVq2wZuvM786cohlYz+I8Yc93cf29y47KZTrJ0D1PKQhLt9qvN47lE/9BwJT5TTBmr9anHFwspa5g3y63D7jKOwPQmXMurpMQVjrYLiaWgvbXeKYJxRdo3x8dOJvIGuEfcGgQnRMlfnXtNDqsZWGlXpqwCp/qffwmNtKGq/zCigXh7EVgJU6cTF7oHqw7WrIkYRDWKKH4urKBSEr61fwqSF/svFdqDmLlM+zzxWICv5aZ3tUjo7xhvAcpOjVcyDB6MddPUJEFtp4Ect0e3TUYFpmgI9tSxbEUbO/8ZlBfKk9I+Jw4UPKNuUMa/Xnt/9KvtAxdpRjLJzNnsaMfnasRNflVJZTNvhoN7mB3LjRV/dy6E/Hgx5Dsghsujhp8vf/LKUk61io+9VHrTpEDIIe94BwpH4yqNZ3qmiOn18xDR5+EowAZfWuWXe5+QEfcno3ThLmPyCBLYDMCNdsmfgkkGONWBFuvTzrHe+1Nbj+5eblxNbPHEZJuq+BNYUuhqDP0RnGsBnAzz1kVLUNHVa7H6kFTpky9xUaV/P/IkuiQDlnuF+KpKb3mjLOcHYezq/Igux30zp9DoNdUc2HyiNuDUT5dvx5zDEGguOy SZlw48q6 YqFxNhsGVmCbeteME2vDyasVlxf9YNXvDr4HE4icyXo7bi7y7JXqjZMeZHoQixoBudlF9ZUsdTdhjac6pohYBFnF9cpHPxQie5bPDlXK2WVChbnwLBbxoJPZA3PABlidt7FisPDlNx7YhxueEyGRqShhHa1wFEjK0Qsj+cFvo524I2Xe6uY+bShRm+BVfLzdIZObfxLx1pUm7XxKAdXMoGVRgyNtSYml5bwmndS6AZzdNUoud9uAP6476xQu8LO5Fna5IBZ+iln/IK6tqxObS4Zr1zaVoZCh3SpXGWe6I0vlCvd9289x5kgUCppjji+J1fzu3hNwTp+lo+y6S8apbvMs8B2m/mF1s8luWbmc+MBcVSHpTsHgubtOX4APxGNuIan4rqGPB3p2ZOP0yzcNk+cpsgg3VoBww5flWo3Ilc+pa+zRd1yUKQgt5uw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Monday 04 Mar 2024 at 15:37:41 (-0800), Elliot Berman wrote: > On Mon, Mar 04, 2024 at 01:10:48PM +0000, Quentin Perret wrote: > > On Friday 23 Feb 2024 at 16:37:23 (-0800), Elliot Berman wrote: > > > On Thu, Feb 22, 2024 at 11:09:40PM -0800, Christoph Hellwig wrote: > > > > On Thu, Feb 22, 2024 at 03:16:42PM -0800, Elliot Berman wrote: > > > > > Firmware and hypervisor drivers can donate system heap memory to their > > > > > respective firmware/hypervisor entities. Those drivers should unmap the > > > > > pages from the kernel's logical map before doing so. > > > > > > > > > > Export can_set_direct_map, set_direct_map_invalid_noflush, and > > > > > set_direct_map_default_noflush. > > > > > > > > Err, not they should not. And not using such super low-level interfaces > > > > from modular code. > > > > > > Hi Cristoph, > > > > > > We've observed a few times that Linux can unintentionally access a page > > > we've unmapped from host's stage 2 page table via an unaligned load from > > > an adjacent page. The stage 2 is managed by Gunyah. There are few > > > scenarios where even though we allocate and own a page from buddy, > > > someone else could try to access the page without going through the > > > hypervisor driver. One such instance we know about is > > > load_unaligned_zeropad() via pathlookup_at() [1]. > > > > > > load_unaligned_zeropad() could be called near the end of a page. If the > > > next page isn't mapped by the kernel in the stage one page tables, then > > > the access from to the unmapped page from load_unaligned_zeropad() will > > > land in __do_kernel_fault(), call fixup_exception(), and fill the > > > remainder of the load with zeroes. If the page in question is mapped in > > > stage 1 but was unmapped from stage 2, then the access lands back in > > > Linux in do_sea(), leading to a panic(). > > > > > > Our preference would be to add fixup_exception() to S2 PTW errors for > > > two reasons: > > > 1. It's cheaper to do performance wise: we've already manipulated S2 > > > page table and prevent intentional access to the page because > > > pKVM/Gunyah drivers know that access to the page has been lost. > > > 2. Page-granular S1 mappings only happen on arm64 with rodata=full. > > > > > > In an off-list discussion with the Android pkvm folks, their preference > > > was to have the pages unmapped from stage 1. I've gone with that > > > approach to get started but welcome discussion on the best approach. > > > > > > The Android (downstream) implementation of arm64 pkvm is currently > > > implementing a hack where s2 ptw faults are given back to the host as s1 > > > ptw faults (i.e. __do_kernel_fault() gets called and not do_sea()) -- > > > allowing the kernel to fixup the exception. > > > > > > arm64 pKVM will also face this issue when implementing guest_memfd or > > > when donating more memory to the hyp for s2 page tables, etc. As far as > > > I can tell, this isn't an issue for arm64 pKVM today because memory > > > isn't being dynamically donated to the hypervisor. > > > > FWIW pKVM already donates memory dynamically to the hypervisor, to store > > e.g. guest VM metadata and page-tables, and we've never seen that > > problem as far as I can recall. > > > > A key difference is that pKVM injects a data abort back into the kernel > > in case of a stage-2 fault, so the whole EXTABLE trick/hack in > > load_unaligned_zeropad() should work fine out of the box. > > > > As discussed offline, Gunyah injecting an SEA into the kernel is > > questionable, but I understand that the architecture is a bit lacking in > > this department, and that's probably the next best thing. > > > > Could the Gunyah driver allocate from a CMA region instead? That would > > surely simplify unmapping from EL1 stage-1 (similar to how drivers > > usually donate memory to TZ). > > In my opinion, CMA is overly restrictive because we'd have to define the > region up front and we don't know how much memory the virtual machines > the user will want to launch. I was thinking of using CMA to allocate pages needed to store guest metadata and such at EL2, but not to back the actual guest pages themselves. That still means overallocating somehow, but that should hopefully be much smaller and be less of a problem? For the actual guest pages, the gunyah variant of guestmem will have to unmap the pages from the direct map itself, but I'd be personally happy with making that part non-modular to avoid the issue Christoph and others have raised. Thanks, Quentin