From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09BD8C48BF6 for ; Mon, 4 Mar 2024 13:10:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 96A576B0093; Mon, 4 Mar 2024 08:10:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8F1866B0095; Mon, 4 Mar 2024 08:10:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 76A596B0096; Mon, 4 Mar 2024 08:10:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 5E66D6B0093 for ; Mon, 4 Mar 2024 08:10:56 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2A13BC0B32 for ; Mon, 4 Mar 2024 13:10:56 +0000 (UTC) X-FDA: 81859391712.12.9A5641F Received: from mail-ej1-f43.google.com (mail-ej1-f43.google.com [209.85.218.43]) by imf15.hostedemail.com (Postfix) with ESMTP id 436D8A0019 for ; Mon, 4 Mar 2024 13:10:54 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1Ah99mKl; spf=pass (imf15.hostedemail.com: domain of qperret@google.com designates 209.85.218.43 as permitted sender) smtp.mailfrom=qperret@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709557854; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=z3m7OivOxzJeGdkMzIGOv6kRagQeqCuoThPb8xlNbgo=; b=M/E7S3l9UBTl/yH+sL+Thl8Ev6EuLVCLeO2VR9V0I3dFzqU1wamzTKQd0f5UBqwkFghcXU 0XHhwabmR5NAwEJh0X1HWJ66oVRSiOA9YnvdtoE96sJDTIbBGxJGlY2V60BrxJIHne8w/F m/r2S1b7/LkvRV5X69d5pDaD0yej9k8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709557854; a=rsa-sha256; cv=none; b=2M0bWSmN0fx1Erm/gA2s7EUoMkOkvVwc45g/IGbanETVHClTi+a8gd05NMJwlpjrhdhLJP RNmLRdzDA7/Wx+LSY+m2Cqa5UAU6s2BUg00xtI9WKKpmnybvyujh4WNecMeqy8XouuLHh4 +3iclt33sWtmlPwJMrHTQGT4+5T7C88= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1Ah99mKl; spf=pass (imf15.hostedemail.com: domain of qperret@google.com designates 209.85.218.43 as permitted sender) smtp.mailfrom=qperret@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ej1-f43.google.com with SMTP id a640c23a62f3a-a4499ef8b5aso287836666b.0 for ; Mon, 04 Mar 2024 05:10:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1709557853; x=1710162653; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=z3m7OivOxzJeGdkMzIGOv6kRagQeqCuoThPb8xlNbgo=; b=1Ah99mKl2RqN60CeSQLIK3cH5ElrhJO9VC1Q9EyhJPUgzTqn+W/CvZu0/4wr+BOCFI ek1VvmdVwPge7HG5yaM4A36w3OYyBA1tAqLAKs3k12uTik4u4022ZkM9ZO/lRm+m05xG TCuaHKuJRFmO1WTBgo8yHTpK79YF5ZDc21PFSXy3h95GcMnUeeGBhmZFiULhreIDyNPA dM+TFKSyjS5woUKLRf44VPniN7eqlsu4dgGox+atTdGOPkxYGr6rs8vL9rUzQWDwrXyw Uwhyr0LDwsgN/L9lSSOeOoI7txzDcsAaL7kHmjakwps6tz1Sycol3guwCg43KpwXyW4p zNLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709557853; x=1710162653; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=z3m7OivOxzJeGdkMzIGOv6kRagQeqCuoThPb8xlNbgo=; b=ZCIuZEOgbH9cbcvaXOknVJx2Sg8voMXFExzHRr3ZfKK35V3PfPD+mL1rsuJ7QyEC8t vmTzj556LvmatPF+DKYT5IfMBNxWrR4OOKaDO5kvHnAr8MdPeQO7+dU4mCQDsBWzpqNW wBeFGxknnSk/3aMcrqQG/mcx3T75uo4av/j6YndmMZOeyBvdPis71NwWIv5We4uBLzuD LTwYr+ZCMD139Yv792sCgpWhX8dflK/G/EyetTXTGtus9hw5GrRlt6bL8C8f3uYh5qrn m/pYC/6uFBSoV9rr3CV6sDnm79+SFXFTvnlWKmbuCijDrSpmM/hB1lisLIZBxSATgO/D bQMw== X-Forwarded-Encrypted: i=1; AJvYcCVVTuIMI0f3sXpIsANZxUl9b6kTk0/pA6362JgtLVqhi1sh/PzKWQLJK9S9B4XVeliKbLhYt6ElAedFFWkVfblE5IE= X-Gm-Message-State: AOJu0YyFydI4pehVebwjKInmCr6ic1UDlLG8IEpxOsEiluKIkDPtS+h8 LUmUEz+uqtGYdi8fuZG4yMJKm2JlmYai/iVeCQ7mBJyuhGPH/bMyblweblWa7Q== X-Google-Smtp-Source: AGHT+IGrvXADFoe0PTlXPw7y7hmD/cJqKu6+s04vNFfdc9+9OA1wUWqNlu47ZlZ+Fh8V1+pbnqa8SA== X-Received: by 2002:a17:906:4f09:b0:a45:70b9:252b with SMTP id t9-20020a1709064f0900b00a4570b9252bmr965095eju.57.1709557852486; Mon, 04 Mar 2024 05:10:52 -0800 (PST) Received: from google.com (64.227.90.34.bc.googleusercontent.com. [34.90.227.64]) by smtp.gmail.com with ESMTPSA id r18-20020a1709067fd200b00a4589f3392esm98523ejs.207.2024.03.04.05.10.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Mar 2024 05:10:51 -0800 (PST) Date: Mon, 4 Mar 2024 13:10:48 +0000 From: Quentin Perret To: Christoph Hellwig , Will Deacon , Chris Goldsworthy , Android KVM , Patrick Daly , Alex Elder , Srinivas Kandagatla , Murali Nalajal , Trilok Soni , Srivatsa Vaddagiri , Carl van Schaik , Philip Derrin , Prakruthi Deepak Heragu , Jonathan Corbet , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Catalin Marinas , Konrad Dybcio , Bjorn Andersson , Dmitry Baryshkov , Fuad Tabba , Sean Christopherson , Andrew Morton , linux-arm-msm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org Subject: Re: Re: [PATCH v17 19/35] arch/mm: Export direct {un,}map functions Message-ID: References: <20240222-gunyah-v17-0-1e9da6763d38@quicinc.com> <20240222-gunyah-v17-19-1e9da6763d38@quicinc.com> <20240223071006483-0800.eberman@hu-eberman-lv.qualcomm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240223071006483-0800.eberman@hu-eberman-lv.qualcomm.com> X-Rspamd-Queue-Id: 436D8A0019 X-Rspam-User: X-Stat-Signature: os3814prbxaeptpt8hzh4tsaf4f6zj3t X-Rspamd-Server: rspam03 X-HE-Tag: 1709557854-625296 X-HE-Meta: U2FsdGVkX1+UgRGzeE9k3PToMbCr5pr7epMIPX1T1gtQ4x6jbF3sJ1rrO6CwFNszlKJqEXIZhLofeTMGbgzwxIoGLdJucVCQL4WWfhanwdyecHNGcYTT7CVyKGC799fpl0C98T5esIuH7z1vRxg6Itq38TsXJh8cnC0jgdgCUeo3OqM1uvFfCoQhF7uhCH2sXwwceIfmSGBoxBbRi4j8ZokOLCUGZ0+Mn/AHcmBfWgP6m6v0jOR5NhaFEiUTdjbHwneAWa8+GWtPmwYnylcS0GOqGcRtEUYAfheq4YYf8Sws2wfJXuHpgqMYkmHBQ7iWDt+iNGv14AV6kmD6BfgjV4+0QdhHR0LZXvZfXlJpI9r3ocKoCXjIreHW8yz9rBdsJB+Mz8HAN8Fu8FqbDhYWeHWSztaBv2wUTT2vxbWvkB6CzTvoNjnZl1osQqYEaQrWw8bTp3cn2Kkz/K+YhQAIED1eMkEnWP+IMDHkWM4vwH5188TLxBp0sXfJE8AxaGaFfYoMTJQ8y0+ff20yGs9/KqwleCFQJr1RNFk3UDolEt545cpiNGa8TeOdpEn5nJO7m+bMw9bWQdrgZFjrhAH/L2efKESDiB1WXqdYsld6IhroXIaiJ+/tXrCnxjWweAPCVfeAY3bU7vXBHze9vzA3uga/BUJxqSmdp1IMTiplF5tTKQG/lXkVGYjd6nOGKqiP5ypf7nTOQcfiDhMvbZIKDGJzw/Zu3LoYXUnGE2u+ATRdzHDrA+rqKBwQyjDU0+8mQaEcFNcH6D/mOPL9gpMmCSooulpFI5twdm3jt4w/pynzETSwVcf790H49XHRkklZBMuF5XTcZk+Vvb64hs53CTo0HDe+f5BFVL8dID8PSmfhKMIweD+hu91LncpI3Bji/c49BNlvU6hUkmFUeUFcmic5hC1c8A8FZHa5JskgWz8FWufTMpkfnQWg1VuNNyzgWueK1ygIeO+2x2k0jMw tuPXrVXJ 1cQJK6cx0VvbTtsHtQjeCdDLsylxP5mFF2MsfGlrfBs3dtikI4dVwPzpCvrDf1NcK/FqPdEBaGOp86SjZxde/CxhvX5I1Oh6eoQQiHpVH084KLnm2xRQq/sBHawpYB5OUtvSyM9YW8HE/rQrBZloJ+FQ9vwtSyVdZCMf6F/mIJ84yB6as3ZZFzhbaDl4oJktIyHsWx+OKmET187xZHme6c/bs7CeL5q//LfoFJv6diOMbYBCqRnwgwn/G5BYxSbDkLeca7r/WtDgntaskbji5lGA9fr8h5LK3S6i9o8+iL5LMq+8tl+67r4ilohn0bHJgeYc9ohqu20+XnT9r9n9ZpJdZCFV9q3V9c2KAZvybZv0jyrIrQHxLE6UW4xS38Xnckxa2ovauaHeWVWExIdbgISmaKVcxErYQhZZ21jXLflUzJBhJx/WpIGnDFg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Friday 23 Feb 2024 at 16:37:23 (-0800), Elliot Berman wrote: > On Thu, Feb 22, 2024 at 11:09:40PM -0800, Christoph Hellwig wrote: > > On Thu, Feb 22, 2024 at 03:16:42PM -0800, Elliot Berman wrote: > > > Firmware and hypervisor drivers can donate system heap memory to their > > > respective firmware/hypervisor entities. Those drivers should unmap the > > > pages from the kernel's logical map before doing so. > > > > > > Export can_set_direct_map, set_direct_map_invalid_noflush, and > > > set_direct_map_default_noflush. > > > > Err, not they should not. And not using such super low-level interfaces > > from modular code. > > Hi Cristoph, > > We've observed a few times that Linux can unintentionally access a page > we've unmapped from host's stage 2 page table via an unaligned load from > an adjacent page. The stage 2 is managed by Gunyah. There are few > scenarios where even though we allocate and own a page from buddy, > someone else could try to access the page without going through the > hypervisor driver. One such instance we know about is > load_unaligned_zeropad() via pathlookup_at() [1]. > > load_unaligned_zeropad() could be called near the end of a page. If the > next page isn't mapped by the kernel in the stage one page tables, then > the access from to the unmapped page from load_unaligned_zeropad() will > land in __do_kernel_fault(), call fixup_exception(), and fill the > remainder of the load with zeroes. If the page in question is mapped in > stage 1 but was unmapped from stage 2, then the access lands back in > Linux in do_sea(), leading to a panic(). > > Our preference would be to add fixup_exception() to S2 PTW errors for > two reasons: > 1. It's cheaper to do performance wise: we've already manipulated S2 > page table and prevent intentional access to the page because > pKVM/Gunyah drivers know that access to the page has been lost. > 2. Page-granular S1 mappings only happen on arm64 with rodata=full. > > In an off-list discussion with the Android pkvm folks, their preference > was to have the pages unmapped from stage 1. I've gone with that > approach to get started but welcome discussion on the best approach. > > The Android (downstream) implementation of arm64 pkvm is currently > implementing a hack where s2 ptw faults are given back to the host as s1 > ptw faults (i.e. __do_kernel_fault() gets called and not do_sea()) -- > allowing the kernel to fixup the exception. > > arm64 pKVM will also face this issue when implementing guest_memfd or > when donating more memory to the hyp for s2 page tables, etc. As far as > I can tell, this isn't an issue for arm64 pKVM today because memory > isn't being dynamically donated to the hypervisor. FWIW pKVM already donates memory dynamically to the hypervisor, to store e.g. guest VM metadata and page-tables, and we've never seen that problem as far as I can recall. A key difference is that pKVM injects a data abort back into the kernel in case of a stage-2 fault, so the whole EXTABLE trick/hack in load_unaligned_zeropad() should work fine out of the box. As discussed offline, Gunyah injecting an SEA into the kernel is questionable, but I understand that the architecture is a bit lacking in this department, and that's probably the next best thing. Could the Gunyah driver allocate from a CMA region instead? That would surely simplify unmapping from EL1 stage-1 (similar to how drivers usually donate memory to TZ). Thanks, Quentin