From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECE7EC4167B for ; Mon, 11 Dec 2023 18:18:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 61C316B0193; Mon, 11 Dec 2023 13:18:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5CC316B0194; Mon, 11 Dec 2023 13:18:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4E21C6B0195; Mon, 11 Dec 2023 13:18:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 3EDB06B0193 for ; Mon, 11 Dec 2023 13:18:29 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E9EBB1207E4 for ; Mon, 11 Dec 2023 18:18:28 +0000 (UTC) X-FDA: 81555347496.06.1EF45AD Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf16.hostedemail.com (Postfix) with ESMTP id 7466A180014 for ; Mon, 11 Dec 2023 18:18:26 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf16.hostedemail.com: domain of cmarinas@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702318707; a=rsa-sha256; cv=none; b=yY9NmJ7agRuwad2os5+DB3w7700oaC1j3NA6SBCewyKo0GJIhBwbyXsUBJWXq65qUQRMFw QXH8u9iwKd+FJVZNfQJD9kDfAjJgBbjp371VJCkQztctu+WwVQ99vPT8E+PP1Rsih5Ac1p OLoJIo/J3pvN8hHOtwtxSbJMzwW8z+A= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf16.hostedemail.com: domain of cmarinas@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702318707; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+5B8qG2T7SW2v0hqWdVbcatPSqMDpbtCY9DO73WaLOE=; b=TvQ8uX0Fq61hB8l7LizBIHFvYGeh35bD0gUhojY0pSqEfqAxkg+ltRbmsndsupmX5BLFGE CJm5rjJeKj7ce1dTUdsLjBdfhL2DRLhCTOVv8u5M3dSP3c9iXXSStDVnlDUivzY/599T91 w8YMhxE/6yb0ucEMnadST5VKr9socio= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 1DC12CE13CF; Mon, 11 Dec 2023 18:18:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7733EC433C7; Mon, 11 Dec 2023 18:18:19 +0000 (UTC) Date: Mon, 11 Dec 2023 18:18:17 +0000 From: Catalin Marinas To: Joey Gouly Cc: linux-arm-kernel@lists.infradead.org, akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, broonie@kernel.org, dave.hansen@linux.intel.com, maz@kernel.org, oliver.upton@linux.dev, shuah@kernel.org, will@kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, James Morse , Suzuki K Poulose , Zenghui Yu Subject: Re: [PATCH v3 12/25] arm64: handle PKEY/POE faults Message-ID: References: <20231124163510.1835740-1-joey.gouly@arm.com> <20231124163510.1835740-13-joey.gouly@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231124163510.1835740-13-joey.gouly@arm.com> X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 7466A180014 X-Stat-Signature: dumij1xehy9thosmsdnnit44db4cdg1o X-Rspam-User: X-HE-Tag: 1702318706-62703 X-HE-Meta: U2FsdGVkX1+vBLRRQ0wcmwvPFvK5zjuBYyzQUDR2qPjEBP9cTdb60uVuxB7nRfii0mHsxkCFpfy/fy1Lp7LSGZOnMl/6fg+VGv9az3anLFMMp7xcWZgNhR6YYpdv7puNyjZy+y2j6MvmR/V2y6alpFpduaYsiCYrCXpfR7dKrRfrtGZqZxw4hR+vMMZPzo1S/lKET+tQAF1xQ27XZqMnj1eir3X4+X5dPZ+5biCbCVUSn7q602sF6XQM9mwAUKG4G2TINCgvptkb+hpM9FMIIkEoRIJkKTWLidBa0Hp9BYfGrMQYpR1gOSxmlKbvS08Mr7khwMVKhM6Sa9AJ+QoOOLUAQ7tMI9PLaUobK7ozGZ46qWpnjasp2+o8hN6z+rSVzGxstwp3zz8vF3MY+mRmip31wcn9eOcPDn+QSKDh3OEmH2pPvyT8KDjQfzMCE3n8QvIdiEVy67vDo5r071LLmlktW3yVImRfGhUJFe8ONRePaOLIBgGsErewnAmzLHh+rOwR81B5RB7KS7+H70cnIHRb1is4tLShyR8+LZ8hLeddGTfWnMl/jUDXI8+SfEuaijCJCfeb9Gaxf+jrnzOjkzp53Crum7MXzworUSNcvDUanE5DQtfNUT4x2VkgQCvrGLuk8iHc/oh9dSyoKqyExvh4n28g6skPEsPZxQnnwkbXYzM0vx7WKbwM0XX6pBfOOVsDkNk9CkvxC0F4MLgpyuT8kwfZ2+5RnFFDkWpLBIVzGhfkOGHUnH6YOBxa+9PLq9SnDE5PMEm2gKW2w17BXcS4tuig5SpoG/vGLf9dLGRd46sdhoFfzNE9c6LDux8/N31QSv57CtfYuojXRw4727DcQUPm+kjUhwk5cgHd5tTlAcPxX+ghlNSQvD3AN+OCc7cfgcAj1RDzuatoHhG+/iXnoCR/6ZGbmvoOzq03umbT3WSbKMTtapZZe39jpPvHE6gx0oqqs+FeKT5Qe2K GPjLCOga Hr73ioGozxosiZ/LpfMgxMkKpsx4iXpGKFGU9m30kf9PMXOl9BhbgwkZVPAix2xBfZLlnuvuMy/rOgfj3SSOkYOICP6ApX51L0RYNqP5HmkuAaqnyWM1qXuxW9XbNX6FQbm98BFFxG265vGRNhwVW8RicjbrI3JdzqT5qHf156UbRnzI8KbaDQtPRZEAH18eIfw6miaCGAY2JT5Pd1Zu6JHhYPHzNxhlWGelW+xqzysxuwOREwzjXJ3qmPrkV6vhWm1ykr2PndOcx8OLmovKoQ58RaOQYWkOHWOQu76KwhZzOaFjq3LIjN320UHPjwH0OWnT5 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Nov 24, 2023 at 04:34:57PM +0000, Joey Gouly wrote: > @@ -497,6 +498,23 @@ static void do_bad_area(unsigned long far, unsigned long esr, > #define VM_FAULT_BADMAP ((__force vm_fault_t)0x010000) > #define VM_FAULT_BADACCESS ((__force vm_fault_t)0x020000) > > +static bool fault_from_pkey(unsigned long esr, struct vm_area_struct *vma, > + unsigned int mm_flags) > +{ > + unsigned long iss2 = ESR_ELx_ISS2(esr); > + > + if (!arch_pkeys_enabled()) > + return false; > + > + if (iss2 & ESR_ELx_Overlay) > + return true; > + > + return !arch_vma_access_permitted(vma, > + mm_flags & FAULT_FLAG_WRITE, > + mm_flags & FAULT_FLAG_INSTRUCTION, > + mm_flags & FAULT_FLAG_REMOTE); > +} Do we actually need this additional arch_vma_access_permitted() check? The ESR should tell us if it was a POE fault. Permission overlay faults have priority over the base permission faults, so we'd not need to fall back to this additional checks. Well, see below, we could do something slightly smarter here. I can see x86 and powerpc have similar checks (though at a different point under the mmap lock) but I'm not familiar with their exception model, exception priorities. > + > static vm_fault_t __do_page_fault(struct mm_struct *mm, > struct vm_area_struct *vma, unsigned long addr, > unsigned int mm_flags, unsigned long vm_flags, > @@ -688,9 +706,29 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, > * Something tried to access memory that isn't in our memory > * map. > */ > - arm64_force_sig_fault(SIGSEGV, > - fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : SEGV_MAPERR, > - far, inf->name); > + int fault_kind; > + /* > + * The pkey value that we return to userspace can be different > + * from the pkey that caused the fault. > + * > + * 1. T1 : mprotect_key(foo, PAGE_SIZE, pkey=4); > + * 2. T1 : set AMR to deny access to pkey=4, touches, page > + * 3. T1 : faults... > + * 4. T2: mprotect_key(foo, PAGE_SIZE, pkey=5); > + * 5. T1 : enters fault handler, takes mmap_lock, etc... > + * 6. T1 : reaches here, sees vma_pkey(vma)=5, when we really > + * faulted on a pte with its pkey=4. > + */ > + int pkey = vma_pkey(vma); Other than the vma_pkey() race, I'm more worried about the vma completely disappearing, e.g. munmap() in another thread. We end up dereferencing a free pointer here. We need to do this under the mmap lock. Since we need to do this check under the mmap lock, we could add an additional check to see if the pkey fault we got was a racy and just restart the instruction if it no longer faults according to por_el0_allows_pkey(). However, the code below is too late in the fault handling to be able to do much other than SIGSEGV. > + > + if (fault_from_pkey(esr, vma, mm_flags)) > + fault_kind = SEGV_PKUERR; > + else > + fault_kind = fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : SEGV_MAPERR; > + > + arm64_force_sig_fault_pkey(SIGSEGV, > + fault_kind, > + far, inf->name, pkey); > } > > return 0; -- Catalin