From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 12FE7C2BA18 for ; Thu, 20 Jun 2024 10:26:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:In-Reply-To:From:References:Cc:To:Subject:MIME-Version:Date: Message-ID:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=BwjjERaGUmlSLmRd4WcUL6+q2AsEXiwMnz3EwQzCVC4=; b=jb0ql9DO1OIeVgvlq7bd9NzUN2 7cAm+vW2ywMJ1+ElRu5eGi9gsXN+0NNhCmJftXqwQrYQyrdXA7Yr/GbaIkJqTGxnNkpYB4E0ckj7Q W4/sV2VMDbRR3cqY/egVhkkGiurllGWj8h8oRN+7ggJ7sWkg8oGY7PwOyU1xyyIGpScj9qocndOuv xbyJaZIVXoylFBP7hNDJ1NnBK+qtClCD4zNq5XG/5F9OwguNhGDtogbA6JGIatsmzz4JuQrdtQK+d MAHOguv+H5ATEFCZJbfiyoJyj8XkBKbqJcS2IZz2GDPL2kOaTEJqfTf/zWRKy4jvYiCGfbuzkqAqp qYyo+IIQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sKEze-00000004ZMz-1FsL; Thu, 20 Jun 2024 10:26:14 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sKEza-00000004ZLy-2Nvy for linux-arm-kernel@lists.infradead.org; Thu, 20 Jun 2024 10:26:12 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 32ECBDA7; Thu, 20 Jun 2024 03:26:34 -0700 (PDT) Received: from [10.57.74.104] (unknown [10.57.74.104]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A70F33F73B; Thu, 20 Jun 2024 03:26:08 -0700 (PDT) Message-ID: <807640e2-3a79-4b65-a7f9-4b47f2a39f23@arm.com> Date: Thu, 20 Jun 2024 11:26:07 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1] arm64: mm: Permit PTE SW bits to change in live mappings Content-Language: en-GB To: Peter Xu Cc: Catalin Marinas , Will Deacon , Mark Rutland , linux-arm-kernel@lists.infradead.org References: <20240619121859.4153966-1-ryan.roberts@arm.com> <3a42e195-9392-442f-aba7-fdd2c186b98f@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240620_032610_750167_1377DE5C X-CRM114-Status: GOOD ( 29.52 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 19/06/2024 20:04, Peter Xu wrote: > On Wed, Jun 19, 2024 at 04:58:32PM +0100, Ryan Roberts wrote: >> The code in question is: >> >> if (userfaultfd_pte_wp(vma, ptep_get(vmf->pte))) { >> if (!userfaultfd_wp_async(vma)) { >> pte_unmap_unlock(vmf->pte, vmf->ptl); >> return handle_userfault(vmf, VM_UFFD_WP); >> } >> >> /* >> * Nothing needed (cache flush, TLB invalidations, >> * etc.) because we're only removing the uffd-wp bit, >> * which is completely invisible to the user. >> */ >> pte = pte_clear_uffd_wp(ptep_get(vmf->pte)); >> >> set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); >> /* >> * Update this to be prepared for following up CoW >> * handling >> */ >> vmf->orig_pte = pte; >> } >> >> Perhaps we should consider a change to the following style as a cleanup? >> >> old_pte = ptep_modify_prot_start(vma, addr, pte); >> ptent = pte_clear_uffd_wp(old_pte); >> ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent); > > You're probably right that at least the access bit seems racy to be set > here, so we may have risk of losing that when a race happened against HW. > Dirty bit shouldn't be a concern in this case due to missing W bit, iiuc. > > IMO it's a matter of whether we'd like to "make access bit 100% accurate" > when the race happened, while paying that off with an always slower generic > path. Looks cleaner indeed but maybe not very beneficial in reality. > >> >> Regardless, this patch is still a correct and valuable change; arm64 arch >> doesn't care if SW bits are modified in valid mappings so we shouldn't be >> checking for it. > > Agreed. Let's keep this discussion separate from the original patch if > that already fixes stuff. > >> >>> >>>> >>>> /* creating or taking down mappings is always safe */ >>>> if (!pte_valid(__pte(old)) || !pte_valid(__pte(new))) >>>> -- >>>> 2.43.0 >>>> >>> >>> When looking at this function I found this and caught my attention too: >>> >>> /* live contiguous mappings may not be manipulated at all */ >>> if ((old | new) & PTE_CONT) >>> return false; >>> >>> I'm now wondering how cont-ptes work with uffd-wp now for arm64, from >>> either hugetlb or mTHP pov. This check may be relevant here as a start. >> >> When transitioning a block of ptes between cont and non-cont, we transition the >> block through invalid with tlb invalidation. See contpte_convert(). >> >>> >>> The other thing is since x86 doesn't have cont-ptes yet, uffd-wp didn't >>> consider that, and there may be things overlooked at least from my side. >>> E.g., consider wr-protect one cont-pte huge pages on hugetlb: >>> >>> static inline pte_t huge_pte_mkuffd_wp(pte_t pte) >>> { >>> return huge_pte_wrprotect(pte_mkuffd_wp(pte)); >>> } >>> >>> I think it means so far it won't touch the rest cont-ptes but the 1st. Not >>> sure whether it'll work if write happens on the rest. >> >> I'm not completely sure I follow your point. I think this should work correctly. >> The arm64 huge_pte code knows what size (and level) the huge pte is and spreads >> the passed in pte across all the HW ptes. > > What I was considering is about wr-protect a 64K cont-pte entry in arm64: > > UFFDIO_WRITEPROTECT -> hugetlb_change_protection() -> huge_pte_mkuffd_wp() > > What I'm expecting is huge_pte_mkuffd_wp() would wr-protect all ptes, Yes, I think this works as expected. huge_pte_mkuffd_wp() is just modifying the bits in a pte passed on the stack and returns the modified pte. The pgtable is not touched at this point. The magic happens in set_huge_pte_at() (called by huge_ptep_modify_prot_commit() in your case), which knows how the abstract "huge pte" maps to real PMDs or PTEs in the pgtable and applies the passed in pte value appropriately to all real pmds/ptes (adjusting the pfn as required in the process). > but > looks not right now. I'm not sure if the HW is able to identify "the whole > 64K is wr-protected" in this case, rather than "only the 1st pte is > wr-protected", as IIUC current "pte" points to only the 1st pte entry. I believe this works as you intended; there is no bug as far as I can see. > > Thanks, >