linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v1] arm64: mm: Permit PTE SW bits to change in live mappings
Date: Wed, 19 Jun 2024 15:04:41 -0400	[thread overview]
Message-ID: <ZnMryd6bTYJpUvoa@x1n> (raw)
In-Reply-To: <3a42e195-9392-442f-aba7-fdd2c186b98f@arm.com>

On Wed, Jun 19, 2024 at 04:58:32PM +0100, Ryan Roberts wrote:
> The code in question is:
> 
> 	if (userfaultfd_pte_wp(vma, ptep_get(vmf->pte))) {
> 		if (!userfaultfd_wp_async(vma)) {
> 			pte_unmap_unlock(vmf->pte, vmf->ptl);
> 			return handle_userfault(vmf, VM_UFFD_WP);
> 		}
> 
> 		/*
> 		 * Nothing needed (cache flush, TLB invalidations,
> 		 * etc.) because we're only removing the uffd-wp bit,
> 		 * which is completely invisible to the user.
> 		 */
> 		pte = pte_clear_uffd_wp(ptep_get(vmf->pte));
> 
> 		set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
> 		/*
> 		 * Update this to be prepared for following up CoW
> 		 * handling
> 		 */
> 		vmf->orig_pte = pte;
> 	}
> 
> Perhaps we should consider a change to the following style as a cleanup?
> 
> 	old_pte = ptep_modify_prot_start(vma, addr, pte);
> 	ptent = pte_clear_uffd_wp(old_pte);
> 	ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent);

You're probably right that at least the access bit seems racy to be set
here, so we may have risk of losing that when a race happened against HW.
Dirty bit shouldn't be a concern in this case due to missing W bit, iiuc.

IMO it's a matter of whether we'd like to "make access bit 100% accurate"
when the race happened, while paying that off with an always slower generic
path.  Looks cleaner indeed but maybe not very beneficial in reality.

> 
> Regardless, this patch is still a correct and valuable change; arm64 arch
> doesn't care if SW bits are modified in valid mappings so we shouldn't be
> checking for it.

Agreed.  Let's keep this discussion separate from the original patch if
that already fixes stuff.

> 
> > 
> >>
> >>  	/* creating or taking down mappings is always safe */
> >>  	if (!pte_valid(__pte(old)) || !pte_valid(__pte(new)))
> >> --
> >> 2.43.0
> >>
> > 
> > When looking at this function I found this and caught my attention too:
> > 
> > 	/* live contiguous mappings may not be manipulated at all */
> > 	if ((old | new) & PTE_CONT)
> > 		return false;
> > 
> > I'm now wondering how cont-ptes work with uffd-wp now for arm64, from
> > either hugetlb or mTHP pov.  This check may be relevant here as a start.
> 
> When transitioning a block of ptes between cont and non-cont, we transition the
> block through invalid with tlb invalidation. See contpte_convert().
> 
> > 
> > The other thing is since x86 doesn't have cont-ptes yet, uffd-wp didn't
> > consider that, and there may be things overlooked at least from my side.
> > E.g., consider wr-protect one cont-pte huge pages on hugetlb:
> > 
> > static inline pte_t huge_pte_mkuffd_wp(pte_t pte)
> > {
> > 	return huge_pte_wrprotect(pte_mkuffd_wp(pte));
> > }
> > 
> > I think it means so far it won't touch the rest cont-ptes but the 1st.  Not
> > sure whether it'll work if write happens on the rest.
> 
> I'm not completely sure I follow your point. I think this should work correctly.
> The arm64 huge_pte code knows what size (and level) the huge pte is and spreads
> the passed in pte across all the HW ptes.

What I was considering is about wr-protect a 64K cont-pte entry in arm64:

  UFFDIO_WRITEPROTECT -> hugetlb_change_protection() -> huge_pte_mkuffd_wp()

What I'm expecting is huge_pte_mkuffd_wp() would wr-protect all ptes, but
looks not right now.  I'm not sure if the HW is able to identify "the whole
64K is wr-protected" in this case, rather than "only the 1st pte is
wr-protected", as IIUC current "pte" points to only the 1st pte entry.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2024-06-19 19:05 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-19 12:18 [PATCH v1] arm64: mm: Permit PTE SW bits to change in live mappings Ryan Roberts
2024-06-19 14:54 ` Peter Xu
2024-06-19 15:58   ` Ryan Roberts
2024-06-19 19:04     ` Peter Xu [this message]
2024-06-20 10:26       ` Ryan Roberts
2024-06-20 13:31         ` Peter Xu
2024-06-19 15:13 ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZnMryd6bTYJpUvoa@x1n \
    --to=peterx@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=ryan.roberts@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).