All of lore.kernel.org
 help / color / mirror / Atom feed
From: steve.capper@arm.com (Steve Capper)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] arm64: hugetlb: Avoid unnecessary clearing in huge_ptep_set_access_flags
Date: Wed, 19 Sep 2018 18:44:37 +0100	[thread overview]
Message-ID: <20180919174437.1866-1-steve.capper@arm.com> (raw)

For contiguous hugetlb, huge_ptep_set_access_flags performs a
get_clear_flush (which then flushes the TLBs) even when no change of ptes
is necessary.

Unfortunately, this behaviour can lead to back-to-back page faults being
generated when running with multiple threads that access the same
contiguous huge page.

Thread 1                     |  Thread 2
-----------------------------+------------------------------
hugetlb_fault                |
huge_ptep_set_access_flags   |
  -> invalidate pte range    | hugetlb_fault
continue processing          | wait for hugetlb fault mutex
release mutex and return     | huge_ptep_set_access_flags
                             |   -> invalidate pte range
hugetlb_fault
...

This patch changes huge_ptep_set_access_flags s.t. we first read the
contiguous range of ptes (whilst preserving dirty information); the pte
range is only then invalidated where necessary and this prevents further
spurious page faults.

Fixes: d8bdcff28764 ("arm64: hugetlb: Add break-before-make logic for contiguous entries")
Reported-by: Lei Zhang <zhang.lei@jp.fujitsu.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>

---

I was unable to test this on any hardware as I'm away from the office.

Can you please test this Lei Zhang?

Cheers,
--
Steve
---
 arch/arm64/mm/hugetlbpage.c | 36 ++++++++++++++++++++++++++++++++----
 1 file changed, 32 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 192b3ba07075..76d229eb6ba1 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -131,6 +131,27 @@ static pte_t get_clear_flush(struct mm_struct *mm,
 	return orig_pte;
 }
 
+static pte_t get_contig_pte(pte_t *ptep, unsigned long pgsize,
+		unsigned long ncontig)
+{
+	unsigned long i;
+	pte_t orig_pte = huge_ptep_get(ptep);
+
+	for (i = 0; i < ncontig; i++, ptep++) {
+		pte_t pte = huge_ptep_get(ptep);
+
+		/*
+		 * If HW_AFDBM is enabled, then the HW could turn on
+		 * the dirty bit for any page in the set, so check
+		 * them all.  All hugetlb entries are already young.
+		 */
+		if (pte_dirty(pte))
+			orig_pte = pte_mkdirty(orig_pte);
+	}
+
+	return orig_pte;
+}
+
 /*
  * Changing some bits of contiguous entries requires us to follow a
  * Break-Before-Make approach, breaking the whole contiguous set
@@ -324,7 +345,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 			       unsigned long addr, pte_t *ptep,
 			       pte_t pte, int dirty)
 {
-	int ncontig, i, changed = 0;
+	int ncontig, i;
 	size_t pgsize = 0;
 	unsigned long pfn = pte_pfn(pte), dpfn;
 	pgprot_t hugeprot;
@@ -336,9 +357,16 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 	ncontig = find_num_contig(vma->vm_mm, addr, ptep, &pgsize);
 	dpfn = pgsize >> PAGE_SHIFT;
 
+	orig_pte = get_contig_pte(ptep, pgsize, ncontig);
+	if (pte_same(orig_pte, pte))
+		return 0;
+
+	/*
+	 * we need to get our orig_pte again as HW DBM may have happened since
+	 * above. get_clear_flush will ultimately cmpxchg with 0 to ensure
+	 * that we can't lose any dirty information.
+	 */
 	orig_pte = get_clear_flush(vma->vm_mm, addr, ptep, pgsize, ncontig);
-	if (!pte_same(orig_pte, pte))
-		changed = 1;
 
 	/* Make sure we don't lose the dirty state */
 	if (pte_dirty(orig_pte))
@@ -348,7 +376,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 	for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn)
 		set_pte_at(vma->vm_mm, addr, ptep, pfn_pte(pfn, hugeprot));
 
-	return changed;
+	return 1;
 }
 
 void huge_ptep_set_wrprotect(struct mm_struct *mm,
-- 
2.11.0

             reply	other threads:[~2018-09-19 17:44 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-19 17:44 Steve Capper [this message]
2018-09-20  9:39 ` [PATCH] arm64: hugetlb: Avoid unnecessary clearing in huge_ptep_set_access_flags Will Deacon
2018-09-20 16:34   ` Steve Capper
2018-09-20 16:40   ` Steve Capper
2018-11-06 13:16 ` Zhang, Lei
2018-11-06 17:14   ` Will Deacon
2018-12-13  0:25     ` Zhang, Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180919174437.1866-1-steve.capper@arm.com \
    --to=steve.capper@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.