public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Claudio Imbrenda <imbrenda@linux.ibm.com>
To: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org,
	borntraeger@de.ibm.com, frankja@linux.ibm.com, nsg@linux.ibm.com,
	nrb@linux.ibm.com, seiden@linux.ibm.com, gra@linux.ibm.com,
	schlameuss@linux.ibm.com, hca@linux.ibm.com, svens@linux.ibm.com,
	agordeev@linux.ibm.com, gor@linux.ibm.com, david@redhat.com,
	gerald.schaefer@linux.ibm.com
Subject: [PATCH v5 04/23] KVM: s390: Add gmap_helper_set_unused()
Date: Mon, 24 Nov 2025 12:55:35 +0100	[thread overview]
Message-ID: <20251124115554.27049-5-imbrenda@linux.ibm.com> (raw)
In-Reply-To: <20251124115554.27049-1-imbrenda@linux.ibm.com>

Add gmap_helper_set_unused() to mark userspace ptes as unused.

Core mm code will use that information to discard unused pages instead
of attempting to swap them.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Tested-by: Nico Boehr <nrb@linux.ibm.com>
Acked-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
---
 arch/s390/include/asm/gmap_helpers.h |  1 +
 arch/s390/mm/gmap_helpers.c          | 79 ++++++++++++++++++++++++++++
 2 files changed, 80 insertions(+)

diff --git a/arch/s390/include/asm/gmap_helpers.h b/arch/s390/include/asm/gmap_helpers.h
index 5356446a61c4..2d3ae421077e 100644
--- a/arch/s390/include/asm/gmap_helpers.h
+++ b/arch/s390/include/asm/gmap_helpers.h
@@ -11,5 +11,6 @@
 void gmap_helper_zap_one_page(struct mm_struct *mm, unsigned long vmaddr);
 void gmap_helper_discard(struct mm_struct *mm, unsigned long vmaddr, unsigned long end);
 int gmap_helper_disable_cow_sharing(void);
+void gmap_helper_try_set_pte_unused(struct mm_struct *mm, unsigned long vmaddr);
 
 #endif /* _ASM_S390_GMAP_HELPERS_H */
diff --git a/arch/s390/mm/gmap_helpers.c b/arch/s390/mm/gmap_helpers.c
index e14a63119e30..dca783859a73 100644
--- a/arch/s390/mm/gmap_helpers.c
+++ b/arch/s390/mm/gmap_helpers.c
@@ -124,6 +124,85 @@ void gmap_helper_discard(struct mm_struct *mm, unsigned long vmaddr, unsigned lo
 }
 EXPORT_SYMBOL_GPL(gmap_helper_discard);
 
+/**
+ * gmap_helper_try_set_pte_unused() - mark a pte entry as unused
+ * @mm: the mm
+ * @vmaddr: the userspace address whose pte is to be marked
+ *
+ * Mark the pte corresponding the given address as unused. This will cause
+ * core mm code to just drop this page instead of swapping it.
+ *
+ * This function needs to be called with interrupts disabled (for example
+ * while holding a spinlock), or while holding the mmap lock. Normally this
+ * function is called as a result of an unmap operation, and thus KVM common
+ * code will already hold kvm->mmu_lock in write mode.
+ *
+ * Context: Needs to be called while holding the mmap lock or with interrupts
+ *          disabled.
+ */
+void gmap_helper_try_set_pte_unused(struct mm_struct *mm, unsigned long vmaddr)
+{
+	pmd_t *pmdp, pmd, pmdval;
+	pud_t *pudp, pud;
+	p4d_t *p4dp, p4d;
+	pgd_t *pgdp, pgd;
+	spinlock_t *ptl;	/* Lock for the host (userspace) page table */
+	pte_t *ptep;
+
+	pgdp = pgd_offset(mm, vmaddr);
+	pgd = pgdp_get(pgdp);
+	if (pgd_none(pgd) || !pgd_present(pgd))
+		return;
+
+	p4dp = p4d_offset(pgdp, vmaddr);
+	p4d = p4dp_get(p4dp);
+	if (p4d_none(p4d) || !p4d_present(p4d))
+		return;
+
+	pudp = pud_offset(p4dp, vmaddr);
+	pud = pudp_get(pudp);
+	if (pud_none(pud) || pud_leaf(pud) || !pud_present(pud))
+		return;
+
+	pmdp = pmd_offset(pudp, vmaddr);
+	pmd = pmdp_get_lockless(pmdp);
+	if (pmd_none(pmd) || pmd_leaf(pmd) || !pmd_present(pmd))
+		return;
+
+	ptep = pte_offset_map_rw_nolock(mm, pmdp, vmaddr, &pmdval, &ptl);
+	if (!ptep)
+		return;
+
+	/*
+	 * Several paths exists that takes the ptl lock and then call the
+	 * mmu_notifier, which takes the mmu_lock. The unmap path, instead,
+	 * takes the mmu_lock in write mode first, and then potentially
+	 * calls this function, which takes the ptl lock. This can lead to a
+	 * deadlock.
+	 * The unused page mechanism is only an optimization, if the
+	 * _PAGE_UNUSED bit is not set, the unused page is swapped as normal
+	 * instead of being discarded.
+	 * If the lock is contended the bit is not set and the deadlock is
+	 * avoided.
+	 */
+	if (spin_trylock(ptl)) {
+		/*
+		 * Make sure the pte we are touching is still the correct
+		 * one. In theory this check should not be needed, but
+		 * better safe than sorry.
+		 * Disabling interrupts or holding the mmap lock is enough to
+		 * guarantee that no concurrent updates to the page tables
+		 * are possible.
+		 */
+		if (likely(pmd_same(pmdval, pmdp_get_lockless(pmdp))))
+			__atomic64_or(_PAGE_UNUSED, (long *)ptep);
+		spin_unlock(ptl);
+	}
+
+	pte_unmap(ptep);
+}
+EXPORT_SYMBOL_GPL(gmap_helper_try_set_pte_unused);
+
 static int find_zeropage_pte_entry(pte_t *pte, unsigned long addr,
 				   unsigned long end, struct mm_walk *walk)
 {
-- 
2.51.1


  parent reply	other threads:[~2025-11-24 11:56 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-24 11:55 [PATCH v5 00/23] KVM: s390: gmap rewrite, the real deal Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 01/23] KVM: s390: Refactor pgste lock and unlock functions Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 02/23] KVM: s390: add P bit in table entry bitfields, move union vaddress Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 03/23] s390: Move sske_frame() to a header Claudio Imbrenda
2025-11-24 11:55 ` Claudio Imbrenda [this message]
2025-11-24 11:55 ` [PATCH v5 05/23] KVM: s390: Enable KVM_GENERIC_MMU_NOTIFIER Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 06/23] KVM: s390: Rename some functions in gaccess.c Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 07/23] KVM: s390: KVM-specific bitfields and helper functions Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 08/23] KVM: s390: KVM page table management functions: allocation Claudio Imbrenda
2025-11-24 12:27   ` Janosch Frank
2025-11-24 12:41     ` Claudio Imbrenda
2025-11-24 13:01       ` Janosch Frank
2025-11-24 11:55 ` [PATCH v5 09/23] KVM: s390: KVM page table management functions: clear and replace Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 10/23] KVM: s390: KVM page table management functions: walks Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 11/23] KVM: s390: KVM page table management functions: storage keys Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 12/23] KVM: s390: KVM page table management functions: lifecycle management Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 13/23] KVM: s390: KVM page table management functions: CMMA Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 14/23] KVM: s390: New gmap code Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 15/23] KVM: s390: Add helper functions for fault handling Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 16/23] KVM: s390: Add some helper functions needed for vSIE Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 17/23] KVM: s390: Stop using CONFIG_PGSTE Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 18/23] KVM: s390: Storage key functions refactoring Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 19/23] KVM: s390: Switch to new gmap Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 20/23] KVM: s390: Remove gmap from s390/mm Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 21/23] KVM: S390: Remove PGSTE code from linux/s390 mm Claudio Imbrenda
2025-11-25 19:24   ` Heiko Carstens
2025-11-26  8:38     ` Heiko Carstens
2025-11-26  8:47       ` Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 22/23] KVM: s390: Enable 1M pages for gmap Claudio Imbrenda
2025-11-24 17:35   ` Christian Borntraeger
2025-11-24 11:55 ` [PATCH v5 23/23] KVM: s390: Storage key manipulation IOCTL Claudio Imbrenda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251124115554.27049-5-imbrenda@linux.ibm.com \
    --to=imbrenda@linux.ibm.com \
    --cc=agordeev@linux.ibm.com \
    --cc=borntraeger@de.ibm.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=gra@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=nrb@linux.ibm.com \
    --cc=nsg@linux.ibm.com \
    --cc=schlameuss@linux.ibm.com \
    --cc=seiden@linux.ibm.com \
    --cc=svens@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox