From: Claudio Imbrenda <imbrenda@linux.ibm.com>
To: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org,
borntraeger@de.ibm.com, frankja@linux.ibm.com, nsg@linux.ibm.com,
nrb@linux.ibm.com, seiden@linux.ibm.com, gra@linux.ibm.com,
schlameuss@linux.ibm.com, hca@linux.ibm.com, svens@linux.ibm.com,
agordeev@linux.ibm.com, gor@linux.ibm.com, david@redhat.com,
gerald.schaefer@linux.ibm.com
Subject: [PATCH v5 04/23] KVM: s390: Add gmap_helper_set_unused()
Date: Mon, 24 Nov 2025 12:55:35 +0100 [thread overview]
Message-ID: <20251124115554.27049-5-imbrenda@linux.ibm.com> (raw)
In-Reply-To: <20251124115554.27049-1-imbrenda@linux.ibm.com>
Add gmap_helper_set_unused() to mark userspace ptes as unused.
Core mm code will use that information to discard unused pages instead
of attempting to swap them.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Tested-by: Nico Boehr <nrb@linux.ibm.com>
Acked-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
---
arch/s390/include/asm/gmap_helpers.h | 1 +
arch/s390/mm/gmap_helpers.c | 79 ++++++++++++++++++++++++++++
2 files changed, 80 insertions(+)
diff --git a/arch/s390/include/asm/gmap_helpers.h b/arch/s390/include/asm/gmap_helpers.h
index 5356446a61c4..2d3ae421077e 100644
--- a/arch/s390/include/asm/gmap_helpers.h
+++ b/arch/s390/include/asm/gmap_helpers.h
@@ -11,5 +11,6 @@
void gmap_helper_zap_one_page(struct mm_struct *mm, unsigned long vmaddr);
void gmap_helper_discard(struct mm_struct *mm, unsigned long vmaddr, unsigned long end);
int gmap_helper_disable_cow_sharing(void);
+void gmap_helper_try_set_pte_unused(struct mm_struct *mm, unsigned long vmaddr);
#endif /* _ASM_S390_GMAP_HELPERS_H */
diff --git a/arch/s390/mm/gmap_helpers.c b/arch/s390/mm/gmap_helpers.c
index e14a63119e30..dca783859a73 100644
--- a/arch/s390/mm/gmap_helpers.c
+++ b/arch/s390/mm/gmap_helpers.c
@@ -124,6 +124,85 @@ void gmap_helper_discard(struct mm_struct *mm, unsigned long vmaddr, unsigned lo
}
EXPORT_SYMBOL_GPL(gmap_helper_discard);
+/**
+ * gmap_helper_try_set_pte_unused() - mark a pte entry as unused
+ * @mm: the mm
+ * @vmaddr: the userspace address whose pte is to be marked
+ *
+ * Mark the pte corresponding the given address as unused. This will cause
+ * core mm code to just drop this page instead of swapping it.
+ *
+ * This function needs to be called with interrupts disabled (for example
+ * while holding a spinlock), or while holding the mmap lock. Normally this
+ * function is called as a result of an unmap operation, and thus KVM common
+ * code will already hold kvm->mmu_lock in write mode.
+ *
+ * Context: Needs to be called while holding the mmap lock or with interrupts
+ * disabled.
+ */
+void gmap_helper_try_set_pte_unused(struct mm_struct *mm, unsigned long vmaddr)
+{
+ pmd_t *pmdp, pmd, pmdval;
+ pud_t *pudp, pud;
+ p4d_t *p4dp, p4d;
+ pgd_t *pgdp, pgd;
+ spinlock_t *ptl; /* Lock for the host (userspace) page table */
+ pte_t *ptep;
+
+ pgdp = pgd_offset(mm, vmaddr);
+ pgd = pgdp_get(pgdp);
+ if (pgd_none(pgd) || !pgd_present(pgd))
+ return;
+
+ p4dp = p4d_offset(pgdp, vmaddr);
+ p4d = p4dp_get(p4dp);
+ if (p4d_none(p4d) || !p4d_present(p4d))
+ return;
+
+ pudp = pud_offset(p4dp, vmaddr);
+ pud = pudp_get(pudp);
+ if (pud_none(pud) || pud_leaf(pud) || !pud_present(pud))
+ return;
+
+ pmdp = pmd_offset(pudp, vmaddr);
+ pmd = pmdp_get_lockless(pmdp);
+ if (pmd_none(pmd) || pmd_leaf(pmd) || !pmd_present(pmd))
+ return;
+
+ ptep = pte_offset_map_rw_nolock(mm, pmdp, vmaddr, &pmdval, &ptl);
+ if (!ptep)
+ return;
+
+ /*
+ * Several paths exists that takes the ptl lock and then call the
+ * mmu_notifier, which takes the mmu_lock. The unmap path, instead,
+ * takes the mmu_lock in write mode first, and then potentially
+ * calls this function, which takes the ptl lock. This can lead to a
+ * deadlock.
+ * The unused page mechanism is only an optimization, if the
+ * _PAGE_UNUSED bit is not set, the unused page is swapped as normal
+ * instead of being discarded.
+ * If the lock is contended the bit is not set and the deadlock is
+ * avoided.
+ */
+ if (spin_trylock(ptl)) {
+ /*
+ * Make sure the pte we are touching is still the correct
+ * one. In theory this check should not be needed, but
+ * better safe than sorry.
+ * Disabling interrupts or holding the mmap lock is enough to
+ * guarantee that no concurrent updates to the page tables
+ * are possible.
+ */
+ if (likely(pmd_same(pmdval, pmdp_get_lockless(pmdp))))
+ __atomic64_or(_PAGE_UNUSED, (long *)ptep);
+ spin_unlock(ptl);
+ }
+
+ pte_unmap(ptep);
+}
+EXPORT_SYMBOL_GPL(gmap_helper_try_set_pte_unused);
+
static int find_zeropage_pte_entry(pte_t *pte, unsigned long addr,
unsigned long end, struct mm_walk *walk)
{
--
2.51.1
next prev parent reply other threads:[~2025-11-24 11:56 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-24 11:55 [PATCH v5 00/23] KVM: s390: gmap rewrite, the real deal Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 01/23] KVM: s390: Refactor pgste lock and unlock functions Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 02/23] KVM: s390: add P bit in table entry bitfields, move union vaddress Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 03/23] s390: Move sske_frame() to a header Claudio Imbrenda
2025-11-24 11:55 ` Claudio Imbrenda [this message]
2025-11-24 11:55 ` [PATCH v5 05/23] KVM: s390: Enable KVM_GENERIC_MMU_NOTIFIER Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 06/23] KVM: s390: Rename some functions in gaccess.c Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 07/23] KVM: s390: KVM-specific bitfields and helper functions Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 08/23] KVM: s390: KVM page table management functions: allocation Claudio Imbrenda
2025-11-24 12:27 ` Janosch Frank
2025-11-24 12:41 ` Claudio Imbrenda
2025-11-24 13:01 ` Janosch Frank
2025-11-24 11:55 ` [PATCH v5 09/23] KVM: s390: KVM page table management functions: clear and replace Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 10/23] KVM: s390: KVM page table management functions: walks Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 11/23] KVM: s390: KVM page table management functions: storage keys Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 12/23] KVM: s390: KVM page table management functions: lifecycle management Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 13/23] KVM: s390: KVM page table management functions: CMMA Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 14/23] KVM: s390: New gmap code Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 15/23] KVM: s390: Add helper functions for fault handling Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 16/23] KVM: s390: Add some helper functions needed for vSIE Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 17/23] KVM: s390: Stop using CONFIG_PGSTE Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 18/23] KVM: s390: Storage key functions refactoring Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 19/23] KVM: s390: Switch to new gmap Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 20/23] KVM: s390: Remove gmap from s390/mm Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 21/23] KVM: S390: Remove PGSTE code from linux/s390 mm Claudio Imbrenda
2025-11-25 19:24 ` Heiko Carstens
2025-11-26 8:38 ` Heiko Carstens
2025-11-26 8:47 ` Claudio Imbrenda
2025-11-24 11:55 ` [PATCH v5 22/23] KVM: s390: Enable 1M pages for gmap Claudio Imbrenda
2025-11-24 17:35 ` Christian Borntraeger
2025-11-24 11:55 ` [PATCH v5 23/23] KVM: s390: Storage key manipulation IOCTL Claudio Imbrenda
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251124115554.27049-5-imbrenda@linux.ibm.com \
--to=imbrenda@linux.ibm.com \
--cc=agordeev@linux.ibm.com \
--cc=borntraeger@de.ibm.com \
--cc=david@redhat.com \
--cc=frankja@linux.ibm.com \
--cc=gerald.schaefer@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=gra@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=nrb@linux.ibm.com \
--cc=nsg@linux.ibm.com \
--cc=schlameuss@linux.ibm.com \
--cc=seiden@linux.ibm.com \
--cc=svens@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox