All of lore.kernel.org
 help / color / mirror / Atom feed
From: Heiko Carstens <hca@linux.ibm.com>
To: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	linux-s390@vger.kernel.org, borntraeger@de.ibm.com,
	frankja@linux.ibm.com, david@kernel.org, seiden@linux.ibm.com,
	nrb@linux.ibm.com, schlameuss@linux.ibm.com, gra@linux.ibm.com,
	gerald.schaefer@linux.ibm.com, gor@linux.ibm.com,
	agordeev@linux.ibm.com, svens@linux.ibm.com
Subject: Re: [PATCH v2 1/1] s390/mm: Fix handling of _PAGE_UNUSED pte bit
Date: Mon, 15 Jun 2026 11:43:00 +0200	[thread overview]
Message-ID: <20260615094300.31370D7a-hca@linux.ibm.com> (raw)
In-Reply-To: <20260615091741.76724-2-imbrenda@linux.ibm.com>

On Mon, Jun 15, 2026 at 11:17:41AM +0200, Claudio Imbrenda wrote:
> The _PAGE_UNUSED softbit should not really be lying around. Its sole
> purpose is to signal to try_to_unmap_one() and try_to_migrate_one()
> that the page can be discarded instead of being moved / swapped.
> 
> KVM has no way to know why a page is being unmapped, so it sets the bit
> on userspace ptes corresponding to unused guest pages every time they
> get unmapped. KVM has no reasonable way to clear the bit once the page
> is in use again.
> 
> Without appropriate cleanup, the _PAGE_UNUSED bit will linger around
> and cause guest corruption when a used page is instead thrown out.
> 
> While set_ptes() checks and clears the bit, ptep_xchg_direct(),
> ptep_xchg_lazy(), and ptep_modify_prot_commit() did not. This led to
> used pages being thrown out as if they were unused, causing guest
> corruption.
> 
> This patch fixes the issue by introducing the missing checks in the
> above functions.
> 
> Also fix gmap_helper_try_set_pte_unused() to only set the bit if the
> pte is present; the _PAGE_UNUSED bit is only defined for present ptes
> and thus should not be set for non-present ptes.
> 
> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> Fixes: c98175b7917f ("KVM: s390: Add gmap_helper_set_unused()")
> ---
>  arch/s390/mm/gmap_helpers.c | 4 ++--
>  arch/s390/mm/pgtable.c      | 6 ++++++
>  2 files changed, 8 insertions(+), 2 deletions(-)

...

> diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
> index 4acd8b140c4b..2acc79383e7d 100644
> --- a/arch/s390/mm/pgtable.c
> +++ b/arch/s390/mm/pgtable.c
> @@ -122,6 +122,8 @@ pte_t ptep_xchg_direct(struct mm_struct *mm, unsigned long addr,
>  
>  	preempt_disable();
>  	old = ptep_flush_direct(mm, addr, ptep, 1);
> +	if (pte_present(new))
> +		new = clear_pte_bit(new, __pgprot(_PAGE_UNUSED));
>  	set_pte(ptep, new);
>  	preempt_enable();
>  	return old;
> @@ -160,6 +162,8 @@ pte_t ptep_xchg_lazy(struct mm_struct *mm, unsigned long addr,
>  
>  	preempt_disable();
>  	old = ptep_flush_lazy(mm, addr, ptep, 1);
> +	if (pte_present(new))
> +		new = clear_pte_bit(new, __pgprot(_PAGE_UNUSED));
>  	set_pte(ptep, new);
>  	preempt_enable();
>  	return old;
> @@ -175,6 +179,8 @@ pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr,
>  void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
>  			     pte_t *ptep, pte_t old_pte, pte_t pte)
>  {
> +	if (pte_present(pte))
> +		pte = clear_pte_bit(pte, __pgprot(_PAGE_UNUSED));
>  	set_pte(ptep, pte);

Can't we move the logic from set_ptes() to set_pte() instead? The above
approach remembers me of the open-coded removal of the no-exec bit at many
places we had, which became a maintenance mess until it was rewritten.

The compiler _might_ even be clever enough to move the removal of the bit
outside the loop within set_ptes().

  reply	other threads:[~2026-06-15  9:43 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-15  9:17 [PATCH v2 0/1] s390/mm: Fix handling of _PAGE_UNUSED pte bit Claudio Imbrenda
2026-06-15  9:17 ` [PATCH v2 1/1] " Claudio Imbrenda
2026-06-15  9:43   ` Heiko Carstens [this message]
2026-06-15 10:31     ` Claudio Imbrenda
2026-06-15 11:50       ` Heiko Carstens
2026-06-15 12:09         ` Gerald Schaefer
2026-06-15 11:51   ` sashiko-bot
2026-06-15 16:03   ` Alexander Gordeev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260615094300.31370D7a-hca@linux.ibm.com \
    --to=hca@linux.ibm.com \
    --cc=agordeev@linux.ibm.com \
    --cc=borntraeger@de.ibm.com \
    --cc=david@kernel.org \
    --cc=frankja@linux.ibm.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=gra@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=nrb@linux.ibm.com \
    --cc=schlameuss@linux.ibm.com \
    --cc=seiden@linux.ibm.com \
    --cc=svens@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.