Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: zhanghao <76824143@qq.com>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH] KVM: guest_memfd: Fix ABBA deadlock in error_remove_folio
Date: Thu, 11 Jun 2026 10:32:00 +0000	[thread overview]
Message-ID: <20260611103200.E71701F00893@smtp.kernel.org> (raw)
In-Reply-To: <tencent_4B051B84CDCC0D1964EC00337AA32E40DC07@qq.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] kvm_gmem_error_folio incorrectly returns the internal enum MF_DELAYED instead of 0 on success, causing memory_failure() to treat successful recovery as a failure.
- [High] Leaving the poisoned folio in the page cache causes memory_failure() to fail due to an unexpected extra reference during me_pagecache_clean().
--

commit f0c468f1d22ef5955d11ed4dde621b8eef0c2808
Author: Hao Zhang <zhanghao1@kylinos.cn>

KVM: guest_memfd: Fix ABBA deadlock in error_remove_folio

This commit modifies kvm_gmem_error_folio() to use a trylock when acquiring
the mapping's invalidate_lock. This prevents an ABBA deadlock with guest_memfd
punch-hole operations, returning -EBUSY instead if the lock is contended.

> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 69c9d6d546b28..9417be3049cf0 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -499,7 +499,16 @@ static int kvm_gmem_error_folio(struct address_space *mapping, struct folio *fol
>  {
>  	pgoff_t start, end;
>  
> -	filemap_invalidate_lock_shared(mapping);
> +	/*
> +	 * memory_failure() holds mf_mutex globally.  We must not block
> +	 * on filemap_invalidate_lock here, as it can be held exclusive
> +	 * by kvm_gmem_fallocate() (MADV_REMOVE/FALLOC_FL_PUNCH_HOLE
> +	 * path), creating an ABBA deadlock with the poisoned folio lock.
> +	 * If the invalidation lock is contended, fail the recovery instead
> +	 * of reporting MF_DELAYED without zapping KVM mappings.
> +	 */
> +	if (!filemap_invalidate_trylock_shared(mapping))
> +		return -EBUSY;
>  
>  	start = folio->index;
>  	end = start + folio_nr_pages(folio);

[Severity: High]
This is a pre-existing issue, but does leaving the poisoned folio in the page
cache cause memory_failure() to fail?

In kvm_gmem_error_folio(), guest_memfd intentionally does not truncate the
poisoned folio from the page cache so that KVM can gracefully handle guest
access to the poisoned page. However, in
mm/memory-failure.c:me_pagecache_clean(), the code checks for extra references
after truncate_error_folio():

	extra_pins = shmem_mapping(mapping);
	...
	if (has_extra_refcount(ps, p, extra_pins))
		ret = MF_FAILED;

Because guest_memfd is not shmem, extra_pins is false. has_extra_refcount()
then sees the lingering page cache reference on the folio as an unexpected
user and returns true. Does this guarantee that me_pagecache_clean() will
always return MF_FAILED for guest_memfd, leading to the VM being hard-killed
by the MCE handler instead of KVM gracefully recovering?

[Severity: High]
This isn't a bug introduced by this patch, but should this callback return 0
on success instead of MF_DELAYED?

The VFS address_space_operations->error_remove_folio callback contract
requires returning 0 on success or a standard negative error code on failure.
However, kvm_gmem_error_folio() returns the internal memory-failure enum
MF_DELAYED (2).

The caller truncate_error_folio() in mm/memory-failure.c checks this:

	if (err != 0)
		pr_info("%#lx: Failed to punch page: %d\n", pfn, err);

Since it mistakenly treats the positive enum value as an error, will this
log "Failed to punch page: 2" and return MF_FAILED, entirely defeating the
graceful recovery feature?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/tencent_4B051B84CDCC0D1964EC00337AA32E40DC07@qq.com?part=1

  reply	other threads:[~2026-06-11 10:32 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-11  8:18 [PATCH] KVM: guest_memfd: Fix ABBA deadlock in error_remove_folio zhanghao
2026-06-11 10:32 ` sashiko-bot [this message]
2026-06-11 14:18 ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260611103200.E71701F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=76824143@qq.com \
    --cc=kvm@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox