All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Gavin Shan <gshan@redhat.com>
Cc: kvmarm@lists.linux.dev, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, pbonzini@redhat.com,
	seanjc@google.com, oliver.upton@linux.dev, hshuai@redhat.com,
	zhenyzha@redhat.com, shan.gavin@gmail.com
Subject: Re: [PATCH] KVM: Avoid illegal stage2 mapping on invalid memory slot
Date: Thu, 08 Jun 2023 15:31:09 +0100	[thread overview]
Message-ID: <87bkhqnhzm.wl-maz@kernel.org> (raw)
In-Reply-To: <20230608090348.414990-1-gshan@redhat.com>

Hi Gavin,

On Thu, 08 Jun 2023 10:03:48 +0100,
Gavin Shan <gshan@redhat.com> wrote:
> 
> We run into guest hang in edk2 firmware when KSM is kept as running
> on the host. The edk2 firmware is waiting for status 0x80 from QEMU's
> pflash device (TYPE_PFLASH_CFI01) during the operation for sector
> erasing or buffered write. The status is returned by reading the
> memory region of the pflash device and the read request should
> have been forwarded to QEMU and emulated by it. Unfortunately, the
> read request is covered by an illegal stage2 mapping when the guest
> hang issue occurs. The read request is completed with QEMU bypassed and
> wrong status is fetched.
> 
> The illegal stage2 mapping is populated due to same page mering by
> KSM at (C) even the associated memory slot has been marked as invalid
> at (B).
> 
>   CPU-A                    CPU-B
>   -----                    -----
>                            ioctl(kvm_fd, KVM_SET_USER_MEMORY_REGION)
>                            kvm_vm_ioctl_set_memory_region
>                            kvm_set_memory_region
>                            __kvm_set_memory_region
>                            kvm_set_memslot(kvm, old, NULL, KVM_MR_DELETE)
>                              kvm_invalidate_memslot
>                                kvm_copy_memslot
>                                kvm_replace_memslot
>                                kvm_swap_active_memslots        (A)
>                                kvm_arch_flush_shadow_memslot   (B)
>   same page merging by KSM
>   kvm_mmu_notifier_change_pte
>   kvm_handle_hva_range
>   __kvm_handle_hva_range       (C)
> 
> Fix the issue by skipping the invalid memory slot at (C) to avoid the
> illegal stage2 mapping. Without the illegal stage2 mapping, the read
> request for the pflash's status is forwarded to QEMU and emulated by
> it. The correct pflash's status can be returned from QEMU to break
> the infinite wait in edk2 firmware.

Huh, nice one :-(.

> 
> Cc: stable@vger.kernel.org # v5.13+
> Fixes: 3039bcc74498 ("KVM: Move x86's MMU notifier memslot walkers to generic code")
> Reported-by: Shuai Hu <hshuai@redhat.com>
> Reported-by: Zhenyu Zhang <zhenyzha@redhat.com>
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  virt/kvm/kvm_main.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 479802a892d4..7f81a3a209b6 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -598,6 +598,9 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm,
>  			unsigned long hva_start, hva_end;
>  
>  			slot = container_of(node, struct kvm_memory_slot, hva_node[slots->node_idx]);
> +			if (slot->flags & KVM_MEMSLOT_INVALID)
> +				continue;
> +
>  			hva_start = max(range->start, slot->userspace_addr);
>  			hva_end = min(range->end, slot->userspace_addr +
>  						  (slot->npages << PAGE_SHIFT));

I don't immediately see what makes it safer. If we're not holding one
of slots_{,arch_}lock in the notifier, we can still race against the
update, can't we?  I don't think holding the srcu lock helps us here.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

  reply	other threads:[~2023-06-08 14:31 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-08  9:03 [PATCH] KVM: Avoid illegal stage2 mapping on invalid memory slot Gavin Shan
2023-06-08 14:31 ` Marc Zyngier [this message]
2023-06-08 23:17   ` Gavin Shan
2023-06-09  8:06     ` Marc Zyngier
2023-06-09  9:02       ` Gavin Shan
2023-06-12 14:41 ` Sean Christopherson
2023-06-13  1:06   ` Gavin Shan
2023-06-13 14:58     ` Sean Christopherson
2023-06-14  0:01       ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bkhqnhzm.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=gshan@redhat.com \
    --cc=hshuai@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=shan.gavin@gmail.com \
    --cc=zhenyzha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.