linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Kunkun Jiang <jiangkunkun@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>,
	James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	<linux-arm-kernel@lists.infradead.org>, <kvmarm@lists.linux.dev>,
	<wanghaibin.wang@huawei.com>, <tangnianyao@huawei.com>,
	<wangzhou1@hisilicon.com>
Subject: Re: [bug report] KVM: arm64: vgic-v4: Occasionally issue VMOVP to an unmapped VPE on GICv4.1
Date: Mon, 30 Sep 2024 15:09:48 +0100	[thread overview]
Message-ID: <87r091utar.wl-maz@kernel.org> (raw)
In-Reply-To: <5ea32d67-f429-acfb-c6d4-06c7860146e2@huawei.com>

On Mon, 30 Sep 2024 07:25:28 +0100,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> 
> Hi Marc,
> 
> On 2024/9/29 18:07, Marc Zyngier wrote:
> > On Sun, 29 Sep 2024 08:18:41 +0100,
> > Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> >> 
> >> Hi all,
> >> 
> >> I found a problem with occasionally issuing VMOVP to an unmapped VPE
> >> on GICv4.1. In my test environment, operating an unmapped VPE will
> >> generate RAS, so I found this problem. The detailed analysis is as
> >> follows.
> >> 
> >> The vgic_v4_teardown() will be executed when VM is destroyed to free
> >> the GICv4 data structures. The code is as follows:
> >>> /**
> >>>   * vgic_v4_teardown - Free the GICv4 data structures
> >>>   * @kvm:        Pointer to the VM being destroyed
> >>>   */
> >>> void vgic_v4_teardown(struct kvm *kvm)
> >>> {
> >>>          struct its_vm *its_vm = &kvm->arch.vgic.its_vm;
> >>>          int i;
> >>> 
> >>>          lockdep_assert_held(&kvm->arch.config_lock);
> >>> 
> >>>          if (!its_vm->vpes)
> >>>                  return;
> >>> 
> >>>          for (i = 0; i < its_vm->nr_vpes; i++) {
> >>>                  struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, i);
> >>>                  int irq = its_vm->vpes[i]->irq;
> >>> 
> >>>                  irq_clear_status_flags(irq, DB_IRQ_FLAGS);
> >>>                  free_irq(irq, vcpu);
> >>>          }
> >>> 
> >>>          its_free_vcpu_irqs(its_vm);
> >>>          kfree(its_vm->vpes);
> >>>          its_vm->nr_vpes = 0;
> >>>          its_vm->vpes = NULL;
> >>> }
> >> 
> >> [1] In irq_clear_status_flags(irq, DB_IRQ_FLAGS), the status flags of
> >> a doorbell are cleared. DB_IRQ_FLAGS contains IRQ_NO_BALANCING. So
> >> after this,the irqbalance.service can schedule the doorbell.
> >> [2] In free_irq(), the VPE is unmaped.
> >> [3] In its_free_vcpu_irqs(its_vm), unregister_irq_proc() is called to
> >> delete the contents in /proc/irq/xx/ of the doorbell.
> >> 
> >> For VPEs in large-scale VM, there is a centain time window between [2]
> >> and [3]. The irqbalance.service got a chance to schedule the
> >> doorbell. Therefore, the VMOVP is issued to an unmapped VPE.
> >> 
> >> I tried not clearing IRQ_NO_BALANCING and the problem was solved. But
> >> it's not clear if there's any other problem with doing so.
> > 
> > I don't think that's a good idea, because whoever request the same
> > interrupt number again for a different purpose will have the flag set
> > and will experience odd behaviours.
> > 
> > I'd rather fix it for good, given that we have all the necessary
> > tracking in place already. Something like the patch below, as usual
> > untested.
> 
> After testing, the patch below fixes my problem.

Thanks. But it also introduces a regression on GICv4.0, which doesn't
use vmapp_count. So the fix is slightly more involved.

I'll try to post a more complete patch later this week.

	M.

-- 
Without deviation from the norm, progress is not possible.


  reply	other threads:[~2024-09-30 14:22 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-29  7:18 [bug report] KVM: arm64: vgic-v4: Occasionally issue VMOVP to an unmapped VPE on GICv4.1 Kunkun Jiang
2024-09-29 10:07 ` Marc Zyngier
2024-09-30  6:25   ` Kunkun Jiang
2024-09-30 14:09     ` Marc Zyngier [this message]
2024-09-30  5:18 ` Ganapatrao Kulkarni
2024-09-30  6:33   ` Kunkun Jiang
2024-09-30 14:04   ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r091utar.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=james.morse@arm.com \
    --cc=jiangkunkun@huawei.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=oliver.upton@linux.dev \
    --cc=suzuki.poulose@arm.com \
    --cc=tangnianyao@huawei.com \
    --cc=wanghaibin.wang@huawei.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).