All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: kvm@vger.kernel.org, linux-pci@vger.kernel.org,
	Cornelia Huck <cohuck@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org
Subject: Re: Question on guest enable msi fail when using GICv4/4.1
Date: Fri, 07 May 2021 12:02:57 +0100	[thread overview]
Message-ID: <878s4qq00u.wl-maz@kernel.org> (raw)
In-Reply-To: <cf870bcf-1173-a70b-2b55-4209abcbcbc3@hisilicon.com>

On Fri, 07 May 2021 10:58:23 +0100,
Shaokun Zhang <zhangshaokun@hisilicon.com> wrote:
> 
> Hi Marc,
> 
> Thanks for your quick reply.
> 
> On 2021/5/7 17:03, Marc Zyngier wrote:
> > On Fri, 07 May 2021 06:57:04 +0100,
> > Shaokun Zhang <zhangshaokun@hisilicon.com> wrote:
> >>
> >> [This letter comes from Nianyao Tang]
> >>
> >> Hi,
> >>
> >> Using GICv4/4.1 and msi capability, guest vf driver requires 3
> >> vectors and enable msi, will lead to guest stuck.
> > 
> > Stuck how?
> 
> Guest serial does not response anymore and guest network shutdown.
> 
> > 
> >> Qemu gets number of interrupts from Multiple Message Capable field
> >> set by guest. This field is aligned to a power of 2(if a function
> >> requires 3 vectors, it initializes it to 2).
> > 
> > So I guess this is a MultiMSI device with 4 vectors, right?
> > 
> 
> Yes, it can support maximum of 32 msi interrupts, and vf driver only use 3 msi.
> 
> >> However, guest driver just sends 3 mapi-cmd to vits and 3 ite
> >> entries is recorded in host.  Vfio initializes msi interrupts using
> >> the number of interrupts 4 provide by qemu.  When it comes to the
> >> 4th msi without ite in vits, in irq_bypass_register_producer,
> >> producer and consumer will __connect fail, due to find_ite fail, and
> >> do not resume guest.
> > 
> > Let me rephrase this to check that I understand it:
> > - The device has 4 vectors
> > - The guest only create mappings for 3 of them
> > - VFIO calls kvm_vgic_v4_set_forwarding() for each vector
> > - KVM doesn't have a mapping for the 4th vector and returns an error
> > - VFIO disable this 4th vector
> > 
> > Is that correct? If yes, I don't understand why that impacts the guest
> > at all. From what I can see, vfio_msi_set_vector_signal() just prints
> > a message on the console and carries on.
> > 
> 
> function calls:
> --> vfio_msi_set_vector_signal
>    --> irq_bypass_register_producer
>       -->__connect
> 
> in __connect, add_producer finally calls kvm_vgic_v4_set_forwarding
> and fails to get the 4th mapping. When add_producer fail, it does
> not call cons->start, calls kvm_arch_irq_bypass_start and then
> kvm_arm_resume_guest.

[+Eric, who wrote the irq_bypass infrastructure.]

Ah, so the guest is actually paused, not in a livelock situation
(which is how I interpreted "stuck").

I think we should handle this case gracefully, as there should be no
expectation that the guest will be using this interrupt. Given that
VFIO seems to be pretty unfazed when a producer fails, I'm temped to
do the same thing and restart the guest.

Also, __disconnect doesn't care about errors, so why should __connect
have this odd behaviour?

Can you please try this? It is completely untested (and I think the
del_consumer call is odd, which is why I've also dropped it).

Eric, what do you think?

Thanks,

	M.

diff --git a/virt/lib/irqbypass.c b/virt/lib/irqbypass.c
index c9bb3957f58a..7e1865e15668 100644
--- a/virt/lib/irqbypass.c
+++ b/virt/lib/irqbypass.c
@@ -40,21 +40,14 @@ static int __connect(struct irq_bypass_producer *prod,
 	if (prod->add_consumer)
 		ret = prod->add_consumer(prod, cons);
 
-	if (ret)
-		goto err_add_consumer;
-
-	ret = cons->add_producer(cons, prod);
-	if (ret)
-		goto err_add_producer;
+	if (!ret)
+		ret = cons->add_producer(cons, prod);
 
 	if (cons->start)
 		cons->start(cons);
 	if (prod->start)
 		prod->start(prod);
-err_add_producer:
-	if (prod->del_consumer)
-		prod->del_consumer(prod, cons);
-err_add_consumer:
+
 	return ret;
 }
 

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: <kvmarm@lists.cs.columbia.edu>,
	<linux-arm-kernel@lists.infradead.org>, <kvm@vger.kernel.org>,
	<linux-pci@vger.kernel.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Nianyao Tang <tangnianyao@huawei.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Eric Auger <eric.auger@redhat.com>
Subject: Re: Question on guest enable msi fail when using GICv4/4.1
Date: Fri, 07 May 2021 12:02:57 +0100	[thread overview]
Message-ID: <878s4qq00u.wl-maz@kernel.org> (raw)
In-Reply-To: <cf870bcf-1173-a70b-2b55-4209abcbcbc3@hisilicon.com>

On Fri, 07 May 2021 10:58:23 +0100,
Shaokun Zhang <zhangshaokun@hisilicon.com> wrote:
> 
> Hi Marc,
> 
> Thanks for your quick reply.
> 
> On 2021/5/7 17:03, Marc Zyngier wrote:
> > On Fri, 07 May 2021 06:57:04 +0100,
> > Shaokun Zhang <zhangshaokun@hisilicon.com> wrote:
> >>
> >> [This letter comes from Nianyao Tang]
> >>
> >> Hi,
> >>
> >> Using GICv4/4.1 and msi capability, guest vf driver requires 3
> >> vectors and enable msi, will lead to guest stuck.
> > 
> > Stuck how?
> 
> Guest serial does not response anymore and guest network shutdown.
> 
> > 
> >> Qemu gets number of interrupts from Multiple Message Capable field
> >> set by guest. This field is aligned to a power of 2(if a function
> >> requires 3 vectors, it initializes it to 2).
> > 
> > So I guess this is a MultiMSI device with 4 vectors, right?
> > 
> 
> Yes, it can support maximum of 32 msi interrupts, and vf driver only use 3 msi.
> 
> >> However, guest driver just sends 3 mapi-cmd to vits and 3 ite
> >> entries is recorded in host.  Vfio initializes msi interrupts using
> >> the number of interrupts 4 provide by qemu.  When it comes to the
> >> 4th msi without ite in vits, in irq_bypass_register_producer,
> >> producer and consumer will __connect fail, due to find_ite fail, and
> >> do not resume guest.
> > 
> > Let me rephrase this to check that I understand it:
> > - The device has 4 vectors
> > - The guest only create mappings for 3 of them
> > - VFIO calls kvm_vgic_v4_set_forwarding() for each vector
> > - KVM doesn't have a mapping for the 4th vector and returns an error
> > - VFIO disable this 4th vector
> > 
> > Is that correct? If yes, I don't understand why that impacts the guest
> > at all. From what I can see, vfio_msi_set_vector_signal() just prints
> > a message on the console and carries on.
> > 
> 
> function calls:
> --> vfio_msi_set_vector_signal
>    --> irq_bypass_register_producer
>       -->__connect
> 
> in __connect, add_producer finally calls kvm_vgic_v4_set_forwarding
> and fails to get the 4th mapping. When add_producer fail, it does
> not call cons->start, calls kvm_arch_irq_bypass_start and then
> kvm_arm_resume_guest.

[+Eric, who wrote the irq_bypass infrastructure.]

Ah, so the guest is actually paused, not in a livelock situation
(which is how I interpreted "stuck").

I think we should handle this case gracefully, as there should be no
expectation that the guest will be using this interrupt. Given that
VFIO seems to be pretty unfazed when a producer fails, I'm temped to
do the same thing and restart the guest.

Also, __disconnect doesn't care about errors, so why should __connect
have this odd behaviour?

Can you please try this? It is completely untested (and I think the
del_consumer call is odd, which is why I've also dropped it).

Eric, what do you think?

Thanks,

	M.

diff --git a/virt/lib/irqbypass.c b/virt/lib/irqbypass.c
index c9bb3957f58a..7e1865e15668 100644
--- a/virt/lib/irqbypass.c
+++ b/virt/lib/irqbypass.c
@@ -40,21 +40,14 @@ static int __connect(struct irq_bypass_producer *prod,
 	if (prod->add_consumer)
 		ret = prod->add_consumer(prod, cons);
 
-	if (ret)
-		goto err_add_consumer;
-
-	ret = cons->add_producer(cons, prod);
-	if (ret)
-		goto err_add_producer;
+	if (!ret)
+		ret = cons->add_producer(cons, prod);
 
 	if (cons->start)
 		cons->start(cons);
 	if (prod->start)
 		prod->start(prod);
-err_add_producer:
-	if (prod->del_consumer)
-		prod->del_consumer(prod, cons);
-err_add_consumer:
+
 	return ret;
 }
 

-- 
Without deviation from the norm, progress is not possible.

WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: <kvmarm@lists.cs.columbia.edu>,
	<linux-arm-kernel@lists.infradead.org>, <kvm@vger.kernel.org>,
	<linux-pci@vger.kernel.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Nianyao Tang <tangnianyao@huawei.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Eric Auger <eric.auger@redhat.com>
Subject: Re: Question on guest enable msi fail when using GICv4/4.1
Date: Fri, 07 May 2021 12:02:57 +0100	[thread overview]
Message-ID: <878s4qq00u.wl-maz@kernel.org> (raw)
In-Reply-To: <cf870bcf-1173-a70b-2b55-4209abcbcbc3@hisilicon.com>

On Fri, 07 May 2021 10:58:23 +0100,
Shaokun Zhang <zhangshaokun@hisilicon.com> wrote:
> 
> Hi Marc,
> 
> Thanks for your quick reply.
> 
> On 2021/5/7 17:03, Marc Zyngier wrote:
> > On Fri, 07 May 2021 06:57:04 +0100,
> > Shaokun Zhang <zhangshaokun@hisilicon.com> wrote:
> >>
> >> [This letter comes from Nianyao Tang]
> >>
> >> Hi,
> >>
> >> Using GICv4/4.1 and msi capability, guest vf driver requires 3
> >> vectors and enable msi, will lead to guest stuck.
> > 
> > Stuck how?
> 
> Guest serial does not response anymore and guest network shutdown.
> 
> > 
> >> Qemu gets number of interrupts from Multiple Message Capable field
> >> set by guest. This field is aligned to a power of 2(if a function
> >> requires 3 vectors, it initializes it to 2).
> > 
> > So I guess this is a MultiMSI device with 4 vectors, right?
> > 
> 
> Yes, it can support maximum of 32 msi interrupts, and vf driver only use 3 msi.
> 
> >> However, guest driver just sends 3 mapi-cmd to vits and 3 ite
> >> entries is recorded in host.  Vfio initializes msi interrupts using
> >> the number of interrupts 4 provide by qemu.  When it comes to the
> >> 4th msi without ite in vits, in irq_bypass_register_producer,
> >> producer and consumer will __connect fail, due to find_ite fail, and
> >> do not resume guest.
> > 
> > Let me rephrase this to check that I understand it:
> > - The device has 4 vectors
> > - The guest only create mappings for 3 of them
> > - VFIO calls kvm_vgic_v4_set_forwarding() for each vector
> > - KVM doesn't have a mapping for the 4th vector and returns an error
> > - VFIO disable this 4th vector
> > 
> > Is that correct? If yes, I don't understand why that impacts the guest
> > at all. From what I can see, vfio_msi_set_vector_signal() just prints
> > a message on the console and carries on.
> > 
> 
> function calls:
> --> vfio_msi_set_vector_signal
>    --> irq_bypass_register_producer
>       -->__connect
> 
> in __connect, add_producer finally calls kvm_vgic_v4_set_forwarding
> and fails to get the 4th mapping. When add_producer fail, it does
> not call cons->start, calls kvm_arch_irq_bypass_start and then
> kvm_arm_resume_guest.

[+Eric, who wrote the irq_bypass infrastructure.]

Ah, so the guest is actually paused, not in a livelock situation
(which is how I interpreted "stuck").

I think we should handle this case gracefully, as there should be no
expectation that the guest will be using this interrupt. Given that
VFIO seems to be pretty unfazed when a producer fails, I'm temped to
do the same thing and restart the guest.

Also, __disconnect doesn't care about errors, so why should __connect
have this odd behaviour?

Can you please try this? It is completely untested (and I think the
del_consumer call is odd, which is why I've also dropped it).

Eric, what do you think?

Thanks,

	M.

diff --git a/virt/lib/irqbypass.c b/virt/lib/irqbypass.c
index c9bb3957f58a..7e1865e15668 100644
--- a/virt/lib/irqbypass.c
+++ b/virt/lib/irqbypass.c
@@ -40,21 +40,14 @@ static int __connect(struct irq_bypass_producer *prod,
 	if (prod->add_consumer)
 		ret = prod->add_consumer(prod, cons);
 
-	if (ret)
-		goto err_add_consumer;
-
-	ret = cons->add_producer(cons, prod);
-	if (ret)
-		goto err_add_producer;
+	if (!ret)
+		ret = cons->add_producer(cons, prod);
 
 	if (cons->start)
 		cons->start(cons);
 	if (prod->start)
 		prod->start(prod);
-err_add_producer:
-	if (prod->del_consumer)
-		prod->del_consumer(prod, cons);
-err_add_consumer:
+
 	return ret;
 }
 

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-05-07 11:03 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-07  5:57 Question on guest enable msi fail when using GICv4/4.1 Shaokun Zhang
2021-05-07  5:57 ` Shaokun Zhang
2021-05-07  5:57 ` Shaokun Zhang
2021-05-07  9:03 ` Marc Zyngier
2021-05-07  9:03   ` Marc Zyngier
2021-05-07  9:03   ` Marc Zyngier
2021-05-07  9:58   ` Shaokun Zhang
2021-05-07  9:58     ` Shaokun Zhang
2021-05-07  9:58     ` Shaokun Zhang
2021-05-07 11:02     ` Marc Zyngier [this message]
2021-05-07 11:02       ` Marc Zyngier
2021-05-07 11:02       ` Marc Zyngier
2021-05-07 17:36       ` Marc Zyngier
2021-05-07 17:36         ` Marc Zyngier
2021-05-08  1:51         ` Jason Wang
2021-05-08  1:51           ` Jason Wang
2021-05-08  1:51           ` Jason Wang
2021-05-08  6:56           ` Zhu, Lingshan
2021-05-08  6:56             ` Zhu, Lingshan
2021-05-08  6:56             ` Zhu, Lingshan
2021-05-08  9:15           ` Marc Zyngier
2021-05-08  9:15             ` Marc Zyngier
2021-05-08  9:15             ` Marc Zyngier
2021-05-09 17:00       ` Auger Eric
2021-05-09 17:00         ` Auger Eric
2021-05-09 17:00         ` Auger Eric
2021-05-10  7:49         ` Marc Zyngier
2021-05-10  7:49           ` Marc Zyngier
2021-05-10  7:49           ` Marc Zyngier
2021-05-10  8:29           ` Auger Eric
2021-05-10  8:29             ` Auger Eric
2021-05-10  8:29             ` Auger Eric
2021-05-10  9:59             ` Marc Zyngier
2021-05-10  9:59               ` Marc Zyngier
2021-05-10  9:59               ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878s4qq00u.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=cohuck@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=zhangshaokun@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.