qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] msix: fix interrupt aggregation problem at the passthrough of NVMe SSD
@ 2019-04-09 14:14 Zhuangyanying
  2019-04-09 14:14 ` Zhuangyanying
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Zhuangyanying @ 2019-04-09 14:14 UTC (permalink / raw)
  To: mst, marcel.apfelbaum; +Cc: qemu-devel, arei.gonglei, Zhuang Yanying

From: Zhuang Yanying <ann.zhuangyanying@huawei.com>

Recently I tested the performance of NVMe SSD passthrough and found that interrupts
were aggregated on vcpu0(or the first vcpu of each numa) by /proc/interrupts,when
GuestOS was upgraded to sles12sp3 (or redhat7.6). But /proc/irq/X/smp_affinity_list
shows that the interrupt is spread out, such as 0-10, 11-21,.... and so on.
This problem cannot be resolved by "echo X > /proc/irq/X/smp_affinity_list", because
the NVMe SSD interrupt is requested by the API pci_alloc_irq_vectors(), so the
interrupt has the IRQD_AFFINITY_MANAGED flag.

GuestOS sles12sp3 backport "automatic interrupt affinity for MSI/MSI-X capable devices",
but the implementation of __setup_irq has no corresponding modification. It is still
irq_startup(), then setup_affinity(), that is sending an affinity message when the
interrupt is unmasked. The bare metal configuration is successful, but qemu will
not trigger the msix update, and the affinity configuration fails.
The affinity is configured by /proc/irq/X/smp_affinity_list, implemented at
apic_ack_edge(), the bitmap is stored in pending_mask,
mask->__pci_write_msi_msg()->unmask,
and the timing is guaranteed, and the configuration takes effect.

The GuestOS linux-master incorporates the "genirq/cpuhotplug: Enforce affinity
setting on startup of managed irqs" to ensure that the affinity is first issued
and then __irq_startup(), for the managerred interrupt. So configuration is
successful.

It now looks like sles12sp3 (up to sles15sp1, linux-4.12.x), redhat7.6
(3.10.0-957.10.1) does not have backport the patch yet.
"if (is_masked == was_masked) return;" can it be removed at qemu?
What is the reason for this check?

Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
---
 hw/pci/msix.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index 4e33641..e1ff533 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -119,10 +119,6 @@ static void msix_handle_mask_update(PCIDevice *dev, int vector, bool was_masked)
 {
     bool is_masked = msix_is_masked(dev, vector);
 
-    if (is_masked == was_masked) {
-        return;
-    }
-
     msix_fire_vector_notifier(dev, vector, is_masked);
 
     if (!is_masked && msix_is_pending(dev, vector)) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-04-10  7:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-04-09 14:14 [Qemu-devel] [PATCH] msix: fix interrupt aggregation problem at the passthrough of NVMe SSD Zhuangyanying
2019-04-09 14:14 ` Zhuangyanying
2019-04-09 14:52 ` Michael S. Tsirkin
2019-04-09 14:52   ` Michael S. Tsirkin
2019-04-09 15:04 ` Michael S. Tsirkin
2019-04-09 15:04   ` Michael S. Tsirkin
2019-04-10  7:14   ` Zhuangyanying
2019-04-10  7:14     ` Zhuangyanying

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).