All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] hw/nvme: add irqfd support
@ 2022-08-25 20:14 Klaus Jensen
  2022-08-25 20:14 ` [PATCH v3 1/2] hw/nvme: support irq(de)assertion with eventfd Klaus Jensen
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Klaus Jensen @ 2022-08-25 20:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: Keith Busch, Klaus Jensen, qemu-block, Jinhao Fan, stefanha,
	Klaus Jensen

From: Klaus Jensen <k.jensen@samsung.com>

This is a re-spin of Jinhao's irqfd support series that fixes msix
vector masking/unmasking to work correctly.

I kept being bugged out about that msi route not getting updated, so I
hit the code into submission with a hammer.

I finally noticed the core issue:

  1. The vector notifiers was never set because msix is not enabled at
     the point where nvme_init_pci() is called.

     Move this call to nvme_start_ctrl().

Since the unmask callback was suddenly getting called now, another fix
was needed:

  2. Call kvm_irqchip_add_irqfd_notifier_gsi() in the unmask handler -
     not in nvme_init_irq_notifier(). The vectors may potentially be
     masked/unmasked and shall cause a pair of add_irqfd and
     remove_irqfd calls. Removing it from nvme_init_irq_notifier() makes
     sure we do not try to double add.

Now it does what it is supposed to; no hacks required :)

Jinhao Fan (2):
  hw/nvme: support irq(de)assertion with eventfd
  hw/nvme: use KVM irqfd when available

 hw/nvme/ctrl.c       | 261 ++++++++++++++++++++++++++++++++++++++++---
 hw/nvme/nvme.h       |   6 +
 hw/nvme/trace-events |   3 +
 3 files changed, 253 insertions(+), 17 deletions(-)

-- 
2.37.2



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v3 1/2] hw/nvme: support irq(de)assertion with eventfd
  2022-08-25 20:14 [PATCH v3 0/2] hw/nvme: add irqfd support Klaus Jensen
@ 2022-08-25 20:14 ` Klaus Jensen
  2022-08-25 20:14 ` [PATCH v3 2/2] hw/nvme: use KVM irqfd when available Klaus Jensen
  2022-08-26  2:03 ` [PATCH v3 0/2] hw/nvme: add irqfd support Jinhao Fan
  2 siblings, 0 replies; 5+ messages in thread
From: Klaus Jensen @ 2022-08-25 20:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: Keith Busch, Klaus Jensen, qemu-block, Jinhao Fan, stefanha,
	Klaus Jensen

From: Jinhao Fan <fanjinhao21s@ict.ac.cn>

When the new option 'irq-eventfd' is turned on, the IO emulation code
signals an eventfd when it want to (de)assert an irq. The main loop
eventfd handler does the actual irq (de)assertion.  This paves the way
for iothread support since QEMU's interrupt emulation is not thread
safe.

Asserting and deasseting irq with eventfd has some performance
implications. For small queue depth it increases request latency but
for large queue depth it effectively coalesces irqs.

Comparision (KIOPS):

QD            1   4  16  64
QEMU         38 123 210 329
irq-eventfd  32 106 240 364

Signed-off-by: Jinhao Fan <fanjinhao21s@ict.ac.cn>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/nvme/ctrl.c | 120 ++++++++++++++++++++++++++++++++++++++++++-------
 hw/nvme/nvme.h |   3 ++
 2 files changed, 106 insertions(+), 17 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 87aeba056499..51792f395597 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -526,34 +526,57 @@ static void nvme_irq_check(NvmeCtrl *n)
     }
 }
 
+static void nvme_irq_do_assert(NvmeCtrl *n, NvmeCQueue *cq)
+{
+    if (msix_enabled(&(n->parent_obj))) {
+        trace_pci_nvme_irq_msix(cq->vector);
+        msix_notify(&(n->parent_obj), cq->vector);
+    } else {
+        trace_pci_nvme_irq_pin();
+        assert(cq->vector < 32);
+        n->irq_status |= 1 << cq->vector;
+        nvme_irq_check(n);
+    }
+}
+
 static void nvme_irq_assert(NvmeCtrl *n, NvmeCQueue *cq)
 {
     if (cq->irq_enabled) {
-        if (msix_enabled(&(n->parent_obj))) {
-            trace_pci_nvme_irq_msix(cq->vector);
-            msix_notify(&(n->parent_obj), cq->vector);
+        if (cq->assert_notifier.initialized) {
+            event_notifier_set(&cq->assert_notifier);
         } else {
-            trace_pci_nvme_irq_pin();
-            assert(cq->vector < 32);
-            n->irq_status |= 1 << cq->vector;
-            nvme_irq_check(n);
+            nvme_irq_do_assert(n, cq);
         }
     } else {
         trace_pci_nvme_irq_masked();
     }
 }
 
+static void nvme_irq_do_deassert(NvmeCtrl *n, NvmeCQueue *cq)
+{
+    if (msix_enabled(&(n->parent_obj))) {
+        return;
+    } else {
+        assert(cq->vector < 32);
+        if (!n->cq_pending) {
+            n->irq_status &= ~(1 << cq->vector);
+        }
+        nvme_irq_check(n);
+    }
+}
+
 static void nvme_irq_deassert(NvmeCtrl *n, NvmeCQueue *cq)
 {
     if (cq->irq_enabled) {
-        if (msix_enabled(&(n->parent_obj))) {
-            return;
+        if (cq->deassert_notifier.initialized) {
+            /*
+             * The deassert notifier will only be initilized when MSI-X is NOT
+             * in use. Therefore no need to worry about extra eventfd syscall
+             * for pin-based interrupts.
+             */
+            event_notifier_set(&cq->deassert_notifier);
         } else {
-            assert(cq->vector < 32);
-            if (!n->cq_pending) {
-                n->irq_status &= ~(1 << cq->vector);
-            }
-            nvme_irq_check(n);
+            nvme_irq_do_deassert(n, cq);
         }
     }
 }
@@ -1338,6 +1361,50 @@ static void nvme_update_cq_head(NvmeCQueue *cq)
     trace_pci_nvme_shadow_doorbell_cq(cq->cqid, cq->head);
 }
 
+static void nvme_assert_notifier_read(EventNotifier *e)
+{
+    NvmeCQueue *cq = container_of(e, NvmeCQueue, assert_notifier);
+    if (event_notifier_test_and_clear(e)) {
+        nvme_irq_do_assert(cq->ctrl, cq);
+    }
+}
+
+static void nvme_deassert_notifier_read(EventNotifier *e)
+{
+    NvmeCQueue *cq = container_of(e, NvmeCQueue, deassert_notifier);
+    if (event_notifier_test_and_clear(e)) {
+        nvme_irq_do_deassert(cq->ctrl, cq);
+    }
+}
+
+static void nvme_init_irq_notifier(NvmeCtrl *n, NvmeCQueue *cq)
+{
+    int ret;
+
+    ret = event_notifier_init(&cq->assert_notifier, 0);
+    if (ret < 0) {
+        return;
+    }
+
+    event_notifier_set_handler(&cq->assert_notifier,
+                                nvme_assert_notifier_read);
+
+    if (!msix_enabled(&n->parent_obj)) {
+        ret = event_notifier_init(&cq->deassert_notifier, 0);
+        if (ret < 0) {
+            event_notifier_set_handler(&cq->assert_notifier, NULL);
+            event_notifier_cleanup(&cq->assert_notifier);
+
+            return;
+        }
+
+        event_notifier_set_handler(&cq->deassert_notifier,
+                                   nvme_deassert_notifier_read);
+    }
+
+    return;
+}
+
 static void nvme_post_cqes(void *opaque)
 {
     NvmeCQueue *cq = opaque;
@@ -1377,8 +1444,10 @@ static void nvme_post_cqes(void *opaque)
         QTAILQ_INSERT_TAIL(&sq->req_list, req, entry);
     }
     if (cq->tail != cq->head) {
-        if (cq->irq_enabled && !pending) {
-            n->cq_pending++;
+        if (cq->irq_enabled) {
+            if (!pending) {
+                n->cq_pending++;
+            }
         }
 
         nvme_irq_assert(n, cq);
@@ -4705,6 +4774,14 @@ static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
         event_notifier_set_handler(&cq->notifier, NULL);
         event_notifier_cleanup(&cq->notifier);
     }
+    if (cq->assert_notifier.initialized) {
+        event_notifier_set_handler(&cq->assert_notifier, NULL);
+        event_notifier_cleanup(&cq->assert_notifier);
+    }
+    if (cq->deassert_notifier.initialized) {
+        event_notifier_set_handler(&cq->deassert_notifier, NULL);
+        event_notifier_cleanup(&cq->deassert_notifier);
+    }
     if (msix_enabled(&n->parent_obj)) {
         msix_vector_unuse(&n->parent_obj, cq->vector);
     }
@@ -4734,7 +4811,7 @@ static uint16_t nvme_del_cq(NvmeCtrl *n, NvmeRequest *req)
         n->cq_pending--;
     }
 
-    nvme_irq_deassert(n, cq);
+    nvme_irq_do_deassert(n, cq);
     trace_pci_nvme_del_cq(qid);
     nvme_free_cq(cq, n);
     return NVME_SUCCESS;
@@ -4772,6 +4849,14 @@ static void nvme_init_cq(NvmeCQueue *cq, NvmeCtrl *n, uint64_t dma_addr,
     }
     n->cq[cqid] = cq;
     cq->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, nvme_post_cqes, cq);
+
+    /*
+     * Only enable irq eventfd for IO queues since we always emulate admin
+     * queue in main loop thread.
+     */
+    if (cqid && n->params.irq_eventfd) {
+        nvme_init_irq_notifier(n, cq);
+    }
 }
 
 static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req)
@@ -7671,6 +7756,7 @@ static Property nvme_props[] = {
     DEFINE_PROP_BOOL("use-intel-id", NvmeCtrl, params.use_intel_id, false),
     DEFINE_PROP_BOOL("legacy-cmb", NvmeCtrl, params.legacy_cmb, false),
     DEFINE_PROP_BOOL("ioeventfd", NvmeCtrl, params.ioeventfd, false),
+    DEFINE_PROP_BOOL("irq-eventfd", NvmeCtrl, params.irq_eventfd, false),
     DEFINE_PROP_UINT8("zoned.zasl", NvmeCtrl, params.zasl, 0),
     DEFINE_PROP_BOOL("zoned.auto_transition", NvmeCtrl,
                      params.auto_transition_zones, true),
diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
index 79f5c281c223..4850d3e9653a 100644
--- a/hw/nvme/nvme.h
+++ b/hw/nvme/nvme.h
@@ -398,6 +398,8 @@ typedef struct NvmeCQueue {
     uint64_t    ei_addr;
     QEMUTimer   *timer;
     EventNotifier notifier;
+    EventNotifier assert_notifier;
+    EventNotifier deassert_notifier;
     bool        ioeventfd_enabled;
     QTAILQ_HEAD(, NvmeSQueue) sq_list;
     QTAILQ_HEAD(, NvmeRequest) req_list;
@@ -422,6 +424,7 @@ typedef struct NvmeParams {
     bool     auto_transition_zones;
     bool     legacy_cmb;
     bool     ioeventfd;
+    bool     irq_eventfd;
     uint8_t  sriov_max_vfs;
     uint16_t sriov_vq_flexible;
     uint16_t sriov_vi_flexible;
-- 
2.37.2



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v3 2/2] hw/nvme: use KVM irqfd when available
  2022-08-25 20:14 [PATCH v3 0/2] hw/nvme: add irqfd support Klaus Jensen
  2022-08-25 20:14 ` [PATCH v3 1/2] hw/nvme: support irq(de)assertion with eventfd Klaus Jensen
@ 2022-08-25 20:14 ` Klaus Jensen
  2022-08-26  2:03 ` [PATCH v3 0/2] hw/nvme: add irqfd support Jinhao Fan
  2 siblings, 0 replies; 5+ messages in thread
From: Klaus Jensen @ 2022-08-25 20:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: Keith Busch, Klaus Jensen, qemu-block, Jinhao Fan, stefanha,
	Klaus Jensen

From: Jinhao Fan <fanjinhao21s@ict.ac.cn>

Use KVM's irqfd to send interrupts when possible. This approach is
thread safe. Moreover, it does not have the inter-thread communication
overhead of plain event notifiers since handler callback are called
in the same system call as irqfd write.

Signed-off-by: Jinhao Fan <fanjinhao21s@ict.ac.cn>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
 hw/nvme/ctrl.c       | 145 ++++++++++++++++++++++++++++++++++++++++++-
 hw/nvme/nvme.h       |   3 +
 hw/nvme/trace-events |   3 +
 3 files changed, 149 insertions(+), 2 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 51792f395597..396f3f0cddbd 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -192,6 +192,7 @@
 #include "qapi/error.h"
 #include "qapi/visitor.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/kvm.h"
 #include "sysemu/block-backend.h"
 #include "sysemu/hostmem.h"
 #include "hw/pci/msix.h"
@@ -1377,8 +1378,115 @@ static void nvme_deassert_notifier_read(EventNotifier *e)
     }
 }
 
+static int nvme_kvm_vector_use(NvmeCtrl *n, NvmeCQueue *cq, uint32_t vector)
+{
+    KVMRouteChange c = kvm_irqchip_begin_route_changes(kvm_state);
+    int ret;
+
+    ret = kvm_irqchip_add_msi_route(&c, vector, &n->parent_obj);
+    if (ret < 0) {
+        return ret;
+    }
+
+    kvm_irqchip_commit_route_changes(&c);
+
+    cq->virq = ret;
+
+    return 0;
+}
+
+static int nvme_kvm_vector_unmask(PCIDevice *pci_dev, unsigned vector,
+                                  MSIMessage msg)
+{
+    NvmeCtrl *n = NVME(pci_dev);
+    int ret;
+
+    trace_pci_nvme_irq_unmask(vector, msg.address, msg.data);
+
+    for (uint32_t i = 1; i <= n->params.max_ioqpairs; i++) {
+        NvmeCQueue *cq = n->cq[i];
+
+        if (!cq) {
+            continue;
+        }
+
+        if (cq->vector == vector) {
+            if (cq->msg.data != msg.data || cq->msg.address != msg.address) {
+                ret = kvm_irqchip_update_msi_route(kvm_state, cq->virq, msg,
+                                                   pci_dev);
+                if (ret < 0) {
+                    return ret;
+                }
+
+                kvm_irqchip_commit_routes(kvm_state);
+
+                cq->msg = msg;
+            }
+
+            ret = kvm_irqchip_add_irqfd_notifier_gsi(kvm_state,
+                                                     &cq->assert_notifier,
+                                                     NULL, cq->virq);
+            if (ret < 0) {
+                return ret;
+            }
+        }
+    }
+
+    return 0;
+}
+
+static void nvme_kvm_vector_mask(PCIDevice *pci_dev, unsigned vector)
+{
+    NvmeCtrl *n = NVME(pci_dev);
+
+    trace_pci_nvme_irq_mask(vector);
+
+    for (uint32_t i = 1; i <= n->params.max_ioqpairs; i++) {
+        NvmeCQueue *cq = n->cq[i];
+
+        if (!cq) {
+            continue;
+        }
+
+        if (cq->vector == vector) {
+            kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state,
+                                                  &cq->assert_notifier,
+                                                  cq->virq);
+        }
+    }
+}
+
+static void nvme_kvm_vector_poll(PCIDevice *pci_dev, unsigned int vector_start,
+                                 unsigned int vector_end)
+{
+    NvmeCtrl *n = NVME(pci_dev);
+
+    trace_pci_nvme_irq_poll(vector_start, vector_end);
+
+    for (uint32_t i = 1; i <= n->params.max_ioqpairs; i++) {
+        NvmeCQueue *cq = n->cq[i];
+
+        if (!cq) {
+            continue;
+        }
+
+        if (!msix_is_masked(pci_dev, cq->vector)) {
+            continue;
+        }
+
+        if (cq->vector >= vector_start && cq->vector <= vector_end) {
+            if (event_notifier_test_and_clear(&cq->assert_notifier)) {
+                msix_set_pending(pci_dev, i);
+            }
+        }
+    }
+}
+
+
 static void nvme_init_irq_notifier(NvmeCtrl *n, NvmeCQueue *cq)
 {
+    bool with_irqfd = msix_enabled(&n->parent_obj) &&
+                      kvm_msi_via_irqfd_enabled();
     int ret;
 
     ret = event_notifier_init(&cq->assert_notifier, 0);
@@ -1386,12 +1494,27 @@ static void nvme_init_irq_notifier(NvmeCtrl *n, NvmeCQueue *cq)
         return;
     }
 
-    event_notifier_set_handler(&cq->assert_notifier,
-                                nvme_assert_notifier_read);
+    if (with_irqfd) {
+        ret = nvme_kvm_vector_use(n, cq, cq->vector);
+        if (ret < 0) {
+            event_notifier_cleanup(&cq->assert_notifier);
+
+            return;
+        }
+    } else {
+        event_notifier_set_handler(&cq->assert_notifier,
+                                   nvme_assert_notifier_read);
+    }
 
     if (!msix_enabled(&n->parent_obj)) {
         ret = event_notifier_init(&cq->deassert_notifier, 0);
         if (ret < 0) {
+            if (with_irqfd) {
+                kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state,
+                                                      &cq->assert_notifier,
+                                                      cq->virq);
+            }
+
             event_notifier_set_handler(&cq->assert_notifier, NULL);
             event_notifier_cleanup(&cq->assert_notifier);
 
@@ -4764,6 +4887,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
 
 static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
 {
+    bool with_irqfd = msix_enabled(&n->parent_obj) &&
+                      kvm_msi_via_irqfd_enabled();
     uint16_t offset = (cq->cqid << 3) + (1 << 2);
 
     n->cq[cq->cqid] = NULL;
@@ -4775,6 +4900,12 @@ static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
         event_notifier_cleanup(&cq->notifier);
     }
     if (cq->assert_notifier.initialized) {
+        if (with_irqfd) {
+            kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state,
+                                                  &cq->assert_notifier,
+                                                  cq->virq);
+            kvm_irqchip_release_virq(kvm_state, cq->virq);
+        }
         event_notifier_set_handler(&cq->assert_notifier, NULL);
         event_notifier_cleanup(&cq->assert_notifier);
     }
@@ -6528,6 +6659,9 @@ static int nvme_start_ctrl(NvmeCtrl *n)
     uint32_t page_size = 1 << page_bits;
     NvmeSecCtrlEntry *sctrl = nvme_sctrl(n);
 
+    bool with_irqfd = msix_enabled(&n->parent_obj) &&
+                      kvm_msi_via_irqfd_enabled();
+
     if (pci_is_vf(&n->parent_obj) && !sctrl->scs) {
         trace_pci_nvme_err_startfail_virt_state(le16_to_cpu(sctrl->nvi),
                                                 le16_to_cpu(sctrl->nvq),
@@ -6617,6 +6751,12 @@ static int nvme_start_ctrl(NvmeCtrl *n)
 
     nvme_select_iocs(n);
 
+    if (with_irqfd) {
+        return msix_set_vector_notifiers(PCI_DEVICE(n), nvme_kvm_vector_unmask,
+                                         nvme_kvm_vector_mask,
+                                         nvme_kvm_vector_poll);
+    }
+
     return 0;
 }
 
@@ -7734,6 +7874,7 @@ static void nvme_exit(PCIDevice *pci_dev)
         pcie_sriov_pf_exit(pci_dev);
     }
 
+    msix_unset_vector_notifiers(pci_dev);
     msix_uninit(pci_dev, &n->bar0, &n->bar0);
     memory_region_del_subregion(&n->bar0, &n->iomem);
 }
diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
index 4850d3e9653a..b0b986b02426 100644
--- a/hw/nvme/nvme.h
+++ b/hw/nvme/nvme.h
@@ -20,6 +20,7 @@
 
 #include "qemu/uuid.h"
 #include "hw/pci/pci.h"
+#include "hw/pci/msi.h"
 #include "hw/block/block.h"
 
 #include "block/nvme.h"
@@ -396,10 +397,12 @@ typedef struct NvmeCQueue {
     uint64_t    dma_addr;
     uint64_t    db_addr;
     uint64_t    ei_addr;
+    int         virq;
     QEMUTimer   *timer;
     EventNotifier notifier;
     EventNotifier assert_notifier;
     EventNotifier deassert_notifier;
+    MSIMessage  msg;
     bool        ioeventfd_enabled;
     QTAILQ_HEAD(, NvmeSQueue) sq_list;
     QTAILQ_HEAD(, NvmeRequest) req_list;
diff --git a/hw/nvme/trace-events b/hw/nvme/trace-events
index fccb79f48973..b11fcf4a651d 100644
--- a/hw/nvme/trace-events
+++ b/hw/nvme/trace-events
@@ -2,6 +2,9 @@
 pci_nvme_irq_msix(uint32_t vector) "raising MSI-X IRQ vector %u"
 pci_nvme_irq_pin(void) "pulsing IRQ pin"
 pci_nvme_irq_masked(void) "IRQ is masked"
+pci_nvme_irq_mask(uint32_t vector) "IRQ %u gets masked"
+pci_nvme_irq_unmask(uint32_t vector, uint64_t addr, uint32_t data) "IRQ %u gets unmasked, addr=0x%"PRIx64" data=0x%"PRIu32""
+pci_nvme_irq_poll(uint32_t vector_start, uint32_t vector_end) "IRQ poll, start=0x%"PRIu32" end=0x%"PRIu32""
 pci_nvme_dma_read(uint64_t prp1, uint64_t prp2) "DMA read, prp1=0x%"PRIx64" prp2=0x%"PRIx64""
 pci_nvme_dbbuf_config(uint64_t dbs_addr, uint64_t eis_addr) "dbs_addr=0x%"PRIx64" eis_addr=0x%"PRIx64""
 pci_nvme_map_addr(uint64_t addr, uint64_t len) "addr 0x%"PRIx64" len %"PRIu64""
-- 
2.37.2



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 0/2] hw/nvme: add irqfd support
  2022-08-25 20:14 [PATCH v3 0/2] hw/nvme: add irqfd support Klaus Jensen
  2022-08-25 20:14 ` [PATCH v3 1/2] hw/nvme: support irq(de)assertion with eventfd Klaus Jensen
  2022-08-25 20:14 ` [PATCH v3 2/2] hw/nvme: use KVM irqfd when available Klaus Jensen
@ 2022-08-26  2:03 ` Jinhao Fan
  2022-08-26  6:50   ` Klaus Jensen
  2 siblings, 1 reply; 5+ messages in thread
From: Jinhao Fan @ 2022-08-26  2:03 UTC (permalink / raw)
  To: Klaus Jensen; +Cc: qemu-devel, Keith Busch, qemu-block, stefanha, Klaus Jensen

at 4:14 AM, Klaus Jensen <its@irrelevant.dk> wrote:

> From: Klaus Jensen <k.jensen@samsung.com>
> 
> This is a re-spin of Jinhao's irqfd support series that fixes msix
> vector masking/unmasking to work correctly.
> 
> I kept being bugged out about that msi route not getting updated, so I
> hit the code into submission with a hammer.
> 
> I finally noticed the core issue:
> 
>  1. The vector notifiers was never set because msix is not enabled at
>     the point where nvme_init_pci() is called.
> 
>     Move this call to nvme_start_ctrl().
> 
> Since the unmask callback was suddenly getting called now, another fix
> was needed:
> 
>  2. Call kvm_irqchip_add_irqfd_notifier_gsi() in the unmask handler -
>     not in nvme_init_irq_notifier(). The vectors may potentially be
>     masked/unmasked and shall cause a pair of add_irqfd and
>     remove_irqfd calls. Removing it from nvme_init_irq_notifier() makes
>     sure we do not try to double add.
> 
> Now it does what it is supposed to; no hacks required :)
> 
> Jinhao Fan (2):
>  hw/nvme: support irq(de)assertion with eventfd
>  hw/nvme: use KVM irqfd when available
> 
> hw/nvme/ctrl.c       | 261 ++++++++++++++++++++++++++++++++++++++++---
> hw/nvme/nvme.h       |   6 +
> hw/nvme/trace-events |   3 +
> 3 files changed, 253 insertions(+), 17 deletions(-)
> 
> -- 
> 2.37.2

This patch series works well on my machine without a vIOMMU. 

Just to confirm my understanding, the workflow of updating msi route is:
Host driver update MSI msg and addr -> unmask callback get called ->
kvm_irqchip_update_msi_route() does the work. Right?




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 0/2] hw/nvme: add irqfd support
  2022-08-26  2:03 ` [PATCH v3 0/2] hw/nvme: add irqfd support Jinhao Fan
@ 2022-08-26  6:50   ` Klaus Jensen
  0 siblings, 0 replies; 5+ messages in thread
From: Klaus Jensen @ 2022-08-26  6:50 UTC (permalink / raw)
  To: Jinhao Fan; +Cc: qemu-devel, Keith Busch, qemu-block, stefanha, Klaus Jensen

[-- Attachment #1: Type: text/plain, Size: 1785 bytes --]

On Aug 26 10:03, Jinhao Fan wrote:
> at 4:14 AM, Klaus Jensen <its@irrelevant.dk> wrote:
> 
> > From: Klaus Jensen <k.jensen@samsung.com>
> > 
> > This is a re-spin of Jinhao's irqfd support series that fixes msix
> > vector masking/unmasking to work correctly.
> > 
> > I kept being bugged out about that msi route not getting updated, so I
> > hit the code into submission with a hammer.
> > 
> > I finally noticed the core issue:
> > 
> >  1. The vector notifiers was never set because msix is not enabled at
> >     the point where nvme_init_pci() is called.
> > 
> >     Move this call to nvme_start_ctrl().
> > 
> > Since the unmask callback was suddenly getting called now, another fix
> > was needed:
> > 
> >  2. Call kvm_irqchip_add_irqfd_notifier_gsi() in the unmask handler -
> >     not in nvme_init_irq_notifier(). The vectors may potentially be
> >     masked/unmasked and shall cause a pair of add_irqfd and
> >     remove_irqfd calls. Removing it from nvme_init_irq_notifier() makes
> >     sure we do not try to double add.
> > 
> > Now it does what it is supposed to; no hacks required :)
> > 
> > Jinhao Fan (2):
> >  hw/nvme: support irq(de)assertion with eventfd
> >  hw/nvme: use KVM irqfd when available
> > 
> > hw/nvme/ctrl.c       | 261 ++++++++++++++++++++++++++++++++++++++++---
> > hw/nvme/nvme.h       |   6 +
> > hw/nvme/trace-events |   3 +
> > 3 files changed, 253 insertions(+), 17 deletions(-)
> > 
> > -- 
> > 2.37.2
> 
> This patch series works well on my machine without a vIOMMU. 
> 
> Just to confirm my understanding, the workflow of updating msi route is:
> Host driver update MSI msg and addr -> unmask callback get called ->
> kvm_irqchip_update_msi_route() does the work. Right?
> 

Yes :)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-08-26  6:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-25 20:14 [PATCH v3 0/2] hw/nvme: add irqfd support Klaus Jensen
2022-08-25 20:14 ` [PATCH v3 1/2] hw/nvme: support irq(de)assertion with eventfd Klaus Jensen
2022-08-25 20:14 ` [PATCH v3 2/2] hw/nvme: use KVM irqfd when available Klaus Jensen
2022-08-26  2:03 ` [PATCH v3 0/2] hw/nvme: add irqfd support Jinhao Fan
2022-08-26  6:50   ` Klaus Jensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.