qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] hw/e1000e|igb: interrupts and qtests fixes
@ 2025-01-17 17:02 Nicholas Piggin
  2025-01-17 17:02 ` [PATCH 1/9] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq Nicholas Piggin
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Nicholas Piggin @ 2025-01-17 17:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Nicholas Piggin, Dmitry Fleytman, Akihiko Odaki, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

This is a re-work of the patch series here

https://lore.kernel.org/qemu-devel/20250115150112.346497-1-npiggin@gmail.com/

Patch 1 is taken from patch 4 there, and adjusted with Odaki san's
suggestion to clear the msix pending bit with the PBACLR register.
That makes all the other patches unnecessary, which is great. This
patch should be good by itself.

However after doing more testing, I found interrupt throttling in
the devices was raising supurious interrupts and went too far down
that rabbit hole. Consider patches 2-9 to be RFC to address some of
these issues and not required for patch 1.

Thanks,
Nick

Nicholas Piggin (9):
  qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after
    irq
  net/e1000e: Permit disabling interrupt throttling
  qtest/e1000e|igb: assert irqs are clear before triggering an irq
  net/igb: Fix interrupt throttling interval calculation
  net/igb: Fix EITR LLI and counter fields
  net/e1000e|igb: Fix interrupt throttling logic
  qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test
  net/e1000e: Fix xITR minimum value
  hw/net/e1000e|igb: Remove xitr_guest_value logic

 hw/net/igb_core.h           |   2 -
 hw/net/igb_regs.h           |  11 +++-
 tests/qtest/libqos/e1000e.h |   1 +
 hw/net/e1000e_core.c        | 107 ++++++++++++++++++++++++++----------
 hw/net/igb.c                |   2 +-
 hw/net/igb_core.c           |  81 +++++++++++++++++++++------
 tests/qtest/e1000e-test.c   |  19 ++++++-
 tests/qtest/igb-test.c      |  18 ++++++
 tests/qtest/libqos/e1000e.c |   9 ++-
 9 files changed, 197 insertions(+), 53 deletions(-)

-- 
2.45.2



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/9] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq
  2025-01-17 17:02 [PATCH 0/9] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
@ 2025-01-17 17:02 ` Nicholas Piggin
  2025-01-17 17:02 ` [PATCH 2/9] net/e1000e: Permit disabling interrupt throttling Nicholas Piggin
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Nicholas Piggin @ 2025-01-17 17:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Nicholas Piggin, Dmitry Fleytman, Akihiko Odaki, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

The e1000e and igb tests do not clear the ICR/EICR cause bits (or
set auto-clear) on seeing queue interrupts, which inhibits the
triggering of a new interrupt. The msix pending bit which is used
to test for the interrupt is also not cleared (the vector is masked).

Fix this by clearing the ICR/EICR cause bits, and the msix pending
bit using the PBACLR device register.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 tests/qtest/e1000e-test.c | 9 ++++++++-
 tests/qtest/igb-test.c    | 8 ++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
index de9738fdb74..746d26cfb67 100644
--- a/tests/qtest/e1000e-test.c
+++ b/tests/qtest/e1000e-test.c
@@ -66,6 +66,10 @@ static void e1000e_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
 
     /* Wait for TX WB interrupt */
     e1000e_wait_isr(d, E1000E_TX0_MSG_ID);
+    /* Read ICR to make it ready for next interrupt, assert TXQ0 cause */
+    g_assert(e1000e_macreg_read(d, E1000_ICR) & E1000_ICR_TXQ0);
+    /* Write PBACLR to clear the MSIX pending bit */
+    e1000e_macreg_write(d, E1000_PBACLR, (1 << E1000E_TX0_MSG_ID));
 
     /* Check DD bit */
     g_assert_cmphex(le32_to_cpu(descr.upper.data) & E1000_TXD_STAT_DD, ==,
@@ -117,7 +121,10 @@ static void e1000e_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator
 
     /* Wait for TX WB interrupt */
     e1000e_wait_isr(d, E1000E_RX0_MSG_ID);
-
+    /* Read ICR to make it ready for next interrupt, assert RXQ0 cause */
+    g_assert(e1000e_macreg_read(d, E1000_ICR) & E1000_ICR_RXQ0);
+    /* Write PBACLR to clear the MSIX pending bit */
+    e1000e_macreg_write(d, E1000_PBACLR, (1 << E1000E_RX0_MSG_ID));
     /* Check DD bit */
     g_assert_cmphex(le32_to_cpu(descr.wb.upper.status_error) &
         E1000_RXD_STAT_DD, ==, E1000_RXD_STAT_DD);
diff --git a/tests/qtest/igb-test.c b/tests/qtest/igb-test.c
index 3d397ea6973..cf8b4131cf2 100644
--- a/tests/qtest/igb-test.c
+++ b/tests/qtest/igb-test.c
@@ -69,6 +69,10 @@ static void igb_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *allo
 
     /* Wait for TX WB interrupt */
     e1000e_wait_isr(d, E1000E_TX0_MSG_ID);
+    /* Read EICR which clears it ready for next interrupt, assert TXQ0 cause */
+    g_assert(e1000e_macreg_read(d, E1000_EICR) & (1 << E1000E_TX0_MSG_ID));
+    /* Write PBACLR to clear the MSIX pending bit */
+    e1000e_macreg_write(d, E1000_PBACLR, (1 << E1000E_TX0_MSG_ID));
 
     /* Check DD bit */
     g_assert_cmphex(le32_to_cpu(descr.wb.status) & E1000_TXD_STAT_DD, ==,
@@ -120,6 +124,10 @@ static void igb_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
 
     /* Wait for TX WB interrupt */
     e1000e_wait_isr(d, E1000E_RX0_MSG_ID);
+    /* Read EICR which clears it ready for next interrupt, assert RXQ0 cause */
+    g_assert(e1000e_macreg_read(d, E1000_EICR) & (1 << E1000E_RX0_MSG_ID));
+    /* Write PBACLR to clear the MSIX pending bit */
+    e1000e_macreg_write(d, E1000_PBACLR, (1 << E1000E_RX0_MSG_ID));
 
     /* Check DD bit */
     g_assert_cmphex(le32_to_cpu(descr.wb.upper.status_error) &
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/9] net/e1000e: Permit disabling interrupt throttling
  2025-01-17 17:02 [PATCH 0/9] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
  2025-01-17 17:02 ` [PATCH 1/9] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq Nicholas Piggin
@ 2025-01-17 17:02 ` Nicholas Piggin
  2025-01-17 17:02 ` [PATCH 3/9] qtest/e1000e|igb: assert irqs are clear before triggering an irq Nicholas Piggin
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Nicholas Piggin @ 2025-01-17 17:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Nicholas Piggin, Dmitry Fleytman, Akihiko Odaki, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

The spec explicitly permits xITR register interval field to have a value
of zero to disable throttling. The e1000e model already allows for this
in the throttling logic.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/e1000e_core.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 24138587905..e32955d244b 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -2783,7 +2783,11 @@ e1000e_set_itr(E1000ECore *core, int index, uint32_t val)
     trace_e1000e_irq_itr_set(val);
 
     core->itr_guest_value = interval;
-    core->mac[index] = MAX(interval, E1000E_MIN_XITR);
+    if (interval == 0) {
+        core->mac[index] = 0;
+    } else {
+        core->mac[index] = MAX(interval, E1000E_MIN_XITR);
+    }
 }
 
 static void
@@ -2795,7 +2799,11 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t val)
     trace_e1000e_irq_eitr_set(eitr_num, val);
 
     core->eitr_guest_value[eitr_num] = interval;
-    core->mac[index] = MAX(interval, E1000E_MIN_XITR);
+    if (interval == 0) {
+        core->mac[index] = 0;
+    } else {
+        core->mac[index] = MAX(interval, E1000E_MIN_XITR);
+    }
 }
 
 static void
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/9] qtest/e1000e|igb: assert irqs are clear before triggering an irq
  2025-01-17 17:02 [PATCH 0/9] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
  2025-01-17 17:02 ` [PATCH 1/9] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq Nicholas Piggin
  2025-01-17 17:02 ` [PATCH 2/9] net/e1000e: Permit disabling interrupt throttling Nicholas Piggin
@ 2025-01-17 17:02 ` Nicholas Piggin
  2025-01-18  8:14   ` Akihiko Odaki
  2025-01-19  9:22   ` Yan Vugenfirer
  2025-01-17 17:03 ` [PATCH 4/9] net/igb: Fix interrupt throttling interval calculation Nicholas Piggin
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 17+ messages in thread
From: Nicholas Piggin @ 2025-01-17 17:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Nicholas Piggin, Dmitry Fleytman, Akihiko Odaki, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

Assert there is no existing irq raised that would lead to a false
positive interrupt test.

e1000e has to disable interrupt throttling for this test, because
it can cause delayed superfluous interrupts which trip the assertions.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 tests/qtest/libqos/e1000e.h |  1 +
 tests/qtest/e1000e-test.c   | 10 ++++++++++
 tests/qtest/igb-test.c      |  6 ++++++
 tests/qtest/libqos/e1000e.c |  9 ++++++++-
 4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/tests/qtest/libqos/e1000e.h b/tests/qtest/libqos/e1000e.h
index 30643c80949..7154aec0339 100644
--- a/tests/qtest/libqos/e1000e.h
+++ b/tests/qtest/libqos/e1000e.h
@@ -54,6 +54,7 @@ static inline uint32_t e1000e_macreg_read(QE1000E *d, uint32_t reg)
     return qpci_io_readl(&d_pci->pci_dev, d_pci->mac_regs, reg);
 }
 
+bool e1000e_seen_isr(QE1000E *d, uint16_t msg_id);
 void e1000e_wait_isr(QE1000E *d, uint16_t msg_id);
 void e1000e_tx_ring_push(QE1000E *d, void *descr);
 void e1000e_rx_ring_push(QE1000E *d, void *descr);
diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
index 746d26cfb67..9ab81ecff5b 100644
--- a/tests/qtest/e1000e-test.c
+++ b/tests/qtest/e1000e-test.c
@@ -61,6 +61,9 @@ static void e1000e_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
                                    E1000_TXD_DTYP_D   |
                                    sizeof(buffer));
 
+    /* Ensure the interrupt has not been taken already */
+    g_assert(!e1000e_seen_isr(d, E1000E_TX0_MSG_ID));
+
     /* Put descriptor to the ring */
     e1000e_tx_ring_push(d, &descr);
 
@@ -105,6 +108,9 @@ static void e1000e_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator
     char buffer[64];
     int ret;
 
+    /* Ensure the interrupt has not been taken already */
+    g_assert(!e1000e_seen_isr(d, E1000E_RX0_MSG_ID));
+
     /* Send a dummy packet to device's socket*/
     ret = iov_send(test_sockets[0], iov, 2, 0, sizeof(len) + sizeof(packet));
     g_assert_cmpint(ret, == , sizeof(packet) + sizeof(len));
@@ -188,6 +194,10 @@ static void test_e1000e_multiple_transfers(void *obj, void *data,
         return;
     }
 
+    /* Clear EITR because buggy QEMU throttle timer causes superfluous irqs */
+    e1000e_macreg_write(d, E1000_EITR + E1000E_RX0_MSG_ID * 4, 0);
+    e1000e_macreg_write(d, E1000_EITR + E1000E_TX0_MSG_ID * 4, 0);
+
     for (i = 0; i < iterations; i++) {
         e1000e_send_verify(d, data, alloc);
         e1000e_receive_verify(d, data, alloc);
diff --git a/tests/qtest/igb-test.c b/tests/qtest/igb-test.c
index cf8b4131cf2..1bbb4bea4c1 100644
--- a/tests/qtest/igb-test.c
+++ b/tests/qtest/igb-test.c
@@ -64,6 +64,9 @@ static void igb_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *allo
                                           E1000_TXD_DTYP_D   |
                                           sizeof(buffer));
 
+    /* Ensure the interrupt has not been taken already */
+    g_assert(!e1000e_seen_isr(d, E1000E_TX0_MSG_ID));
+
     /* Put descriptor to the ring */
     e1000e_tx_ring_push(d, &descr);
 
@@ -119,6 +122,9 @@ static void igb_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
     memset(&descr, 0, sizeof(descr));
     descr.read.pkt_addr = cpu_to_le64(data);
 
+    /* Ensure the interrupt has not been taken already */
+    g_assert(!e1000e_seen_isr(d, E1000E_RX0_MSG_ID));
+
     /* Put descriptor to the ring */
     e1000e_rx_ring_push(d, &descr);
 
diff --git a/tests/qtest/libqos/e1000e.c b/tests/qtest/libqos/e1000e.c
index 925654c7fd4..4e4c387b0bf 100644
--- a/tests/qtest/libqos/e1000e.c
+++ b/tests/qtest/libqos/e1000e.c
@@ -77,13 +77,20 @@ static void e1000e_foreach_callback(QPCIDevice *dev, int devfn, void *data)
     g_free(dev);
 }
 
+bool e1000e_seen_isr(QE1000E *d, uint16_t msg_id)
+{
+    QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
+
+    return qpci_msix_pending(&d_pci->pci_dev, msg_id);
+}
+
 void e1000e_wait_isr(QE1000E *d, uint16_t msg_id)
 {
     QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
     guint64 end_time = g_get_monotonic_time() + 5 * G_TIME_SPAN_SECOND;
 
     do {
-        if (qpci_msix_pending(&d_pci->pci_dev, msg_id)) {
+        if (e1000e_seen_isr(d, msg_id)) {
             return;
         }
         qtest_clock_step(d_pci->pci_dev.bus->qts, 10000);
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/9] net/igb: Fix interrupt throttling interval calculation
  2025-01-17 17:02 [PATCH 0/9] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (2 preceding siblings ...)
  2025-01-17 17:02 ` [PATCH 3/9] qtest/e1000e|igb: assert irqs are clear before triggering an irq Nicholas Piggin
@ 2025-01-17 17:03 ` Nicholas Piggin
  2025-01-18  8:22   ` Akihiko Odaki
  2025-01-17 17:03 ` [PATCH 5/9] net/igb: Fix EITR LLI and counter fields Nicholas Piggin
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Nicholas Piggin @ 2025-01-17 17:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Nicholas Piggin, Dmitry Fleytman, Akihiko Odaki, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

IGB throttling granularity is 1us, and interval field is in bits 2..14
of the EITRx registers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/igb_regs.h | 3 +++
 hw/net/igb_core.c | 7 ++++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index 4dc4c31da27..1ed5ee5039a 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -146,6 +146,9 @@ union e1000_adv_rx_desc {
 #define IGB_82576_VF_DEV_ID        0x10CA
 #define IGB_I350_VF_DEV_ID         0x1520
 
+/* Delay increments in nanoseconds for interrupt throttling registers */
+#define IGB_INTR_THROTTLING_NS_RES (1000)
+
 /* VLAN info */
 #define IGB_TX_FLAGS_VLAN_MASK     0xffff0000
 #define IGB_TX_FLAGS_VLAN_SHIFT    16
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 39e3ce1c8fe..94f9785749a 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -142,8 +142,9 @@ static void igb_msix_notify(IGBCore *core, unsigned int cause)
 static inline void
 igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
 {
-    int64_t delay_ns = (int64_t) timer->core->mac[timer->delay_reg] *
-                                 timer->delay_resolution_ns;
+    int64_t delay_ns =
+            (int64_t)((timer->core->mac[timer->delay_reg] & 0x7FFC) >> 2) *
+                     timer->delay_resolution_ns;
 
     trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
 
@@ -180,7 +181,7 @@ igb_intrmgr_initialize_all_timers(IGBCore *core, bool create)
     for (i = 0; i < IGB_INTR_NUM; i++) {
         core->eitr[i].core = core;
         core->eitr[i].delay_reg = EITR0 + i;
-        core->eitr[i].delay_resolution_ns = E1000_INTR_DELAY_NS_RES;
+        core->eitr[i].delay_resolution_ns = IGB_INTR_THROTTLING_NS_RES;
     }
 
     if (!create) {
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/9] net/igb: Fix EITR LLI and counter fields
  2025-01-17 17:02 [PATCH 0/9] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (3 preceding siblings ...)
  2025-01-17 17:03 ` [PATCH 4/9] net/igb: Fix interrupt throttling interval calculation Nicholas Piggin
@ 2025-01-17 17:03 ` Nicholas Piggin
  2025-01-18  8:37   ` Akihiko Odaki
  2025-01-17 17:03 ` [PATCH 6/9] net/e1000e|igb: Fix interrupt throttling logic Nicholas Piggin
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Nicholas Piggin @ 2025-01-17 17:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Nicholas Piggin, Dmitry Fleytman, Akihiko Odaki, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

IGB EITR registers have counter fields which reflect the current ITR
and LLI counter values, as well as a bit to enable LLI moderation,
and a bit to write the register without modifying the counter fields.

Implement the ITR counter and log an unimp message if software tries
to enable LLI moderation. Fix the implementation of the counter ignore
bit.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/igb_regs.h |  8 +++++--
 hw/net/igb_core.c | 53 ++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 51 insertions(+), 10 deletions(-)

diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index 1ed5ee5039a..b612248264a 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -321,8 +321,12 @@ union e1000_adv_rx_desc {
 #define E1000_EICR_TX_QUEUE3    0x00000800 /* Tx Queue 3 Interrupt */
 #define E1000_EICR_OTHER        0x80000000 /* Interrupt Cause Active */
 
-/* Extended Interrupt Cause Set */
-/* E1000_EITR_CNT_IGNR is only for 82576 and newer */
+/* Extended Interrupt Throttle */
+/* These are only for 82576 and newer */
+#define E1000_EITR_INTERVAL     0x00007FFC
+#define E1000_EITR_LLI_EN       0x00008000
+#define E1000_EITR_LLI_CNT      0x001F0000
+#define E1000_EITR_ITR_CNT      0x7FE00000
 #define E1000_EITR_CNT_IGNR     0x80000000 /* Don't reset counters on write */
 
 #define E1000_TSYNCTXCTL_VALID    0x00000001 /* tx timestamp valid */
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 94f9785749a..cdebc917d2e 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -140,12 +140,8 @@ static void igb_msix_notify(IGBCore *core, unsigned int cause)
 }
 
 static inline void
-igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
+igb_intrmgr_arm_timer(IGBIntrDelayTimer *timer, int64_t delay_ns)
 {
-    int64_t delay_ns =
-            (int64_t)((timer->core->mac[timer->delay_reg] & 0x7FFC) >> 2) *
-                     timer->delay_resolution_ns;
-
     trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
 
     timer_mod(timer->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + delay_ns);
@@ -153,6 +149,16 @@ igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
     timer->running = true;
 }
 
+static inline void
+igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
+{
+    uint32_t interval = (timer->core->mac[timer->delay_reg] &
+                         E1000_EITR_INTERVAL) >> 2;
+    int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
+
+    igb_intrmgr_arm_timer(timer, delay_ns);
+}
+
 static void
 igb_intmgr_timer_resume(IGBIntrDelayTimer *timer)
 {
@@ -2881,7 +2887,22 @@ igb_mac_swsm_read(IGBCore *core, int index)
 static uint32_t
 igb_mac_eitr_read(IGBCore *core, int index)
 {
-    return core->eitr_guest_value[index - EITR0];
+    uint32_t eitr_num = index - EITR0;
+    uint32_t val = core->mac[eitr_num];
+    IGBIntrDelayTimer *timer = &core->eitr[eitr_num];
+
+    if (timer->running) { /* ITR is enabled, get ITR counter */
+        int64_t remains = timer_expire_time_ns(timer->timer) -
+                          qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+        if (remains > 0) {
+            /* CNT is the most significant 10 of 12 bits */
+            uint32_t cnt;
+            cnt = remains / timer->delay_resolution_ns;
+            val |= ((cnt >> 2) << 21) & E1000_EITR_ITR_CNT;
+        }
+    }
+
+    return val;
 }
 
 static uint32_t igb_mac_vfmailbox_read(IGBCore *core, int index)
@@ -3047,8 +3068,24 @@ igb_set_eitr(IGBCore *core, int index, uint32_t val)
 
     trace_igb_irq_eitr_set(eitr_num, val);
 
-    core->eitr_guest_value[eitr_num] = val & ~E1000_EITR_CNT_IGNR;
-    core->mac[index] = val & 0x7FFE;
+    if (val & E1000_EITR_LLI_EN) {
+        qemu_log_mask(LOG_UNIMP, "%s: LLI moderation not supported, ignoring\n",
+                                 __func__);
+    }
+
+    if (!(val & E1000_EITR_CNT_IGNR)) {
+        IGBIntrDelayTimer *timer = &core->eitr[eitr_num];
+        uint32_t itr_cnt = (val & E1000_EITR_ITR_CNT) >> 21;
+        /* CNT is the most significant 10 of 12 bits */
+        uint64_t ns = (itr_cnt << 2) * timer->delay_resolution_ns;
+
+        igb_intrmgr_arm_timer(timer, ns);
+    }
+
+    val &= E1000_EITR_INTERVAL | E1000_EITR_LLI_EN;
+
+    core->mac[index] = val;
+    core->eitr_guest_value[eitr_num] = val;
 }
 
 static void
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/9] net/e1000e|igb: Fix interrupt throttling logic
  2025-01-17 17:02 [PATCH 0/9] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (4 preceding siblings ...)
  2025-01-17 17:03 ` [PATCH 5/9] net/igb: Fix EITR LLI and counter fields Nicholas Piggin
@ 2025-01-17 17:03 ` Nicholas Piggin
  2025-01-18  9:50   ` Akihiko Odaki
  2025-01-17 17:03 ` [PATCH 7/9] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test Nicholas Piggin
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Nicholas Piggin @ 2025-01-17 17:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Nicholas Piggin, Dmitry Fleytman, Akihiko Odaki, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

Interrupt throttling is broken in several ways:
- Timer expiry sends an interrupt even if there is no cause.
- Timer expiry that results in an interrupt does not re-arm
  the timer so an interrupt can appear immediately after the
  timer expiry interrupt.
- Interrupt auto-clear should not clear cause if an interrupt
  is delayed by throttling.

Fix these by skipping the auto-clear logic if an interrupt is
delayed, and when the throttle timer expires check the cause
bits corresponding to the msix vector before sending an irq,
and send it using the functions that run the throttling state
machine and perform auto-clear.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/e1000e_core.c | 60 +++++++++++++++++++++++++++++++++++++++-----
 hw/net/igb_core.c    | 28 +++++++++++++--------
 2 files changed, 72 insertions(+), 16 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index e32955d244b..c5be20bcbbe 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -168,16 +168,63 @@ e1000e_intrmgr_on_throttling_timer(void *opaque)
     }
 }
 
+/* Find causes from IVAR vectors and only interrupt if causes are set */
+static uint32_t find_msix_causes(E1000ECore *core, int vec)
+{
+    uint32_t causes = 0;
+    uint32_t int_cfg;
+
+    int_cfg = E1000_IVAR_RXQ0(core->mac[IVAR]);
+    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+        causes |= E1000_ICR_RXQ0;
+    }
+
+    int_cfg = E1000_IVAR_RXQ1(core->mac[IVAR]);
+    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+        causes |= E1000_ICR_RXQ1;
+    }
+
+    int_cfg = E1000_IVAR_TXQ0(core->mac[IVAR]);
+    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+        causes |= E1000_ICR_TXQ0;
+    }
+
+    int_cfg = E1000_IVAR_TXQ1(core->mac[IVAR]);
+    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+        causes |= E1000_ICR_TXQ1;
+    }
+
+    int_cfg = E1000_IVAR_OTHER(core->mac[IVAR]);
+    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+        causes |= E1000_ICR_OTHER;
+    }
+
+    return causes;
+}
+
+static void
+e1000e_msix_notify(E1000ECore *core, uint32_t causes);
+
 static void
 e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
 {
     E1000IntrDelayTimer *timer = opaque;
-    int idx = timer - &timer->core->eitr[0];
+    E1000ECore *core = timer->core;
+    int idx = timer - &core->eitr[0];
+    uint32_t causes;
 
     timer->running = false;
 
-    trace_e1000e_irq_msix_notify_postponed_vec(idx);
-    msix_notify(timer->core->owner, idx);
+    causes = find_msix_causes(core, idx);
+    if (core->mac[IMS] & core->mac[ICR] & causes) {
+        trace_e1000e_irq_msix_notify_postponed_vec(idx);
+        e1000e_msix_notify(core, causes);
+    }
 }
 
 static void
@@ -1982,10 +2029,11 @@ e1000e_msix_notify_one(E1000ECore *core, uint32_t cause, uint32_t int_cfg)
     if (E1000_IVAR_ENTRY_VALID(int_cfg)) {
         uint32_t vec = E1000_IVAR_ENTRY_VEC(int_cfg);
         if (vec < E1000E_MSIX_VEC_NUM) {
-            if (!e1000e_eitr_should_postpone(core, vec)) {
-                trace_e1000e_irq_msix_notify_vec(vec);
-                msix_notify(core->owner, vec);
+            if (e1000e_eitr_should_postpone(core, vec)) {
+                return;
             }
+            trace_e1000e_irq_msix_notify_vec(vec);
+            msix_notify(core->owner, vec);
         } else {
             trace_e1000e_wrn_msix_vec_wrong(cause, int_cfg);
         }
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index cdebc917d2e..dad32be54fd 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -168,16 +168,7 @@ igb_intmgr_timer_resume(IGBIntrDelayTimer *timer)
 }
 
 static void
-igb_intrmgr_on_msix_throttling_timer(void *opaque)
-{
-    IGBIntrDelayTimer *timer = opaque;
-    int idx = timer - &timer->core->eitr[0];
-
-    timer->running = false;
-
-    trace_e1000e_irq_msix_notify_postponed_vec(idx);
-    igb_msix_notify(timer->core, idx);
-}
+igb_intrmgr_on_msix_throttling_timer(void *opaque);
 
 static void
 igb_intrmgr_initialize_all_timers(IGBCore *core, bool create)
@@ -2279,6 +2270,23 @@ static void igb_send_msix(IGBCore *core, uint32_t causes)
     }
 }
 
+static void
+igb_intrmgr_on_msix_throttling_timer(void *opaque)
+{
+    IGBIntrDelayTimer *timer = opaque;
+    IGBCore *core = timer->core;
+    int vector = timer - &core->eitr[0];
+    uint32_t causes;
+
+    timer->running = false;
+
+    causes = core->mac[EICR] & core->mac[EIMS];
+    if ((causes & BIT(vector)) && !igb_eitr_should_postpone(core, vector)) {
+        trace_e1000e_irq_msix_notify_postponed_vec(vector);
+        igb_msix_notify(core, vector);
+    }
+}
+
 static inline void
 igb_fix_icr_asserted(IGBCore *core)
 {
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 7/9] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test
  2025-01-17 17:02 [PATCH 0/9] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (5 preceding siblings ...)
  2025-01-17 17:03 ` [PATCH 6/9] net/e1000e|igb: Fix interrupt throttling logic Nicholas Piggin
@ 2025-01-17 17:03 ` Nicholas Piggin
  2025-01-17 17:03 ` [PATCH 8/9] net/e1000e: Fix xITR minimum value Nicholas Piggin
  2025-01-17 17:03 ` [PATCH 9/9] hw/net/e1000e|igb: Remove xitr_guest_value logic Nicholas Piggin
  8 siblings, 0 replies; 17+ messages in thread
From: Nicholas Piggin @ 2025-01-17 17:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Nicholas Piggin, Dmitry Fleytman, Akihiko Odaki, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

Enable interrupt throtling on one of the two queue interrupts used
in the multiple_transfers test, to improve coverage.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 tests/qtest/e1000e-test.c | 4 ++--
 tests/qtest/igb-test.c    | 4 ++++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
index 9ab81ecff5b..5e391095f32 100644
--- a/tests/qtest/e1000e-test.c
+++ b/tests/qtest/e1000e-test.c
@@ -194,8 +194,8 @@ static void test_e1000e_multiple_transfers(void *obj, void *data,
         return;
     }
 
-    /* Clear EITR because buggy QEMU throttle timer causes superfluous irqs */
-    e1000e_macreg_write(d, E1000_EITR + E1000E_RX0_MSG_ID * 4, 0);
+    /* Use EITR for one irq and disable it for the other, for testing */
+    e1000e_macreg_write(d, E1000_EITR + E1000E_RX0_MSG_ID * 4, 500);
     e1000e_macreg_write(d, E1000_EITR + E1000E_TX0_MSG_ID * 4, 0);
 
     for (i = 0; i < iterations; i++) {
diff --git a/tests/qtest/igb-test.c b/tests/qtest/igb-test.c
index 1bbb4bea4c1..0a9c5162019 100644
--- a/tests/qtest/igb-test.c
+++ b/tests/qtest/igb-test.c
@@ -198,6 +198,10 @@ static void test_igb_multiple_transfers(void *obj, void *data,
         return;
     }
 
+    /* Use EITR for one irq and disable it for the other, for testing */
+    e1000e_macreg_write(d, E1000_EITR(E1000E_RX0_MSG_ID), 0);
+    e1000e_macreg_write(d, E1000_EITR(E1000E_TX0_MSG_ID), 125 << 2);
+
     for (i = 0; i < iterations; i++) {
         igb_send_verify(d, data, alloc);
         igb_receive_verify(d, data, alloc);
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 8/9] net/e1000e: Fix xITR minimum value
  2025-01-17 17:02 [PATCH 0/9] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (6 preceding siblings ...)
  2025-01-17 17:03 ` [PATCH 7/9] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test Nicholas Piggin
@ 2025-01-17 17:03 ` Nicholas Piggin
  2025-01-18  7:50   ` Akihiko Odaki
  2025-01-17 17:03 ` [PATCH 9/9] hw/net/e1000e|igb: Remove xitr_guest_value logic Nicholas Piggin
  8 siblings, 1 reply; 17+ messages in thread
From: Nicholas Piggin @ 2025-01-17 17:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Nicholas Piggin, Dmitry Fleytman, Akihiko Odaki, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

The ITR minimum value may be a mis-reading or ambiguity in the spec.
Section 10.2.4.2 says the maximum observable interrupt rate should never
exceed 7813, but that is in context of example of the interval being
programmed to 500. On the other hand 7.4.4 does say ITR rules permit
no more than that rate.

There is no minimum value specified, and zero is explicitly allowed and
disables throttling logic (which is already supported behaviour in the
throttling code of the models). This seems to fall outside ITR rules, so
should not cause any limit.

Spec 7.4.4 also says that EITR registers should be initialised to zero.

Remove the minimum value from the ITR and EITR registers, and set ITR
default to 500.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/e1000e_core.c | 24 ++++++++++--------------
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index c5be20bcbbe..34bb5f8096b 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -51,8 +51,13 @@
 
 #include "trace.h"
 
-/* No more then 7813 interrupts per second according to spec 10.2.4.2 */
-#define E1000E_MIN_XITR     (500)
+/*
+ * A suggested range for ITR is 651-5580, according to spec 10.2.4.2, but
+ * QEMU has traditionally set 500 here and spec 7.4.4 ITR rules says the
+ * max observable interrupts from the adapter should be 7813/s (corresponding
+ * to 500).
+ */
+#define E1000E_DEFAULT_ITR (500)
 
 #define E1000E_MAX_TX_FRAGS (64)
 
@@ -2831,11 +2836,7 @@ e1000e_set_itr(E1000ECore *core, int index, uint32_t val)
     trace_e1000e_irq_itr_set(val);
 
     core->itr_guest_value = interval;
-    if (interval == 0) {
-        core->mac[index] = 0;
-    } else {
-        core->mac[index] = MAX(interval, E1000E_MIN_XITR);
-    }
+    core->mac[index] = interval;
 }
 
 static void
@@ -2847,11 +2848,7 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t val)
     trace_e1000e_irq_eitr_set(eitr_num, val);
 
     core->eitr_guest_value[eitr_num] = interval;
-    if (interval == 0) {
-        core->mac[index] = 0;
-    } else {
-        core->mac[index] = MAX(interval, E1000E_MIN_XITR);
-    }
+    core->mac[index] = interval;
 }
 
 static void
@@ -3500,8 +3497,7 @@ static const uint32_t e1000e_mac_reg_init[] = {
     [FACTPS]        = E1000_FACTPS_LAN0_ON | 0x20000000,
     [SWSM]          = 1,
     [RXCSUM]        = E1000_RXCSUM_IPOFLD | E1000_RXCSUM_TUOFLD,
-    [ITR]           = E1000E_MIN_XITR,
-    [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = E1000E_MIN_XITR,
+    [ITR]           = E1000E_DEFAULT_ITR,
 };
 
 static void e1000e_reset(E1000ECore *core, bool sw)
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 9/9] hw/net/e1000e|igb: Remove xitr_guest_value logic
  2025-01-17 17:02 [PATCH 0/9] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (7 preceding siblings ...)
  2025-01-17 17:03 ` [PATCH 8/9] net/e1000e: Fix xITR minimum value Nicholas Piggin
@ 2025-01-17 17:03 ` Nicholas Piggin
  8 siblings, 0 replies; 17+ messages in thread
From: Nicholas Piggin @ 2025-01-17 17:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Nicholas Piggin, Dmitry Fleytman, Akihiko Odaki, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

The guest value xITR logic is not required now that the write functions
store necessary data to be read back, and internal users mask and shift
fields they need as they go.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/igb_core.h    |  2 --
 hw/net/e1000e_core.c | 31 +++++++++++++++----------------
 hw/net/igb.c         |  2 +-
 hw/net/igb_core.c    |  1 -
 4 files changed, 16 insertions(+), 20 deletions(-)

diff --git a/hw/net/igb_core.h b/hw/net/igb_core.h
index d70b54e318f..3578ab79034 100644
--- a/hw/net/igb_core.h
+++ b/hw/net/igb_core.h
@@ -90,8 +90,6 @@ struct IGBCore {
 
     IGBIntrDelayTimer eitr[IGB_INTR_NUM];
 
-    uint32_t eitr_guest_value[IGB_INTR_NUM];
-
     uint8_t permanent_mac[ETH_ALEN];
 
     NICState *owner_nic;
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 34bb5f8096b..e9043a0080f 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -2606,18 +2606,6 @@ e1000e_mac_swsm_read(E1000ECore *core, int index)
     return val;
 }
 
-static uint32_t
-e1000e_mac_itr_read(E1000ECore *core, int index)
-{
-    return core->itr_guest_value;
-}
-
-static uint32_t
-e1000e_mac_eitr_read(E1000ECore *core, int index)
-{
-    return core->eitr_guest_value[index - EITR];
-}
-
 static uint32_t
 e1000e_mac_icr_read(E1000ECore *core, int index)
 {
@@ -2835,7 +2823,6 @@ e1000e_set_itr(E1000ECore *core, int index, uint32_t val)
 
     trace_e1000e_irq_itr_set(val);
 
-    core->itr_guest_value = interval;
     core->mac[index] = interval;
 }
 
@@ -2847,7 +2834,6 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t val)
 
     trace_e1000e_irq_eitr_set(eitr_num, val);
 
-    core->eitr_guest_value[eitr_num] = interval;
     core->mac[index] = interval;
 }
 
@@ -3072,6 +3058,7 @@ static const readops e1000e_macreg_readops[] = {
     e1000e_getreg(GSCN_1),
     e1000e_getreg(FCAL),
     e1000e_getreg(FLSWCNT),
+    e1000e_getreg(ITR),
 
     [TOTH]    = e1000e_mac_read_clr8,
     [GOTCH]   = e1000e_mac_read_clr8,
@@ -3105,7 +3092,6 @@ static const readops e1000e_macreg_readops[] = {
     [MPRC]    = e1000e_mac_read_clr4,
     [BPTC]    = e1000e_mac_read_clr4,
     [TSCTC]   = e1000e_mac_read_clr4,
-    [ITR]     = e1000e_mac_itr_read,
     [CTRL]    = e1000e_get_ctrl,
     [TARC1]   = e1000e_get_tarc,
     [SWSM]    = e1000e_mac_swsm_read,
@@ -3130,7 +3116,7 @@ static const readops e1000e_macreg_readops[] = {
     [RETA ... RETA + 31]   = e1000e_mac_readreg,
     [RSSRK ... RSSRK + 31] = e1000e_mac_readreg,
     [MAVTV0 ... MAVTV3]    = e1000e_mac_readreg,
-    [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = e1000e_mac_eitr_read
+    [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = e1000e_mac_readreg,
 };
 enum { E1000E_NREADOPS = ARRAY_SIZE(e1000e_macreg_readops) };
 
@@ -3560,13 +3546,26 @@ void e1000e_core_pre_save(E1000ECore *core)
             core->tx[i].skip_cp = true;
         }
     }
+
+    /* back compat */
+    core->itr_guest_value = core->mac[ITR];
+    for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
+        core->eitr_guest_value[i] = core->mac[EITR + i];
+    }
 }
 
 int
 e1000e_core_post_load(E1000ECore *core)
 {
+    int i;
     NetClientState *nc = qemu_get_queue(core->owner_nic);
 
+    /* back compat */
+    core->mac[ITR] = core->itr_guest_value;
+    for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
+        core->mac[EITR + i] = core->eitr_guest_value[i];
+    }
+
     /*
      * nc.link_down can't be migrated, so infer link_down according
      * to link status bit in core.mac[STATUS].
diff --git a/hw/net/igb.c b/hw/net/igb.c
index 4d93ce629f9..b7539267b7a 100644
--- a/hw/net/igb.c
+++ b/hw/net/igb.c
@@ -580,7 +580,7 @@ static const VMStateDescription igb_vmstate = {
         VMSTATE_IGB_INTR_DELAY_TIMER_ARRAY(core.eitr, IGBState,
                                            IGB_INTR_NUM),
 
-        VMSTATE_UINT32_ARRAY(core.eitr_guest_value, IGBState, IGB_INTR_NUM),
+        VMSTATE_UNUSED(sizeof(uint32_t) * IGB_INTR_NUM),
 
         VMSTATE_STRUCT_ARRAY(core.tx, IGBState, IGB_NUM_QUEUES, 0,
                              igb_vmstate_tx, struct igb_tx),
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index dad32be54fd..62f319d73e8 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -3093,7 +3093,6 @@ igb_set_eitr(IGBCore *core, int index, uint32_t val)
     val &= E1000_EITR_INTERVAL | E1000_EITR_LLI_EN;
 
     core->mac[index] = val;
-    core->eitr_guest_value[eitr_num] = val;
 }
 
 static void
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 8/9] net/e1000e: Fix xITR minimum value
  2025-01-17 17:03 ` [PATCH 8/9] net/e1000e: Fix xITR minimum value Nicholas Piggin
@ 2025-01-18  7:50   ` Akihiko Odaki
  0 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2025-01-18  7:50 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-devel
  Cc: Dmitry Fleytman, Jason Wang, Sriram Yagnaraman, Fabiano Rosas,
	Laurent Vivier, Paolo Bonzini

On 2025/01/18 2:03, Nicholas Piggin wrote:
> The ITR minimum value may be a mis-reading or ambiguity in the spec.
> Section 10.2.4.2 says the maximum observable interrupt rate should never
> exceed 7813, but that is in context of example of the interval being
> programmed to 500. On the other hand 7.4.4 does say ITR rules permit
> no more than that rate.
> 
> There is no minimum value specified, and zero is explicitly allowed and
> disables throttling logic (which is already supported behaviour in the
> throttling code of the models). This seems to fall outside ITR rules, so
> should not cause any limit.
> 
> Spec 7.4.4 also says that EITR registers should be initialised to zero.
> 
> Remove the minimum value from the ITR and EITR registers, and set ITR
> default to 500.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Please squash this change into "[PATCH 2/9] net/e1000e: Permit disabling 
interrupt throttling".

> ---
>   hw/net/e1000e_core.c | 24 ++++++++++--------------
>   1 file changed, 10 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
> index c5be20bcbbe..34bb5f8096b 100644
> --- a/hw/net/e1000e_core.c
> +++ b/hw/net/e1000e_core.c
> @@ -51,8 +51,13 @@
>   
>   #include "trace.h"
>   
> -/* No more then 7813 interrupts per second according to spec 10.2.4.2 */
> -#define E1000E_MIN_XITR     (500)
> +/*
> + * A suggested range for ITR is 651-5580, according to spec 10.2.4.2, but
> + * QEMU has traditionally set 500 here and spec 7.4.4 ITR rules says the
> + * max observable interrupts from the adapter should be 7813/s (corresponding
> + * to 500).
> + */
> +#define E1000E_DEFAULT_ITR (500)
>   
>   #define E1000E_MAX_TX_FRAGS (64)
>   
> @@ -2831,11 +2836,7 @@ e1000e_set_itr(E1000ECore *core, int index, uint32_t val)
>       trace_e1000e_irq_itr_set(val);
>   
>       core->itr_guest_value = interval;
> -    if (interval == 0) {
> -        core->mac[index] = 0;
> -    } else {
> -        core->mac[index] = MAX(interval, E1000E_MIN_XITR);
> -    }
> +    core->mac[index] = interval;
>   }
>   
>   static void
> @@ -2847,11 +2848,7 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t val)
>       trace_e1000e_irq_eitr_set(eitr_num, val);
>   
>       core->eitr_guest_value[eitr_num] = interval;
> -    if (interval == 0) {
> -        core->mac[index] = 0;
> -    } else {
> -        core->mac[index] = MAX(interval, E1000E_MIN_XITR);
> -    }
> +    core->mac[index] = interval;
>   }
>   
>   static void
> @@ -3500,8 +3497,7 @@ static const uint32_t e1000e_mac_reg_init[] = {
>       [FACTPS]        = E1000_FACTPS_LAN0_ON | 0x20000000,
>       [SWSM]          = 1,
>       [RXCSUM]        = E1000_RXCSUM_IPOFLD | E1000_RXCSUM_TUOFLD,
> -    [ITR]           = E1000E_MIN_XITR,
> -    [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = E1000E_MIN_XITR,
> +    [ITR]           = E1000E_DEFAULT_ITR,
>   };
>   
>   static void e1000e_reset(E1000ECore *core, bool sw)



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/9] qtest/e1000e|igb: assert irqs are clear before triggering an irq
  2025-01-17 17:02 ` [PATCH 3/9] qtest/e1000e|igb: assert irqs are clear before triggering an irq Nicholas Piggin
@ 2025-01-18  8:14   ` Akihiko Odaki
  2025-01-19  9:22   ` Yan Vugenfirer
  1 sibling, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2025-01-18  8:14 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-devel
  Cc: Dmitry Fleytman, Jason Wang, Sriram Yagnaraman, Fabiano Rosas,
	Laurent Vivier, Paolo Bonzini

On 2025/01/18 2:02, Nicholas Piggin wrote:
> Assert there is no existing irq raised that would lead to a false
> positive interrupt test.
> 
> e1000e has to disable interrupt throttling for this test, because
> it can cause delayed superfluous interrupts which trip the assertions.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   tests/qtest/libqos/e1000e.h |  1 +
>   tests/qtest/e1000e-test.c   | 10 ++++++++++
>   tests/qtest/igb-test.c      |  6 ++++++
>   tests/qtest/libqos/e1000e.c |  9 ++++++++-
>   4 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qtest/libqos/e1000e.h b/tests/qtest/libqos/e1000e.h
> index 30643c80949..7154aec0339 100644
> --- a/tests/qtest/libqos/e1000e.h
> +++ b/tests/qtest/libqos/e1000e.h
> @@ -54,6 +54,7 @@ static inline uint32_t e1000e_macreg_read(QE1000E *d, uint32_t reg)
>       return qpci_io_readl(&d_pci->pci_dev, d_pci->mac_regs, reg);
>   }
>   
> +bool e1000e_seen_isr(QE1000E *d, uint16_t msg_id);

I would rather name it e1000e_pending_isr.
We may have seen an interrupt in the past but it may no longer be pending.

>   void e1000e_wait_isr(QE1000E *d, uint16_t msg_id);
>   void e1000e_tx_ring_push(QE1000E *d, void *descr);
>   void e1000e_rx_ring_push(QE1000E *d, void *descr);
> diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
> index 746d26cfb67..9ab81ecff5b 100644
> --- a/tests/qtest/e1000e-test.c
> +++ b/tests/qtest/e1000e-test.c
> @@ -61,6 +61,9 @@ static void e1000e_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
>                                      E1000_TXD_DTYP_D   |
>                                      sizeof(buffer));
>   
> +    /* Ensure the interrupt has not been taken already */
> +    g_assert(!e1000e_seen_isr(d, E1000E_TX0_MSG_ID));
> +
>       /* Put descriptor to the ring */
>       e1000e_tx_ring_push(d, &descr);
>   
> @@ -105,6 +108,9 @@ static void e1000e_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator
>       char buffer[64];
>       int ret;
>   
> +    /* Ensure the interrupt has not been taken already */
> +    g_assert(!e1000e_seen_isr(d, E1000E_RX0_MSG_ID));
> +
>       /* Send a dummy packet to device's socket*/
>       ret = iov_send(test_sockets[0], iov, 2, 0, sizeof(len) + sizeof(packet));
>       g_assert_cmpint(ret, == , sizeof(packet) + sizeof(len));
> @@ -188,6 +194,10 @@ static void test_e1000e_multiple_transfers(void *obj, void *data,
>           return;
>       }
>   
> +    /* Clear EITR because buggy QEMU throttle timer causes superfluous irqs */
> +    e1000e_macreg_write(d, E1000_EITR + E1000E_RX0_MSG_ID * 4, 0);
> +    e1000e_macreg_write(d, E1000_EITR + E1000E_TX0_MSG_ID * 4, 0);
> +
>       for (i = 0; i < iterations; i++) {
>           e1000e_send_verify(d, data, alloc);
>           e1000e_receive_verify(d, data, alloc);
> diff --git a/tests/qtest/igb-test.c b/tests/qtest/igb-test.c
> index cf8b4131cf2..1bbb4bea4c1 100644
> --- a/tests/qtest/igb-test.c
> +++ b/tests/qtest/igb-test.c
> @@ -64,6 +64,9 @@ static void igb_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *allo
>                                             E1000_TXD_DTYP_D   |
>                                             sizeof(buffer));
>   
> +    /* Ensure the interrupt has not been taken already */
> +    g_assert(!e1000e_seen_isr(d, E1000E_TX0_MSG_ID));
> +
>       /* Put descriptor to the ring */
>       e1000e_tx_ring_push(d, &descr);
>   
> @@ -119,6 +122,9 @@ static void igb_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
>       memset(&descr, 0, sizeof(descr));
>       descr.read.pkt_addr = cpu_to_le64(data);
>   
> +    /* Ensure the interrupt has not been taken already */
> +    g_assert(!e1000e_seen_isr(d, E1000E_RX0_MSG_ID));
> +
>       /* Put descriptor to the ring */
>       e1000e_rx_ring_push(d, &descr);
>   
> diff --git a/tests/qtest/libqos/e1000e.c b/tests/qtest/libqos/e1000e.c
> index 925654c7fd4..4e4c387b0bf 100644
> --- a/tests/qtest/libqos/e1000e.c
> +++ b/tests/qtest/libqos/e1000e.c
> @@ -77,13 +77,20 @@ static void e1000e_foreach_callback(QPCIDevice *dev, int devfn, void *data)
>       g_free(dev);
>   }
>   
> +bool e1000e_seen_isr(QE1000E *d, uint16_t msg_id)
> +{
> +    QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
> +
> +    return qpci_msix_pending(&d_pci->pci_dev, msg_id);
> +}
> +
>   void e1000e_wait_isr(QE1000E *d, uint16_t msg_id)
>   {
>       QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
>       guint64 end_time = g_get_monotonic_time() + 5 * G_TIME_SPAN_SECOND;
>   
>       do {
> -        if (qpci_msix_pending(&d_pci->pci_dev, msg_id)) {
> +        if (e1000e_seen_isr(d, msg_id)) {
>               return;
>           }
>           qtest_clock_step(d_pci->pci_dev.bus->qts, 10000);



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/9] net/igb: Fix interrupt throttling interval calculation
  2025-01-17 17:03 ` [PATCH 4/9] net/igb: Fix interrupt throttling interval calculation Nicholas Piggin
@ 2025-01-18  8:22   ` Akihiko Odaki
  0 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2025-01-18  8:22 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-devel
  Cc: Dmitry Fleytman, Jason Wang, Sriram Yagnaraman, Fabiano Rosas,
	Laurent Vivier, Paolo Bonzini

On 2025/01/18 2:03, Nicholas Piggin wrote:
> IGB throttling granularity is 1us, and interval field is in bits 2..14
> of the EITRx registers.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Please add Fixes: as described in docs/devel/submitting-a-patch.rst

> ---
>   hw/net/igb_regs.h | 3 +++
>   hw/net/igb_core.c | 7 ++++---
>   2 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
> index 4dc4c31da27..1ed5ee5039a 100644
> --- a/hw/net/igb_regs.h
> +++ b/hw/net/igb_regs.h
> @@ -146,6 +146,9 @@ union e1000_adv_rx_desc {
>   #define IGB_82576_VF_DEV_ID        0x10CA
>   #define IGB_I350_VF_DEV_ID         0x1520
>   
> +/* Delay increments in nanoseconds for interrupt throttling registers */
> +#define IGB_INTR_THROTTLING_NS_RES (1000)
> +
>   /* VLAN info */
>   #define IGB_TX_FLAGS_VLAN_MASK     0xffff0000
>   #define IGB_TX_FLAGS_VLAN_SHIFT    16
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
> index 39e3ce1c8fe..94f9785749a 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -142,8 +142,9 @@ static void igb_msix_notify(IGBCore *core, unsigned int cause)
>   static inline void
>   igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
>   {
> -    int64_t delay_ns = (int64_t) timer->core->mac[timer->delay_reg] *
> -                                 timer->delay_resolution_ns;
> +    int64_t delay_ns =
> +            (int64_t)((timer->core->mac[timer->delay_reg] & 0x7FFC) >> 2) *
> +                     timer->delay_resolution_ns;
>   
>       trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
>   
> @@ -180,7 +181,7 @@ igb_intrmgr_initialize_all_timers(IGBCore *core, bool create)
>       for (i = 0; i < IGB_INTR_NUM; i++) {
>           core->eitr[i].core = core;
>           core->eitr[i].delay_reg = EITR0 + i;
> -        core->eitr[i].delay_resolution_ns = E1000_INTR_DELAY_NS_RES;
> +        core->eitr[i].delay_resolution_ns = IGB_INTR_THROTTLING_NS_RES;
>       }
>   
>       if (!create) {



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/9] net/igb: Fix EITR LLI and counter fields
  2025-01-17 17:03 ` [PATCH 5/9] net/igb: Fix EITR LLI and counter fields Nicholas Piggin
@ 2025-01-18  8:37   ` Akihiko Odaki
  0 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2025-01-18  8:37 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-devel
  Cc: Dmitry Fleytman, Jason Wang, Sriram Yagnaraman, Fabiano Rosas,
	Laurent Vivier, Paolo Bonzini

On 2025/01/18 2:03, Nicholas Piggin wrote:
> IGB EITR registers have counter fields which reflect the current ITR
> and LLI counter values, as well as a bit to enable LLI moderation,
> and a bit to write the register without modifying the counter fields.
> 
> Implement the ITR counter and log an unimp message if software tries
> to enable LLI moderation. Fix the implementation of the counter ignore
> bit.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   hw/net/igb_regs.h |  8 +++++--
>   hw/net/igb_core.c | 53 ++++++++++++++++++++++++++++++++++++++++-------
>   2 files changed, 51 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
> index 1ed5ee5039a..b612248264a 100644
> --- a/hw/net/igb_regs.h
> +++ b/hw/net/igb_regs.h
> @@ -321,8 +321,12 @@ union e1000_adv_rx_desc {
>   #define E1000_EICR_TX_QUEUE3    0x00000800 /* Tx Queue 3 Interrupt */
>   #define E1000_EICR_OTHER        0x80000000 /* Interrupt Cause Active */
>   
> -/* Extended Interrupt Cause Set */
> -/* E1000_EITR_CNT_IGNR is only for 82576 and newer */
> +/* Extended Interrupt Throttle */
> +/* These are only for 82576 and newer */
> +#define E1000_EITR_INTERVAL     0x00007FFC
> +#define E1000_EITR_LLI_EN       0x00008000
> +#define E1000_EITR_LLI_CNT      0x001F0000
> +#define E1000_EITR_ITR_CNT      0x7FE00000
>   #define E1000_EITR_CNT_IGNR     0x80000000 /* Don't reset counters on write */
>   
>   #define E1000_TSYNCTXCTL_VALID    0x00000001 /* tx timestamp valid */
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
> index 94f9785749a..cdebc917d2e 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -140,12 +140,8 @@ static void igb_msix_notify(IGBCore *core, unsigned int cause)
>   }
>   
>   static inline void
> -igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
> +igb_intrmgr_arm_timer(IGBIntrDelayTimer *timer, int64_t delay_ns)
>   {
> -    int64_t delay_ns =
> -            (int64_t)((timer->core->mac[timer->delay_reg] & 0x7FFC) >> 2) *
> -                     timer->delay_resolution_ns;
> -
>       trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
>   
>       timer_mod(timer->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + delay_ns);
> @@ -153,6 +149,16 @@ igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
>       timer->running = true;
>   }
>   
> +static inline void
> +igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
> +{
> +    uint32_t interval = (timer->core->mac[timer->delay_reg] &
> +                         E1000_EITR_INTERVAL) >> 2;
> +    int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
> +
> +    igb_intrmgr_arm_timer(timer, delay_ns);
> +}
> +
>   static void
>   igb_intmgr_timer_resume(IGBIntrDelayTimer *timer)
>   {
> @@ -2881,7 +2887,22 @@ igb_mac_swsm_read(IGBCore *core, int index)
>   static uint32_t
>   igb_mac_eitr_read(IGBCore *core, int index)
>   {
> -    return core->eitr_guest_value[index - EITR0];
> +    uint32_t eitr_num = index - EITR0;
> +    uint32_t val = core->mac[eitr_num];
> +    IGBIntrDelayTimer *timer = &core->eitr[eitr_num];
> +
> +    if (timer->running) { /* ITR is enabled, get ITR counter */

Saying "ITR is enabled" here is a bit confusing. timer->running may be 
false even if ITR is enabled because the timer is already expired.

> +        int64_t remains = timer_expire_time_ns(timer->timer) -
> +                          qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
> +        if (remains > 0) {
> +            /* CNT is the most significant 10 of 12 bits */
> +            uint32_t cnt;
> +            cnt = remains / timer->delay_resolution_ns;
> +            val |= ((cnt >> 2) << 21) & E1000_EITR_ITR_CNT;
> +        }
> +    }
> +
> +    return val;
>   }
>   
>   static uint32_t igb_mac_vfmailbox_read(IGBCore *core, int index)
> @@ -3047,8 +3068,24 @@ igb_set_eitr(IGBCore *core, int index, uint32_t val)
>   
>       trace_igb_irq_eitr_set(eitr_num, val);
>   
> -    core->eitr_guest_value[eitr_num] = val & ~E1000_EITR_CNT_IGNR;
> -    core->mac[index] = val & 0x7FFE;
> +    if (val & E1000_EITR_LLI_EN) {
> +        qemu_log_mask(LOG_UNIMP, "%s: LLI moderation not supported, ignoring\n",
> +                                 __func__);
> +    }
> +
> +    if (!(val & E1000_EITR_CNT_IGNR)) {
> +        IGBIntrDelayTimer *timer = &core->eitr[eitr_num];
> +        uint32_t itr_cnt = (val & E1000_EITR_ITR_CNT) >> 21;
> +        /* CNT is the most significant 10 of 12 bits */
> +        uint64_t ns = (itr_cnt << 2) * timer->delay_resolution_ns;
> +
> +        igb_intrmgr_arm_timer(timer, ns);
> +    }
> +
> +    val &= E1000_EITR_INTERVAL | E1000_EITR_LLI_EN;
> +
> +    core->mac[index] = val;
> +    core->eitr_guest_value[eitr_num] = val;
>   }
>   
>   static void



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 6/9] net/e1000e|igb: Fix interrupt throttling logic
  2025-01-17 17:03 ` [PATCH 6/9] net/e1000e|igb: Fix interrupt throttling logic Nicholas Piggin
@ 2025-01-18  9:50   ` Akihiko Odaki
  0 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2025-01-18  9:50 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-devel
  Cc: Dmitry Fleytman, Jason Wang, Sriram Yagnaraman, Fabiano Rosas,
	Laurent Vivier, Paolo Bonzini

On 2025/01/18 2:03, Nicholas Piggin wrote:
> Interrupt throttling is broken in several ways:
> - Timer expiry sends an interrupt even if there is no cause.
> - Timer expiry that results in an interrupt does not re-arm
>    the timer so an interrupt can appear immediately after the
>    timer expiry interrupt.
> - Interrupt auto-clear should not clear cause if an interrupt
>    is delayed by throttling.
> 
> Fix these by skipping the auto-clear logic if an interrupt is
> delayed, and when the throttle timer expires check the cause
> bits corresponding to the msix vector before sending an irq,
> and send it using the functions that run the throttling state
> machine and perform auto-clear.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   hw/net/e1000e_core.c | 60 +++++++++++++++++++++++++++++++++++++++-----
>   hw/net/igb_core.c    | 28 +++++++++++++--------
>   2 files changed, 72 insertions(+), 16 deletions(-)
> 
> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
> index e32955d244b..c5be20bcbbe 100644
> --- a/hw/net/e1000e_core.c
> +++ b/hw/net/e1000e_core.c
> @@ -168,16 +168,63 @@ e1000e_intrmgr_on_throttling_timer(void *opaque)
>       }
>   }
>   
> +/* Find causes from IVAR vectors and only interrupt if causes are set */

This comment is misplaced as find_msix_causes() is just to find causes 
and will not cause an interrupt by its own.

The function name is descriptive enough and the comment is a bit 
redundant so I suggest simply removing it.

> +static uint32_t find_msix_causes(E1000ECore *core, int vec)
 > +{> +    uint32_t causes = 0;
> +    uint32_t int_cfg;
> +
> +    int_cfg = E1000_IVAR_RXQ0(core->mac[IVAR]);
> +    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
> +        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
> +        causes |= E1000_ICR_RXQ0;
> +    }
> +
> +    int_cfg = E1000_IVAR_RXQ1(core->mac[IVAR]);
> +    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
> +        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
> +        causes |= E1000_ICR_RXQ1;
> +    }
> +
> +    int_cfg = E1000_IVAR_TXQ0(core->mac[IVAR]);
> +    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
> +        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
> +        causes |= E1000_ICR_TXQ0;
> +    }
> +
> +    int_cfg = E1000_IVAR_TXQ1(core->mac[IVAR]);
> +    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
> +        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
> +        causes |= E1000_ICR_TXQ1;
> +    }
> +
> +    int_cfg = E1000_IVAR_OTHER(core->mac[IVAR]);
> +    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
> +        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
> +        causes |= E1000_ICR_OTHER;
> +    }
> +
> +    return causes;
> +}
> +
> +static void
> +e1000e_msix_notify(E1000ECore *core, uint32_t causes);
> +
>   static void
>   e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
>   {
>       E1000IntrDelayTimer *timer = opaque;
> -    int idx = timer - &timer->core->eitr[0];
> +    E1000ECore *core = timer->core;
> +    int idx = timer - &core->eitr[0];
> +    uint32_t causes;
>   
>       timer->running = false;
>   
> -    trace_e1000e_irq_msix_notify_postponed_vec(idx);
> -    msix_notify(timer->core->owner, idx);
> +    causes = find_msix_causes(core, idx);
> +    if (core->mac[IMS] & core->mac[ICR] & causes) {

To raise only pending interrupts, I think you should do:
causes = core->mac[IMS] & core->mac[ICR] & find_msix_causes(core, idx);
if (causes) {

> +        trace_e1000e_irq_msix_notify_postponed_vec(idx);
> +        e1000e_msix_notify(core, causes);
> +    }
>   }
>   
>   static void
> @@ -1982,10 +2029,11 @@ e1000e_msix_notify_one(E1000ECore *core, uint32_t cause, uint32_t int_cfg)
>       if (E1000_IVAR_ENTRY_VALID(int_cfg)) {
>           uint32_t vec = E1000_IVAR_ENTRY_VEC(int_cfg);
>           if (vec < E1000E_MSIX_VEC_NUM) {
> -            if (!e1000e_eitr_should_postpone(core, vec)) {
> -                trace_e1000e_irq_msix_notify_vec(vec);
> -                msix_notify(core->owner, vec);
> +            if (e1000e_eitr_should_postpone(core, vec)) {
> +                return;
>               }
> +            trace_e1000e_irq_msix_notify_vec(vec);
> +            msix_notify(core->owner, vec);
>           } else {
>               trace_e1000e_wrn_msix_vec_wrong(cause, int_cfg);
>           }
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
> index cdebc917d2e..dad32be54fd 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -168,16 +168,7 @@ igb_intmgr_timer_resume(IGBIntrDelayTimer *timer)
>   }
>   
>   static void
> -igb_intrmgr_on_msix_throttling_timer(void *opaque)
> -{
> -    IGBIntrDelayTimer *timer = opaque;
> -    int idx = timer - &timer->core->eitr[0];
> -
> -    timer->running = false;
> -
> -    trace_e1000e_irq_msix_notify_postponed_vec(idx);
> -    igb_msix_notify(timer->core, idx);
> -}
> +igb_intrmgr_on_msix_throttling_timer(void *opaque);
>   
>   static void
>   igb_intrmgr_initialize_all_timers(IGBCore *core, bool create)
> @@ -2279,6 +2270,23 @@ static void igb_send_msix(IGBCore *core, uint32_t causes)
>       }
>   }
>   
> +static void
> +igb_intrmgr_on_msix_throttling_timer(void *opaque)
> +{
> +    IGBIntrDelayTimer *timer = opaque;
> +    IGBCore *core = timer->core;
> +    int vector = timer - &core->eitr[0];
> +    uint32_t causes;
> +
> +    timer->running = false;
> +
> +    causes = core->mac[EICR] & core->mac[EIMS];
> +    if ((causes & BIT(vector)) && !igb_eitr_should_postpone(core, vector)) {

Why does it check for igb_eitr_should_postpone() while 
e1000e_intrmgr_on_msix_throttling_timer() doesn't?

Please remove it unless it is necessary; it implies 
igb_eitr_should_postpone() can be true in this function and is 
potentially misleading.

> +        trace_e1000e_irq_msix_notify_postponed_vec(vector);
> +        igb_msix_notify(core, vector);
> +    }
> +}
> +
>   static inline void
>   igb_fix_icr_asserted(IGBCore *core)
>   {



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/9] qtest/e1000e|igb: assert irqs are clear before triggering an irq
  2025-01-17 17:02 ` [PATCH 3/9] qtest/e1000e|igb: assert irqs are clear before triggering an irq Nicholas Piggin
  2025-01-18  8:14   ` Akihiko Odaki
@ 2025-01-19  9:22   ` Yan Vugenfirer
  2025-01-21  4:45     ` Nicholas Piggin
  1 sibling, 1 reply; 17+ messages in thread
From: Yan Vugenfirer @ 2025-01-19  9:22 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: qemu-devel, Dmitry Fleytman, Akihiko Odaki, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 4915 bytes --]

On Fri, Jan 17, 2025 at 7:05 PM Nicholas Piggin <npiggin@gmail.com> wrote:

> Assert there is no existing irq raised that would lead to a false
> positive interrupt test.
>
> e1000e has to disable interrupt throttling for this test, because
> it can cause delayed superfluous interrupts which trip the assertions.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  tests/qtest/libqos/e1000e.h |  1 +
>  tests/qtest/e1000e-test.c   | 10 ++++++++++
>  tests/qtest/igb-test.c      |  6 ++++++
>  tests/qtest/libqos/e1000e.c |  9 ++++++++-
>  4 files changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/tests/qtest/libqos/e1000e.h b/tests/qtest/libqos/e1000e.h
> index 30643c80949..7154aec0339 100644
> --- a/tests/qtest/libqos/e1000e.h
> +++ b/tests/qtest/libqos/e1000e.h
> @@ -54,6 +54,7 @@ static inline uint32_t e1000e_macreg_read(QE1000E *d,
> uint32_t reg)
>      return qpci_io_readl(&d_pci->pci_dev, d_pci->mac_regs, reg);
>  }
>
> +bool e1000e_seen_isr(QE1000E *d, uint16_t msg_id);
>  void e1000e_wait_isr(QE1000E *d, uint16_t msg_id);
>  void e1000e_tx_ring_push(QE1000E *d, void *descr);
>  void e1000e_rx_ring_push(QE1000E *d, void *descr);
> diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
> index 746d26cfb67..9ab81ecff5b 100644
> --- a/tests/qtest/e1000e-test.c
> +++ b/tests/qtest/e1000e-test.c
> @@ -61,6 +61,9 @@ static void e1000e_send_verify(QE1000E *d, int
> *test_sockets, QGuestAllocator *a
>                                     E1000_TXD_DTYP_D   |
>                                     sizeof(buffer));
>
> +    /* Ensure the interrupt has not been taken already */
> +    g_assert(!e1000e_seen_isr(d, E1000E_TX0_MSG_ID));
> +
>      /* Put descriptor to the ring */
>      e1000e_tx_ring_push(d, &descr);
>
> @@ -105,6 +108,9 @@ static void e1000e_receive_verify(QE1000E *d, int
> *test_sockets, QGuestAllocator
>      char buffer[64];
>      int ret;
>
> +    /* Ensure the interrupt has not been taken already */
> +    g_assert(!e1000e_seen_isr(d, E1000E_RX0_MSG_ID));
> +
>
I don't think potentially crashing the guest is the right solution.
I suggest maybe substituting with logging.


>      /* Send a dummy packet to device's socket*/
>      ret = iov_send(test_sockets[0], iov, 2, 0, sizeof(len) +
> sizeof(packet));
>      g_assert_cmpint(ret, == , sizeof(packet) + sizeof(len));
> @@ -188,6 +194,10 @@ static void test_e1000e_multiple_transfers(void *obj,
> void *data,
>          return;
>      }
>
> +    /* Clear EITR because buggy QEMU throttle timer causes superfluous
> irqs */
> +    e1000e_macreg_write(d, E1000_EITR + E1000E_RX0_MSG_ID * 4, 0);
> +    e1000e_macreg_write(d, E1000_EITR + E1000E_TX0_MSG_ID * 4, 0);
> +
>      for (i = 0; i < iterations; i++) {
>          e1000e_send_verify(d, data, alloc);
>          e1000e_receive_verify(d, data, alloc);
> diff --git a/tests/qtest/igb-test.c b/tests/qtest/igb-test.c
> index cf8b4131cf2..1bbb4bea4c1 100644
> --- a/tests/qtest/igb-test.c
> +++ b/tests/qtest/igb-test.c
> @@ -64,6 +64,9 @@ static void igb_send_verify(QE1000E *d, int
> *test_sockets, QGuestAllocator *allo
>                                            E1000_TXD_DTYP_D   |
>                                            sizeof(buffer));
>
> +    /* Ensure the interrupt has not been taken already */
> +    g_assert(!e1000e_seen_isr(d, E1000E_TX0_MSG_ID));
> +
>      /* Put descriptor to the ring */
>      e1000e_tx_ring_push(d, &descr);
>
> @@ -119,6 +122,9 @@ static void igb_receive_verify(QE1000E *d, int
> *test_sockets, QGuestAllocator *a
>      memset(&descr, 0, sizeof(descr));
>      descr.read.pkt_addr = cpu_to_le64(data);
>
> +    /* Ensure the interrupt has not been taken already */
> +    g_assert(!e1000e_seen_isr(d, E1000E_RX0_MSG_ID));
> +
>      /* Put descriptor to the ring */
>      e1000e_rx_ring_push(d, &descr);
>
> diff --git a/tests/qtest/libqos/e1000e.c b/tests/qtest/libqos/e1000e.c
> index 925654c7fd4..4e4c387b0bf 100644
> --- a/tests/qtest/libqos/e1000e.c
> +++ b/tests/qtest/libqos/e1000e.c
> @@ -77,13 +77,20 @@ static void e1000e_foreach_callback(QPCIDevice *dev,
> int devfn, void *data)
>      g_free(dev);
>  }
>
> +bool e1000e_seen_isr(QE1000E *d, uint16_t msg_id)
> +{
> +    QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
> +
> +    return qpci_msix_pending(&d_pci->pci_dev, msg_id);
> +}
> +
>  void e1000e_wait_isr(QE1000E *d, uint16_t msg_id)
>  {
>      QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
>      guint64 end_time = g_get_monotonic_time() + 5 * G_TIME_SPAN_SECOND;
>
>      do {
> -        if (qpci_msix_pending(&d_pci->pci_dev, msg_id)) {
> +        if (e1000e_seen_isr(d, msg_id)) {
>              return;
>          }
>          qtest_clock_step(d_pci->pci_dev.bus->qts, 10000);
> --
> 2.45.2
>
>
>

[-- Attachment #2: Type: text/html, Size: 6049 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/9] qtest/e1000e|igb: assert irqs are clear before triggering an irq
  2025-01-19  9:22   ` Yan Vugenfirer
@ 2025-01-21  4:45     ` Nicholas Piggin
  0 siblings, 0 replies; 17+ messages in thread
From: Nicholas Piggin @ 2025-01-21  4:45 UTC (permalink / raw)
  To: Yan Vugenfirer
  Cc: qemu-devel, Dmitry Fleytman, Akihiko Odaki, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

On Sun Jan 19, 2025 at 7:22 PM AEST, Yan Vugenfirer wrote:
> On Fri, Jan 17, 2025 at 7:05 PM Nicholas Piggin <npiggin@gmail.com> wrote:
>
>> Assert there is no existing irq raised that would lead to a false
>> positive interrupt test.
>>
>> e1000e has to disable interrupt throttling for this test, because
>> it can cause delayed superfluous interrupts which trip the assertions.
>>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>  tests/qtest/libqos/e1000e.h |  1 +
>>  tests/qtest/e1000e-test.c   | 10 ++++++++++
>>  tests/qtest/igb-test.c      |  6 ++++++
>>  tests/qtest/libqos/e1000e.c |  9 ++++++++-
>>  4 files changed, 25 insertions(+), 1 deletion(-)
>>
>> diff --git a/tests/qtest/libqos/e1000e.h b/tests/qtest/libqos/e1000e.h
>> index 30643c80949..7154aec0339 100644
>> --- a/tests/qtest/libqos/e1000e.h
>> +++ b/tests/qtest/libqos/e1000e.h
>> @@ -54,6 +54,7 @@ static inline uint32_t e1000e_macreg_read(QE1000E *d,
>> uint32_t reg)
>>      return qpci_io_readl(&d_pci->pci_dev, d_pci->mac_regs, reg);
>>  }
>>
>> +bool e1000e_seen_isr(QE1000E *d, uint16_t msg_id);
>>  void e1000e_wait_isr(QE1000E *d, uint16_t msg_id);
>>  void e1000e_tx_ring_push(QE1000E *d, void *descr);
>>  void e1000e_rx_ring_push(QE1000E *d, void *descr);
>> diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
>> index 746d26cfb67..9ab81ecff5b 100644
>> --- a/tests/qtest/e1000e-test.c
>> +++ b/tests/qtest/e1000e-test.c
>> @@ -61,6 +61,9 @@ static void e1000e_send_verify(QE1000E *d, int
>> *test_sockets, QGuestAllocator *a
>>                                     E1000_TXD_DTYP_D   |
>>                                     sizeof(buffer));
>>
>> +    /* Ensure the interrupt has not been taken already */
>> +    g_assert(!e1000e_seen_isr(d, E1000E_TX0_MSG_ID));
>> +
>>      /* Put descriptor to the ring */
>>      e1000e_tx_ring_push(d, &descr);
>>
>> @@ -105,6 +108,9 @@ static void e1000e_receive_verify(QE1000E *d, int
>> *test_sockets, QGuestAllocator
>>      char buffer[64];
>>      int ret;
>>
>> +    /* Ensure the interrupt has not been taken already */
>> +    g_assert(!e1000e_seen_isr(d, E1000E_RX0_MSG_ID));
>> +
>>
> I don't think potentially crashing the guest is the right solution.
> I suggest maybe substituting with logging.

This is just for qtest where assertions are used to indicate
failure.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-01-21  4:46 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-17 17:02 [PATCH 0/9] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
2025-01-17 17:02 ` [PATCH 1/9] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq Nicholas Piggin
2025-01-17 17:02 ` [PATCH 2/9] net/e1000e: Permit disabling interrupt throttling Nicholas Piggin
2025-01-17 17:02 ` [PATCH 3/9] qtest/e1000e|igb: assert irqs are clear before triggering an irq Nicholas Piggin
2025-01-18  8:14   ` Akihiko Odaki
2025-01-19  9:22   ` Yan Vugenfirer
2025-01-21  4:45     ` Nicholas Piggin
2025-01-17 17:03 ` [PATCH 4/9] net/igb: Fix interrupt throttling interval calculation Nicholas Piggin
2025-01-18  8:22   ` Akihiko Odaki
2025-01-17 17:03 ` [PATCH 5/9] net/igb: Fix EITR LLI and counter fields Nicholas Piggin
2025-01-18  8:37   ` Akihiko Odaki
2025-01-17 17:03 ` [PATCH 6/9] net/e1000e|igb: Fix interrupt throttling logic Nicholas Piggin
2025-01-18  9:50   ` Akihiko Odaki
2025-01-17 17:03 ` [PATCH 7/9] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test Nicholas Piggin
2025-01-17 17:03 ` [PATCH 8/9] net/e1000e: Fix xITR minimum value Nicholas Piggin
2025-01-18  7:50   ` Akihiko Odaki
2025-01-17 17:03 ` [PATCH 9/9] hw/net/e1000e|igb: Remove xitr_guest_value logic Nicholas Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).