qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes
@ 2025-05-02  3:16 Nicholas Piggin
  2025-05-02  3:16 ` [PATCH v3 01/12] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq Nicholas Piggin
                   ` (11 more replies)
  0 siblings, 12 replies; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-02  3:16 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

Since v2:
https://lore.kernel.org/qemu-devel/20250411043128.201289-1-npiggin@gmail.com/

Mostly fixed comments from Akihiko Odaki

- e1000e change initial ITR as well as EITR
- Preserve EITR writable bits
- Use SCALE_US constant instead of 1000
- Fix register field definitions to not trample on code from Linux header
- Split the big fixes patch into several pieces

It is difficult to split the big fix into pieces because some of the
bugs are inter-dependent. I think I came up with a way to do it.

Thanks,
Nick

Nicholas Piggin (12):
  qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after
    irq
  net/e1000e: Permit disabling interrupt throttling
  hw/net/e1000e|igb: Remove xitr_guest_value logic
  qtest/e1000e|igb: assert irqs are clear before triggering an irq
  net/igb: Fix interrupt throttling interval calculation
  net/igb: Implement EITR Moderation Counter
  igb: Add a note about re-loading timers breaking deterministic replay
  hw/net/e1000e: Postponed msix interrupt processing should auto-clear
    cause
  hw/net/e1000e: Do not auto-clear cause on postponed msix interrupt
  net/e1000e|igb: Only send delayed msix interrupts that have a cause
  net/e1000e|igb: Fix interrupt throttling rearming
  qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test

 hw/net/igb_regs.h           |   9 +++
 tests/qtest/libqos/e1000e.h |   1 +
 hw/net/e1000e_core.c        | 146 +++++++++++++++++++++++++-----------
 hw/net/igb_core.c           | 109 +++++++++++++++++++++------
 tests/qtest/e1000e-test.c   |  21 +++++-
 tests/qtest/igb-test.c      |  18 +++++
 tests/qtest/libqos/e1000e.c |   9 ++-
 7 files changed, 246 insertions(+), 67 deletions(-)

-- 
2.47.1



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 01/12] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq
  2025-05-02  3:16 [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
@ 2025-05-02  3:16 ` Nicholas Piggin
  2025-05-19 15:06   ` Fabiano Rosas
  2025-05-02  3:16 ` [PATCH v3 02/12] net/e1000e: Permit disabling interrupt throttling Nicholas Piggin
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-02  3:16 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum

The e1000e and igb tests do not clear the ICR/EICR cause bits (or
set auto-clear) on seeing queue interrupts, which inhibits the
triggering of a new interrupt. The msix pending bit which is used
to test for the interrupt is also not cleared (the vector is masked).

Fix this by clearing the ICR/EICR cause bits, and the msix pending
bit using the PBACLR device register.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 tests/qtest/e1000e-test.c | 9 ++++++++-
 tests/qtest/igb-test.c    | 8 ++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
index de9738fdb74..746d26cfb67 100644
--- a/tests/qtest/e1000e-test.c
+++ b/tests/qtest/e1000e-test.c
@@ -66,6 +66,10 @@ static void e1000e_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
 
     /* Wait for TX WB interrupt */
     e1000e_wait_isr(d, E1000E_TX0_MSG_ID);
+    /* Read ICR to make it ready for next interrupt, assert TXQ0 cause */
+    g_assert(e1000e_macreg_read(d, E1000_ICR) & E1000_ICR_TXQ0);
+    /* Write PBACLR to clear the MSIX pending bit */
+    e1000e_macreg_write(d, E1000_PBACLR, (1 << E1000E_TX0_MSG_ID));
 
     /* Check DD bit */
     g_assert_cmphex(le32_to_cpu(descr.upper.data) & E1000_TXD_STAT_DD, ==,
@@ -117,7 +121,10 @@ static void e1000e_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator
 
     /* Wait for TX WB interrupt */
     e1000e_wait_isr(d, E1000E_RX0_MSG_ID);
-
+    /* Read ICR to make it ready for next interrupt, assert RXQ0 cause */
+    g_assert(e1000e_macreg_read(d, E1000_ICR) & E1000_ICR_RXQ0);
+    /* Write PBACLR to clear the MSIX pending bit */
+    e1000e_macreg_write(d, E1000_PBACLR, (1 << E1000E_RX0_MSG_ID));
     /* Check DD bit */
     g_assert_cmphex(le32_to_cpu(descr.wb.upper.status_error) &
         E1000_RXD_STAT_DD, ==, E1000_RXD_STAT_DD);
diff --git a/tests/qtest/igb-test.c b/tests/qtest/igb-test.c
index 3d397ea6973..cf8b4131cf2 100644
--- a/tests/qtest/igb-test.c
+++ b/tests/qtest/igb-test.c
@@ -69,6 +69,10 @@ static void igb_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *allo
 
     /* Wait for TX WB interrupt */
     e1000e_wait_isr(d, E1000E_TX0_MSG_ID);
+    /* Read EICR which clears it ready for next interrupt, assert TXQ0 cause */
+    g_assert(e1000e_macreg_read(d, E1000_EICR) & (1 << E1000E_TX0_MSG_ID));
+    /* Write PBACLR to clear the MSIX pending bit */
+    e1000e_macreg_write(d, E1000_PBACLR, (1 << E1000E_TX0_MSG_ID));
 
     /* Check DD bit */
     g_assert_cmphex(le32_to_cpu(descr.wb.status) & E1000_TXD_STAT_DD, ==,
@@ -120,6 +124,10 @@ static void igb_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
 
     /* Wait for TX WB interrupt */
     e1000e_wait_isr(d, E1000E_RX0_MSG_ID);
+    /* Read EICR which clears it ready for next interrupt, assert RXQ0 cause */
+    g_assert(e1000e_macreg_read(d, E1000_EICR) & (1 << E1000E_RX0_MSG_ID));
+    /* Write PBACLR to clear the MSIX pending bit */
+    e1000e_macreg_write(d, E1000_PBACLR, (1 << E1000E_RX0_MSG_ID));
 
     /* Check DD bit */
     g_assert_cmphex(le32_to_cpu(descr.wb.upper.status_error) &
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 02/12] net/e1000e: Permit disabling interrupt throttling
  2025-05-02  3:16 [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
  2025-05-02  3:16 ` [PATCH v3 01/12] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq Nicholas Piggin
@ 2025-05-02  3:16 ` Nicholas Piggin
  2025-05-05  5:41   ` Akihiko Odaki
  2025-05-02  3:16 ` [PATCH v3 03/12] hw/net/e1000e|igb: Remove xitr_guest_value logic Nicholas Piggin
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-02  3:16 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

The spec explicitly permits xITR register interval field to have a value
of zero to disable throttling. The e1000e model already allows for this
in the throttling logic, so remove the minimum value for the register.

The spec appears to say there is a maximum observable interrupt rate
when throttling is enabled, regardless of ITR value, so throttle timer
calculation is clamped to that minimum value.

EITR registers default to 0, as specified in spec 7.4.4.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/e1000e_core.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 24138587905..96f74f1ea14 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -51,8 +51,17 @@
 
 #include "trace.h"
 
-/* No more then 7813 interrupts per second according to spec 10.2.4.2 */
-#define E1000E_MIN_XITR     (500)
+/*
+ * A suggested range for ITR is 651-5580, according to spec 10.2.4.2, but
+ * QEMU has traditionally set 500 here.
+ */
+#define E1000E_DEFAULT_ITR (500)
+
+/*
+ * spec 7.4.4 ITR rules says the maximum observable interrupt rate from the
+ * adapter should not exceed 7813/s (corresponding to 500).
+ */
+#define E1000E_EFFECTIVE_MIN_XITR (500)
 
 #define E1000E_MAX_TX_FRAGS (64)
 
@@ -105,8 +114,9 @@ e1000e_lower_legacy_irq(E1000ECore *core)
 static inline void
 e1000e_intrmgr_rearm_timer(E1000IntrDelayTimer *timer)
 {
-    int64_t delay_ns = (int64_t) timer->core->mac[timer->delay_reg] *
-                                 timer->delay_resolution_ns;
+    uint32_t delay = MAX(timer->core->mac[timer->delay_reg],
+                         E1000E_EFFECTIVE_MIN_XITR);
+    int64_t delay_ns = (int64_t)delay * timer->delay_resolution_ns;
 
     trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
 
@@ -2783,7 +2793,7 @@ e1000e_set_itr(E1000ECore *core, int index, uint32_t val)
     trace_e1000e_irq_itr_set(val);
 
     core->itr_guest_value = interval;
-    core->mac[index] = MAX(interval, E1000E_MIN_XITR);
+    core->mac[index] = interval;
 }
 
 static void
@@ -2795,7 +2805,7 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t val)
     trace_e1000e_irq_eitr_set(eitr_num, val);
 
     core->eitr_guest_value[eitr_num] = interval;
-    core->mac[index] = MAX(interval, E1000E_MIN_XITR);
+    core->mac[index] = interval;
 }
 
 static void
@@ -3444,8 +3454,7 @@ static const uint32_t e1000e_mac_reg_init[] = {
     [FACTPS]        = E1000_FACTPS_LAN0_ON | 0x20000000,
     [SWSM]          = 1,
     [RXCSUM]        = E1000_RXCSUM_IPOFLD | E1000_RXCSUM_TUOFLD,
-    [ITR]           = E1000E_MIN_XITR,
-    [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = E1000E_MIN_XITR,
+    [ITR]           = E1000E_DEFAULT_ITR,
 };
 
 static void e1000e_reset(E1000ECore *core, bool sw)
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 03/12] hw/net/e1000e|igb: Remove xitr_guest_value logic
  2025-05-02  3:16 [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
  2025-05-02  3:16 ` [PATCH v3 01/12] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq Nicholas Piggin
  2025-05-02  3:16 ` [PATCH v3 02/12] net/e1000e: Permit disabling interrupt throttling Nicholas Piggin
@ 2025-05-02  3:16 ` Nicholas Piggin
  2025-05-05  5:45   ` Akihiko Odaki
  2025-05-02  3:16 ` [PATCH v3 04/12] qtest/e1000e|igb: assert irqs are clear before triggering an irq Nicholas Piggin
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-02  3:16 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

The guest value xITR logic is not required now that the write functions
store necessary data to be read back, and internal users mask and shift
fields they need as they go.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/e1000e_core.c | 31 +++++++++++++++----------------
 hw/net/igb_core.c    | 16 +++++++++++++---
 2 files changed, 28 insertions(+), 19 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 96f74f1ea14..f8e6522f810 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -2563,18 +2563,6 @@ e1000e_mac_swsm_read(E1000ECore *core, int index)
     return val;
 }
 
-static uint32_t
-e1000e_mac_itr_read(E1000ECore *core, int index)
-{
-    return core->itr_guest_value;
-}
-
-static uint32_t
-e1000e_mac_eitr_read(E1000ECore *core, int index)
-{
-    return core->eitr_guest_value[index - EITR];
-}
-
 static uint32_t
 e1000e_mac_icr_read(E1000ECore *core, int index)
 {
@@ -2792,7 +2780,6 @@ e1000e_set_itr(E1000ECore *core, int index, uint32_t val)
 
     trace_e1000e_irq_itr_set(val);
 
-    core->itr_guest_value = interval;
     core->mac[index] = interval;
 }
 
@@ -2804,7 +2791,6 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t val)
 
     trace_e1000e_irq_eitr_set(eitr_num, val);
 
-    core->eitr_guest_value[eitr_num] = interval;
     core->mac[index] = interval;
 }
 
@@ -3029,6 +3015,7 @@ static const readops e1000e_macreg_readops[] = {
     e1000e_getreg(GSCN_1),
     e1000e_getreg(FCAL),
     e1000e_getreg(FLSWCNT),
+    e1000e_getreg(ITR),
 
     [TOTH]    = e1000e_mac_read_clr8,
     [GOTCH]   = e1000e_mac_read_clr8,
@@ -3062,7 +3049,6 @@ static const readops e1000e_macreg_readops[] = {
     [MPRC]    = e1000e_mac_read_clr4,
     [BPTC]    = e1000e_mac_read_clr4,
     [TSCTC]   = e1000e_mac_read_clr4,
-    [ITR]     = e1000e_mac_itr_read,
     [CTRL]    = e1000e_get_ctrl,
     [TARC1]   = e1000e_get_tarc,
     [SWSM]    = e1000e_mac_swsm_read,
@@ -3087,7 +3073,7 @@ static const readops e1000e_macreg_readops[] = {
     [RETA ... RETA + 31]   = e1000e_mac_readreg,
     [RSSRK ... RSSRK + 31] = e1000e_mac_readreg,
     [MAVTV0 ... MAVTV3]    = e1000e_mac_readreg,
-    [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = e1000e_mac_eitr_read
+    [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = e1000e_mac_readreg,
 };
 enum { E1000E_NREADOPS = ARRAY_SIZE(e1000e_macreg_readops) };
 
@@ -3517,13 +3503,26 @@ void e1000e_core_pre_save(E1000ECore *core)
             core->tx[i].skip_cp = true;
         }
     }
+
+    /* back compat, QEMU moves xITR in itr_guest_value state */
+    core->itr_guest_value = core->mac[ITR];
+    for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
+        core->eitr_guest_value[i] = core->mac[EITR + i];
+    }
 }
 
 int
 e1000e_core_post_load(E1000ECore *core)
 {
+    int i;
     NetClientState *nc = qemu_get_queue(core->owner_nic);
 
+    /* back compat */
+    core->mac[ITR] = core->itr_guest_value;
+    for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
+        core->mac[EITR + i] = core->eitr_guest_value[i];
+    }
+
     /*
      * nc.link_down can't be migrated, so infer link_down according
      * to link status bit in core.mac[STATUS].
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 39e3ce1c8fe..271c54380e9 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -2880,7 +2880,7 @@ igb_mac_swsm_read(IGBCore *core, int index)
 static uint32_t
 igb_mac_eitr_read(IGBCore *core, int index)
 {
-    return core->eitr_guest_value[index - EITR0];
+    return core->mac[index - EITR0];
 }
 
 static uint32_t igb_mac_vfmailbox_read(IGBCore *core, int index)
@@ -3046,8 +3046,7 @@ igb_set_eitr(IGBCore *core, int index, uint32_t val)
 
     trace_igb_irq_eitr_set(eitr_num, val);
 
-    core->eitr_guest_value[eitr_num] = val & ~E1000_EITR_CNT_IGNR;
-    core->mac[index] = val & 0x7FFE;
+    core->mac[index] = val;
 }
 
 static void
@@ -4527,13 +4526,24 @@ void igb_core_pre_save(IGBCore *core)
             core->tx[i].skip_cp = true;
         }
     }
+
+    /* back compat, QEMU moves EITR in eitr_guest_value state */
+    for (i = 0; i < IGB_INTR_NUM; i++) {
+        core->eitr_guest_value[i] = core->mac[EITR0 + i];
+    }
 }
 
 int
 igb_core_post_load(IGBCore *core)
 {
+    int i;
     NetClientState *nc = qemu_get_queue(core->owner_nic);
 
+    /* back compat */
+    for (i = 0; i < IGB_INTR_NUM; i++) {
+        core->mac[EITR0 + i] = core->eitr_guest_value[i];
+    }
+
     /*
      * nc.link_down can't be migrated, so infer link_down according
      * to link status bit in core.mac[STATUS].
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 04/12] qtest/e1000e|igb: assert irqs are clear before triggering an irq
  2025-05-02  3:16 [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (2 preceding siblings ...)
  2025-05-02  3:16 ` [PATCH v3 03/12] hw/net/e1000e|igb: Remove xitr_guest_value logic Nicholas Piggin
@ 2025-05-02  3:16 ` Nicholas Piggin
  2025-05-19 15:07   ` Fabiano Rosas
  2025-05-02  3:16 ` [PATCH v3 05/12] net/igb: Fix interrupt throttling interval calculation Nicholas Piggin
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-02  3:16 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum

Assert there is no existing irq raised that would lead to a false
positive interrupt test.

e1000e has to disable interrupt throttling for this test, because
it can cause delayed superfluous interrupts which trip the assertions.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 tests/qtest/libqos/e1000e.h |  1 +
 tests/qtest/e1000e-test.c   | 10 ++++++++++
 tests/qtest/igb-test.c      |  6 ++++++
 tests/qtest/libqos/e1000e.c |  9 ++++++++-
 4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/tests/qtest/libqos/e1000e.h b/tests/qtest/libqos/e1000e.h
index 30643c80949..e693670119f 100644
--- a/tests/qtest/libqos/e1000e.h
+++ b/tests/qtest/libqos/e1000e.h
@@ -54,6 +54,7 @@ static inline uint32_t e1000e_macreg_read(QE1000E *d, uint32_t reg)
     return qpci_io_readl(&d_pci->pci_dev, d_pci->mac_regs, reg);
 }
 
+bool e1000e_pending_isr(QE1000E *d, uint16_t msg_id);
 void e1000e_wait_isr(QE1000E *d, uint16_t msg_id);
 void e1000e_tx_ring_push(QE1000E *d, void *descr);
 void e1000e_rx_ring_push(QE1000E *d, void *descr);
diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
index 746d26cfb67..a538c72cc84 100644
--- a/tests/qtest/e1000e-test.c
+++ b/tests/qtest/e1000e-test.c
@@ -61,6 +61,9 @@ static void e1000e_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
                                    E1000_TXD_DTYP_D   |
                                    sizeof(buffer));
 
+    /* Ensure the interrupt has not been taken already */
+    g_assert(!e1000e_pending_isr(d, E1000E_TX0_MSG_ID));
+
     /* Put descriptor to the ring */
     e1000e_tx_ring_push(d, &descr);
 
@@ -105,6 +108,9 @@ static void e1000e_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator
     char buffer[64];
     int ret;
 
+    /* Ensure the interrupt has not been taken already */
+    g_assert(!e1000e_pending_isr(d, E1000E_RX0_MSG_ID));
+
     /* Send a dummy packet to device's socket*/
     ret = iov_send(test_sockets[0], iov, 2, 0, sizeof(len) + sizeof(packet));
     g_assert_cmpint(ret, == , sizeof(packet) + sizeof(len));
@@ -188,6 +194,10 @@ static void test_e1000e_multiple_transfers(void *obj, void *data,
         return;
     }
 
+    /* Clear EITR because buggy QEMU throttle timer causes superfluous irqs */
+    e1000e_macreg_write(d, E1000_EITR + E1000E_RX0_MSG_ID * 4, 0);
+    e1000e_macreg_write(d, E1000_EITR + E1000E_TX0_MSG_ID * 4, 0);
+
     for (i = 0; i < iterations; i++) {
         e1000e_send_verify(d, data, alloc);
         e1000e_receive_verify(d, data, alloc);
diff --git a/tests/qtest/igb-test.c b/tests/qtest/igb-test.c
index cf8b4131cf2..12cdd8b498a 100644
--- a/tests/qtest/igb-test.c
+++ b/tests/qtest/igb-test.c
@@ -64,6 +64,9 @@ static void igb_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *allo
                                           E1000_TXD_DTYP_D   |
                                           sizeof(buffer));
 
+    /* Ensure the interrupt has not been taken already */
+    g_assert(!e1000e_pending_isr(d, E1000E_TX0_MSG_ID));
+
     /* Put descriptor to the ring */
     e1000e_tx_ring_push(d, &descr);
 
@@ -119,6 +122,9 @@ static void igb_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
     memset(&descr, 0, sizeof(descr));
     descr.read.pkt_addr = cpu_to_le64(data);
 
+    /* Ensure the interrupt has not been taken already */
+    g_assert(!e1000e_pending_isr(d, E1000E_RX0_MSG_ID));
+
     /* Put descriptor to the ring */
     e1000e_rx_ring_push(d, &descr);
 
diff --git a/tests/qtest/libqos/e1000e.c b/tests/qtest/libqos/e1000e.c
index 925654c7fd4..0bf70e25389 100644
--- a/tests/qtest/libqos/e1000e.c
+++ b/tests/qtest/libqos/e1000e.c
@@ -77,13 +77,20 @@ static void e1000e_foreach_callback(QPCIDevice *dev, int devfn, void *data)
     g_free(dev);
 }
 
+bool e1000e_pending_isr(QE1000E *d, uint16_t msg_id)
+{
+    QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
+
+    return qpci_msix_pending(&d_pci->pci_dev, msg_id);
+}
+
 void e1000e_wait_isr(QE1000E *d, uint16_t msg_id)
 {
     QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
     guint64 end_time = g_get_monotonic_time() + 5 * G_TIME_SPAN_SECOND;
 
     do {
-        if (qpci_msix_pending(&d_pci->pci_dev, msg_id)) {
+        if (e1000e_pending_isr(d, msg_id)) {
             return;
         }
         qtest_clock_step(d_pci->pci_dev.bus->qts, 10000);
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 05/12] net/igb: Fix interrupt throttling interval calculation
  2025-05-02  3:16 [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (3 preceding siblings ...)
  2025-05-02  3:16 ` [PATCH v3 04/12] qtest/e1000e|igb: assert irqs are clear before triggering an irq Nicholas Piggin
@ 2025-05-02  3:16 ` Nicholas Piggin
  2025-05-02  3:16 ` [PATCH v3 06/12] net/igb: Implement EITR Moderation Counter Nicholas Piggin
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-02  3:16 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

IGB throttling granularity is 1us, and interval field is in bits 2..14
of the EITRx registers.

Fixes: 3a977deebe6b ("Intrdocue igb device emulation")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/igb_regs.h | 3 +++
 hw/net/igb_core.c | 7 ++++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index 4dc4c31da27..a87aa44f5f3 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -146,6 +146,9 @@ union e1000_adv_rx_desc {
 #define IGB_82576_VF_DEV_ID        0x10CA
 #define IGB_I350_VF_DEV_ID         0x1520
 
+/* Delay increments in nanoseconds for interrupt throttling registers */
+#define IGB_INTR_THROTTLING_NS_RES (1 * SCALE_US)
+
 /* VLAN info */
 #define IGB_TX_FLAGS_VLAN_MASK     0xffff0000
 #define IGB_TX_FLAGS_VLAN_SHIFT    16
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 271c54380e9..9e9e6e3354f 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -142,8 +142,9 @@ static void igb_msix_notify(IGBCore *core, unsigned int cause)
 static inline void
 igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
 {
-    int64_t delay_ns = (int64_t) timer->core->mac[timer->delay_reg] *
-                                 timer->delay_resolution_ns;
+    int64_t delay_ns =
+            (int64_t)((timer->core->mac[timer->delay_reg] & 0x7FFC) >> 2) *
+                     timer->delay_resolution_ns;
 
     trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
 
@@ -180,7 +181,7 @@ igb_intrmgr_initialize_all_timers(IGBCore *core, bool create)
     for (i = 0; i < IGB_INTR_NUM; i++) {
         core->eitr[i].core = core;
         core->eitr[i].delay_reg = EITR0 + i;
-        core->eitr[i].delay_resolution_ns = E1000_INTR_DELAY_NS_RES;
+        core->eitr[i].delay_resolution_ns = IGB_INTR_THROTTLING_NS_RES;
     }
 
     if (!create) {
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 06/12] net/igb: Implement EITR Moderation Counter
  2025-05-02  3:16 [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (4 preceding siblings ...)
  2025-05-02  3:16 ` [PATCH v3 05/12] net/igb: Fix interrupt throttling interval calculation Nicholas Piggin
@ 2025-05-02  3:16 ` Nicholas Piggin
  2025-05-02  3:16 ` [PATCH v3 07/12] igb: Add a note about re-loading timers breaking deterministic replay Nicholas Piggin
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-02  3:16 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

IGB EITR registers have counter fields which reflect the current ITR
and LLI counter values, as well as a bit to enable LLI moderation,
and a bit to write the register without modifying the counter fields.

Implement the EITR Moderation Counter (aka EITR counter), and counter
ignore bit. The EITR counter is the time remaining in the interrupt
moderation delay which is implemented as a QEMU timer.

Log an unimp message if software tries to enable LLI moderation.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/igb_regs.h |  6 ++++++
 hw/net/igb_core.c | 48 +++++++++++++++++++++++++++++++++++++++++------
 2 files changed, 48 insertions(+), 6 deletions(-)

diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index a87aa44f5f3..c4785eda0ce 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -631,6 +631,12 @@ union e1000_adv_rx_desc {
 #define E1000_EICR_MSIX_MASK   0x01FFFFFF /* Bits used in MSI-X mode */
 #define E1000_EICR_LEGACY_MASK 0x4000FFFF /* Bits used in non MSI-X mode */
 
+/* These are only for 82576 and newer */
+#define E1000_EITR_INTERVAL     0x00007FFC
+#define E1000_EITR_LLI_EN       0x00008000
+#define E1000_EITR_LLI_CNT      0x001F0000
+#define E1000_EITR_ITR_CNT      0x7FE00000
+
 /* Mirror VF Control (only RST bit); RW */
 #define E1000_PVTCTRL(_n) (0x10000 + (_n) * 0x100)
 
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 9e9e6e3354f..eedc341f298 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -140,12 +140,8 @@ static void igb_msix_notify(IGBCore *core, unsigned int cause)
 }
 
 static inline void
-igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
+igb_intrmgr_arm_timer(IGBIntrDelayTimer *timer, int64_t delay_ns)
 {
-    int64_t delay_ns =
-            (int64_t)((timer->core->mac[timer->delay_reg] & 0x7FFC) >> 2) *
-                     timer->delay_resolution_ns;
-
     trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
 
     timer_mod(timer->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + delay_ns);
@@ -153,6 +149,16 @@ igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
     timer->running = true;
 }
 
+static inline void
+igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
+{
+    uint32_t interval = (timer->core->mac[timer->delay_reg] &
+                         E1000_EITR_INTERVAL) >> 2;
+    int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
+
+    igb_intrmgr_arm_timer(timer, delay_ns);
+}
+
 static void
 igb_intmgr_timer_resume(IGBIntrDelayTimer *timer)
 {
@@ -2881,7 +2887,21 @@ igb_mac_swsm_read(IGBCore *core, int index)
 static uint32_t
 igb_mac_eitr_read(IGBCore *core, int index)
 {
-    return core->mac[index - EITR0];
+    uint32_t eitr_num = index - EITR0;
+    uint32_t val = core->mac[eitr_num];
+    IGBIntrDelayTimer *timer = &core->eitr[eitr_num];
+
+    if (timer->running) { /* timer is pending, find time remaining */
+        int64_t remains = timer_expire_time_ns(timer->timer) -
+                          qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+        if (remains > 0) {
+            uint32_t cnt; /* CNT is the most significant 10 of 12 bits */
+            cnt = remains / timer->delay_resolution_ns;
+            val |= ((cnt >> 2) << 21) & E1000_EITR_ITR_CNT;
+        }
+    }
+
+    return val;
 }
 
 static uint32_t igb_mac_vfmailbox_read(IGBCore *core, int index)
@@ -3047,6 +3067,22 @@ igb_set_eitr(IGBCore *core, int index, uint32_t val)
 
     trace_igb_irq_eitr_set(eitr_num, val);
 
+    if (val & (E1000_EITR_LLI_EN | E1000_EITR_LLI_CNT)) {
+        qemu_log_mask(LOG_UNIMP, "%s: LLI moderation not supported, ignoring\n",
+                                 __func__);
+    }
+
+    if (!(val & E1000_EITR_CNT_IGNR)) {
+        IGBIntrDelayTimer *timer = &core->eitr[eitr_num];
+        uint32_t itr_cnt = (val & E1000_EITR_ITR_CNT) >> 21;
+        /* CNT is the most significant 10 of 12 bits */
+        uint64_t ns = (itr_cnt << 2) * timer->delay_resolution_ns;
+
+        igb_intrmgr_arm_timer(timer, ns);
+    }
+
+    val &= E1000_EITR_INTERVAL | E1000_EITR_LLI_EN;
+
     core->mac[index] = val;
 }
 
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 07/12] igb: Add a note about re-loading timers breaking deterministic replay
  2025-05-02  3:16 [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (5 preceding siblings ...)
  2025-05-02  3:16 ` [PATCH v3 06/12] net/igb: Implement EITR Moderation Counter Nicholas Piggin
@ 2025-05-02  3:16 ` Nicholas Piggin
  2025-05-02  3:17 ` [PATCH v3 08/12] hw/net/e1000e: Postponed msix interrupt processing should auto-clear cause Nicholas Piggin
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-02  3:16 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

Add a note about the problem re-loading timers with default values.
Deterministic replay requires the timers be migrated with the same
value.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/igb_core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index eedc341f298..3ae3e53530b 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -4589,7 +4589,9 @@ igb_core_post_load(IGBCore *core)
 
     /*
      * we need to restart intrmgr timers, as an older version of
-     * QEMU can have stopped them before migration
+     * QEMU can have stopped them before migration.
+     * XXX: re-setting timers with fresh values breaks deterministic
+     * replay.
      */
     igb_intrmgr_resume(core);
     igb_autoneg_resume(core);
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 08/12] hw/net/e1000e: Postponed msix interrupt processing should auto-clear cause
  2025-05-02  3:16 [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (6 preceding siblings ...)
  2025-05-02  3:16 ` [PATCH v3 07/12] igb: Add a note about re-loading timers breaking deterministic replay Nicholas Piggin
@ 2025-05-02  3:17 ` Nicholas Piggin
  2025-05-02  3:17 ` [PATCH v3 09/12] hw/net/e1000e: Do not auto-clear cause on postponed msix interrupt Nicholas Piggin
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-02  3:17 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

Cause auto-clearing and masking should be performed during msix
interrupt processing.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/e1000e_core.c | 86 +++++++++++++++++++++++++++++++++++---------
 1 file changed, 69 insertions(+), 17 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index f8e6522f810..5969f49e8fd 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -178,16 +178,62 @@ e1000e_intrmgr_on_throttling_timer(void *opaque)
     }
 }
 
+/* returns the bitmap of causes that are mapped to a given vector */
+static uint32_t find_msix_causes(E1000ECore *core, int vec)
+{
+    uint32_t causes = 0;
+    uint32_t int_cfg;
+
+    int_cfg = E1000_IVAR_RXQ0(core->mac[IVAR]);
+    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+        causes |= E1000_ICR_RXQ0;
+    }
+
+    int_cfg = E1000_IVAR_RXQ1(core->mac[IVAR]);
+    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+        causes |= E1000_ICR_RXQ1;
+    }
+
+    int_cfg = E1000_IVAR_TXQ0(core->mac[IVAR]);
+    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+        causes |= E1000_ICR_TXQ0;
+    }
+
+    int_cfg = E1000_IVAR_TXQ1(core->mac[IVAR]);
+    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+        causes |= E1000_ICR_TXQ1;
+    }
+
+    int_cfg = E1000_IVAR_OTHER(core->mac[IVAR]);
+    if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+        E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+        causes |= E1000_ICR_OTHER;
+    }
+
+    return causes;
+}
+
+static void
+e1000e_msix_auto_clear_mask(E1000ECore *core, uint32_t cause);
+
 static void
 e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
 {
     E1000IntrDelayTimer *timer = opaque;
-    int idx = timer - &timer->core->eitr[0];
+    E1000ECore *core = timer->core;
+    int idx = timer - &core->eitr[0];
+    uint32_t causes;
 
     timer->running = false;
 
+    causes = find_msix_causes(core, idx);
     trace_e1000e_irq_msix_notify_postponed_vec(idx);
-    msix_notify(timer->core->owner, idx);
+    msix_notify(core->owner, idx);
+    e1000e_msix_auto_clear_mask(core, causes);
 }
 
 static void
@@ -1985,24 +2031,10 @@ e1000e_eitr_should_postpone(E1000ECore *core, int idx)
 }
 
 static void
-e1000e_msix_notify_one(E1000ECore *core, uint32_t cause, uint32_t int_cfg)
+e1000e_msix_auto_clear_mask(E1000ECore *core, uint32_t cause)
 {
     uint32_t effective_eiac;
 
-    if (E1000_IVAR_ENTRY_VALID(int_cfg)) {
-        uint32_t vec = E1000_IVAR_ENTRY_VEC(int_cfg);
-        if (vec < E1000E_MSIX_VEC_NUM) {
-            if (!e1000e_eitr_should_postpone(core, vec)) {
-                trace_e1000e_irq_msix_notify_vec(vec);
-                msix_notify(core->owner, vec);
-            }
-        } else {
-            trace_e1000e_wrn_msix_vec_wrong(cause, int_cfg);
-        }
-    } else {
-        trace_e1000e_wrn_msix_invalid(cause, int_cfg);
-    }
-
     if (core->mac[CTRL_EXT] & E1000_CTRL_EXT_EIAME) {
         trace_e1000e_irq_iam_clear_eiame(core->mac[IAM], cause);
         core->mac[IAM] &= ~cause;
@@ -2019,6 +2051,26 @@ e1000e_msix_notify_one(E1000ECore *core, uint32_t cause, uint32_t int_cfg)
     }
 }
 
+static void
+e1000e_msix_notify_one(E1000ECore *core, uint32_t cause, uint32_t int_cfg)
+{
+    if (E1000_IVAR_ENTRY_VALID(int_cfg)) {
+        uint32_t vec = E1000_IVAR_ENTRY_VEC(int_cfg);
+        if (vec < E1000E_MSIX_VEC_NUM) {
+            if (!e1000e_eitr_should_postpone(core, vec)) {
+                trace_e1000e_irq_msix_notify_vec(vec);
+                msix_notify(core->owner, vec);
+            }
+        } else {
+            trace_e1000e_wrn_msix_vec_wrong(cause, int_cfg);
+        }
+    } else {
+        trace_e1000e_wrn_msix_invalid(cause, int_cfg);
+    }
+
+    e1000e_msix_auto_clear_mask(core, cause);
+}
+
 static void
 e1000e_msix_notify(E1000ECore *core, uint32_t causes)
 {
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 09/12] hw/net/e1000e: Do not auto-clear cause on postponed msix interrupt
  2025-05-02  3:16 [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (7 preceding siblings ...)
  2025-05-02  3:17 ` [PATCH v3 08/12] hw/net/e1000e: Postponed msix interrupt processing should auto-clear cause Nicholas Piggin
@ 2025-05-02  3:17 ` Nicholas Piggin
  2025-05-02  3:17 ` [PATCH v3 10/12] net/e1000e|igb: Only send delayed msix interrupts that have a cause Nicholas Piggin
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-02  3:17 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

If an interrupt is postponed, it should not do cause auto-clearing
or auto-masking. That is done when the interrupt processing occurs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/e1000e_core.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 5969f49e8fd..022884a2ea0 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -2057,10 +2057,11 @@ e1000e_msix_notify_one(E1000ECore *core, uint32_t cause, uint32_t int_cfg)
     if (E1000_IVAR_ENTRY_VALID(int_cfg)) {
         uint32_t vec = E1000_IVAR_ENTRY_VEC(int_cfg);
         if (vec < E1000E_MSIX_VEC_NUM) {
-            if (!e1000e_eitr_should_postpone(core, vec)) {
-                trace_e1000e_irq_msix_notify_vec(vec);
-                msix_notify(core->owner, vec);
+            if (e1000e_eitr_should_postpone(core, vec)) {
+                return;
             }
+            trace_e1000e_irq_msix_notify_vec(vec);
+            msix_notify(core->owner, vec);
         } else {
             trace_e1000e_wrn_msix_vec_wrong(cause, int_cfg);
         }
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 10/12] net/e1000e|igb: Only send delayed msix interrupts that have a cause
  2025-05-02  3:16 [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (8 preceding siblings ...)
  2025-05-02  3:17 ` [PATCH v3 09/12] hw/net/e1000e: Do not auto-clear cause on postponed msix interrupt Nicholas Piggin
@ 2025-05-02  3:17 ` Nicholas Piggin
  2025-05-05  5:51   ` Akihiko Odaki
  2025-05-02  3:17 ` [PATCH v3 11/12] net/e1000e|igb: Fix interrupt throttling rearming Nicholas Piggin
  2025-05-02  3:17 ` [PATCH v3 12/12] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test Nicholas Piggin
  11 siblings, 1 reply; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-02  3:17 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

The msix interrupt throttling timer expiry sends an interrupt even if
there is no unmasked interrupt causes. This can be observed by seeing
two interrupts in response to a single event when throttling is active.

The e1000e non-msix paths seem to get this right by masking and testing
ICR and IMS. Add similar checks for the msix cases in e1000e and igb.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/e1000e_core.c | 10 ++++++----
 hw/net/igb_core.c    | 11 ++++++++---
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 022884a2ea0..d53f70065ef 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -230,10 +230,12 @@ e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
 
     timer->running = false;
 
-    causes = find_msix_causes(core, idx);
-    trace_e1000e_irq_msix_notify_postponed_vec(idx);
-    msix_notify(core->owner, idx);
-    e1000e_msix_auto_clear_mask(core, causes);
+    causes = find_msix_causes(core, idx) & core->mac[IMS] & core->mac[ICR];
+    if (causes) {
+        trace_e1000e_irq_msix_notify_postponed_vec(idx);
+        msix_notify(core->owner, causes);
+        e1000e_msix_auto_clear_mask(core, causes);
+    }
 }
 
 static void
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 3ae3e53530b..035637f81f8 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -171,12 +171,17 @@ static void
 igb_intrmgr_on_msix_throttling_timer(void *opaque)
 {
     IGBIntrDelayTimer *timer = opaque;
-    int idx = timer - &timer->core->eitr[0];
+    IGBCore *core = timer->core;
+    int vector = timer - &core->eitr[0];
+    uint32_t causes;
 
     timer->running = false;
 
-    trace_e1000e_irq_msix_notify_postponed_vec(idx);
-    igb_msix_notify(timer->core, idx);
+    causes = core->mac[EICR] & core->mac[EIMS];
+    if (causes & BIT(vector)) {
+        trace_e1000e_irq_msix_notify_postponed_vec(vector);
+        igb_msix_notify(core, vector);
+    }
 }
 
 static void
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 11/12] net/e1000e|igb: Fix interrupt throttling rearming
  2025-05-02  3:16 [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (9 preceding siblings ...)
  2025-05-02  3:17 ` [PATCH v3 10/12] net/e1000e|igb: Only send delayed msix interrupts that have a cause Nicholas Piggin
@ 2025-05-02  3:17 ` Nicholas Piggin
  2025-05-05  6:03   ` Akihiko Odaki
  2025-05-02  3:17 ` [PATCH v3 12/12] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test Nicholas Piggin
  11 siblings, 1 reply; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-02  3:17 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

Timer expiry that results in an interrupt does not rearm the timer so
an interrupt can appear immediately after the interrupt generated by
timer expiry.

Fix this by rearming the throttle timer when a delayed interrupt is
processed. e1000e gets this by reusing the e1000e_msix_notify()
logic, igb calls igb_intrmgr_rearm_timer() directly.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/net/e1000e_core.c |  5 ++--
 hw/net/igb_core.c    | 55 ++++++++++++++++++++++++++------------------
 2 files changed, 35 insertions(+), 25 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index d53f70065ef..2932122c04b 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -218,7 +218,7 @@ static uint32_t find_msix_causes(E1000ECore *core, int vec)
 }
 
 static void
-e1000e_msix_auto_clear_mask(E1000ECore *core, uint32_t cause);
+e1000e_msix_notify(E1000ECore *core, uint32_t causes);
 
 static void
 e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
@@ -233,8 +233,7 @@ e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
     causes = find_msix_causes(core, idx) & core->mac[IMS] & core->mac[ICR];
     if (causes) {
         trace_e1000e_irq_msix_notify_postponed_vec(idx);
-        msix_notify(core->owner, causes);
-        e1000e_msix_auto_clear_mask(core, causes);
+        e1000e_msix_notify(core, causes);
     }
 }
 
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 035637f81f8..cc25a1d5baa 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -152,11 +152,14 @@ igb_intrmgr_arm_timer(IGBIntrDelayTimer *timer, int64_t delay_ns)
 static inline void
 igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
 {
-    uint32_t interval = (timer->core->mac[timer->delay_reg] &
-                         E1000_EITR_INTERVAL) >> 2;
-    int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
+    uint32_t eitr = timer->core->mac[timer->delay_reg];
 
-    igb_intrmgr_arm_timer(timer, delay_ns);
+    if (eitr != 0) {
+        uint32_t interval = (eitr & E1000_EITR_INTERVAL) >> 2;
+        int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
+
+        igb_intrmgr_arm_timer(timer, delay_ns);
+    }
 }
 
 static void
@@ -168,21 +171,7 @@ igb_intmgr_timer_resume(IGBIntrDelayTimer *timer)
 }
 
 static void
-igb_intrmgr_on_msix_throttling_timer(void *opaque)
-{
-    IGBIntrDelayTimer *timer = opaque;
-    IGBCore *core = timer->core;
-    int vector = timer - &core->eitr[0];
-    uint32_t causes;
-
-    timer->running = false;
-
-    causes = core->mac[EICR] & core->mac[EIMS];
-    if (causes & BIT(vector)) {
-        trace_e1000e_irq_msix_notify_postponed_vec(vector);
-        igb_msix_notify(core, vector);
-    }
-}
+igb_intrmgr_on_msix_throttling_timer(void *opaque);
 
 static void
 igb_intrmgr_initialize_all_timers(IGBCore *core, bool create)
@@ -2258,9 +2247,7 @@ igb_postpone_interrupt(IGBIntrDelayTimer *timer)
         return true;
     }
 
-    if (timer->core->mac[timer->delay_reg] != 0) {
-        igb_intrmgr_rearm_timer(timer);
-    }
+    igb_intrmgr_rearm_timer(timer);
 
     return false;
 }
@@ -2284,6 +2271,30 @@ static void igb_send_msix(IGBCore *core, uint32_t causes)
     }
 }
 
+static void
+igb_intrmgr_on_msix_throttling_timer(void *opaque)
+{
+    IGBIntrDelayTimer *timer = opaque;
+    IGBCore *core = timer->core;
+    int vector = timer - &core->eitr[0];
+    uint32_t causes;
+
+    timer->running = false;
+
+    causes = core->mac[EICR] & core->mac[EIMS];
+    if (causes & BIT(vector)) {
+        /*
+         * The moderation counter is loaded with interval value whenever the
+         * interrupt is signaled. This includes when the interrupt is signaled
+         * by the counter reaching 0.
+         */
+        igb_intrmgr_rearm_timer(timer);
+
+        trace_e1000e_irq_msix_notify_postponed_vec(vector);
+        igb_msix_notify(core, vector);
+    }
+}
+
 static inline void
 igb_fix_icr_asserted(IGBCore *core)
 {
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 12/12] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test
  2025-05-02  3:16 [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
                   ` (10 preceding siblings ...)
  2025-05-02  3:17 ` [PATCH v3 11/12] net/e1000e|igb: Fix interrupt throttling rearming Nicholas Piggin
@ 2025-05-02  3:17 ` Nicholas Piggin
  2025-05-19 15:08   ` Fabiano Rosas
  11 siblings, 1 reply; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-02  3:17 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini

Enable interrupt throtling on one of the two queue interrupts used
in the multiple_transfers test, to improve coverage. The number of
interrupts for the e1000e test is reduced because it has a long minimum
throttling delay so without reducing iterations throttling adds about
40s to the test runtime.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 tests/qtest/e1000e-test.c | 6 +++---
 tests/qtest/igb-test.c    | 4 ++++
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
index a538c72cc84..645b31127f0 100644
--- a/tests/qtest/e1000e-test.c
+++ b/tests/qtest/e1000e-test.c
@@ -181,7 +181,7 @@ static void test_e1000e_rx(void *obj, void *data, QGuestAllocator * alloc)
 static void test_e1000e_multiple_transfers(void *obj, void *data,
                                            QGuestAllocator *alloc)
 {
-    static const long iterations = 4 * 1024;
+    static const long iterations = 1 * 1024;
     long i;
 
     QE1000E_PCI *e1000e = obj;
@@ -194,8 +194,8 @@ static void test_e1000e_multiple_transfers(void *obj, void *data,
         return;
     }
 
-    /* Clear EITR because buggy QEMU throttle timer causes superfluous irqs */
-    e1000e_macreg_write(d, E1000_EITR + E1000E_RX0_MSG_ID * 4, 0);
+    /* Use EITR for one irq and disable it for the other, for testing */
+    e1000e_macreg_write(d, E1000_EITR + E1000E_RX0_MSG_ID * 4, 500);
     e1000e_macreg_write(d, E1000_EITR + E1000E_TX0_MSG_ID * 4, 0);
 
     for (i = 0; i < iterations; i++) {
diff --git a/tests/qtest/igb-test.c b/tests/qtest/igb-test.c
index 12cdd8b498a..c1877a77be4 100644
--- a/tests/qtest/igb-test.c
+++ b/tests/qtest/igb-test.c
@@ -198,6 +198,10 @@ static void test_igb_multiple_transfers(void *obj, void *data,
         return;
     }
 
+    /* Use EITR for one irq and disable it for the other, for testing */
+    e1000e_macreg_write(d, E1000_EITR(E1000E_RX0_MSG_ID), 0);
+    e1000e_macreg_write(d, E1000_EITR(E1000E_TX0_MSG_ID), 10 << 2); /* 10us */
+
     for (i = 0; i < iterations; i++) {
         igb_send_verify(d, data, alloc);
         igb_receive_verify(d, data, alloc);
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 02/12] net/e1000e: Permit disabling interrupt throttling
  2025-05-02  3:16 ` [PATCH v3 02/12] net/e1000e: Permit disabling interrupt throttling Nicholas Piggin
@ 2025-05-05  5:41   ` Akihiko Odaki
  2025-05-05  6:36     ` Nicholas Piggin
  0 siblings, 1 reply; 24+ messages in thread
From: Akihiko Odaki @ 2025-05-05  5:41 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: qemu-devel, Dmitry Fleytman, Jason Wang, Sriram Yagnaraman,
	Fabiano Rosas, Laurent Vivier, Paolo Bonzini

On 2025/05/02 12:16, Nicholas Piggin wrote:
> The spec explicitly permits xITR register interval field to have a value
> of zero to disable throttling. The e1000e model already allows for this
> in the throttling logic, so remove the minimum value for the register.
> 
> The spec appears to say there is a maximum observable interrupt rate
> when throttling is enabled, regardless of ITR value, so throttle timer
> calculation is clamped to that minimum value.
> 
> EITR registers default to 0, as specified in spec 7.4.4.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   hw/net/e1000e_core.c | 25 +++++++++++++++++--------
>   1 file changed, 17 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
> index 24138587905..96f74f1ea14 100644
> --- a/hw/net/e1000e_core.c
> +++ b/hw/net/e1000e_core.c
> @@ -51,8 +51,17 @@
>   
>   #include "trace.h"
>   
> -/* No more then 7813 interrupts per second according to spec 10.2.4.2 */
> -#define E1000E_MIN_XITR     (500)
> +/*
> + * A suggested range for ITR is 651-5580, according to spec 10.2.4.2, but
> + * QEMU has traditionally set 500 here.
> + */
> +#define E1000E_DEFAULT_ITR (500)

The cover letter says this version changes "initial ITR as well as EITR" 
but the ITR value is unchanged here. Forgot to commit the change?

> +
> +/*
> + * spec 7.4.4 ITR rules says the maximum observable interrupt rate from the
> + * adapter should not exceed 7813/s (corresponding to 500).
> + */
> +#define E1000E_EFFECTIVE_MIN_XITR (500)
>   
>   #define E1000E_MAX_TX_FRAGS (64)
>   
> @@ -105,8 +114,9 @@ e1000e_lower_legacy_irq(E1000ECore *core)
>   static inline void
>   e1000e_intrmgr_rearm_timer(E1000IntrDelayTimer *timer)
>   {
> -    int64_t delay_ns = (int64_t) timer->core->mac[timer->delay_reg] *
> -                                 timer->delay_resolution_ns;
> +    uint32_t delay = MAX(timer->core->mac[timer->delay_reg],
> +                         E1000E_EFFECTIVE_MIN_XITR);
> +    int64_t delay_ns = (int64_t)delay * timer->delay_resolution_ns;
>   
>       trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
>   
> @@ -2783,7 +2793,7 @@ e1000e_set_itr(E1000ECore *core, int index, uint32_t val)
>       trace_e1000e_irq_itr_set(val);
>   
>       core->itr_guest_value = interval;
> -    core->mac[index] = MAX(interval, E1000E_MIN_XITR);
> +    core->mac[index] = interval;
>   }
>   
>   static void
> @@ -2795,7 +2805,7 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t val)
>       trace_e1000e_irq_eitr_set(eitr_num, val);
>   
>       core->eitr_guest_value[eitr_num] = interval;
> -    core->mac[index] = MAX(interval, E1000E_MIN_XITR);
> +    core->mac[index] = interval;
>   }
>   
>   static void
> @@ -3444,8 +3454,7 @@ static const uint32_t e1000e_mac_reg_init[] = {
>       [FACTPS]        = E1000_FACTPS_LAN0_ON | 0x20000000,
>       [SWSM]          = 1,
>       [RXCSUM]        = E1000_RXCSUM_IPOFLD | E1000_RXCSUM_TUOFLD,
> -    [ITR]           = E1000E_MIN_XITR,
> -    [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = E1000E_MIN_XITR,
> +    [ITR]           = E1000E_DEFAULT_ITR,
>   };
>   
>   static void e1000e_reset(E1000ECore *core, bool sw)



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 03/12] hw/net/e1000e|igb: Remove xitr_guest_value logic
  2025-05-02  3:16 ` [PATCH v3 03/12] hw/net/e1000e|igb: Remove xitr_guest_value logic Nicholas Piggin
@ 2025-05-05  5:45   ` Akihiko Odaki
  2025-05-05  6:38     ` Nicholas Piggin
  0 siblings, 1 reply; 24+ messages in thread
From: Akihiko Odaki @ 2025-05-05  5:45 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: qemu-devel, Dmitry Fleytman, Jason Wang, Sriram Yagnaraman,
	Fabiano Rosas, Laurent Vivier, Paolo Bonzini

On 2025/05/02 12:16, Nicholas Piggin wrote:
> The guest value xITR logic is not required now that the write functions
> store necessary data to be read back, and internal users mask and shift
> fields they need as they go.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   hw/net/e1000e_core.c | 31 +++++++++++++++----------------
>   hw/net/igb_core.c    | 16 +++++++++++++---
>   2 files changed, 28 insertions(+), 19 deletions(-)
> 
> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
> index 96f74f1ea14..f8e6522f810 100644
> --- a/hw/net/e1000e_core.c
> +++ b/hw/net/e1000e_core.c
> @@ -2563,18 +2563,6 @@ e1000e_mac_swsm_read(E1000ECore *core, int index)
>       return val;
>   }
>   
> -static uint32_t
> -e1000e_mac_itr_read(E1000ECore *core, int index)
> -{
> -    return core->itr_guest_value;
> -}
> -
> -static uint32_t
> -e1000e_mac_eitr_read(E1000ECore *core, int index)
> -{
> -    return core->eitr_guest_value[index - EITR];
> -}
> -
>   static uint32_t
>   e1000e_mac_icr_read(E1000ECore *core, int index)
>   {
> @@ -2792,7 +2780,6 @@ e1000e_set_itr(E1000ECore *core, int index, uint32_t val)
>   
>       trace_e1000e_irq_itr_set(val);
>   
> -    core->itr_guest_value = interval;
>       core->mac[index] = interval;
>   }
>   
> @@ -2804,7 +2791,6 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t val)
>   
>       trace_e1000e_irq_eitr_set(eitr_num, val);
>   
> -    core->eitr_guest_value[eitr_num] = interval;
>       core->mac[index] = interval;
>   }
>   
> @@ -3029,6 +3015,7 @@ static const readops e1000e_macreg_readops[] = {
>       e1000e_getreg(GSCN_1),
>       e1000e_getreg(FCAL),
>       e1000e_getreg(FLSWCNT),
> +    e1000e_getreg(ITR),
>   
>       [TOTH]    = e1000e_mac_read_clr8,
>       [GOTCH]   = e1000e_mac_read_clr8,
> @@ -3062,7 +3049,6 @@ static const readops e1000e_macreg_readops[] = {
>       [MPRC]    = e1000e_mac_read_clr4,
>       [BPTC]    = e1000e_mac_read_clr4,
>       [TSCTC]   = e1000e_mac_read_clr4,
> -    [ITR]     = e1000e_mac_itr_read,
>       [CTRL]    = e1000e_get_ctrl,
>       [TARC1]   = e1000e_get_tarc,
>       [SWSM]    = e1000e_mac_swsm_read,
> @@ -3087,7 +3073,7 @@ static const readops e1000e_macreg_readops[] = {
>       [RETA ... RETA + 31]   = e1000e_mac_readreg,
>       [RSSRK ... RSSRK + 31] = e1000e_mac_readreg,
>       [MAVTV0 ... MAVTV3]    = e1000e_mac_readreg,
> -    [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = e1000e_mac_eitr_read
> +    [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = e1000e_mac_readreg,
>   };
>   enum { E1000E_NREADOPS = ARRAY_SIZE(e1000e_macreg_readops) };
>   
> @@ -3517,13 +3503,26 @@ void e1000e_core_pre_save(E1000ECore *core)
>               core->tx[i].skip_cp = true;
>           }
>       }
> +
> +    /* back compat, QEMU moves xITR in itr_guest_value state */
> +    core->itr_guest_value = core->mac[ITR];
> +    for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
> +        core->eitr_guest_value[i] = core->mac[EITR + i];
> +    }
>   }
>   
>   int
>   e1000e_core_post_load(E1000ECore *core)
>   {
> +    int i;
>       NetClientState *nc = qemu_get_queue(core->owner_nic);
>   
> +    /* back compat */
> +    core->mac[ITR] = core->itr_guest_value;
> +    for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
> +        core->mac[EITR + i] = core->eitr_guest_value[i];
> +    }
> +
>       /*
>        * nc.link_down can't be migrated, so infer link_down according
>        * to link status bit in core.mac[STATUS].
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
> index 39e3ce1c8fe..271c54380e9 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -2880,7 +2880,7 @@ igb_mac_swsm_read(IGBCore *core, int index)
>   static uint32_t
>   igb_mac_eitr_read(IGBCore *core, int index)
>   {
> -    return core->eitr_guest_value[index - EITR0];
> +    return core->mac[index - EITR0];
>   }
>   
>   static uint32_t igb_mac_vfmailbox_read(IGBCore *core, int index)
> @@ -3046,8 +3046,7 @@ igb_set_eitr(IGBCore *core, int index, uint32_t val)
>   
>       trace_igb_irq_eitr_set(eitr_num, val);
>   
> -    core->eitr_guest_value[eitr_num] = val & ~E1000_EITR_CNT_IGNR;
> -    core->mac[index] = val & 0x7FFE;
> +    core->mac[index] = val;

This change will keep the CNT_INGR although the specification says it 
"is always read as zero".

>   }
>   
>   static void
> @@ -4527,13 +4526,24 @@ void igb_core_pre_save(IGBCore *core)
>               core->tx[i].skip_cp = true;
>           }
>       }
> +
> +    /* back compat, QEMU moves EITR in eitr_guest_value state */
> +    for (i = 0; i < IGB_INTR_NUM; i++) {
> +        core->eitr_guest_value[i] = core->mac[EITR0 + i];
> +    }
>   }
>   
>   int
>   igb_core_post_load(IGBCore *core)
>   {
> +    int i;
>       NetClientState *nc = qemu_get_queue(core->owner_nic);
>   
> +    /* back compat */
> +    for (i = 0; i < IGB_INTR_NUM; i++) {
> +        core->mac[EITR0 + i] = core->eitr_guest_value[i];
> +    }
> +
>       /*
>        * nc.link_down can't be migrated, so infer link_down according
>        * to link status bit in core.mac[STATUS].



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 10/12] net/e1000e|igb: Only send delayed msix interrupts that have a cause
  2025-05-02  3:17 ` [PATCH v3 10/12] net/e1000e|igb: Only send delayed msix interrupts that have a cause Nicholas Piggin
@ 2025-05-05  5:51   ` Akihiko Odaki
  2025-05-05  6:48     ` Nicholas Piggin
  0 siblings, 1 reply; 24+ messages in thread
From: Akihiko Odaki @ 2025-05-05  5:51 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: qemu-devel, Dmitry Fleytman, Jason Wang, Sriram Yagnaraman,
	Fabiano Rosas, Laurent Vivier, Paolo Bonzini

On 2025/05/02 12:17, Nicholas Piggin wrote:
> The msix interrupt throttling timer expiry sends an interrupt even if
> there is no unmasked interrupt causes. This can be observed by seeing
> two interrupts in response to a single event when throttling is active.
> 
> The e1000e non-msix paths seem to get this right by masking and testing
> ICR and IMS. Add similar checks for the msix cases in e1000e and igb.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   hw/net/e1000e_core.c | 10 ++++++----
>   hw/net/igb_core.c    | 11 ++++++++---
>   2 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
> index 022884a2ea0..d53f70065ef 100644
> --- a/hw/net/e1000e_core.c
> +++ b/hw/net/e1000e_core.c
> @@ -230,10 +230,12 @@ e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
>   
>       timer->running = false;
>   
> -    causes = find_msix_causes(core, idx);
> -    trace_e1000e_irq_msix_notify_postponed_vec(idx);
> -    msix_notify(core->owner, idx);
> -    e1000e_msix_auto_clear_mask(core, causes);
> +    causes = find_msix_causes(core, idx) & core->mac[IMS] & core->mac[ICR];
> +    if (causes) {
> +        trace_e1000e_irq_msix_notify_postponed_vec(idx);
> +        msix_notify(core->owner, causes);

I think this should be: msix_notify(core->owner, idx);

> +        e1000e_msix_auto_clear_mask(core, causes);
> +    }
>   }
>   
>   static void
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
> index 3ae3e53530b..035637f81f8 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -171,12 +171,17 @@ static void
>   igb_intrmgr_on_msix_throttling_timer(void *opaque)
>   {
>       IGBIntrDelayTimer *timer = opaque;
> -    int idx = timer - &timer->core->eitr[0];
> +    IGBCore *core = timer->core;
> +    int vector = timer - &core->eitr[0];
> +    uint32_t causes;
>   
>       timer->running = false;
>   
> -    trace_e1000e_irq_msix_notify_postponed_vec(idx);
> -    igb_msix_notify(timer->core, idx);
> +    causes = core->mac[EICR] & core->mac[EIMS];
> +    if (causes & BIT(vector)) {
> +        trace_e1000e_irq_msix_notify_postponed_vec(vector);
> +        igb_msix_notify(core, vector);
> +    }
>   }
>   
>   static void



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 11/12] net/e1000e|igb: Fix interrupt throttling rearming
  2025-05-02  3:17 ` [PATCH v3 11/12] net/e1000e|igb: Fix interrupt throttling rearming Nicholas Piggin
@ 2025-05-05  6:03   ` Akihiko Odaki
  2025-05-05  6:49     ` Nicholas Piggin
  0 siblings, 1 reply; 24+ messages in thread
From: Akihiko Odaki @ 2025-05-05  6:03 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: qemu-devel, Dmitry Fleytman, Jason Wang, Sriram Yagnaraman,
	Fabiano Rosas, Laurent Vivier, Paolo Bonzini

On 2025/05/02 12:17, Nicholas Piggin wrote:
> Timer expiry that results in an interrupt does not rearm the timer so
> an interrupt can appear immediately after the interrupt generated by
> timer expiry.
> 
> Fix this by rearming the throttle timer when a delayed interrupt is
> processed. e1000e gets this by reusing the e1000e_msix_notify()
> logic, igb calls igb_intrmgr_rearm_timer() directly.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   hw/net/e1000e_core.c |  5 ++--
>   hw/net/igb_core.c    | 55 ++++++++++++++++++++++++++------------------
>   2 files changed, 35 insertions(+), 25 deletions(-)
> 
> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
> index d53f70065ef..2932122c04b 100644
> --- a/hw/net/e1000e_core.c
> +++ b/hw/net/e1000e_core.c
> @@ -218,7 +218,7 @@ static uint32_t find_msix_causes(E1000ECore *core, int vec)
>   }
>   
>   static void
> -e1000e_msix_auto_clear_mask(E1000ECore *core, uint32_t cause);
> +e1000e_msix_notify(E1000ECore *core, uint32_t causes);
>   
>   static void
>   e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
> @@ -233,8 +233,7 @@ e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
>       causes = find_msix_causes(core, idx) & core->mac[IMS] & core->mac[ICR];
>       if (causes) {
>           trace_e1000e_irq_msix_notify_postponed_vec(idx);
> -        msix_notify(core->owner, causes);
> -        e1000e_msix_auto_clear_mask(core, causes);
> +        e1000e_msix_notify(core, causes);
>       }
>   }
>   
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
> index 035637f81f8..cc25a1d5baa 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -152,11 +152,14 @@ igb_intrmgr_arm_timer(IGBIntrDelayTimer *timer, int64_t delay_ns)
>   static inline void
>   igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
>   {
> -    uint32_t interval = (timer->core->mac[timer->delay_reg] &
> -                         E1000_EITR_INTERVAL) >> 2;
> -    int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
> +    uint32_t eitr = timer->core->mac[timer->delay_reg];
>   
> -    igb_intrmgr_arm_timer(timer, delay_ns);
> +    if (eitr != 0) {
> +        uint32_t interval = (eitr & E1000_EITR_INTERVAL) >> 2;
> +        int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
> +
> +        igb_intrmgr_arm_timer(timer, delay_ns);
> +    }
>   }
>   
>   static void
> @@ -168,21 +171,7 @@ igb_intmgr_timer_resume(IGBIntrDelayTimer *timer)
>   }
>   
>   static void
> -igb_intrmgr_on_msix_throttling_timer(void *opaque)
> -{
> -    IGBIntrDelayTimer *timer = opaque;
> -    IGBCore *core = timer->core;
> -    int vector = timer - &core->eitr[0];
> -    uint32_t causes;
> -
> -    timer->running = false;
> -
> -    causes = core->mac[EICR] & core->mac[EIMS];
> -    if (causes & BIT(vector)) {
> -        trace_e1000e_irq_msix_notify_postponed_vec(vector);
> -        igb_msix_notify(core, vector);
> -    }
> -}
> +igb_intrmgr_on_msix_throttling_timer(void *opaque);
>   
>   static void
>   igb_intrmgr_initialize_all_timers(IGBCore *core, bool create)
> @@ -2258,9 +2247,7 @@ igb_postpone_interrupt(IGBIntrDelayTimer *timer)
>           return true;
>       }
>   
> -    if (timer->core->mac[timer->delay_reg] != 0) {
> -        igb_intrmgr_rearm_timer(timer);
> -    }
> +    igb_intrmgr_rearm_timer(timer);
>   
>       return false;
>   }
> @@ -2284,6 +2271,30 @@ static void igb_send_msix(IGBCore *core, uint32_t causes)
>       }
>   }
>   
> +static void
> +igb_intrmgr_on_msix_throttling_timer(void *opaque)
> +{
> +    IGBIntrDelayTimer *timer = opaque;
> +    IGBCore *core = timer->core;
> +    int vector = timer - &core->eitr[0];
> +    uint32_t causes;
> +
> +    timer->running = false;
> +
> +    causes = core->mac[EICR] & core->mac[EIMS];
> +    if (causes & BIT(vector)) {
> +        /*
> +         * The moderation counter is loaded with interval value whenever the
> +         * interrupt is signaled. This includes when the interrupt is signaled
> +         * by the counter reaching 0.
> +         */
> +        igb_intrmgr_rearm_timer(timer);
> +
> +        trace_e1000e_irq_msix_notify_postponed_vec(vector);
> +        igb_msix_notify(core, vector);
> +    }
> +}
> +

I wonder why the definition is moved. This patch adds a 
igb_intrmgr_rearm_timer() call but it's already placed earlier than this 
function.

>   static inline void
>   igb_fix_icr_asserted(IGBCore *core)
>   {



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 02/12] net/e1000e: Permit disabling interrupt throttling
  2025-05-05  5:41   ` Akihiko Odaki
@ 2025-05-05  6:36     ` Nicholas Piggin
  0 siblings, 0 replies; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-05  6:36 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: qemu-devel, Dmitry Fleytman, Jason Wang, Sriram Yagnaraman,
	Fabiano Rosas, Laurent Vivier, Paolo Bonzini

On Mon May 5, 2025 at 3:41 PM AEST, Akihiko Odaki wrote:
> On 2025/05/02 12:16, Nicholas Piggin wrote:
>> The spec explicitly permits xITR register interval field to have a value
>> of zero to disable throttling. The e1000e model already allows for this
>> in the throttling logic, so remove the minimum value for the register.
>> 
>> The spec appears to say there is a maximum observable interrupt rate
>> when throttling is enabled, regardless of ITR value, so throttle timer
>> calculation is clamped to that minimum value.
>> 
>> EITR registers default to 0, as specified in spec 7.4.4.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   hw/net/e1000e_core.c | 25 +++++++++++++++++--------
>>   1 file changed, 17 insertions(+), 8 deletions(-)
>> 
>> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
>> index 24138587905..96f74f1ea14 100644
>> --- a/hw/net/e1000e_core.c
>> +++ b/hw/net/e1000e_core.c
>> @@ -51,8 +51,17 @@
>>   
>>   #include "trace.h"
>>   
>> -/* No more then 7813 interrupts per second according to spec 10.2.4.2 */
>> -#define E1000E_MIN_XITR     (500)
>> +/*
>> + * A suggested range for ITR is 651-5580, according to spec 10.2.4.2, but
>> + * QEMU has traditionally set 500 here.
>> + */
>> +#define E1000E_DEFAULT_ITR (500)
>
> The cover letter says this version changes "initial ITR as well as EITR" 
> but the ITR value is unchanged here. Forgot to commit the change?

Hmm yes I must have, thanks good catch.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 03/12] hw/net/e1000e|igb: Remove xitr_guest_value logic
  2025-05-05  5:45   ` Akihiko Odaki
@ 2025-05-05  6:38     ` Nicholas Piggin
  0 siblings, 0 replies; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-05  6:38 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: qemu-devel, Dmitry Fleytman, Jason Wang, Sriram Yagnaraman,
	Fabiano Rosas, Laurent Vivier, Paolo Bonzini

On Mon May 5, 2025 at 3:45 PM AEST, Akihiko Odaki wrote:
> On 2025/05/02 12:16, Nicholas Piggin wrote:
>> The guest value xITR logic is not required now that the write functions
>> store necessary data to be read back, and internal users mask and shift
>> fields they need as they go.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   hw/net/e1000e_core.c | 31 +++++++++++++++----------------
>>   hw/net/igb_core.c    | 16 +++++++++++++---
>>   2 files changed, 28 insertions(+), 19 deletions(-)
>> 
>> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
>> index 96f74f1ea14..f8e6522f810 100644
>> --- a/hw/net/e1000e_core.c
>> +++ b/hw/net/e1000e_core.c
>> @@ -2563,18 +2563,6 @@ e1000e_mac_swsm_read(E1000ECore *core, int index)
>>       return val;
>>   }
>>   
>> -static uint32_t
>> -e1000e_mac_itr_read(E1000ECore *core, int index)
>> -{
>> -    return core->itr_guest_value;
>> -}
>> -
>> -static uint32_t
>> -e1000e_mac_eitr_read(E1000ECore *core, int index)
>> -{
>> -    return core->eitr_guest_value[index - EITR];
>> -}
>> -
>>   static uint32_t
>>   e1000e_mac_icr_read(E1000ECore *core, int index)
>>   {
>> @@ -2792,7 +2780,6 @@ e1000e_set_itr(E1000ECore *core, int index, uint32_t val)
>>   
>>       trace_e1000e_irq_itr_set(val);
>>   
>> -    core->itr_guest_value = interval;
>>       core->mac[index] = interval;
>>   }
>>   
>> @@ -2804,7 +2791,6 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t val)
>>   
>>       trace_e1000e_irq_eitr_set(eitr_num, val);
>>   
>> -    core->eitr_guest_value[eitr_num] = interval;
>>       core->mac[index] = interval;
>>   }
>>   
>> @@ -3029,6 +3015,7 @@ static const readops e1000e_macreg_readops[] = {
>>       e1000e_getreg(GSCN_1),
>>       e1000e_getreg(FCAL),
>>       e1000e_getreg(FLSWCNT),
>> +    e1000e_getreg(ITR),
>>   
>>       [TOTH]    = e1000e_mac_read_clr8,
>>       [GOTCH]   = e1000e_mac_read_clr8,
>> @@ -3062,7 +3049,6 @@ static const readops e1000e_macreg_readops[] = {
>>       [MPRC]    = e1000e_mac_read_clr4,
>>       [BPTC]    = e1000e_mac_read_clr4,
>>       [TSCTC]   = e1000e_mac_read_clr4,
>> -    [ITR]     = e1000e_mac_itr_read,
>>       [CTRL]    = e1000e_get_ctrl,
>>       [TARC1]   = e1000e_get_tarc,
>>       [SWSM]    = e1000e_mac_swsm_read,
>> @@ -3087,7 +3073,7 @@ static const readops e1000e_macreg_readops[] = {
>>       [RETA ... RETA + 31]   = e1000e_mac_readreg,
>>       [RSSRK ... RSSRK + 31] = e1000e_mac_readreg,
>>       [MAVTV0 ... MAVTV3]    = e1000e_mac_readreg,
>> -    [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = e1000e_mac_eitr_read
>> +    [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = e1000e_mac_readreg,
>>   };
>>   enum { E1000E_NREADOPS = ARRAY_SIZE(e1000e_macreg_readops) };
>>   
>> @@ -3517,13 +3503,26 @@ void e1000e_core_pre_save(E1000ECore *core)
>>               core->tx[i].skip_cp = true;
>>           }
>>       }
>> +
>> +    /* back compat, QEMU moves xITR in itr_guest_value state */
>> +    core->itr_guest_value = core->mac[ITR];
>> +    for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
>> +        core->eitr_guest_value[i] = core->mac[EITR + i];
>> +    }
>>   }
>>   
>>   int
>>   e1000e_core_post_load(E1000ECore *core)
>>   {
>> +    int i;
>>       NetClientState *nc = qemu_get_queue(core->owner_nic);
>>   
>> +    /* back compat */
>> +    core->mac[ITR] = core->itr_guest_value;
>> +    for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
>> +        core->mac[EITR + i] = core->eitr_guest_value[i];
>> +    }
>> +
>>       /*
>>        * nc.link_down can't be migrated, so infer link_down according
>>        * to link status bit in core.mac[STATUS].
>> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
>> index 39e3ce1c8fe..271c54380e9 100644
>> --- a/hw/net/igb_core.c
>> +++ b/hw/net/igb_core.c
>> @@ -2880,7 +2880,7 @@ igb_mac_swsm_read(IGBCore *core, int index)
>>   static uint32_t
>>   igb_mac_eitr_read(IGBCore *core, int index)
>>   {
>> -    return core->eitr_guest_value[index - EITR0];
>> +    return core->mac[index - EITR0];
>>   }
>>   
>>   static uint32_t igb_mac_vfmailbox_read(IGBCore *core, int index)
>> @@ -3046,8 +3046,7 @@ igb_set_eitr(IGBCore *core, int index, uint32_t val)
>>   
>>       trace_igb_irq_eitr_set(eitr_num, val);
>>   
>> -    core->eitr_guest_value[eitr_num] = val & ~E1000_EITR_CNT_IGNR;
>> -    core->mac[index] = val & 0x7FFE;
>> +    core->mac[index] = val;
>
> This change will keep the CNT_INGR although the specification says it 
> "is always read as zero".

I will fix.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 10/12] net/e1000e|igb: Only send delayed msix interrupts that have a cause
  2025-05-05  5:51   ` Akihiko Odaki
@ 2025-05-05  6:48     ` Nicholas Piggin
  0 siblings, 0 replies; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-05  6:48 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: qemu-devel, Dmitry Fleytman, Jason Wang, Sriram Yagnaraman,
	Fabiano Rosas, Laurent Vivier, Paolo Bonzini

On Mon May 5, 2025 at 3:51 PM AEST, Akihiko Odaki wrote:
> On 2025/05/02 12:17, Nicholas Piggin wrote:
>> The msix interrupt throttling timer expiry sends an interrupt even if
>> there is no unmasked interrupt causes. This can be observed by seeing
>> two interrupts in response to a single event when throttling is active.
>> 
>> The e1000e non-msix paths seem to get this right by masking and testing
>> ICR and IMS. Add similar checks for the msix cases in e1000e and igb.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   hw/net/e1000e_core.c | 10 ++++++----
>>   hw/net/igb_core.c    | 11 ++++++++---
>>   2 files changed, 14 insertions(+), 7 deletions(-)
>> 
>> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
>> index 022884a2ea0..d53f70065ef 100644
>> --- a/hw/net/e1000e_core.c
>> +++ b/hw/net/e1000e_core.c
>> @@ -230,10 +230,12 @@ e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
>>   
>>       timer->running = false;
>>   
>> -    causes = find_msix_causes(core, idx);
>> -    trace_e1000e_irq_msix_notify_postponed_vec(idx);
>> -    msix_notify(core->owner, idx);
>> -    e1000e_msix_auto_clear_mask(core, causes);
>> +    causes = find_msix_causes(core, idx) & core->mac[IMS] & core->mac[ICR];
>> +    if (causes) {
>> +        trace_e1000e_irq_msix_notify_postponed_vec(idx);
>> +        msix_notify(core->owner, causes);
>
> I think this should be: msix_notify(core->owner, idx);

Yes good catch, it was fixed in the next patch but it's a but from
splitting. Thank you.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 11/12] net/e1000e|igb: Fix interrupt throttling rearming
  2025-05-05  6:03   ` Akihiko Odaki
@ 2025-05-05  6:49     ` Nicholas Piggin
  0 siblings, 0 replies; 24+ messages in thread
From: Nicholas Piggin @ 2025-05-05  6:49 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: qemu-devel, Dmitry Fleytman, Jason Wang, Sriram Yagnaraman,
	Fabiano Rosas, Laurent Vivier, Paolo Bonzini

On Mon May 5, 2025 at 4:03 PM AEST, Akihiko Odaki wrote:
> On 2025/05/02 12:17, Nicholas Piggin wrote:
>> Timer expiry that results in an interrupt does not rearm the timer so
>> an interrupt can appear immediately after the interrupt generated by
>> timer expiry.
>> 
>> Fix this by rearming the throttle timer when a delayed interrupt is
>> processed. e1000e gets this by reusing the e1000e_msix_notify()
>> logic, igb calls igb_intrmgr_rearm_timer() directly.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   hw/net/e1000e_core.c |  5 ++--
>>   hw/net/igb_core.c    | 55 ++++++++++++++++++++++++++------------------
>>   2 files changed, 35 insertions(+), 25 deletions(-)
>> 
>> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
>> index d53f70065ef..2932122c04b 100644
>> --- a/hw/net/e1000e_core.c
>> +++ b/hw/net/e1000e_core.c
>> @@ -218,7 +218,7 @@ static uint32_t find_msix_causes(E1000ECore *core, int vec)
>>   }
>>   
>>   static void
>> -e1000e_msix_auto_clear_mask(E1000ECore *core, uint32_t cause);
>> +e1000e_msix_notify(E1000ECore *core, uint32_t causes);
>>   
>>   static void
>>   e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
>> @@ -233,8 +233,7 @@ e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
>>       causes = find_msix_causes(core, idx) & core->mac[IMS] & core->mac[ICR];
>>       if (causes) {
>>           trace_e1000e_irq_msix_notify_postponed_vec(idx);
>> -        msix_notify(core->owner, causes);
>> -        e1000e_msix_auto_clear_mask(core, causes);
>> +        e1000e_msix_notify(core, causes);
>>       }
>>   }
>>   
>> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
>> index 035637f81f8..cc25a1d5baa 100644
>> --- a/hw/net/igb_core.c
>> +++ b/hw/net/igb_core.c
>> @@ -152,11 +152,14 @@ igb_intrmgr_arm_timer(IGBIntrDelayTimer *timer, int64_t delay_ns)
>>   static inline void
>>   igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
>>   {
>> -    uint32_t interval = (timer->core->mac[timer->delay_reg] &
>> -                         E1000_EITR_INTERVAL) >> 2;
>> -    int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
>> +    uint32_t eitr = timer->core->mac[timer->delay_reg];
>>   
>> -    igb_intrmgr_arm_timer(timer, delay_ns);
>> +    if (eitr != 0) {
>> +        uint32_t interval = (eitr & E1000_EITR_INTERVAL) >> 2;
>> +        int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
>> +
>> +        igb_intrmgr_arm_timer(timer, delay_ns);
>> +    }
>>   }
>>   
>>   static void
>> @@ -168,21 +171,7 @@ igb_intmgr_timer_resume(IGBIntrDelayTimer *timer)
>>   }
>>   
>>   static void
>> -igb_intrmgr_on_msix_throttling_timer(void *opaque)
>> -{
>> -    IGBIntrDelayTimer *timer = opaque;
>> -    IGBCore *core = timer->core;
>> -    int vector = timer - &core->eitr[0];
>> -    uint32_t causes;
>> -
>> -    timer->running = false;
>> -
>> -    causes = core->mac[EICR] & core->mac[EIMS];
>> -    if (causes & BIT(vector)) {
>> -        trace_e1000e_irq_msix_notify_postponed_vec(vector);
>> -        igb_msix_notify(core, vector);
>> -    }
>> -}
>> +igb_intrmgr_on_msix_throttling_timer(void *opaque);
>>   
>>   static void
>>   igb_intrmgr_initialize_all_timers(IGBCore *core, bool create)
>> @@ -2258,9 +2247,7 @@ igb_postpone_interrupt(IGBIntrDelayTimer *timer)
>>           return true;
>>       }
>>   
>> -    if (timer->core->mac[timer->delay_reg] != 0) {
>> -        igb_intrmgr_rearm_timer(timer);
>> -    }
>> +    igb_intrmgr_rearm_timer(timer);
>>   
>>       return false;
>>   }
>> @@ -2284,6 +2271,30 @@ static void igb_send_msix(IGBCore *core, uint32_t causes)
>>       }
>>   }
>>   
>> +static void
>> +igb_intrmgr_on_msix_throttling_timer(void *opaque)
>> +{
>> +    IGBIntrDelayTimer *timer = opaque;
>> +    IGBCore *core = timer->core;
>> +    int vector = timer - &core->eitr[0];
>> +    uint32_t causes;
>> +
>> +    timer->running = false;
>> +
>> +    causes = core->mac[EICR] & core->mac[EIMS];
>> +    if (causes & BIT(vector)) {
>> +        /*
>> +         * The moderation counter is loaded with interval value whenever the
>> +         * interrupt is signaled. This includes when the interrupt is signaled
>> +         * by the counter reaching 0.
>> +         */
>> +        igb_intrmgr_rearm_timer(timer);
>> +
>> +        trace_e1000e_irq_msix_notify_postponed_vec(vector);
>> +        igb_msix_notify(core, vector);
>> +    }
>> +}
>> +
>
> I wonder why the definition is moved. This patch adds a 
> igb_intrmgr_rearm_timer() call but it's already placed earlier than this 
> function.

Hmm, I did require moving it at one point, but I must have reworked it
enough to avoid it. That makes the patch smaller and nicer. Another
good catch.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 01/12] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq
  2025-05-02  3:16 ` [PATCH v3 01/12] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq Nicholas Piggin
@ 2025-05-19 15:06   ` Fabiano Rosas
  0 siblings, 0 replies; 24+ messages in thread
From: Fabiano Rosas @ 2025-05-19 15:06 UTC (permalink / raw)
  To: Nicholas Piggin, Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Laurent Vivier, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum

Nicholas Piggin <npiggin@gmail.com> writes:

> The e1000e and igb tests do not clear the ICR/EICR cause bits (or
> set auto-clear) on seeing queue interrupts, which inhibits the
> triggering of a new interrupt. The msix pending bit which is used
> to test for the interrupt is also not cleared (the vector is masked).
>
> Fix this by clearing the ICR/EICR cause bits, and the msix pending
> bit using the PBACLR device register.
>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> Cc: Dmitry Fleytman <dmitry.fleytman@gmail.com>
> Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
> Cc: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 04/12] qtest/e1000e|igb: assert irqs are clear before triggering an irq
  2025-05-02  3:16 ` [PATCH v3 04/12] qtest/e1000e|igb: assert irqs are clear before triggering an irq Nicholas Piggin
@ 2025-05-19 15:07   ` Fabiano Rosas
  0 siblings, 0 replies; 24+ messages in thread
From: Fabiano Rosas @ 2025-05-19 15:07 UTC (permalink / raw)
  To: Nicholas Piggin, Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Laurent Vivier, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum

Nicholas Piggin <npiggin@gmail.com> writes:

> Assert there is no existing irq raised that would lead to a false
> positive interrupt test.
>
> e1000e has to disable interrupt throttling for this test, because
> it can cause delayed superfluous interrupts which trip the assertions.
>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> Cc: Dmitry Fleytman <dmitry.fleytman@gmail.com>
> Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
> Cc: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 12/12] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test
  2025-05-02  3:17 ` [PATCH v3 12/12] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test Nicholas Piggin
@ 2025-05-19 15:08   ` Fabiano Rosas
  0 siblings, 0 replies; 24+ messages in thread
From: Fabiano Rosas @ 2025-05-19 15:08 UTC (permalink / raw)
  To: Nicholas Piggin, Akihiko Odaki
  Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
	Sriram Yagnaraman, Laurent Vivier, Paolo Bonzini

Nicholas Piggin <npiggin@gmail.com> writes:

> Enable interrupt throtling on one of the two queue interrupts used
> in the multiple_transfers test, to improve coverage. The number of
> interrupts for the e1000e test is reduced because it has a long minimum
> throttling delay so without reducing iterations throttling adds about
> 40s to the test runtime.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-05-19 15:08 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-02  3:16 [PATCH v3 00/12] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
2025-05-02  3:16 ` [PATCH v3 01/12] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq Nicholas Piggin
2025-05-19 15:06   ` Fabiano Rosas
2025-05-02  3:16 ` [PATCH v3 02/12] net/e1000e: Permit disabling interrupt throttling Nicholas Piggin
2025-05-05  5:41   ` Akihiko Odaki
2025-05-05  6:36     ` Nicholas Piggin
2025-05-02  3:16 ` [PATCH v3 03/12] hw/net/e1000e|igb: Remove xitr_guest_value logic Nicholas Piggin
2025-05-05  5:45   ` Akihiko Odaki
2025-05-05  6:38     ` Nicholas Piggin
2025-05-02  3:16 ` [PATCH v3 04/12] qtest/e1000e|igb: assert irqs are clear before triggering an irq Nicholas Piggin
2025-05-19 15:07   ` Fabiano Rosas
2025-05-02  3:16 ` [PATCH v3 05/12] net/igb: Fix interrupt throttling interval calculation Nicholas Piggin
2025-05-02  3:16 ` [PATCH v3 06/12] net/igb: Implement EITR Moderation Counter Nicholas Piggin
2025-05-02  3:16 ` [PATCH v3 07/12] igb: Add a note about re-loading timers breaking deterministic replay Nicholas Piggin
2025-05-02  3:17 ` [PATCH v3 08/12] hw/net/e1000e: Postponed msix interrupt processing should auto-clear cause Nicholas Piggin
2025-05-02  3:17 ` [PATCH v3 09/12] hw/net/e1000e: Do not auto-clear cause on postponed msix interrupt Nicholas Piggin
2025-05-02  3:17 ` [PATCH v3 10/12] net/e1000e|igb: Only send delayed msix interrupts that have a cause Nicholas Piggin
2025-05-05  5:51   ` Akihiko Odaki
2025-05-05  6:48     ` Nicholas Piggin
2025-05-02  3:17 ` [PATCH v3 11/12] net/e1000e|igb: Fix interrupt throttling rearming Nicholas Piggin
2025-05-05  6:03   ` Akihiko Odaki
2025-05-05  6:49     ` Nicholas Piggin
2025-05-02  3:17 ` [PATCH v3 12/12] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test Nicholas Piggin
2025-05-19 15:08   ` Fabiano Rosas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).