* [PATCH v2 0/8] hw/e1000e|igb: interrupts and qtests fixes
@ 2025-04-11 4:31 Nicholas Piggin
2025-04-11 4:31 ` [PATCH v2 1/8] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq Nicholas Piggin
` (7 more replies)
0 siblings, 8 replies; 14+ messages in thread
From: Nicholas Piggin @ 2025-04-11 4:31 UTC (permalink / raw)
To: Akihiko Odaki
Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini
Changes since v1:
https://lore.kernel.org/qemu-devel/20250115150112.346497-1-npiggin@gmail.com/
- Made changes as suggested by Odaki san in v1. Renamed some
functions and changed some comments, added a Fixes: tag.
- Bug fix in "net/e1000e|igb: Fix interrupt throttling logic"
patch to notify only causes that were not masked.
- Squashed patch 8 into patch 2 and improved changelog and comments,
retained the 7813 interrupts/sec limit for e1000e mitigation.
- Reordered patches in the series.
- Improved the changelog for "net/e1000e|igb: Fix interrupt throttling
logic" to be clearer about the problems and fixes.
- In that patch, made the delayed irq timer rearming path a bit clearer.
- Reduced test iterations to avoid increasing qtest time too much
for e1000e interrupt throttling test.
Thanks,
Nick
Nicholas Piggin (8):
qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after
irq
net/e1000e: Permit disabling interrupt throttling
hw/net/e1000e|igb: Remove xitr_guest_value logic
qtest/e1000e|igb: assert irqs are clear before triggering an irq
net/igb: Fix interrupt throttling interval calculation
net/igb: Implement EITR Moderation Counter
net/e1000e|igb: Fix interrupt throttling logic
qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test
hw/net/igb_regs.h | 11 +++-
tests/qtest/libqos/e1000e.h | 1 +
hw/net/e1000e_core.c | 115 ++++++++++++++++++++++++++----------
hw/net/igb_core.c | 109 +++++++++++++++++++++++++++-------
tests/qtest/e1000e-test.c | 21 ++++++-
tests/qtest/igb-test.c | 18 ++++++
tests/qtest/libqos/e1000e.c | 9 ++-
7 files changed, 227 insertions(+), 57 deletions(-)
--
2.47.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 1/8] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq
2025-04-11 4:31 [PATCH v2 0/8] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
@ 2025-04-11 4:31 ` Nicholas Piggin
2025-04-11 4:31 ` [PATCH v2 2/8] net/e1000e: Permit disabling interrupt throttling Nicholas Piggin
` (6 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2025-04-11 4:31 UTC (permalink / raw)
To: Akihiko Odaki
Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Michael S. Tsirkin, Marcel Apfelbaum
The e1000e and igb tests do not clear the ICR/EICR cause bits (or
set auto-clear) on seeing queue interrupts, which inhibits the
triggering of a new interrupt. The msix pending bit which is used
to test for the interrupt is also not cleared (the vector is masked).
Fix this by clearing the ICR/EICR cause bits, and the msix pending
bit using the PBACLR device register.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
tests/qtest/e1000e-test.c | 9 ++++++++-
tests/qtest/igb-test.c | 8 ++++++++
2 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
index de9738fdb74..746d26cfb67 100644
--- a/tests/qtest/e1000e-test.c
+++ b/tests/qtest/e1000e-test.c
@@ -66,6 +66,10 @@ static void e1000e_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
/* Wait for TX WB interrupt */
e1000e_wait_isr(d, E1000E_TX0_MSG_ID);
+ /* Read ICR to make it ready for next interrupt, assert TXQ0 cause */
+ g_assert(e1000e_macreg_read(d, E1000_ICR) & E1000_ICR_TXQ0);
+ /* Write PBACLR to clear the MSIX pending bit */
+ e1000e_macreg_write(d, E1000_PBACLR, (1 << E1000E_TX0_MSG_ID));
/* Check DD bit */
g_assert_cmphex(le32_to_cpu(descr.upper.data) & E1000_TXD_STAT_DD, ==,
@@ -117,7 +121,10 @@ static void e1000e_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator
/* Wait for TX WB interrupt */
e1000e_wait_isr(d, E1000E_RX0_MSG_ID);
-
+ /* Read ICR to make it ready for next interrupt, assert RXQ0 cause */
+ g_assert(e1000e_macreg_read(d, E1000_ICR) & E1000_ICR_RXQ0);
+ /* Write PBACLR to clear the MSIX pending bit */
+ e1000e_macreg_write(d, E1000_PBACLR, (1 << E1000E_RX0_MSG_ID));
/* Check DD bit */
g_assert_cmphex(le32_to_cpu(descr.wb.upper.status_error) &
E1000_RXD_STAT_DD, ==, E1000_RXD_STAT_DD);
diff --git a/tests/qtest/igb-test.c b/tests/qtest/igb-test.c
index 3d397ea6973..cf8b4131cf2 100644
--- a/tests/qtest/igb-test.c
+++ b/tests/qtest/igb-test.c
@@ -69,6 +69,10 @@ static void igb_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *allo
/* Wait for TX WB interrupt */
e1000e_wait_isr(d, E1000E_TX0_MSG_ID);
+ /* Read EICR which clears it ready for next interrupt, assert TXQ0 cause */
+ g_assert(e1000e_macreg_read(d, E1000_EICR) & (1 << E1000E_TX0_MSG_ID));
+ /* Write PBACLR to clear the MSIX pending bit */
+ e1000e_macreg_write(d, E1000_PBACLR, (1 << E1000E_TX0_MSG_ID));
/* Check DD bit */
g_assert_cmphex(le32_to_cpu(descr.wb.status) & E1000_TXD_STAT_DD, ==,
@@ -120,6 +124,10 @@ static void igb_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
/* Wait for TX WB interrupt */
e1000e_wait_isr(d, E1000E_RX0_MSG_ID);
+ /* Read EICR which clears it ready for next interrupt, assert RXQ0 cause */
+ g_assert(e1000e_macreg_read(d, E1000_EICR) & (1 << E1000E_RX0_MSG_ID));
+ /* Write PBACLR to clear the MSIX pending bit */
+ e1000e_macreg_write(d, E1000_PBACLR, (1 << E1000E_RX0_MSG_ID));
/* Check DD bit */
g_assert_cmphex(le32_to_cpu(descr.wb.upper.status_error) &
--
2.47.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 2/8] net/e1000e: Permit disabling interrupt throttling
2025-04-11 4:31 [PATCH v2 0/8] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
2025-04-11 4:31 ` [PATCH v2 1/8] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq Nicholas Piggin
@ 2025-04-11 4:31 ` Nicholas Piggin
2025-04-19 7:15 ` Akihiko Odaki
2025-04-11 4:31 ` [PATCH v2 3/8] hw/net/e1000e|igb: Remove xitr_guest_value logic Nicholas Piggin
` (5 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Nicholas Piggin @ 2025-04-11 4:31 UTC (permalink / raw)
To: Akihiko Odaki
Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini
The spec explicitly permits xITR register interval field to have a value
of zero to disable throttling. The e1000e model already allows for this
in the throttling logic, so remove the minimum value for the register.
The spec appears to say there is a maximum observable interrupt rate
when throttling is enabled, regardless of ITR value, so throttle timer
calculation is clamped to that minimum value.
EITR registers default to 0, as specified in spec 7.4.4.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
hw/net/e1000e_core.c | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 24138587905..96f74f1ea14 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -51,8 +51,17 @@
#include "trace.h"
-/* No more then 7813 interrupts per second according to spec 10.2.4.2 */
-#define E1000E_MIN_XITR (500)
+/*
+ * A suggested range for ITR is 651-5580, according to spec 10.2.4.2, but
+ * QEMU has traditionally set 500 here.
+ */
+#define E1000E_DEFAULT_ITR (500)
+
+/*
+ * spec 7.4.4 ITR rules says the maximum observable interrupt rate from the
+ * adapter should not exceed 7813/s (corresponding to 500).
+ */
+#define E1000E_EFFECTIVE_MIN_XITR (500)
#define E1000E_MAX_TX_FRAGS (64)
@@ -105,8 +114,9 @@ e1000e_lower_legacy_irq(E1000ECore *core)
static inline void
e1000e_intrmgr_rearm_timer(E1000IntrDelayTimer *timer)
{
- int64_t delay_ns = (int64_t) timer->core->mac[timer->delay_reg] *
- timer->delay_resolution_ns;
+ uint32_t delay = MAX(timer->core->mac[timer->delay_reg],
+ E1000E_EFFECTIVE_MIN_XITR);
+ int64_t delay_ns = (int64_t)delay * timer->delay_resolution_ns;
trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
@@ -2783,7 +2793,7 @@ e1000e_set_itr(E1000ECore *core, int index, uint32_t val)
trace_e1000e_irq_itr_set(val);
core->itr_guest_value = interval;
- core->mac[index] = MAX(interval, E1000E_MIN_XITR);
+ core->mac[index] = interval;
}
static void
@@ -2795,7 +2805,7 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t val)
trace_e1000e_irq_eitr_set(eitr_num, val);
core->eitr_guest_value[eitr_num] = interval;
- core->mac[index] = MAX(interval, E1000E_MIN_XITR);
+ core->mac[index] = interval;
}
static void
@@ -3444,8 +3454,7 @@ static const uint32_t e1000e_mac_reg_init[] = {
[FACTPS] = E1000_FACTPS_LAN0_ON | 0x20000000,
[SWSM] = 1,
[RXCSUM] = E1000_RXCSUM_IPOFLD | E1000_RXCSUM_TUOFLD,
- [ITR] = E1000E_MIN_XITR,
- [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = E1000E_MIN_XITR,
+ [ITR] = E1000E_DEFAULT_ITR,
};
static void e1000e_reset(E1000ECore *core, bool sw)
--
2.47.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 3/8] hw/net/e1000e|igb: Remove xitr_guest_value logic
2025-04-11 4:31 [PATCH v2 0/8] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
2025-04-11 4:31 ` [PATCH v2 1/8] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq Nicholas Piggin
2025-04-11 4:31 ` [PATCH v2 2/8] net/e1000e: Permit disabling interrupt throttling Nicholas Piggin
@ 2025-04-11 4:31 ` Nicholas Piggin
2025-04-19 7:12 ` Akihiko Odaki
2025-04-11 4:31 ` [PATCH v2 4/8] qtest/e1000e|igb: assert irqs are clear before triggering an irq Nicholas Piggin
` (4 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Nicholas Piggin @ 2025-04-11 4:31 UTC (permalink / raw)
To: Akihiko Odaki
Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini
The guest value xITR logic is not required now that the write functions
store necessary data to be read back, and internal users mask and shift
fields they need as they go.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
hw/net/e1000e_core.c | 31 +++++++++++++++----------------
hw/net/igb_core.c | 14 ++++++++++++--
2 files changed, 27 insertions(+), 18 deletions(-)
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 96f74f1ea14..f8e6522f810 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -2563,18 +2563,6 @@ e1000e_mac_swsm_read(E1000ECore *core, int index)
return val;
}
-static uint32_t
-e1000e_mac_itr_read(E1000ECore *core, int index)
-{
- return core->itr_guest_value;
-}
-
-static uint32_t
-e1000e_mac_eitr_read(E1000ECore *core, int index)
-{
- return core->eitr_guest_value[index - EITR];
-}
-
static uint32_t
e1000e_mac_icr_read(E1000ECore *core, int index)
{
@@ -2792,7 +2780,6 @@ e1000e_set_itr(E1000ECore *core, int index, uint32_t val)
trace_e1000e_irq_itr_set(val);
- core->itr_guest_value = interval;
core->mac[index] = interval;
}
@@ -2804,7 +2791,6 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t val)
trace_e1000e_irq_eitr_set(eitr_num, val);
- core->eitr_guest_value[eitr_num] = interval;
core->mac[index] = interval;
}
@@ -3029,6 +3015,7 @@ static const readops e1000e_macreg_readops[] = {
e1000e_getreg(GSCN_1),
e1000e_getreg(FCAL),
e1000e_getreg(FLSWCNT),
+ e1000e_getreg(ITR),
[TOTH] = e1000e_mac_read_clr8,
[GOTCH] = e1000e_mac_read_clr8,
@@ -3062,7 +3049,6 @@ static const readops e1000e_macreg_readops[] = {
[MPRC] = e1000e_mac_read_clr4,
[BPTC] = e1000e_mac_read_clr4,
[TSCTC] = e1000e_mac_read_clr4,
- [ITR] = e1000e_mac_itr_read,
[CTRL] = e1000e_get_ctrl,
[TARC1] = e1000e_get_tarc,
[SWSM] = e1000e_mac_swsm_read,
@@ -3087,7 +3073,7 @@ static const readops e1000e_macreg_readops[] = {
[RETA ... RETA + 31] = e1000e_mac_readreg,
[RSSRK ... RSSRK + 31] = e1000e_mac_readreg,
[MAVTV0 ... MAVTV3] = e1000e_mac_readreg,
- [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = e1000e_mac_eitr_read
+ [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = e1000e_mac_readreg,
};
enum { E1000E_NREADOPS = ARRAY_SIZE(e1000e_macreg_readops) };
@@ -3517,13 +3503,26 @@ void e1000e_core_pre_save(E1000ECore *core)
core->tx[i].skip_cp = true;
}
}
+
+ /* back compat, QEMU moves xITR in itr_guest_value state */
+ core->itr_guest_value = core->mac[ITR];
+ for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
+ core->eitr_guest_value[i] = core->mac[EITR + i];
+ }
}
int
e1000e_core_post_load(E1000ECore *core)
{
+ int i;
NetClientState *nc = qemu_get_queue(core->owner_nic);
+ /* back compat */
+ core->mac[ITR] = core->itr_guest_value;
+ for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
+ core->mac[EITR + i] = core->eitr_guest_value[i];
+ }
+
/*
* nc.link_down can't be migrated, so infer link_down according
* to link status bit in core.mac[STATUS].
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 39e3ce1c8fe..5d169c059d9 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -2880,7 +2880,7 @@ igb_mac_swsm_read(IGBCore *core, int index)
static uint32_t
igb_mac_eitr_read(IGBCore *core, int index)
{
- return core->eitr_guest_value[index - EITR0];
+ return core->mac[index - EITR0];
}
static uint32_t igb_mac_vfmailbox_read(IGBCore *core, int index)
@@ -3046,7 +3046,6 @@ igb_set_eitr(IGBCore *core, int index, uint32_t val)
trace_igb_irq_eitr_set(eitr_num, val);
- core->eitr_guest_value[eitr_num] = val & ~E1000_EITR_CNT_IGNR;
core->mac[index] = val & 0x7FFE;
}
@@ -4527,13 +4526,24 @@ void igb_core_pre_save(IGBCore *core)
core->tx[i].skip_cp = true;
}
}
+
+ /* back compat, QEMU moves EITR in eitr_guest_value state */
+ for (i = 0; i < IGB_INTR_NUM; i++) {
+ core->eitr_guest_value[i] = core->mac[EITR0 + i];
+ }
}
int
igb_core_post_load(IGBCore *core)
{
+ int i;
NetClientState *nc = qemu_get_queue(core->owner_nic);
+ /* back compat */
+ for (i = 0; i < IGB_INTR_NUM; i++) {
+ core->mac[EITR0 + i] = core->eitr_guest_value[i];
+ }
+
/*
* nc.link_down can't be migrated, so infer link_down according
* to link status bit in core.mac[STATUS].
--
2.47.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 4/8] qtest/e1000e|igb: assert irqs are clear before triggering an irq
2025-04-11 4:31 [PATCH v2 0/8] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
` (2 preceding siblings ...)
2025-04-11 4:31 ` [PATCH v2 3/8] hw/net/e1000e|igb: Remove xitr_guest_value logic Nicholas Piggin
@ 2025-04-11 4:31 ` Nicholas Piggin
2025-04-11 4:31 ` [PATCH v2 5/8] net/igb: Fix interrupt throttling interval calculation Nicholas Piggin
` (3 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2025-04-11 4:31 UTC (permalink / raw)
To: Akihiko Odaki
Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Michael S. Tsirkin, Marcel Apfelbaum
Assert there is no existing irq raised that would lead to a false
positive interrupt test.
e1000e has to disable interrupt throttling for this test, because
it can cause delayed superfluous interrupts which trip the assertions.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
tests/qtest/libqos/e1000e.h | 1 +
tests/qtest/e1000e-test.c | 10 ++++++++++
tests/qtest/igb-test.c | 6 ++++++
tests/qtest/libqos/e1000e.c | 9 ++++++++-
4 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/tests/qtest/libqos/e1000e.h b/tests/qtest/libqos/e1000e.h
index 30643c80949..e693670119f 100644
--- a/tests/qtest/libqos/e1000e.h
+++ b/tests/qtest/libqos/e1000e.h
@@ -54,6 +54,7 @@ static inline uint32_t e1000e_macreg_read(QE1000E *d, uint32_t reg)
return qpci_io_readl(&d_pci->pci_dev, d_pci->mac_regs, reg);
}
+bool e1000e_pending_isr(QE1000E *d, uint16_t msg_id);
void e1000e_wait_isr(QE1000E *d, uint16_t msg_id);
void e1000e_tx_ring_push(QE1000E *d, void *descr);
void e1000e_rx_ring_push(QE1000E *d, void *descr);
diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
index 746d26cfb67..a538c72cc84 100644
--- a/tests/qtest/e1000e-test.c
+++ b/tests/qtest/e1000e-test.c
@@ -61,6 +61,9 @@ static void e1000e_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
E1000_TXD_DTYP_D |
sizeof(buffer));
+ /* Ensure the interrupt has not been taken already */
+ g_assert(!e1000e_pending_isr(d, E1000E_TX0_MSG_ID));
+
/* Put descriptor to the ring */
e1000e_tx_ring_push(d, &descr);
@@ -105,6 +108,9 @@ static void e1000e_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator
char buffer[64];
int ret;
+ /* Ensure the interrupt has not been taken already */
+ g_assert(!e1000e_pending_isr(d, E1000E_RX0_MSG_ID));
+
/* Send a dummy packet to device's socket*/
ret = iov_send(test_sockets[0], iov, 2, 0, sizeof(len) + sizeof(packet));
g_assert_cmpint(ret, == , sizeof(packet) + sizeof(len));
@@ -188,6 +194,10 @@ static void test_e1000e_multiple_transfers(void *obj, void *data,
return;
}
+ /* Clear EITR because buggy QEMU throttle timer causes superfluous irqs */
+ e1000e_macreg_write(d, E1000_EITR + E1000E_RX0_MSG_ID * 4, 0);
+ e1000e_macreg_write(d, E1000_EITR + E1000E_TX0_MSG_ID * 4, 0);
+
for (i = 0; i < iterations; i++) {
e1000e_send_verify(d, data, alloc);
e1000e_receive_verify(d, data, alloc);
diff --git a/tests/qtest/igb-test.c b/tests/qtest/igb-test.c
index cf8b4131cf2..12cdd8b498a 100644
--- a/tests/qtest/igb-test.c
+++ b/tests/qtest/igb-test.c
@@ -64,6 +64,9 @@ static void igb_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator *allo
E1000_TXD_DTYP_D |
sizeof(buffer));
+ /* Ensure the interrupt has not been taken already */
+ g_assert(!e1000e_pending_isr(d, E1000E_TX0_MSG_ID));
+
/* Put descriptor to the ring */
e1000e_tx_ring_push(d, &descr);
@@ -119,6 +122,9 @@ static void igb_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator *a
memset(&descr, 0, sizeof(descr));
descr.read.pkt_addr = cpu_to_le64(data);
+ /* Ensure the interrupt has not been taken already */
+ g_assert(!e1000e_pending_isr(d, E1000E_RX0_MSG_ID));
+
/* Put descriptor to the ring */
e1000e_rx_ring_push(d, &descr);
diff --git a/tests/qtest/libqos/e1000e.c b/tests/qtest/libqos/e1000e.c
index 925654c7fd4..0bf70e25389 100644
--- a/tests/qtest/libqos/e1000e.c
+++ b/tests/qtest/libqos/e1000e.c
@@ -77,13 +77,20 @@ static void e1000e_foreach_callback(QPCIDevice *dev, int devfn, void *data)
g_free(dev);
}
+bool e1000e_pending_isr(QE1000E *d, uint16_t msg_id)
+{
+ QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
+
+ return qpci_msix_pending(&d_pci->pci_dev, msg_id);
+}
+
void e1000e_wait_isr(QE1000E *d, uint16_t msg_id)
{
QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
guint64 end_time = g_get_monotonic_time() + 5 * G_TIME_SPAN_SECOND;
do {
- if (qpci_msix_pending(&d_pci->pci_dev, msg_id)) {
+ if (e1000e_pending_isr(d, msg_id)) {
return;
}
qtest_clock_step(d_pci->pci_dev.bus->qts, 10000);
--
2.47.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 5/8] net/igb: Fix interrupt throttling interval calculation
2025-04-11 4:31 [PATCH v2 0/8] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
` (3 preceding siblings ...)
2025-04-11 4:31 ` [PATCH v2 4/8] qtest/e1000e|igb: assert irqs are clear before triggering an irq Nicholas Piggin
@ 2025-04-11 4:31 ` Nicholas Piggin
2025-04-19 7:22 ` Akihiko Odaki
2025-04-11 4:31 ` [PATCH v2 6/8] net/igb: Implement EITR Moderation Counter Nicholas Piggin
` (2 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Nicholas Piggin @ 2025-04-11 4:31 UTC (permalink / raw)
To: Akihiko Odaki
Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini
IGB throttling granularity is 1us, and interval field is in bits 2..14
of the EITRx registers.
Fixes: 3a977deebe6b ("Intrdocue igb device emulation")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
hw/net/igb_regs.h | 3 +++
hw/net/igb_core.c | 7 ++++---
2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index 4dc4c31da27..1ed5ee5039a 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -146,6 +146,9 @@ union e1000_adv_rx_desc {
#define IGB_82576_VF_DEV_ID 0x10CA
#define IGB_I350_VF_DEV_ID 0x1520
+/* Delay increments in nanoseconds for interrupt throttling registers */
+#define IGB_INTR_THROTTLING_NS_RES (1000)
+
/* VLAN info */
#define IGB_TX_FLAGS_VLAN_MASK 0xffff0000
#define IGB_TX_FLAGS_VLAN_SHIFT 16
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 5d169c059d9..8fcc872a7c0 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -142,8 +142,9 @@ static void igb_msix_notify(IGBCore *core, unsigned int cause)
static inline void
igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
{
- int64_t delay_ns = (int64_t) timer->core->mac[timer->delay_reg] *
- timer->delay_resolution_ns;
+ int64_t delay_ns =
+ (int64_t)((timer->core->mac[timer->delay_reg] & 0x7FFC) >> 2) *
+ timer->delay_resolution_ns;
trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
@@ -180,7 +181,7 @@ igb_intrmgr_initialize_all_timers(IGBCore *core, bool create)
for (i = 0; i < IGB_INTR_NUM; i++) {
core->eitr[i].core = core;
core->eitr[i].delay_reg = EITR0 + i;
- core->eitr[i].delay_resolution_ns = E1000_INTR_DELAY_NS_RES;
+ core->eitr[i].delay_resolution_ns = IGB_INTR_THROTTLING_NS_RES;
}
if (!create) {
--
2.47.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 6/8] net/igb: Implement EITR Moderation Counter
2025-04-11 4:31 [PATCH v2 0/8] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
` (4 preceding siblings ...)
2025-04-11 4:31 ` [PATCH v2 5/8] net/igb: Fix interrupt throttling interval calculation Nicholas Piggin
@ 2025-04-11 4:31 ` Nicholas Piggin
2025-04-19 7:40 ` Akihiko Odaki
2025-04-11 4:31 ` [PATCH v2 7/8] net/e1000e|igb: Fix interrupt throttling logic Nicholas Piggin
2025-04-11 4:31 ` [PATCH v2 8/8] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test Nicholas Piggin
7 siblings, 1 reply; 14+ messages in thread
From: Nicholas Piggin @ 2025-04-11 4:31 UTC (permalink / raw)
To: Akihiko Odaki
Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini
IGB EITR registers have counter fields which reflect the current ITR
and LLI counter values, as well as a bit to enable LLI moderation,
and a bit to write the register without modifying the counter fields.
Implement the EITR Moderation Counter (aka EITR counter), and counter
ignore bit. The EITR counter is the time remaining in the interrupt
moderation delay which is implemented as a QEMU timer.
Log an unimp message if software tries to enable LLI moderation.
Add a note about the problem with reloading timers after migration.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
hw/net/igb_regs.h | 8 +++++--
hw/net/igb_core.c | 54 ++++++++++++++++++++++++++++++++++++++++-------
2 files changed, 52 insertions(+), 10 deletions(-)
diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index 1ed5ee5039a..b612248264a 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -321,8 +321,12 @@ union e1000_adv_rx_desc {
#define E1000_EICR_TX_QUEUE3 0x00000800 /* Tx Queue 3 Interrupt */
#define E1000_EICR_OTHER 0x80000000 /* Interrupt Cause Active */
-/* Extended Interrupt Cause Set */
-/* E1000_EITR_CNT_IGNR is only for 82576 and newer */
+/* Extended Interrupt Throttle */
+/* These are only for 82576 and newer */
+#define E1000_EITR_INTERVAL 0x00007FFC
+#define E1000_EITR_LLI_EN 0x00008000
+#define E1000_EITR_LLI_CNT 0x001F0000
+#define E1000_EITR_ITR_CNT 0x7FE00000
#define E1000_EITR_CNT_IGNR 0x80000000 /* Don't reset counters on write */
#define E1000_TSYNCTXCTL_VALID 0x00000001 /* tx timestamp valid */
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 8fcc872a7c0..3ae3e53530b 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -140,12 +140,8 @@ static void igb_msix_notify(IGBCore *core, unsigned int cause)
}
static inline void
-igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
+igb_intrmgr_arm_timer(IGBIntrDelayTimer *timer, int64_t delay_ns)
{
- int64_t delay_ns =
- (int64_t)((timer->core->mac[timer->delay_reg] & 0x7FFC) >> 2) *
- timer->delay_resolution_ns;
-
trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
timer_mod(timer->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + delay_ns);
@@ -153,6 +149,16 @@ igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
timer->running = true;
}
+static inline void
+igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
+{
+ uint32_t interval = (timer->core->mac[timer->delay_reg] &
+ E1000_EITR_INTERVAL) >> 2;
+ int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
+
+ igb_intrmgr_arm_timer(timer, delay_ns);
+}
+
static void
igb_intmgr_timer_resume(IGBIntrDelayTimer *timer)
{
@@ -2881,7 +2887,21 @@ igb_mac_swsm_read(IGBCore *core, int index)
static uint32_t
igb_mac_eitr_read(IGBCore *core, int index)
{
- return core->mac[index - EITR0];
+ uint32_t eitr_num = index - EITR0;
+ uint32_t val = core->mac[eitr_num];
+ IGBIntrDelayTimer *timer = &core->eitr[eitr_num];
+
+ if (timer->running) { /* timer is pending, find time remaining */
+ int64_t remains = timer_expire_time_ns(timer->timer) -
+ qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+ if (remains > 0) {
+ uint32_t cnt; /* CNT is the most significant 10 of 12 bits */
+ cnt = remains / timer->delay_resolution_ns;
+ val |= ((cnt >> 2) << 21) & E1000_EITR_ITR_CNT;
+ }
+ }
+
+ return val;
}
static uint32_t igb_mac_vfmailbox_read(IGBCore *core, int index)
@@ -3047,7 +3067,23 @@ igb_set_eitr(IGBCore *core, int index, uint32_t val)
trace_igb_irq_eitr_set(eitr_num, val);
- core->mac[index] = val & 0x7FFE;
+ if (val & (E1000_EITR_LLI_EN | E1000_EITR_LLI_CNT)) {
+ qemu_log_mask(LOG_UNIMP, "%s: LLI moderation not supported, ignoring\n",
+ __func__);
+ }
+
+ if (!(val & E1000_EITR_CNT_IGNR)) {
+ IGBIntrDelayTimer *timer = &core->eitr[eitr_num];
+ uint32_t itr_cnt = (val & E1000_EITR_ITR_CNT) >> 21;
+ /* CNT is the most significant 10 of 12 bits */
+ uint64_t ns = (itr_cnt << 2) * timer->delay_resolution_ns;
+
+ igb_intrmgr_arm_timer(timer, ns);
+ }
+
+ val &= E1000_EITR_INTERVAL | E1000_EITR_LLI_EN;
+
+ core->mac[index] = val;
}
static void
@@ -4553,7 +4589,9 @@ igb_core_post_load(IGBCore *core)
/*
* we need to restart intrmgr timers, as an older version of
- * QEMU can have stopped them before migration
+ * QEMU can have stopped them before migration.
+ * XXX: re-setting timers with fresh values breaks deterministic
+ * replay.
*/
igb_intrmgr_resume(core);
igb_autoneg_resume(core);
--
2.47.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 7/8] net/e1000e|igb: Fix interrupt throttling logic
2025-04-11 4:31 [PATCH v2 0/8] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
` (5 preceding siblings ...)
2025-04-11 4:31 ` [PATCH v2 6/8] net/igb: Implement EITR Moderation Counter Nicholas Piggin
@ 2025-04-11 4:31 ` Nicholas Piggin
2025-04-19 7:55 ` Akihiko Odaki
2025-04-11 4:31 ` [PATCH v2 8/8] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test Nicholas Piggin
7 siblings, 1 reply; 14+ messages in thread
From: Nicholas Piggin @ 2025-04-11 4:31 UTC (permalink / raw)
To: Akihiko Odaki
Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini
Interrupt throttling is broken in several ways:
- Timer expiry sends an interrupt even if there is no cause.
- (e1000e) Mitigated interrupts still auto-clear cause bits.
- Timer expiry that results in an interrupt does not re-arm the timer so
an interrupt can appear immediately after the timer expiry interrupt.
To fix:
- When the throttle timer expires, check the cause bits corresponding to
the msix vector before sending an irq.
- (e1000e) Skip the auto-clear logic if an interrupt is delayed, and
send delayed irqs using e1000e_msix_notify() to perform auto-clear.
- Re-load the throttle timer when a delayed interrupt is signaled. e1000e
gets this by signaling them with e1000e_msix_notify(), igb calls
igb_intrmgr_rearm_timer() directly.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
hw/net/e1000e_core.c | 59 +++++++++++++++++++++++++++++++++++++++-----
hw/net/igb_core.c | 50 ++++++++++++++++++++++++-------------
2 files changed, 86 insertions(+), 23 deletions(-)
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index f8e6522f810..6fb8da32e4d 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -178,16 +178,62 @@ e1000e_intrmgr_on_throttling_timer(void *opaque)
}
}
+static uint32_t find_msix_causes(E1000ECore *core, int vec)
+{
+ uint32_t causes = 0;
+ uint32_t int_cfg;
+
+ int_cfg = E1000_IVAR_RXQ0(core->mac[IVAR]);
+ if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+ E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+ causes |= E1000_ICR_RXQ0;
+ }
+
+ int_cfg = E1000_IVAR_RXQ1(core->mac[IVAR]);
+ if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+ E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+ causes |= E1000_ICR_RXQ1;
+ }
+
+ int_cfg = E1000_IVAR_TXQ0(core->mac[IVAR]);
+ if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+ E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+ causes |= E1000_ICR_TXQ0;
+ }
+
+ int_cfg = E1000_IVAR_TXQ1(core->mac[IVAR]);
+ if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+ E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+ causes |= E1000_ICR_TXQ1;
+ }
+
+ int_cfg = E1000_IVAR_OTHER(core->mac[IVAR]);
+ if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
+ E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
+ causes |= E1000_ICR_OTHER;
+ }
+
+ return causes;
+}
+
+static void
+e1000e_msix_notify(E1000ECore *core, uint32_t causes);
+
static void
e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
{
E1000IntrDelayTimer *timer = opaque;
- int idx = timer - &timer->core->eitr[0];
+ E1000ECore *core = timer->core;
+ int idx = timer - &core->eitr[0];
+ uint32_t causes;
timer->running = false;
- trace_e1000e_irq_msix_notify_postponed_vec(idx);
- msix_notify(timer->core->owner, idx);
+ causes = find_msix_causes(core, idx) & core->mac[IMS] & core->mac[ICR];
+ if (causes) {
+ trace_e1000e_irq_msix_notify_postponed_vec(idx);
+ e1000e_msix_notify(core, causes);
+ }
}
static void
@@ -1992,10 +2038,11 @@ e1000e_msix_notify_one(E1000ECore *core, uint32_t cause, uint32_t int_cfg)
if (E1000_IVAR_ENTRY_VALID(int_cfg)) {
uint32_t vec = E1000_IVAR_ENTRY_VEC(int_cfg);
if (vec < E1000E_MSIX_VEC_NUM) {
- if (!e1000e_eitr_should_postpone(core, vec)) {
- trace_e1000e_irq_msix_notify_vec(vec);
- msix_notify(core->owner, vec);
+ if (e1000e_eitr_should_postpone(core, vec)) {
+ return;
}
+ trace_e1000e_irq_msix_notify_vec(vec);
+ msix_notify(core->owner, vec);
} else {
trace_e1000e_wrn_msix_vec_wrong(cause, int_cfg);
}
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 3ae3e53530b..cc25a1d5baa 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -152,11 +152,14 @@ igb_intrmgr_arm_timer(IGBIntrDelayTimer *timer, int64_t delay_ns)
static inline void
igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
{
- uint32_t interval = (timer->core->mac[timer->delay_reg] &
- E1000_EITR_INTERVAL) >> 2;
- int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
+ uint32_t eitr = timer->core->mac[timer->delay_reg];
- igb_intrmgr_arm_timer(timer, delay_ns);
+ if (eitr != 0) {
+ uint32_t interval = (eitr & E1000_EITR_INTERVAL) >> 2;
+ int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
+
+ igb_intrmgr_arm_timer(timer, delay_ns);
+ }
}
static void
@@ -168,16 +171,7 @@ igb_intmgr_timer_resume(IGBIntrDelayTimer *timer)
}
static void
-igb_intrmgr_on_msix_throttling_timer(void *opaque)
-{
- IGBIntrDelayTimer *timer = opaque;
- int idx = timer - &timer->core->eitr[0];
-
- timer->running = false;
-
- trace_e1000e_irq_msix_notify_postponed_vec(idx);
- igb_msix_notify(timer->core, idx);
-}
+igb_intrmgr_on_msix_throttling_timer(void *opaque);
static void
igb_intrmgr_initialize_all_timers(IGBCore *core, bool create)
@@ -2253,9 +2247,7 @@ igb_postpone_interrupt(IGBIntrDelayTimer *timer)
return true;
}
- if (timer->core->mac[timer->delay_reg] != 0) {
- igb_intrmgr_rearm_timer(timer);
- }
+ igb_intrmgr_rearm_timer(timer);
return false;
}
@@ -2279,6 +2271,30 @@ static void igb_send_msix(IGBCore *core, uint32_t causes)
}
}
+static void
+igb_intrmgr_on_msix_throttling_timer(void *opaque)
+{
+ IGBIntrDelayTimer *timer = opaque;
+ IGBCore *core = timer->core;
+ int vector = timer - &core->eitr[0];
+ uint32_t causes;
+
+ timer->running = false;
+
+ causes = core->mac[EICR] & core->mac[EIMS];
+ if (causes & BIT(vector)) {
+ /*
+ * The moderation counter is loaded with interval value whenever the
+ * interrupt is signaled. This includes when the interrupt is signaled
+ * by the counter reaching 0.
+ */
+ igb_intrmgr_rearm_timer(timer);
+
+ trace_e1000e_irq_msix_notify_postponed_vec(vector);
+ igb_msix_notify(core, vector);
+ }
+}
+
static inline void
igb_fix_icr_asserted(IGBCore *core)
{
--
2.47.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 8/8] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test
2025-04-11 4:31 [PATCH v2 0/8] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
` (6 preceding siblings ...)
2025-04-11 4:31 ` [PATCH v2 7/8] net/e1000e|igb: Fix interrupt throttling logic Nicholas Piggin
@ 2025-04-11 4:31 ` Nicholas Piggin
7 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2025-04-11 4:31 UTC (permalink / raw)
To: Akihiko Odaki
Cc: Nicholas Piggin, qemu-devel, Dmitry Fleytman, Jason Wang,
Sriram Yagnaraman, Fabiano Rosas, Laurent Vivier, Paolo Bonzini
Enable interrupt throtling on one of the two queue interrupts used
in the multiple_transfers test, to improve coverage. The number of
interrupts for the e1000e test is reduced because it has a long minimum
throttling delay so without reducing iterations throttling adds about
40s to the test runtime.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
tests/qtest/e1000e-test.c | 6 +++---
tests/qtest/igb-test.c | 4 ++++
2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
index a538c72cc84..645b31127f0 100644
--- a/tests/qtest/e1000e-test.c
+++ b/tests/qtest/e1000e-test.c
@@ -181,7 +181,7 @@ static void test_e1000e_rx(void *obj, void *data, QGuestAllocator * alloc)
static void test_e1000e_multiple_transfers(void *obj, void *data,
QGuestAllocator *alloc)
{
- static const long iterations = 4 * 1024;
+ static const long iterations = 1 * 1024;
long i;
QE1000E_PCI *e1000e = obj;
@@ -194,8 +194,8 @@ static void test_e1000e_multiple_transfers(void *obj, void *data,
return;
}
- /* Clear EITR because buggy QEMU throttle timer causes superfluous irqs */
- e1000e_macreg_write(d, E1000_EITR + E1000E_RX0_MSG_ID * 4, 0);
+ /* Use EITR for one irq and disable it for the other, for testing */
+ e1000e_macreg_write(d, E1000_EITR + E1000E_RX0_MSG_ID * 4, 500);
e1000e_macreg_write(d, E1000_EITR + E1000E_TX0_MSG_ID * 4, 0);
for (i = 0; i < iterations; i++) {
diff --git a/tests/qtest/igb-test.c b/tests/qtest/igb-test.c
index 12cdd8b498a..c1877a77be4 100644
--- a/tests/qtest/igb-test.c
+++ b/tests/qtest/igb-test.c
@@ -198,6 +198,10 @@ static void test_igb_multiple_transfers(void *obj, void *data,
return;
}
+ /* Use EITR for one irq and disable it for the other, for testing */
+ e1000e_macreg_write(d, E1000_EITR(E1000E_RX0_MSG_ID), 0);
+ e1000e_macreg_write(d, E1000_EITR(E1000E_TX0_MSG_ID), 10 << 2); /* 10us */
+
for (i = 0; i < iterations; i++) {
igb_send_verify(d, data, alloc);
igb_receive_verify(d, data, alloc);
--
2.47.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/8] hw/net/e1000e|igb: Remove xitr_guest_value logic
2025-04-11 4:31 ` [PATCH v2 3/8] hw/net/e1000e|igb: Remove xitr_guest_value logic Nicholas Piggin
@ 2025-04-19 7:12 ` Akihiko Odaki
0 siblings, 0 replies; 14+ messages in thread
From: Akihiko Odaki @ 2025-04-19 7:12 UTC (permalink / raw)
To: Nicholas Piggin
Cc: qemu-devel, Dmitry Fleytman, Jason Wang, Sriram Yagnaraman,
Fabiano Rosas, Laurent Vivier, Paolo Bonzini
On 2025/04/11 13:31, Nicholas Piggin wrote:
> The guest value xITR logic is not required now that the write functions
> store necessary data to be read back, and internal users mask and shift
> fields they need as they go.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> hw/net/e1000e_core.c | 31 +++++++++++++++----------------
> hw/net/igb_core.c | 14 ++++++++++++--
> 2 files changed, 27 insertions(+), 18 deletions(-)
>
> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
> index 96f74f1ea14..f8e6522f810 100644
> --- a/hw/net/e1000e_core.c
> +++ b/hw/net/e1000e_core.c
> @@ -2563,18 +2563,6 @@ e1000e_mac_swsm_read(E1000ECore *core, int index)
> return val;
> }
>
> -static uint32_t
> -e1000e_mac_itr_read(E1000ECore *core, int index)
> -{
> - return core->itr_guest_value;
> -}
> -
> -static uint32_t
> -e1000e_mac_eitr_read(E1000ECore *core, int index)
> -{
> - return core->eitr_guest_value[index - EITR];
> -}
> -
> static uint32_t
> e1000e_mac_icr_read(E1000ECore *core, int index)
> {
> @@ -2792,7 +2780,6 @@ e1000e_set_itr(E1000ECore *core, int index, uint32_t val)
>
> trace_e1000e_irq_itr_set(val);
>
> - core->itr_guest_value = interval;
> core->mac[index] = interval;
> }
>
> @@ -2804,7 +2791,6 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t val)
>
> trace_e1000e_irq_eitr_set(eitr_num, val);
>
> - core->eitr_guest_value[eitr_num] = interval;
> core->mac[index] = interval;
> }
>
> @@ -3029,6 +3015,7 @@ static const readops e1000e_macreg_readops[] = {
> e1000e_getreg(GSCN_1),
> e1000e_getreg(FCAL),
> e1000e_getreg(FLSWCNT),
> + e1000e_getreg(ITR),
>
> [TOTH] = e1000e_mac_read_clr8,
> [GOTCH] = e1000e_mac_read_clr8,
> @@ -3062,7 +3049,6 @@ static const readops e1000e_macreg_readops[] = {
> [MPRC] = e1000e_mac_read_clr4,
> [BPTC] = e1000e_mac_read_clr4,
> [TSCTC] = e1000e_mac_read_clr4,
> - [ITR] = e1000e_mac_itr_read,
> [CTRL] = e1000e_get_ctrl,
> [TARC1] = e1000e_get_tarc,
> [SWSM] = e1000e_mac_swsm_read,
> @@ -3087,7 +3073,7 @@ static const readops e1000e_macreg_readops[] = {
> [RETA ... RETA + 31] = e1000e_mac_readreg,
> [RSSRK ... RSSRK + 31] = e1000e_mac_readreg,
> [MAVTV0 ... MAVTV3] = e1000e_mac_readreg,
> - [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = e1000e_mac_eitr_read
> + [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = e1000e_mac_readreg,
> };
> enum { E1000E_NREADOPS = ARRAY_SIZE(e1000e_macreg_readops) };
>
> @@ -3517,13 +3503,26 @@ void e1000e_core_pre_save(E1000ECore *core)
> core->tx[i].skip_cp = true;
> }
> }
> +
> + /* back compat, QEMU moves xITR in itr_guest_value state */
> + core->itr_guest_value = core->mac[ITR];
> + for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
> + core->eitr_guest_value[i] = core->mac[EITR + i];
> + }
> }
>
> int
> e1000e_core_post_load(E1000ECore *core)
> {
> + int i;
> NetClientState *nc = qemu_get_queue(core->owner_nic);
>
> + /* back compat */
> + core->mac[ITR] = core->itr_guest_value;
> + for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
> + core->mac[EITR + i] = core->eitr_guest_value[i];
> + }
> +
> /*
> * nc.link_down can't be migrated, so infer link_down according
> * to link status bit in core.mac[STATUS].
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
> index 39e3ce1c8fe..5d169c059d9 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -2880,7 +2880,7 @@ igb_mac_swsm_read(IGBCore *core, int index)
> static uint32_t
> igb_mac_eitr_read(IGBCore *core, int index)
> {
> - return core->eitr_guest_value[index - EITR0];
> + return core->mac[index - EITR0];
> }
>
> static uint32_t igb_mac_vfmailbox_read(IGBCore *core, int index)
> @@ -3046,7 +3046,6 @@ igb_set_eitr(IGBCore *core, int index, uint32_t val)
>
> trace_igb_irq_eitr_set(eitr_num, val);
>
> - core->eitr_guest_value[eitr_num] = val & ~E1000_EITR_CNT_IGNR;
> core->mac[index] = val & 0x7FFE;
EITR has writable bits not covered with 0x7FFE and they should be preserved.
> }
>
> @@ -4527,13 +4526,24 @@ void igb_core_pre_save(IGBCore *core)
> core->tx[i].skip_cp = true;
> }
> }
> +
> + /* back compat, QEMU moves EITR in eitr_guest_value state */
> + for (i = 0; i < IGB_INTR_NUM; i++) {
> + core->eitr_guest_value[i] = core->mac[EITR0 + i];
> + }
> }
>
> int
> igb_core_post_load(IGBCore *core)
> {
> + int i;
> NetClientState *nc = qemu_get_queue(core->owner_nic);
>
> + /* back compat */
> + for (i = 0; i < IGB_INTR_NUM; i++) {
> + core->mac[EITR0 + i] = core->eitr_guest_value[i];
> + }
> +
> /*
> * nc.link_down can't be migrated, so infer link_down according
> * to link status bit in core.mac[STATUS].
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 2/8] net/e1000e: Permit disabling interrupt throttling
2025-04-11 4:31 ` [PATCH v2 2/8] net/e1000e: Permit disabling interrupt throttling Nicholas Piggin
@ 2025-04-19 7:15 ` Akihiko Odaki
0 siblings, 0 replies; 14+ messages in thread
From: Akihiko Odaki @ 2025-04-19 7:15 UTC (permalink / raw)
To: Nicholas Piggin
Cc: qemu-devel, Dmitry Fleytman, Jason Wang, Sriram Yagnaraman,
Fabiano Rosas, Laurent Vivier, Paolo Bonzini
On 2025/04/11 13:31, Nicholas Piggin wrote:
> The spec explicitly permits xITR register interval field to have a value
> of zero to disable throttling. The e1000e model already allows for this
> in the throttling logic, so remove the minimum value for the register.
>
> The spec appears to say there is a maximum observable interrupt rate
> when throttling is enabled, regardless of ITR value, so throttle timer
> calculation is clamped to that minimum value.
>
> EITR registers default to 0, as specified in spec 7.4.4.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> hw/net/e1000e_core.c | 25 +++++++++++++++++--------
> 1 file changed, 17 insertions(+), 8 deletions(-)
>
> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
> index 24138587905..96f74f1ea14 100644
> --- a/hw/net/e1000e_core.c
> +++ b/hw/net/e1000e_core.c
> @@ -51,8 +51,17 @@
>
> #include "trace.h"
>
> -/* No more then 7813 interrupts per second according to spec 10.2.4.2 */
> -#define E1000E_MIN_XITR (500)
> +/*
> + * A suggested range for ITR is 651-5580, according to spec 10.2.4.2, but
> + * QEMU has traditionally set 500 here.
> + */
> +#define E1000E_DEFAULT_ITR (500)
> +
> +/*
> + * spec 7.4.4 ITR rules says the maximum observable interrupt rate from the
> + * adapter should not exceed 7813/s (corresponding to 500).
> + */
> +#define E1000E_EFFECTIVE_MIN_XITR (500)
>
> #define E1000E_MAX_TX_FRAGS (64)
>
> @@ -105,8 +114,9 @@ e1000e_lower_legacy_irq(E1000ECore *core)
> static inline void
> e1000e_intrmgr_rearm_timer(E1000IntrDelayTimer *timer)
> {
> - int64_t delay_ns = (int64_t) timer->core->mac[timer->delay_reg] *
> - timer->delay_resolution_ns;
> + uint32_t delay = MAX(timer->core->mac[timer->delay_reg],
> + E1000E_EFFECTIVE_MIN_XITR);
> + int64_t delay_ns = (int64_t)delay * timer->delay_resolution_ns;
>
> trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
>
> @@ -2783,7 +2793,7 @@ e1000e_set_itr(E1000ECore *core, int index, uint32_t val)
> trace_e1000e_irq_itr_set(val);
>
> core->itr_guest_value = interval;
> - core->mac[index] = MAX(interval, E1000E_MIN_XITR);
> + core->mac[index] = interval;
> }
>
> static void
> @@ -2795,7 +2805,7 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t val)
> trace_e1000e_irq_eitr_set(eitr_num, val);
>
> core->eitr_guest_value[eitr_num] = interval;
> - core->mac[index] = MAX(interval, E1000E_MIN_XITR);
> + core->mac[index] = interval;
> }
>
> static void
> @@ -3444,8 +3454,7 @@ static const uint32_t e1000e_mac_reg_init[] = {
> [FACTPS] = E1000_FACTPS_LAN0_ON | 0x20000000,
> [SWSM] = 1,
> [RXCSUM] = E1000_RXCSUM_IPOFLD | E1000_RXCSUM_TUOFLD,
> - [ITR] = E1000E_MIN_XITR,
> - [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = E1000E_MIN_XITR,
> + [ITR] = E1000E_DEFAULT_ITR,
I think you can change the initial ITR value as you do for EITR.
> };
>
> static void e1000e_reset(E1000ECore *core, bool sw)
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 5/8] net/igb: Fix interrupt throttling interval calculation
2025-04-11 4:31 ` [PATCH v2 5/8] net/igb: Fix interrupt throttling interval calculation Nicholas Piggin
@ 2025-04-19 7:22 ` Akihiko Odaki
0 siblings, 0 replies; 14+ messages in thread
From: Akihiko Odaki @ 2025-04-19 7:22 UTC (permalink / raw)
To: Nicholas Piggin
Cc: qemu-devel, Dmitry Fleytman, Jason Wang, Sriram Yagnaraman,
Fabiano Rosas, Laurent Vivier, Paolo Bonzini
On 2025/04/11 13:31, Nicholas Piggin wrote:
> IGB throttling granularity is 1us, and interval field is in bits 2..14
> of the EITRx registers.
>
> Fixes: 3a977deebe6b ("Intrdocue igb device emulation")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> hw/net/igb_regs.h | 3 +++
> hw/net/igb_core.c | 7 ++++---
> 2 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
> index 4dc4c31da27..1ed5ee5039a 100644
> --- a/hw/net/igb_regs.h
> +++ b/hw/net/igb_regs.h
> @@ -146,6 +146,9 @@ union e1000_adv_rx_desc {
> #define IGB_82576_VF_DEV_ID 0x10CA
> #define IGB_I350_VF_DEV_ID 0x1520
>
> +/* Delay increments in nanoseconds for interrupt throttling registers */
> +#define IGB_INTR_THROTTLING_NS_RES (1000)
Let's use the SCALE_US definition.
> +
> /* VLAN info */
> #define IGB_TX_FLAGS_VLAN_MASK 0xffff0000
> #define IGB_TX_FLAGS_VLAN_SHIFT 16
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
> index 5d169c059d9..8fcc872a7c0 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -142,8 +142,9 @@ static void igb_msix_notify(IGBCore *core, unsigned int cause)
> static inline void
> igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
> {
> - int64_t delay_ns = (int64_t) timer->core->mac[timer->delay_reg] *
> - timer->delay_resolution_ns;
> + int64_t delay_ns =
> + (int64_t)((timer->core->mac[timer->delay_reg] & 0x7FFC) >> 2) *
> + timer->delay_resolution_ns;
>
> trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
>
> @@ -180,7 +181,7 @@ igb_intrmgr_initialize_all_timers(IGBCore *core, bool create)
> for (i = 0; i < IGB_INTR_NUM; i++) {
> core->eitr[i].core = core;
> core->eitr[i].delay_reg = EITR0 + i;
> - core->eitr[i].delay_resolution_ns = E1000_INTR_DELAY_NS_RES;
> + core->eitr[i].delay_resolution_ns = IGB_INTR_THROTTLING_NS_RES;
> }
>
> if (!create) {
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 6/8] net/igb: Implement EITR Moderation Counter
2025-04-11 4:31 ` [PATCH v2 6/8] net/igb: Implement EITR Moderation Counter Nicholas Piggin
@ 2025-04-19 7:40 ` Akihiko Odaki
0 siblings, 0 replies; 14+ messages in thread
From: Akihiko Odaki @ 2025-04-19 7:40 UTC (permalink / raw)
To: Nicholas Piggin
Cc: qemu-devel, Dmitry Fleytman, Jason Wang, Sriram Yagnaraman,
Fabiano Rosas, Laurent Vivier, Paolo Bonzini
On 2025/04/11 13:31, Nicholas Piggin wrote:
> IGB EITR registers have counter fields which reflect the current ITR
> and LLI counter values, as well as a bit to enable LLI moderation,
> and a bit to write the register without modifying the counter fields.
>
> Implement the EITR Moderation Counter (aka EITR counter), and counter
> ignore bit. The EITR counter is the time remaining in the interrupt
> moderation delay which is implemented as a QEMU timer.
>
> Log an unimp message if software tries to enable LLI moderation.
>
> Add a note about the problem with reloading timers after migration.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> hw/net/igb_regs.h | 8 +++++--
> hw/net/igb_core.c | 54 ++++++++++++++++++++++++++++++++++++++++-------
> 2 files changed, 52 insertions(+), 10 deletions(-)
>
> diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
> index 1ed5ee5039a..b612248264a 100644
> --- a/hw/net/igb_regs.h
> +++ b/hw/net/igb_regs.h
> @@ -321,8 +321,12 @@ union e1000_adv_rx_desc {
> #define E1000_EICR_TX_QUEUE3 0x00000800 /* Tx Queue 3 Interrupt */
> #define E1000_EICR_OTHER 0x80000000 /* Interrupt Cause Active */
>
> -/* Extended Interrupt Cause Set */
> -/* E1000_EITR_CNT_IGNR is only for 82576 and newer */
> +/* Extended Interrupt Throttle */
> +/* These are only for 82576 and newer */
> +#define E1000_EITR_INTERVAL 0x00007FFC
> +#define E1000_EITR_LLI_EN 0x00008000
> +#define E1000_EITR_LLI_CNT 0x001F0000
> +#define E1000_EITR_ITR_CNT 0x7FE00000
This part is copied from Linux. Please put the new definitions after the
line saying: /* new */
> #define E1000_EITR_CNT_IGNR 0x80000000 /* Don't reset counters on write */
>
> #define E1000_TSYNCTXCTL_VALID 0x00000001 /* tx timestamp valid */
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
> index 8fcc872a7c0..3ae3e53530b 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -140,12 +140,8 @@ static void igb_msix_notify(IGBCore *core, unsigned int cause)
> }
>
> static inline void
> -igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
> +igb_intrmgr_arm_timer(IGBIntrDelayTimer *timer, int64_t delay_ns)
> {
> - int64_t delay_ns =
> - (int64_t)((timer->core->mac[timer->delay_reg] & 0x7FFC) >> 2) *
> - timer->delay_resolution_ns;
> -
> trace_e1000e_irq_rearm_timer(timer->delay_reg << 2, delay_ns);
>
> timer_mod(timer->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + delay_ns);
> @@ -153,6 +149,16 @@ igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
> timer->running = true;
> }
>
> +static inline void
> +igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
> +{
> + uint32_t interval = (timer->core->mac[timer->delay_reg] &
> + E1000_EITR_INTERVAL) >> 2;
> + int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
> +
> + igb_intrmgr_arm_timer(timer, delay_ns);
> +}
> +
> static void
> igb_intmgr_timer_resume(IGBIntrDelayTimer *timer)
> {
> @@ -2881,7 +2887,21 @@ igb_mac_swsm_read(IGBCore *core, int index)
> static uint32_t
> igb_mac_eitr_read(IGBCore *core, int index)
> {
> - return core->mac[index - EITR0];
> + uint32_t eitr_num = index - EITR0;
> + uint32_t val = core->mac[eitr_num];
> + IGBIntrDelayTimer *timer = &core->eitr[eitr_num];
> +
> + if (timer->running) { /* timer is pending, find time remaining */
> + int64_t remains = timer_expire_time_ns(timer->timer) -
> + qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
> + if (remains > 0) {
> + uint32_t cnt; /* CNT is the most significant 10 of 12 bits */
> + cnt = remains / timer->delay_resolution_ns;
> + val |= ((cnt >> 2) << 21) & E1000_EITR_ITR_CNT;
> + }
> + }
> +
> + return val;
> }
>
> static uint32_t igb_mac_vfmailbox_read(IGBCore *core, int index)
> @@ -3047,7 +3067,23 @@ igb_set_eitr(IGBCore *core, int index, uint32_t val)
>
> trace_igb_irq_eitr_set(eitr_num, val);
>
> - core->mac[index] = val & 0x7FFE;
> + if (val & (E1000_EITR_LLI_EN | E1000_EITR_LLI_CNT)) {
> + qemu_log_mask(LOG_UNIMP, "%s: LLI moderation not supported, ignoring\n",
> + __func__);
> + }
> +
> + if (!(val & E1000_EITR_CNT_IGNR)) {
> + IGBIntrDelayTimer *timer = &core->eitr[eitr_num];
> + uint32_t itr_cnt = (val & E1000_EITR_ITR_CNT) >> 21;
> + /* CNT is the most significant 10 of 12 bits */
> + uint64_t ns = (itr_cnt << 2) * timer->delay_resolution_ns;
> +
> + igb_intrmgr_arm_timer(timer, ns);
> + }
> +
> + val &= E1000_EITR_INTERVAL | E1000_EITR_LLI_EN;
> +
> + core->mac[index] = val;
> }
>
> static void
> @@ -4553,7 +4589,9 @@ igb_core_post_load(IGBCore *core)
>
> /*
> * we need to restart intrmgr timers, as an older version of
> - * QEMU can have stopped them before migration
> + * QEMU can have stopped them before migration.
> + * XXX: re-setting timers with fresh values breaks deterministic
> + * replay.
> */
> igb_intrmgr_resume(core);
> igb_autoneg_resume(core);
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 7/8] net/e1000e|igb: Fix interrupt throttling logic
2025-04-11 4:31 ` [PATCH v2 7/8] net/e1000e|igb: Fix interrupt throttling logic Nicholas Piggin
@ 2025-04-19 7:55 ` Akihiko Odaki
0 siblings, 0 replies; 14+ messages in thread
From: Akihiko Odaki @ 2025-04-19 7:55 UTC (permalink / raw)
To: Nicholas Piggin
Cc: qemu-devel, Dmitry Fleytman, Jason Wang, Sriram Yagnaraman,
Fabiano Rosas, Laurent Vivier, Paolo Bonzini
On 2025/04/11 13:31, Nicholas Piggin wrote:
> Interrupt throttling is broken in several ways:
> - Timer expiry sends an interrupt even if there is no cause.
> - (e1000e) Mitigated interrupts still auto-clear cause bits.
> - Timer expiry that results in an interrupt does not re-arm the timer so
> an interrupt can appear immediately after the timer expiry interrupt.
>
> To fix:
>
> - When the throttle timer expires, check the cause bits corresponding to
> the msix vector before sending an irq.
> - (e1000e) Skip the auto-clear logic if an interrupt is delayed, and
> send delayed irqs using e1000e_msix_notify() to perform auto-clear.
> - Re-load the throttle timer when a delayed interrupt is signaled. e1000e
> gets this by signaling them with e1000e_msix_notify(), igb calls
> igb_intrmgr_rearm_timer() directly.
Please split this patch into independent changes.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> hw/net/e1000e_core.c | 59 +++++++++++++++++++++++++++++++++++++++-----
> hw/net/igb_core.c | 50 ++++++++++++++++++++++++-------------
> 2 files changed, 86 insertions(+), 23 deletions(-)
>
> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
> index f8e6522f810..6fb8da32e4d 100644
> --- a/hw/net/e1000e_core.c
> +++ b/hw/net/e1000e_core.c
> @@ -178,16 +178,62 @@ e1000e_intrmgr_on_throttling_timer(void *opaque)
> }
> }
>
> +static uint32_t find_msix_causes(E1000ECore *core, int vec)
> +{
> + uint32_t causes = 0;
> + uint32_t int_cfg;
> +
> + int_cfg = E1000_IVAR_RXQ0(core->mac[IVAR]);
> + if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
> + E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
> + causes |= E1000_ICR_RXQ0;
> + }
> +
> + int_cfg = E1000_IVAR_RXQ1(core->mac[IVAR]);
> + if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
> + E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
> + causes |= E1000_ICR_RXQ1;
> + }
> +
> + int_cfg = E1000_IVAR_TXQ0(core->mac[IVAR]);
> + if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
> + E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
> + causes |= E1000_ICR_TXQ0;
> + }
> +
> + int_cfg = E1000_IVAR_TXQ1(core->mac[IVAR]);
> + if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
> + E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
> + causes |= E1000_ICR_TXQ1;
> + }
> +
> + int_cfg = E1000_IVAR_OTHER(core->mac[IVAR]);
> + if (E1000_IVAR_ENTRY_VALID(int_cfg) &&
> + E1000_IVAR_ENTRY_VEC(int_cfg) == vec) {
> + causes |= E1000_ICR_OTHER;
> + }
> +
> + return causes;
> +}
> +
> +static void
> +e1000e_msix_notify(E1000ECore *core, uint32_t causes);
> +
> static void
> e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
> {
> E1000IntrDelayTimer *timer = opaque;
> - int idx = timer - &timer->core->eitr[0];
> + E1000ECore *core = timer->core;
> + int idx = timer - &core->eitr[0];
> + uint32_t causes;
>
> timer->running = false;
>
> - trace_e1000e_irq_msix_notify_postponed_vec(idx);
> - msix_notify(timer->core->owner, idx);
> + causes = find_msix_causes(core, idx) & core->mac[IMS] & core->mac[ICR];
> + if (causes) {
> + trace_e1000e_irq_msix_notify_postponed_vec(idx);
> + e1000e_msix_notify(core, causes);
> + }
> }
>
> static void
> @@ -1992,10 +2038,11 @@ e1000e_msix_notify_one(E1000ECore *core, uint32_t cause, uint32_t int_cfg)
> if (E1000_IVAR_ENTRY_VALID(int_cfg)) {
> uint32_t vec = E1000_IVAR_ENTRY_VEC(int_cfg);
> if (vec < E1000E_MSIX_VEC_NUM) {
> - if (!e1000e_eitr_should_postpone(core, vec)) {
> - trace_e1000e_irq_msix_notify_vec(vec);
> - msix_notify(core->owner, vec);
> + if (e1000e_eitr_should_postpone(core, vec)) {
> + return;
> }
> + trace_e1000e_irq_msix_notify_vec(vec);
> + msix_notify(core->owner, vec);
> } else {
> trace_e1000e_wrn_msix_vec_wrong(cause, int_cfg);
> }
> diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
> index 3ae3e53530b..cc25a1d5baa 100644
> --- a/hw/net/igb_core.c
> +++ b/hw/net/igb_core.c
> @@ -152,11 +152,14 @@ igb_intrmgr_arm_timer(IGBIntrDelayTimer *timer, int64_t delay_ns)
> static inline void
> igb_intrmgr_rearm_timer(IGBIntrDelayTimer *timer)
> {
> - uint32_t interval = (timer->core->mac[timer->delay_reg] &
> - E1000_EITR_INTERVAL) >> 2;
> - int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
> + uint32_t eitr = timer->core->mac[timer->delay_reg];
>
> - igb_intrmgr_arm_timer(timer, delay_ns);
> + if (eitr != 0) {
> + uint32_t interval = (eitr & E1000_EITR_INTERVAL) >> 2;
> + int64_t delay_ns = (int64_t)interval * timer->delay_resolution_ns;
> +
> + igb_intrmgr_arm_timer(timer, delay_ns);
> + }
> }
>
> static void
> @@ -168,16 +171,7 @@ igb_intmgr_timer_resume(IGBIntrDelayTimer *timer)
> }
>
> static void
> -igb_intrmgr_on_msix_throttling_timer(void *opaque)
> -{
> - IGBIntrDelayTimer *timer = opaque;
> - int idx = timer - &timer->core->eitr[0];
> -
> - timer->running = false;
> -
> - trace_e1000e_irq_msix_notify_postponed_vec(idx);
> - igb_msix_notify(timer->core, idx);
> -}
> +igb_intrmgr_on_msix_throttling_timer(void *opaque);
>
> static void
> igb_intrmgr_initialize_all_timers(IGBCore *core, bool create)
> @@ -2253,9 +2247,7 @@ igb_postpone_interrupt(IGBIntrDelayTimer *timer)
> return true;
> }
>
> - if (timer->core->mac[timer->delay_reg] != 0) {
> - igb_intrmgr_rearm_timer(timer);
> - }
> + igb_intrmgr_rearm_timer(timer);
>
> return false;
> }
> @@ -2279,6 +2271,30 @@ static void igb_send_msix(IGBCore *core, uint32_t causes)
> }
> }
>
> +static void
> +igb_intrmgr_on_msix_throttling_timer(void *opaque)
> +{
> + IGBIntrDelayTimer *timer = opaque;
> + IGBCore *core = timer->core;
> + int vector = timer - &core->eitr[0];
> + uint32_t causes;
> +
> + timer->running = false;
> +
> + causes = core->mac[EICR] & core->mac[EIMS];
> + if (causes & BIT(vector)) {
> + /*
> + * The moderation counter is loaded with interval value whenever the
> + * interrupt is signaled. This includes when the interrupt is signaled
> + * by the counter reaching 0.
> + */
> + igb_intrmgr_rearm_timer(timer);
> +
> + trace_e1000e_irq_msix_notify_postponed_vec(vector);
> + igb_msix_notify(core, vector);
> + }
> +}
> +
> static inline void
> igb_fix_icr_asserted(IGBCore *core)
> {
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-04-19 7:56 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-11 4:31 [PATCH v2 0/8] hw/e1000e|igb: interrupts and qtests fixes Nicholas Piggin
2025-04-11 4:31 ` [PATCH v2 1/8] qtest/e1000e|igb: Clear interrupt-cause and msix pending bits after irq Nicholas Piggin
2025-04-11 4:31 ` [PATCH v2 2/8] net/e1000e: Permit disabling interrupt throttling Nicholas Piggin
2025-04-19 7:15 ` Akihiko Odaki
2025-04-11 4:31 ` [PATCH v2 3/8] hw/net/e1000e|igb: Remove xitr_guest_value logic Nicholas Piggin
2025-04-19 7:12 ` Akihiko Odaki
2025-04-11 4:31 ` [PATCH v2 4/8] qtest/e1000e|igb: assert irqs are clear before triggering an irq Nicholas Piggin
2025-04-11 4:31 ` [PATCH v2 5/8] net/igb: Fix interrupt throttling interval calculation Nicholas Piggin
2025-04-19 7:22 ` Akihiko Odaki
2025-04-11 4:31 ` [PATCH v2 6/8] net/igb: Implement EITR Moderation Counter Nicholas Piggin
2025-04-19 7:40 ` Akihiko Odaki
2025-04-11 4:31 ` [PATCH v2 7/8] net/e1000e|igb: Fix interrupt throttling logic Nicholas Piggin
2025-04-19 7:55 ` Akihiko Odaki
2025-04-11 4:31 ` [PATCH v2 8/8] qtest/e1000e|igb: Test interrupt throttling in multiple_transfers test Nicholas Piggin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).