public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v11 0/2] usb: xhci: route device to secondary interrupters
@ 2026-01-27  2:34 raoxu
  2026-01-27  2:38 ` [PATCH v11 1/2] usb: xhci: refactor IRQ/interrupter plumbing for multi-vector support raoxu
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: raoxu @ 2026-01-27  2:34 UTC (permalink / raw)
  To: mathias.nyman
  Cc: gregkh, linux-usb, michal.pecio, niklas.neronin, raoxu, zhanjun,
	kenny

From: Xu Rao <raoxu@uniontech.com>

This series is split into two steps: patch 1/2 refactors xHCI IRQ and
interrupter handling to make multi-vector operation possible without
changing behavior; patch 2/2 then enables a small capped set of secondary
interrupters/vectors and routes transfer completions per device (slot) to
reduce contention on interrupter 0.

This is not about increasing USB link throughput, but about avoiding a
driver-imposed single hot spot. On hosts that already provide multiple
MSI/MSI-X vectors and independent event rings, routing all completions
through interrupter 0 creates unnecessary contention (serialized event
handling/locks and coupled moderation), increasing CPU cost and tail
latency under many active devices/endpoints. Using secondary interrupters
simply matches the hardware's design, similar in spirit to merged
xHCI-sideband work: exploit available parallel paths rather than
funneling all events through one.

Xu Rao (2):
  usb: xhci: refactor IRQ/interrupter plumbing for multi-vector support
  usb: xhci: enable secondary interrupters and route transfers per slot

Signed-off-by: Xu Rao <raoxu@uniontech.com>
---
Changelog:
v1 -> v2:
  - Bind interrupters to endpoints at enable time instead of selecting
    per transfer.
  - Store the selected interrupter in struct xhci_virt_ep and program
    TRB_INTR_TARGET() from the bound interrupter.
  - Use a single IRQ handler for both primary and secondary vectors,
    with STS_EINT handling restricted to interrupter 0.
  - Keep a common dev_id for IRQ registration to match the existing
    xhci_cleanup_msix() teardown constraints and avoid dev_id
    lifetime issues.
  - Clarify secondary interrupter teardown to avoid double-free or
    use-after-free during xHCI removal.
v2 -> v3:
  - modify commit information
v3 -> v4:
  - Bind interrupters per USB device (slot) via struct xhci_virt_device,
    program TRB_INTR_TARGET() from vdev->interrupter for bulk/ctrl/isoc.
  - Drop xfer_interrupters and unify on xhci->interrupters[] for both
    primary and secondary event rings and IRQ routing.
  - Allocate secondary interrupters in xhci_mem_init; on any allocation
    failure, rollback and continue with primary interrupter only.
  - Cap secondary interrupter creation with MAX_SECONDARY_INTRNUM,
    defaulting to 4.
  - Route xhci_msi_irq by irq handler_data token (intr_num + 1) to keep
    correct interrupter selection across resume/power_lost.
  - Apply MSI-X affinity hints for secondary vectors.
v4 -> v5:
  - Fix min() signedness build error reported by 0day CI.
  - Rebase onto v6.19-rc2.
v5 -> v6:
  - Route secondary MSI/MSI-X IRQs by storing struct xhci_interrupter in
    irq handler_data, instead of using an (intr_num + 1) token mapping.
  - Program Slot Context Interrupter Target (tt_info[31:22]) from
    vdev->interrupter to keep slot default routing aligned with TRB
    TRB_INTR_TARGET() selection.
v6 -> v7:
  - Add xhci_quiesce_interrupter() and use it for secondary
    interrupters in xhci_stop() and the power_lost path of xhci_resume(),
    ensuring IMAN.IP (RW1C) and ERDP.EHB are properly cleared.
v7 -> v8:
  - Sync secondary MSI/MSI-X vectors in xhci_msix_sync_irqs() with
    synchronize_irq().
  - Fix build errors by adding missing header includes for the IRQ helper APIs.
v8 -> v9:
  - Use PCI_IRQ_AFFINITY to let PCI core spread MSI/MSI-X vectors across CPUs.
  - Route each MSI/MSI-X vector to its interrupter via per-vector irq_ctx dev_id.
  - Fix modpost error: xhci_msix_set_handler_data undefined (0-day CI).
    https://lore.kernel.org/oe-kbuild-all/202601171743.omq3DpnM-lkp@intel.com/
  - Rebase onto v6.19-rc6.
v9 -> v10:
  - refactor IRQ/interrupter plumbing for multi-vector support.
  - add xhci_handle_slot_secondary_events to handle secondary event ring.
v10 -> v11:
  - Fix build warnings:-Wsometimes-uninitialized.
    Closes: https://lore.kernel.org/oe-kbuild-all/202601262208.UybEjc9X-lkp@intel.com/
---
 drivers/usb/host/xhci-mem.c  |  48 +++++++++++
 drivers/usb/host/xhci-pci.c  |  57 ++++++++++---
 drivers/usb/host/xhci-ring.c | 156 +++++++++++++++++++++++++++++------
 drivers/usb/host/xhci.c      |  37 ++++++---
 drivers/usb/host/xhci.h      |  26 +++++-
 5 files changed, 274 insertions(+), 50 deletions(-)
---

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v11 1/2] usb: xhci: refactor IRQ/interrupter plumbing for multi-vector support
  2026-01-27  2:34 [PATCH v11 0/2] usb: xhci: route device to secondary interrupters raoxu
@ 2026-01-27  2:38 ` raoxu
  2026-01-27 12:54   ` Neronin, Niklas
  2026-01-27  2:39 ` [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot raoxu
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: raoxu @ 2026-01-27  2:38 UTC (permalink / raw)
  To: raoxu
  Cc: gregkh, kenny, linux-usb, mathias.nyman, michal.pecio,
	niklas.neronin, zhanjun

From: Xu Rao <raoxu@uniontech.com>

Prepare xHCI for multiple IRQ vectors/interrupters without changing the
current behavior.

Introduce a per-vector irq context (hcd + intr_num) to use as request_irq()
dev_id, and track the active vector count in xhci->irqs_enabled.  Use this
single bound to enable/disable interrupters consistently across run/stop/
resume and to sync/free IRQs.

Legacy IRQ fallback also keeps irqs_enabled >= 1 so interrupter 0 remains
functional when MSI/MSI-X is unavailable.

No functional change intended: still uses interrupter 0 only.

Signed-off-by: Xu Rao <raoxu@uniontech.com>

---
 drivers/usb/host/xhci-mem.c  | 17 +++++++++++++++++
 drivers/usb/host/xhci-pci.c  | 34 +++++++++++++++++++++++++---------
 drivers/usb/host/xhci-ring.c |  9 +++++++--
 drivers/usb/host/xhci.c      | 23 +++++++++++++----------
 drivers/usb/host/xhci.h      | 16 +++++++++++++++-
 5 files changed, 77 insertions(+), 22 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index c708bdd69f16..524c58a149d3 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -2402,6 +2402,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
 {
 	struct device	*dev = xhci_to_hcd(xhci)->self.sysdev;
 	dma_addr_t	dma;
+	unsigned int    i;

 	/*
 	 * xHCI section 5.4.6 - Device Context array must be
@@ -2495,6 +2496,22 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
 	if (!xhci->interrupters[0])
 		goto fail;

+	/* Since only the main interrupt is used, secondary_irqs_alloc is set to 0. */
+	xhci->secondary_irqs_alloc = 0;
+
+	/*
+	 * Initialize all allocated interrupters here.
+	 * Use allocated count as loop bound to avoid touching non-allocated
+	 * or non-operational interrupters.
+	 */
+	xhci_dbg_trace(xhci, trace_xhci_dbg_init, "Add allocated interrupters");
+	for (i = 0; i < xhci->secondary_irqs_alloc + 1; i++) {
+		if (!xhci->interrupters[i])
+			continue;
+		xhci_add_interrupter(xhci, i);
+		xhci->interrupters[i]->isoc_bei_interval = AVOID_BEI_INTERVAL_MAX;
+	}
+
 	if (scratchpad_alloc(xhci, flags))
 		goto fail;

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 585b2f3117b0..b3efd6d8fd9c 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -116,13 +116,12 @@ static const struct xhci_driver_overrides xhci_pci_overrides __initconst = {
 static void xhci_msix_sync_irqs(struct xhci_hcd *xhci)
 {
 	struct usb_hcd *hcd = xhci_to_hcd(xhci);
+	struct pci_dev *pdev = to_pci_dev(hcd->self.controller);
+	int i;

-	if (hcd->msix_enabled) {
-		struct pci_dev *pdev = to_pci_dev(hcd->self.controller);
-
-		/* for now, the driver only supports one primary interrupter */
-		synchronize_irq(pci_irq_vector(pdev, 0));
-	}
+	if (hcd->msix_enabled)
+		for (i = 0; i < xhci->irqs_enabled; i++)
+			synchronize_irq(pci_irq_vector(pdev, i));
 }

 /* Legacy IRQ is freed by usb_remove_hcd() or usb_hcd_pci_shutdown() */
@@ -130,11 +129,17 @@ static void xhci_cleanup_msix(struct xhci_hcd *xhci)
 {
 	struct usb_hcd *hcd = xhci_to_hcd(xhci);
 	struct pci_dev *pdev = to_pci_dev(hcd->self.controller);
+	int i;

 	if (hcd->irq > 0)
 		return;

-	free_irq(pci_irq_vector(pdev, 0), xhci_to_hcd(xhci));
+	for (i = 0; i < xhci->irqs_enabled; i++)
+		free_irq(pci_irq_vector(pdev, i), &xhci->irq_ctx[i]);
+
+	xhci->irqs_enabled = 0;
+	memset(xhci->irq_ctx, 0, sizeof(xhci->irq_ctx));
+
 	pci_free_irq_vectors(pdev);
 	hcd->msix_enabled = 0;
 }
@@ -145,6 +150,7 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
 	struct pci_dev *pdev = to_pci_dev(hcd->self.controller);
 	struct xhci_hcd *xhci = hcd_to_xhci(hcd);
 	int ret;
+	struct xhci_irq_ctx *ctx;

 	/*
 	 * Some Fresco Logic host controllers advertise MSI, but fail to
@@ -174,11 +180,17 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
 		goto legacy_irq;
 	}

-	ret = request_irq(pci_irq_vector(pdev, 0), xhci_msi_irq, 0, "xhci_hcd",
-			  xhci_to_hcd(xhci));
+	memset(xhci->irq_ctx, 0, sizeof(xhci->irq_ctx));
+	xhci->irqs_enabled = 0;
+
+	ctx = &xhci->irq_ctx[0];
+	ctx->hcd = hcd;
+	ctx->intr_num = 0;
+	ret = request_irq(pci_irq_vector(pdev, 0), xhci_msi_irq, 0, "xhci_hcd", ctx);
 	if (ret)
 		goto free_irq_vectors;

+	xhci->irqs_enabled = 1;
 	hcd->msi_enabled = 1;
 	hcd->msix_enabled = pdev->msix_enabled;
 	return 0;
@@ -186,6 +198,7 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
 free_irq_vectors:
 	xhci_dbg_trace(xhci, trace_xhci_dbg_init, "disable %s interrupt",
 		       pdev->msix_enabled ? "MSI-X" : "MSI");
+	xhci->irqs_enabled = 0;
 	pci_free_irq_vectors(pdev);

 legacy_irq:
@@ -198,6 +211,9 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
 		snprintf(hcd->irq_descr, sizeof(hcd->irq_descr), "%s:usb%d",
 			 hcd->driver->description, hcd->self.busnum);

+	/* legacy IRQ path still needs interrupter 0 */
+	xhci->irqs_enabled = 1;
+
 	/* fall back to legacy interrupt */
 	ret = request_irq(pdev->irq, &usb_hcd_irq, IRQF_SHARED, hcd->irq_descr, hcd);
 	if (ret) {
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 9315ba18310d..3ea134c07c5f 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3220,9 +3220,14 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
 	return ret;
 }

-irqreturn_t xhci_msi_irq(int irq, void *hcd)
+irqreturn_t xhci_msi_irq(int irq, void *data)
 {
-	return xhci_irq(hcd);
+	struct xhci_irq_ctx *ctx = data;
+
+	/* For now only vector 0 is requested; keep behavior unchanged. */
+	if (!ctx || !ctx->hcd)
+		return IRQ_NONE;
+	return xhci_irq(ctx->hcd);
 }
 EXPORT_SYMBOL_GPL(xhci_msi_irq);

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index b3ba16b9718c..cbf96cb51583 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -576,10 +576,6 @@ static int xhci_init(struct usb_hcd *hcd)
 	/* Set USB 3.0 device notifications for function remote wake */
 	xhci_set_dev_notifications(xhci);

-	/* Initialize the Primary interrupter */
-	xhci_add_interrupter(xhci, 0);
-	xhci->interrupters[0]->isoc_bei_interval = AVOID_BEI_INTERVAL_MAX;
-
 	/* Initializing Compliance Mode Recovery Data If Needed */
 	if (xhci_compliance_mode_recovery_timer_quirk_check()) {
 		xhci->quirks |= XHCI_COMP_MODE_QUIRK;
@@ -594,9 +590,9 @@ static int xhci_init(struct usb_hcd *hcd)

 static int xhci_run_finished(struct xhci_hcd *xhci)
 {
-	struct xhci_interrupter *ir = xhci->interrupters[0];
 	unsigned long	flags;
 	u32		temp;
+	int		i;

 	/*
 	 * Enable interrupts before starting the host (xhci 4.2 and 5.5.2).
@@ -609,8 +605,10 @@ static int xhci_run_finished(struct xhci_hcd *xhci)
 	temp |= (CMD_EIE);
 	writel(temp, &xhci->op_regs->command);

-	xhci_dbg_trace(xhci, trace_xhci_dbg_init, "Enable primary interrupter");
-	xhci_enable_interrupter(ir);
+	xhci_dbg_trace(xhci, trace_xhci_dbg_init, "Enable all interrupters");
+	for (i = 0; i < xhci->irqs_enabled; i++)
+		if (xhci->interrupters[i])
+			xhci_enable_interrupter(xhci->interrupters[i]);

 	if (xhci_start(xhci)) {
 		xhci_halt(xhci);
@@ -707,7 +705,7 @@ void xhci_stop(struct usb_hcd *hcd)
 {
 	u32 temp;
 	struct xhci_hcd *xhci = hcd_to_xhci(hcd);
-	struct xhci_interrupter *ir = xhci->interrupters[0];
+	int i;

 	mutex_lock(&xhci->mutex);

@@ -742,7 +740,9 @@ void xhci_stop(struct usb_hcd *hcd)
 			"// Disabling event ring interrupts");
 	temp = readl(&xhci->op_regs->status);
 	writel((temp & ~0x1fff) | STS_EINT, &xhci->op_regs->status);
-	xhci_disable_interrupter(xhci, ir);
+	for (i = 0; i < xhci->irqs_enabled; i++)
+		if (xhci->interrupters[i])
+			xhci_disable_interrupter(xhci, xhci->interrupters[i]);

 	xhci_dbg_trace(xhci, trace_xhci_dbg_init, "cleaning up memory");
 	xhci_mem_cleanup(xhci);
@@ -1087,6 +1087,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool power_lost, bool is_auto_resume)
 	bool			comp_timer_running = false;
 	bool			pending_portevent = false;
 	bool			suspended_usb3_devs = false;
+	int			i;

 	if (!hcd->state)
 		return 0;
@@ -1180,7 +1181,9 @@ int xhci_resume(struct xhci_hcd *xhci, bool power_lost, bool is_auto_resume)
 		xhci_dbg(xhci, "// Disabling event ring interrupts\n");
 		temp = readl(&xhci->op_regs->status);
 		writel((temp & ~0x1fff) | STS_EINT, &xhci->op_regs->status);
-		xhci_disable_interrupter(xhci, xhci->interrupters[0]);
+		for (i = 0; i < xhci->irqs_enabled; i++)
+			if (xhci->interrupters[i])
+				xhci_disable_interrupter(xhci, xhci->interrupters[i]);

 		xhci_dbg(xhci, "cleaning up memory\n");
 		xhci_mem_cleanup(xhci);
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 2b0796f6d00e..7707fd7564c5 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -45,6 +45,9 @@
  */
 #define MAX_HC_INTRS		128

+/* Software cap for secondary interrupters; not a hardware limit. */
+#define MAX_SECONDARY_INTRNUM	4
+
 /*
  * xHCI register interface.
  * This corresponds to the eXtensible Host Controller Interface (xHCI)
@@ -1497,6 +1500,11 @@ struct xhci_hub {
 	u8			min_rev;
 };

+struct xhci_irq_ctx {
+	struct usb_hcd	*hcd;
+	u8		intr_num;
+};
+
 /* There is one xhci_hcd structure per controller */
 struct xhci_hcd {
 	struct usb_hcd *main_hcd;
@@ -1533,6 +1541,12 @@ struct xhci_hcd {
 	/* data structures */
 	struct xhci_device_context_array *dcbaa;
 	struct xhci_interrupter **interrupters;
+	/* Number of secondary interrupters successfully allocated */
+	unsigned int		secondary_irqs_alloc;
+	/* Number of IRQ vectors successfully requested (includes vector 0) */
+	unsigned int		irqs_enabled;
+	/* MSI/MSI-X vector contexts. Vector 0 uses [0], secondary use [1..] */
+	struct xhci_irq_ctx	irq_ctx[MAX_SECONDARY_INTRNUM + 1];
 	struct xhci_ring	*cmd_ring;
 	unsigned int            cmd_ring_state;
 #define CMD_RING_STATE_RUNNING         (1 << 0)
@@ -1895,7 +1909,7 @@ int xhci_suspend(struct xhci_hcd *xhci, bool do_wakeup);
 int xhci_resume(struct xhci_hcd *xhci, bool power_lost, bool is_auto_resume);

 irqreturn_t xhci_irq(struct usb_hcd *hcd);
-irqreturn_t xhci_msi_irq(int irq, void *hcd);
+irqreturn_t xhci_msi_irq(int irq, void *data);
 int xhci_alloc_dev(struct usb_hcd *hcd, struct usb_device *udev);
 int xhci_alloc_tt_info(struct xhci_hcd *xhci,
 		struct xhci_virt_device *virt_dev,
--
2.50.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot
  2026-01-27  2:34 [PATCH v11 0/2] usb: xhci: route device to secondary interrupters raoxu
  2026-01-27  2:38 ` [PATCH v11 1/2] usb: xhci: refactor IRQ/interrupter plumbing for multi-vector support raoxu
@ 2026-01-27  2:39 ` raoxu
  2026-01-27  7:39   ` Greg KH
  2026-01-27 12:57   ` [PATCH v11 " Neronin, Niklas
  2026-01-27  7:33 ` [PATCH v11 0/2] usb: xhci: route device to secondary interrupters Greg KH
  2026-01-28 13:19 ` Kenneth Crudup
  3 siblings, 2 replies; 18+ messages in thread
From: raoxu @ 2026-01-27  2:39 UTC (permalink / raw)
  To: raoxu
  Cc: gregkh, kenny, linux-usb, mathias.nyman, michal.pecio,
	niklas.neronin, zhanjun

From: Xu Rao <raoxu@uniontech.com>

Some xHCI hosts expose multiple MSI/MSI-X vectors and support multiple
interrupters with independent event rings, but Linux commonly routes all
transfer completions through interrupter 0.

This is not about increasing USB link throughput, but about avoiding a
driver-imposed single hot spot. On hosts that already provide multiple
MSI/MSI-X vectors and independent event rings, routing all completions
through interrupter 0 creates unnecessary contention (serialized event
handling/locks and coupled moderation), increasing CPU cost and tail
latency under many active devices/endpoints. Using secondary interrupters
simply matches the hardware's design, similar in spirit to merged
xHCI-sideband work: exploit available parallel paths rather than
funneling all events through one.

Allocate a small capped set of secondary interrupters in xhci_mem_init()
(MAX_SECONDARY_INTRNUM, default 4) and request up to the matching number
of IRQ vectors (bounded by what PCI core provides).  Dispatch each vector
to its interrupter via the per-vector irq_ctx.

Route transfers per USB device (slot) by assigning vdev->interrupter at
device allocation time and programming the interrupter target (slot
context + TRB_INTR_TARGET) from that selection, so completions land on the
selected event ring.  Drain a slot's secondary event ring at selected
command completion boundaries to reduce late-event artifacts when teardown
happens on interrupter 0.

Signed-off-by: Xu Rao <raoxu@uniontech.com>

---
 drivers/usb/host/xhci-mem.c  |  35 +++++++-
 drivers/usb/host/xhci-pci.c  |  35 +++++---
 drivers/usb/host/xhci-ring.c | 151 ++++++++++++++++++++++++++++-------
 drivers/usb/host/xhci.c      |  14 ++++
 drivers/usb/host/xhci.h      |  10 +++
 5 files changed, 207 insertions(+), 38 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 524c58a149d3..cc9612873566 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -1197,6 +1197,15 @@ int xhci_setup_addressable_virt_dev(struct xhci_hcd *xhci, struct usb_device *ud

 	ep0_ctx->tx_info = cpu_to_le32(EP_AVG_TRB_LENGTH(8));

+	/* Match Slot Context interrupter target to vdev->interrupter. */
+	if (dev->interrupter) {
+		u32 tt = le32_to_cpu(slot_ctx->tt_info);
+
+		tt &= ~SLOT_INTR_TARGET_MASK;
+		tt |= SLOT_INTR_TARGET(dev->interrupter->intr_num);
+		slot_ctx->tt_info = cpu_to_le32(tt);
+	}
+
 	trace_xhci_setup_addressable_virt_device(dev);

 	/* Steps 7 and 8 were done in xhci_alloc_virt_device() */
@@ -2402,7 +2411,8 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
 {
 	struct device	*dev = xhci_to_hcd(xhci)->self.sysdev;
 	dma_addr_t	dma;
-	unsigned int    i;
+	unsigned int	secondary_intr_num = 0;
+	int		i;

 	/*
 	 * xHCI section 5.4.6 - Device Context array must be
@@ -2496,9 +2506,30 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
 	if (!xhci->interrupters[0])
 		goto fail;

-	/* Since only the main interrupt is used, secondary_irqs_alloc is set to 0. */
+	xhci_dbg_trace(xhci, trace_xhci_dbg_init,
+			"Allocating secondary event ring");
 	xhci->secondary_irqs_alloc = 0;

+	if (xhci->max_interrupters > 1)
+		secondary_intr_num = min_t(unsigned int,
+				xhci->max_interrupters - 1,
+				(unsigned int)MAX_SECONDARY_INTRNUM);
+
+	for (i = 1; i <= secondary_intr_num; i++) {
+		if (xhci->interrupters[i])
+			continue;
+		xhci->interrupters[i] = xhci_alloc_interrupter(xhci, i, flags);
+		if (!xhci->interrupters[i]) {
+			while (--i >= 1) {
+				xhci_free_interrupter(xhci, xhci->interrupters[i]);
+				xhci->interrupters[i] = NULL;
+			}
+			xhci->secondary_irqs_alloc = 0;
+			break;
+		}
+		xhci->secondary_irqs_alloc++;
+	}
+
 	/*
 	 * Initialize all allocated interrupters here.
 	 * Use allocated count as loop bound to avoid touching non-allocated
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index b3efd6d8fd9c..4ff3e29070b1 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -150,6 +150,8 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
 	struct pci_dev *pdev = to_pci_dev(hcd->self.controller);
 	struct xhci_hcd *xhci = hcd_to_xhci(hcd);
 	int ret;
+	unsigned int irqs_num;
+	int i;
 	struct xhci_irq_ctx *ctx;

 	/*
@@ -173,7 +175,7 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)

 	/* TODO: Check with MSI Soc for sysdev */
 	xhci->nvecs = pci_alloc_irq_vectors(pdev, 1, xhci->nvecs,
-					    PCI_IRQ_MSIX | PCI_IRQ_MSI);
+					    PCI_IRQ_MSIX | PCI_IRQ_MSI | PCI_IRQ_AFFINITY);
 	if (xhci->nvecs < 0) {
 		xhci_dbg_trace(xhci, trace_xhci_dbg_init,
 			       "failed to allocate IRQ vectors");
@@ -183,14 +185,30 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
 	memset(xhci->irq_ctx, 0, sizeof(xhci->irq_ctx));
 	xhci->irqs_enabled = 0;

-	ctx = &xhci->irq_ctx[0];
-	ctx->hcd = hcd;
-	ctx->intr_num = 0;
-	ret = request_irq(pci_irq_vector(pdev, 0), xhci_msi_irq, 0, "xhci_hcd", ctx);
-	if (ret)
-		goto free_irq_vectors;
+	/*
+	 * Request up to 1 + secondary_irqs_alloc vectors (vector 0 + secondary),
+	 * limited by what PCI core actually allocated.
+	 */
+	irqs_num = min_t(unsigned int,
+		     (unsigned int)xhci->nvecs,
+		     (unsigned int)(1 + xhci->secondary_irqs_alloc));
+
+	for (i = 0; i < irqs_num; i++) {
+		ctx = &xhci->irq_ctx[i];
+		ctx->hcd = hcd;
+		ctx->intr_num = i;
+		ret = request_irq(pci_irq_vector(pdev, i), xhci_msi_irq, 0,
+				  "xhci_hcd", ctx);
+		if (ret) {
+			while (--i >= 0)
+				free_irq(pci_irq_vector(pdev, i), &xhci->irq_ctx[i]);
+			xhci->irqs_enabled = 0;
+			memset(xhci->irq_ctx, 0, sizeof(xhci->irq_ctx));
+			goto free_irq_vectors;
+		}
+		xhci->irqs_enabled++;
+	}

-	xhci->irqs_enabled = 1;
 	hcd->msi_enabled = 1;
 	hcd->msix_enabled = pdev->msix_enabled;
 	return 0;
@@ -198,7 +216,6 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
 free_irq_vectors:
 	xhci_dbg_trace(xhci, trace_xhci_dbg_init, "disable %s interrupt",
 		       pdev->msix_enabled ? "MSI-X" : "MSI");
-	xhci->irqs_enabled = 0;
 	pci_free_irq_vectors(pdev);

 legacy_irq:
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 3ea134c07c5f..f16339f47ed4 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -64,6 +64,9 @@ static int queue_command(struct xhci_hcd *xhci, struct xhci_command *cmd,
 			 u32 field1, u32 field2,
 			 u32 field3, u32 field4, bool command_must_succeed);

+static void xhci_handle_slot_secondary_events(struct xhci_hcd *xhci,
+		union xhci_trb *cmd_trb);
+
 /*
  * Returns zero if the TRB isn't in this segment, otherwise it returns the DMA
  * address of the TRB.
@@ -1859,6 +1862,9 @@ static void handle_cmd_completion(struct xhci_hcd *xhci,
 		}
 	}

+	/* Handle slot secondary events before command-specific teardown logic */
+	xhci_handle_slot_secondary_events(xhci, cmd_trb);
+
 	cmd_type = TRB_FIELD_TO_TYPE(le32_to_cpu(cmd_trb->generic.field[3]));
 	switch (cmd_type) {
 	case TRB_ENABLE_SLOT:
@@ -3138,6 +3144,72 @@ static int xhci_handle_events(struct xhci_hcd *xhci, struct xhci_interrupter *ir
 	return 0;
 }

+/*
+ * Handle a slot's secondary event ring at command completion boundaries.
+ *
+ * With secondary interrupters, transfer events may lag behind command
+ * completions (handled on interrupter 0). Commands that stop/reset/disable
+ * endpoints/slots can reclaim TD state before those transfer events are
+ * processed, leading to "Spurious event dma" reports. Call this from
+ * handle_cmd_completion() before command-specific completion handling.
+ */
+static void xhci_handle_slot_secondary_events(struct xhci_hcd *xhci,
+		union xhci_trb *cmd_trb)
+{
+	u32 field3, cmd_type, slot_id;
+	struct xhci_virt_device *vdev;
+	struct xhci_interrupter *sec_ir;
+	unsigned int intr;
+
+	/* No secondary routing -> nothing to do */
+	if (xhci->irqs_enabled <= 1)
+		return;
+
+	field3   = le32_to_cpu(cmd_trb->generic.field[3]);
+	cmd_type = TRB_FIELD_TO_TYPE(field3);
+	slot_id  = TRB_TO_SLOT_ID(field3);
+
+	if (!slot_id)
+		return;
+
+	/*
+	 * Handle only for commands that can stop/reset/disable endpoints/slots
+	 * and/or cause the driver to reclaim TD ownership.
+	 */
+	switch (cmd_type) {
+	case TRB_STOP_RING:
+	case TRB_SET_DEQ:
+	case TRB_RESET_EP:
+	case TRB_RESET_DEV:
+	case TRB_DISABLE_SLOT:
+		break;
+	case TRB_CONFIG_EP:
+		/* Only needed for deconfigure (drop endpoints) */
+		if (!(field3 & TRB_DC))
+			return;
+		break;
+	default:
+		return;
+	}
+
+	vdev = xhci->devs[slot_id];
+	if (!vdev)
+		return;
+
+	intr = vdev->interrupter->intr_num;
+	if (!intr)
+		return; /* slot is on primary interrupter */
+
+	sec_ir = xhci->interrupters[intr];
+	if (!sec_ir || !sec_ir->event_ring)
+		return;
+
+	lockdep_assert_held(&xhci->lock);
+
+	/* Handle pending events normally to complete URB/TD bookkeeping. */
+	xhci_handle_events(xhci, sec_ir, false);
+}
+
 /*
  * Move the event ring dequeue pointer to skip events kept in the secondary
  * event ring.  This is used to ensure that pending events in the ring are
@@ -3169,14 +3241,8 @@ void xhci_skip_sec_intr_events(struct xhci_hcd *xhci,
 	xhci_handle_events(xhci, ir, true);
 }

-/*
- * xHCI spec says we can get an interrupt, and if the HC has an error condition,
- * we might get bad data out of the event ring.  Section 4.10.2.7 has a list of
- * indicators of an event TRB error, but we check the status *first* to be safe.
- */
-irqreturn_t xhci_irq(struct usb_hcd *hcd)
+static irqreturn_t xhci_irq_ir(struct xhci_hcd *xhci, struct xhci_interrupter *ir)
 {
-	struct xhci_hcd *xhci = hcd_to_xhci(hcd);
 	irqreturn_t ret = IRQ_HANDLED;
 	u32 status;

@@ -3188,7 +3254,11 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
 		goto out;
 	}

-	if (!(status & STS_EINT)) {
+	/*
+	 * STS_EINT is only meaningful for the primary interrupter (0).
+	 * Secondary vectors may interrupt without STS_EINT set.
+	 */
+	if (ir->intr_num == 0 && !(status & STS_EINT)) {
 		ret = IRQ_NONE;
 		goto out;
 	}
@@ -3204,30 +3274,52 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
 		goto out;
 	}

-	/*
-	 * Clear the op reg interrupt status first,
-	 * so we can receive interrupts from other MSI-X interrupters.
-	 * Write 1 to clear the interrupt status.
-	 */
-	status |= STS_EINT;
-	writel(status, &xhci->op_regs->status);
+	if (ir->intr_num == 0) {
+		/*
+		 * Clear the op reg interrupt status first,
+		 * so we can receive interrupts from other MSI-X interrupters.
+		 * Write 1 to clear the interrupt status.
+		 */
+		status |= STS_EINT;
+		writel(status, &xhci->op_regs->status);
+	}

-	/* This is the handler of the primary interrupter */
-	xhci_handle_events(xhci, xhci->interrupters[0], false);
+	/* Handle events for this interrupter. */
+	xhci_handle_events(xhci, ir, false);
 out:
 	spin_unlock(&xhci->lock);

 	return ret;
 }

+/*
+ * xHCI spec says we can get an interrupt, and if the HC has an error condition,
+ * we might get bad data out of the event ring.  Section 4.10.2.7 has a list of
+ * indicators of an event TRB error, but we check the status *first* to be safe.
+ */
+irqreturn_t xhci_irq(struct usb_hcd *hcd)
+{
+	struct xhci_hcd *xhci = hcd_to_xhci(hcd);
+
+	return xhci_irq_ir(xhci, xhci->interrupters[0]);
+}
+
 irqreturn_t xhci_msi_irq(int irq, void *data)
 {
 	struct xhci_irq_ctx *ctx = data;
+	struct xhci_hcd *xhci;
+	struct xhci_interrupter *ir;

-	/* For now only vector 0 is requested; keep behavior unchanged. */
+	/* All MSI/MSI-X vectors use ctx (dev_id) to select interrupter.*/
 	if (!ctx || !ctx->hcd)
 		return IRQ_NONE;
-	return xhci_irq(ctx->hcd);
+
+	xhci = hcd_to_xhci(ctx->hcd);
+	ir = xhci->interrupters[ctx->intr_num];
+	if (!ir)
+		return IRQ_NONE;
+
+	return xhci_irq_ir(xhci, ir);
 }
 EXPORT_SYMBOL_GPL(xhci_msi_irq);

@@ -3629,6 +3721,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 	int sent_len, ret;
 	u32 field, length_field, remainder;
 	u64 addr, send_addr;
+	struct xhci_interrupter *ir;

 	ring = xhci_urb_to_transfer_ring(xhci, urb);
 	if (!ring)
@@ -3669,6 +3762,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 	start_trb = &ring->enqueue->generic;
 	start_cycle = ring->cycle_state;
 	send_addr = addr;
+	ir = xhci->devs[slot_id]->interrupter;

 	/* Queue the TRBs, even if they are zero-length */
 	for (enqd_len = 0; first_trb || enqd_len < full_len;
@@ -3729,7 +3823,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,

 		length_field = TRB_LEN(trb_buff_len) |
 			TRB_TD_SIZE(remainder) |
-			TRB_INTR_TARGET(0);
+			TRB_INTR_TARGET(ir->intr_num);

 		queue_trb(xhci, ring, more_trbs_coming | need_zero_pkt,
 				lower_32_bits(send_addr),
@@ -3761,7 +3855,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 		urb_priv->td[1].end_trb = ring->enqueue;
 		urb_priv->td[1].end_seg = ring->enq_seg;
 		field = TRB_TYPE(TRB_NORMAL) | ring->cycle_state | TRB_IOC;
-		queue_trb(xhci, ring, 0, 0, 0, TRB_INTR_TARGET(0), field);
+		queue_trb(xhci, ring, 0, 0, 0, TRB_INTR_TARGET(ir->intr_num), field);
 	}

 	check_trb_math(urb, enqd_len);
@@ -3783,6 +3877,9 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 	u32 field;
 	struct urb_priv *urb_priv;
 	struct xhci_td *td;
+	struct xhci_interrupter *ir;
+
+	ir = xhci->devs[slot_id]->interrupter;

 	ep_ring = xhci_urb_to_transfer_ring(xhci, urb);
 	if (!ep_ring)
@@ -3805,7 +3902,7 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 		if (last_trb_on_seg(ep_ring->enq_seg, ep_ring->enqueue + 1)) {
 			field = TRB_TYPE(TRB_TR_NOOP) | ep_ring->cycle_state;
 			queue_trb(xhci, ep_ring, false, 0, 0,
-					TRB_INTR_TARGET(0), field);
+					TRB_INTR_TARGET(ir->intr_num), field);
 		}
 	}

@@ -3856,7 +3953,7 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 	queue_trb(xhci, ep_ring, true,
 		  setup->bRequestType | setup->bRequest << 8 | le16_to_cpu(setup->wValue) << 16,
 		  le16_to_cpu(setup->wIndex) | le16_to_cpu(setup->wLength) << 16,
-		  TRB_LEN(8) | TRB_INTR_TARGET(0),
+		  TRB_LEN(8) | TRB_INTR_TARGET(ir->intr_num),
 		  /* Immediate data in pointer */
 		  field);

@@ -3886,7 +3983,7 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 				urb, 1);
 		length_field = TRB_LEN(urb->transfer_buffer_length) |
 				TRB_TD_SIZE(remainder) |
-				TRB_INTR_TARGET(0);
+				TRB_INTR_TARGET(ir->intr_num);
 		if (setup->bRequestType & USB_DIR_IN)
 			field |= TRB_DIR_IN;
 		queue_trb(xhci, ep_ring, true,
@@ -3909,7 +4006,7 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 	queue_trb(xhci, ep_ring, false,
 			0,
 			0,
-			TRB_INTR_TARGET(0),
+			TRB_INTR_TARGET(ir->intr_num),
 			/* Event on completion */
 			field | TRB_IOC | TRB_TYPE(TRB_STATUS) | ep_ring->cycle_state);

@@ -4099,7 +4196,7 @@ static int xhci_queue_isoc_tx(struct xhci_hcd *xhci, gfp_t mem_flags,

 	xep = &xhci->devs[slot_id]->eps[ep_index];
 	ep_ring = xhci->devs[slot_id]->eps[ep_index].ring;
-	ir = xhci->interrupters[0];
+	ir = xhci->devs[slot_id]->interrupter;

 	num_tds = urb->number_of_packets;
 	if (num_tds < 1) {
@@ -4200,7 +4297,7 @@ static int xhci_queue_isoc_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 						   urb, more_trbs_coming);

 			length_field = TRB_LEN(trb_buff_len) |
-				TRB_INTR_TARGET(0);
+				TRB_INTR_TARGET(ir->intr_num);

 			/* xhci 1.1 with ETE uses TD Size field for TBC */
 			if (first_trb && xep->use_extended_tbc)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index cbf96cb51583..96934a29e39b 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4123,6 +4123,7 @@ static void xhci_free_dev(struct usb_hcd *hcd, struct usb_device *udev)
 		virt_dev->eps[i].ep_state &= ~EP_STOP_CMD_PENDING;
 	virt_dev->udev = NULL;
 	xhci_disable_slot(xhci, udev->slot_id);
+	virt_dev->interrupter = NULL;

 	spin_lock_irqsave(&xhci->lock, flags);
 	xhci_free_virt_device(xhci, virt_dev, udev->slot_id);
@@ -4219,6 +4220,7 @@ int xhci_alloc_dev(struct usb_hcd *hcd, struct usb_device *udev)
 	unsigned long flags;
 	int ret, slot_id;
 	struct xhci_command *command;
+	unsigned int sec_irqs, intr_num;

 	command = xhci_alloc_command(xhci, true, GFP_KERNEL);
 	if (!command)
@@ -4275,6 +4277,18 @@ int xhci_alloc_dev(struct usb_hcd *hcd, struct usb_device *udev)

 	udev->slot_id = slot_id;

+	/*
+	 * Spread devices across IRQ-backed secondary interrupters [1..].
+	 * If only vector 0 is enabled, default to interrupter 0.
+	 */
+	sec_irqs = (xhci->irqs_enabled > 1) ? (xhci->irqs_enabled - 1) : 0;
+	if (sec_irqs) {
+		intr_num = slot_id % sec_irqs;
+		vdev->interrupter = xhci->interrupters[intr_num + 1];
+	} else {
+		vdev->interrupter = xhci->interrupters[0];
+	}
+
 	xhci_debugfs_create_slot(xhci, slot_id);

 	/*
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 7707fd7564c5..b30ace66ff62 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -408,6 +408,11 @@ struct xhci_slot_ctx {
 #define SLOT_STATE_ADDRESSED	2
 #define SLOT_STATE_CONFIGURED	3

+/* Interrupter Target - tt_info[31:22] (xHCI Slot Context) */
+#define SLOT_INTR_TARGET_SHIFT 22
+#define SLOT_INTR_TARGET_MASK  (0x3ffU << SLOT_INTR_TARGET_SHIFT)
+#define SLOT_INTR_TARGET(p)    (((p) & 0x3ffU) << SLOT_INTR_TARGET_SHIFT)
+
 /**
  * struct xhci_ep_ctx
  * @ep_info:	endpoint state, streams, mult, and interval information.
@@ -767,6 +772,11 @@ struct xhci_virt_device {
 	void				*debugfs_private;
 	/* set if this endpoint is controlled via sideband access*/
 	struct xhci_sideband	*sideband;
+	/*
+	 * Default transfer-event interrupter for this USB device.
+	 * Queueing code programs TRB_INTR_TARGET() from vdev->interrupter.
+	 */
+	struct xhci_interrupter *interrupter;
 };

 /*
--
2.50.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 0/2] usb: xhci: route device to secondary interrupters
  2026-01-27  2:34 [PATCH v11 0/2] usb: xhci: route device to secondary interrupters raoxu
  2026-01-27  2:38 ` [PATCH v11 1/2] usb: xhci: refactor IRQ/interrupter plumbing for multi-vector support raoxu
  2026-01-27  2:39 ` [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot raoxu
@ 2026-01-27  7:33 ` Greg KH
  2026-01-28 13:19 ` Kenneth Crudup
  3 siblings, 0 replies; 18+ messages in thread
From: Greg KH @ 2026-01-27  7:33 UTC (permalink / raw)
  To: raoxu
  Cc: mathias.nyman, linux-usb, michal.pecio, niklas.neronin, zhanjun,
	kenny

On Tue, Jan 27, 2026 at 10:34:46AM +0800, raoxu wrote:
> v10 -> v11:
>   - Fix build warnings:-Wsometimes-uninitialized.
>     Closes: https://lore.kernel.org/oe-kbuild-all/202601262208.UybEjc9X-lkp@intel.com/

didn't you also change the changelog text?  If not, why not, if so, why
was this not mentioned?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot
  2026-01-27  2:39 ` [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot raoxu
@ 2026-01-27  7:39   ` Greg KH
  2026-01-27 10:55     ` raoxu
  2026-01-28  8:09     ` raoxu
  2026-01-27 12:57   ` [PATCH v11 " Neronin, Niklas
  1 sibling, 2 replies; 18+ messages in thread
From: Greg KH @ 2026-01-27  7:39 UTC (permalink / raw)
  To: raoxu
  Cc: kenny, linux-usb, mathias.nyman, michal.pecio, niklas.neronin,
	zhanjun

On Tue, Jan 27, 2026 at 10:39:00AM +0800, raoxu wrote:
> From: Xu Rao <raoxu@uniontech.com>
> 
> Some xHCI hosts expose multiple MSI/MSI-X vectors and support multiple
> interrupters with independent event rings, but Linux commonly routes all
> transfer completions through interrupter 0.
> 
> This is not about increasing USB link throughput, but about avoiding a
> driver-imposed single hot spot.

What do you mean exactly by "hot spot"?  Usually this is a good thing,
driver code is in the cache, as are the data structures, so this keeps
data flowing well with less latencies.  Why would you not want this?

> On hosts that already provide multiple
> MSI/MSI-X vectors and independent event rings, routing all completions
> through interrupter 0 creates unnecessary contention (serialized event
> handling/locks and coupled moderation), increasing CPU cost and tail
> latency under many active devices/endpoints.

So this is a MSI routing issue, not a cpu cache issue?

And exactly what type of contention is happening here?  How can it be
measured and noticed?  The latencies involved in USB are huge due to the
protocol and hardware, why would a CPU even notice such things?

> Using secondary interrupters
> simply matches the hardware's design, similar in spirit to merged
> xHCI-sideband work: exploit available parallel paths rather than
> funneling all events through one.

What do you mean by "secondary interrupters"?  The sideband work was for
a totally different issue, whereby the normal hardware and CPU was
bypassed to allow it to remain powered down and to save battery life.
Spreading interrupts across CPU cores does the exact opposite of that,
ensuring that cores can NOT go to sleep when the work should be only
done by one of them.

In short, please post numbers showing how this really is noticable
somehow.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot
  2026-01-27  7:39   ` Greg KH
@ 2026-01-27 10:55     ` raoxu
  2026-01-27 11:04       ` Greg KH
  2026-01-28  8:09     ` raoxu
  1 sibling, 1 reply; 18+ messages in thread
From: raoxu @ 2026-01-27 10:55 UTC (permalink / raw)
  To: gregkh
  Cc: kenny, linux-usb, mathias.nyman, michal.pecio, niklas.neronin,
	raoxu, zhanjun

Hi greg,

Thanks for the review. Below is why I believe enabling secondary
interrupters is justified.

Legitimacy
Most modern xHCI controllers report Max Interrupters > 1 and expose
multiple MSI/MSI-X vectors. Each interrupter has its own event ring,
ERST and IMAN/IMOD. This series only enables what HW advertises and what
PCI allocates, bounded by a small software cap, with a clean fallback to
vector 0.

Why it still matters
Even if USB link latency is high, the IRQ/event path is still a CPU-side
serialization point today: all completions funnel into one event ring and
one handler under the global xhci->lock. Under mixed workloads (e.g. isoc
video + periodic audio/bulk), unrelated devices can queue behind the same
lock/handler and increase jitter/tail latency.

Sideband reference
Upstream xhci-sideband work has already exercised and validated the
multi-interrupter/event-ring plumbing in the core driver. Using secondary
interrupters is therefore a proven in-tree mechanism, not an experimental
path unique to this series.

Why v11 may benchmark flat
v11 still takes the global xhci->lock in the IRQ/event path, so it does
not claim true parallel event processing yet. The goal is to make
multi-vector routing and lifecycle correctness solid (request/free/sync,
run/stop/resume, quiesce/teardown) and establish per-slot routing. This
is the required groundwork for a follow-up to reduce lock scope so
different devices' completions do not wait on the same lock.

thanks,

Xu Rao

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot
  2026-01-27 10:55     ` raoxu
@ 2026-01-27 11:04       ` Greg KH
  0 siblings, 0 replies; 18+ messages in thread
From: Greg KH @ 2026-01-27 11:04 UTC (permalink / raw)
  To: raoxu
  Cc: kenny, linux-usb, mathias.nyman, michal.pecio, niklas.neronin,
	zhanjun

On Tue, Jan 27, 2026 at 06:55:22PM +0800, raoxu wrote:
> Hi greg,
> 
> Thanks for the review. Below is why I believe enabling secondary
> interrupters is justified.

Sorry, but always properly quote emails.  I have no context here.
Remember, some of us get over 1000 emails a day to review and handle.

Please respond with in-line comments as is recommended and I will be
glad to reply.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 1/2] usb: xhci: refactor IRQ/interrupter plumbing for multi-vector support
  2026-01-27  2:38 ` [PATCH v11 1/2] usb: xhci: refactor IRQ/interrupter plumbing for multi-vector support raoxu
@ 2026-01-27 12:54   ` Neronin, Niklas
  0 siblings, 0 replies; 18+ messages in thread
From: Neronin, Niklas @ 2026-01-27 12:54 UTC (permalink / raw)
  To: raoxu; +Cc: gregkh, kenny, linux-usb, mathias.nyman, michal.pecio, zhanjun



On 27/01/2026 4.38, raoxu wrote:
> From: Xu Rao <raoxu@uniontech.com>
> 
> Prepare xHCI for multiple IRQ vectors/interrupters without changing the
> current behavior.
> 
> Introduce a per-vector irq context (hcd + intr_num) to use as request_irq()
> dev_id, and track the active vector count in xhci->irqs_enabled.  Use this
> single bound to enable/disable interrupters consistently across run/stop/
> resume and to sync/free IRQs.
> 
> Legacy IRQ fallback also keeps irqs_enabled >= 1 so interrupter 0 remains
> functional when MSI/MSI-X is unavailable.
> 
> No functional change intended: still uses interrupter 0 only.
> 
> Signed-off-by: Xu Rao <raoxu@uniontech.com>
> 
> ---
>  drivers/usb/host/xhci-mem.c  | 17 +++++++++++++++++
>  drivers/usb/host/xhci-pci.c  | 34 +++++++++++++++++++++++++---------
>  drivers/usb/host/xhci-ring.c |  9 +++++++--
>  drivers/usb/host/xhci.c      | 23 +++++++++++++----------
>  drivers/usb/host/xhci.h      | 16 +++++++++++++++-
>  5 files changed, 77 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
> index c708bdd69f16..524c58a149d3 100644
> --- a/drivers/usb/host/xhci-mem.c
> +++ b/drivers/usb/host/xhci-mem.c
> @@ -2402,6 +2402,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
>  {
>  	struct device	*dev = xhci_to_hcd(xhci)->self.sysdev;
>  	dma_addr_t	dma;
> +	unsigned int    i;
> 
>  	/*
>  	 * xHCI section 5.4.6 - Device Context array must be
> @@ -2495,6 +2496,22 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
>  	if (!xhci->interrupters[0])
>  		goto fail;
> 
> +	/* Since only the main interrupt is used, secondary_irqs_alloc is set to 0. */
> +	xhci->secondary_irqs_alloc = 0;
> +
> +	/*
> +	 * Initialize all allocated interrupters here.
> +	 * Use allocated count as loop bound to avoid touching non-allocated
> +	 * or non-operational interrupters.
> +	 */
> +	xhci_dbg_trace(xhci, trace_xhci_dbg_init, "Add allocated interrupters");
> +	for (i = 0; i < xhci->secondary_irqs_alloc + 1; i++) {
> +		if (!xhci->interrupters[i])
> +			continue;
> +		xhci_add_interrupter(xhci, i);
> +		xhci->interrupters[i]->isoc_bei_interval = AVOID_BEI_INTERVAL_MAX;
> +	}

Could you keep the above initialization code in xhci_init()?

Last year, I moved most register initialization from xhci_mem_init()
into xhci_init(). Keeping this separation has been helpful for debugging
and will also align better with the upcoming suspend/resume rework.

> +
>  	if (scratchpad_alloc(xhci, flags))
>  		goto fail;
> 
> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> index 585b2f3117b0..b3efd6d8fd9c 100644
> --- a/drivers/usb/host/xhci-pci.c
> +++ b/drivers/usb/host/xhci-pci.c
> @@ -116,13 +116,12 @@ static const struct xhci_driver_overrides xhci_pci_overrides __initconst = {
>  static void xhci_msix_sync_irqs(struct xhci_hcd *xhci)
>  {
>  	struct usb_hcd *hcd = xhci_to_hcd(xhci);
> +	struct pci_dev *pdev = to_pci_dev(hcd->self.controller);

Is there a reason for moving this outside the 'if (hcd->msix_enabled)' scope?

> +	int i;
> 
> -	if (hcd->msix_enabled) {
> -		struct pci_dev *pdev = to_pci_dev(hcd->self.controller);
> -
> -		/* for now, the driver only supports one primary interrupter */
> -		synchronize_irq(pci_irq_vector(pdev, 0));
> -	}
> +	if (hcd->msix_enabled)
> +		for (i = 0; i < xhci->irqs_enabled; i++)
> +			synchronize_irq(pci_irq_vector(pdev, i));
>  }
> 
>  /* Legacy IRQ is freed by usb_remove_hcd() or usb_hcd_pci_shutdown() */
> @@ -130,11 +129,17 @@ static void xhci_cleanup_msix(struct xhci_hcd *xhci)
>  {
>  	struct usb_hcd *hcd = xhci_to_hcd(xhci);
>  	struct pci_dev *pdev = to_pci_dev(hcd->self.controller);
> +	int i;
> 
>  	if (hcd->irq > 0)
>  		return;
> 
> -	free_irq(pci_irq_vector(pdev, 0), xhci_to_hcd(xhci));
> +	for (i = 0; i < xhci->irqs_enabled; i++)
> +		free_irq(pci_irq_vector(pdev, i), &xhci->irq_ctx[i]);
> +
> +	xhci->irqs_enabled = 0;
> +	memset(xhci->irq_ctx, 0, sizeof(xhci->irq_ctx));
> +
>  	pci_free_irq_vectors(pdev);
>  	hcd->msix_enabled = 0;
>  }
> @@ -145,6 +150,7 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
>  	struct pci_dev *pdev = to_pci_dev(hcd->self.controller);
>  	struct xhci_hcd *xhci = hcd_to_xhci(hcd);
>  	int ret;
> +	struct xhci_irq_ctx *ctx;
> 
>  	/*
>  	 * Some Fresco Logic host controllers advertise MSI, but fail to
> @@ -174,11 +180,17 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
>  		goto legacy_irq;
>  	}
> 
> -	ret = request_irq(pci_irq_vector(pdev, 0), xhci_msi_irq, 0, "xhci_hcd",
> -			  xhci_to_hcd(xhci));
> +	memset(xhci->irq_ctx, 0, sizeof(xhci->irq_ctx));
> +	xhci->irqs_enabled = 0;
> +
> +	ctx = &xhci->irq_ctx[0];
> +	ctx->hcd = hcd;
> +	ctx->intr_num = 0;
> +	ret = request_irq(pci_irq_vector(pdev, 0), xhci_msi_irq, 0, "xhci_hcd", ctx);
>  	if (ret)
>  		goto free_irq_vectors;
> 
> +	xhci->irqs_enabled = 1;
>  	hcd->msi_enabled = 1;
>  	hcd->msix_enabled = pdev->msix_enabled;
>  	return 0;
> @@ -186,6 +198,7 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
>  free_irq_vectors:
>  	xhci_dbg_trace(xhci, trace_xhci_dbg_init, "disable %s interrupt",
>  		       pdev->msix_enabled ? "MSI-X" : "MSI");
> +	xhci->irqs_enabled = 0;
>  	pci_free_irq_vectors(pdev);
> 
>  legacy_irq:
> @@ -198,6 +211,9 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
>  		snprintf(hcd->irq_descr, sizeof(hcd->irq_descr), "%s:usb%d",
>  			 hcd->driver->description, hcd->self.busnum);
> 
> +	/* legacy IRQ path still needs interrupter 0 */
> +	xhci->irqs_enabled = 1;
> +
>  	/* fall back to legacy interrupt */
>  	ret = request_irq(pdev->irq, &usb_hcd_irq, IRQF_SHARED, hcd->irq_descr, hcd);
>  	if (ret) {
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 9315ba18310d..3ea134c07c5f 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -3220,9 +3220,14 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
>  	return ret;
>  }
> 
> -irqreturn_t xhci_msi_irq(int irq, void *hcd)
> +irqreturn_t xhci_msi_irq(int irq, void *data)
>  {
> -	return xhci_irq(hcd);
> +	struct xhci_irq_ctx *ctx = data;
> +
> +	/* For now only vector 0 is requested; keep behavior unchanged. */
> +	if (!ctx || !ctx->hcd)
> +		return IRQ_NONE;
> +	return xhci_irq(ctx->hcd);
>  }
>  EXPORT_SYMBOL_GPL(xhci_msi_irq);
> 
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index b3ba16b9718c..cbf96cb51583 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -576,10 +576,6 @@ static int xhci_init(struct usb_hcd *hcd)
>  	/* Set USB 3.0 device notifications for function remote wake */
>  	xhci_set_dev_notifications(xhci);
> 
> -	/* Initialize the Primary interrupter */
> -	xhci_add_interrupter(xhci, 0);
> -	xhci->interrupters[0]->isoc_bei_interval = AVOID_BEI_INTERVAL_MAX;
> -
>  	/* Initializing Compliance Mode Recovery Data If Needed */
>  	if (xhci_compliance_mode_recovery_timer_quirk_check()) {
>  		xhci->quirks |= XHCI_COMP_MODE_QUIRK;
> @@ -594,9 +590,9 @@ static int xhci_init(struct usb_hcd *hcd)
> 
>  static int xhci_run_finished(struct xhci_hcd *xhci)
>  {
> -	struct xhci_interrupter *ir = xhci->interrupters[0];
>  	unsigned long	flags;
>  	u32		temp;
> +	int		i;
> 
>  	/*
>  	 * Enable interrupts before starting the host (xhci 4.2 and 5.5.2).
> @@ -609,8 +605,10 @@ static int xhci_run_finished(struct xhci_hcd *xhci)
>  	temp |= (CMD_EIE);
>  	writel(temp, &xhci->op_regs->command);
> 
> -	xhci_dbg_trace(xhci, trace_xhci_dbg_init, "Enable primary interrupter");
> -	xhci_enable_interrupter(ir);
> +	xhci_dbg_trace(xhci, trace_xhci_dbg_init, "Enable all interrupters");
> +	for (i = 0; i < xhci->irqs_enabled; i++)
> +		if (xhci->interrupters[i])
> +			xhci_enable_interrupter(xhci->interrupters[i]);
> 
>  	if (xhci_start(xhci)) {
>  		xhci_halt(xhci);
> @@ -707,7 +705,7 @@ void xhci_stop(struct usb_hcd *hcd)
>  {
>  	u32 temp;
>  	struct xhci_hcd *xhci = hcd_to_xhci(hcd);
> -	struct xhci_interrupter *ir = xhci->interrupters[0];
> +	int i;
> 
>  	mutex_lock(&xhci->mutex);
> 
> @@ -742,7 +740,9 @@ void xhci_stop(struct usb_hcd *hcd)
>  			"// Disabling event ring interrupts");
>  	temp = readl(&xhci->op_regs->status);
>  	writel((temp & ~0x1fff) | STS_EINT, &xhci->op_regs->status);
> -	xhci_disable_interrupter(xhci, ir);
> +	for (i = 0; i < xhci->irqs_enabled; i++)
> +		if (xhci->interrupters[i])
> +			xhci_disable_interrupter(xhci, xhci->interrupters[i]);
> 
>  	xhci_dbg_trace(xhci, trace_xhci_dbg_init, "cleaning up memory");
>  	xhci_mem_cleanup(xhci);
> @@ -1087,6 +1087,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool power_lost, bool is_auto_resume)
>  	bool			comp_timer_running = false;
>  	bool			pending_portevent = false;
>  	bool			suspended_usb3_devs = false;
> +	int			i;
> 
>  	if (!hcd->state)
>  		return 0;
> @@ -1180,7 +1181,9 @@ int xhci_resume(struct xhci_hcd *xhci, bool power_lost, bool is_auto_resume)
>  		xhci_dbg(xhci, "// Disabling event ring interrupts\n");
>  		temp = readl(&xhci->op_regs->status);
>  		writel((temp & ~0x1fff) | STS_EINT, &xhci->op_regs->status);
> -		xhci_disable_interrupter(xhci, xhci->interrupters[0]);
> +		for (i = 0; i < xhci->irqs_enabled; i++)
> +			if (xhci->interrupters[i])
> +				xhci_disable_interrupter(xhci, xhci->interrupters[i]);
> 
>  		xhci_dbg(xhci, "cleaning up memory\n");
>  		xhci_mem_cleanup(xhci);
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index 2b0796f6d00e..7707fd7564c5 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -45,6 +45,9 @@
>   */
>  #define MAX_HC_INTRS		128
> 
> +/* Software cap for secondary interrupters; not a hardware limit. */
> +#define MAX_SECONDARY_INTRNUM	4
> +
>  /*
>   * xHCI register interface.
>   * This corresponds to the eXtensible Host Controller Interface (xHCI)
> @@ -1497,6 +1500,11 @@ struct xhci_hub {
>  	u8			min_rev;
>  };
> 
> +struct xhci_irq_ctx {
> +	struct usb_hcd	*hcd;
> +	u8		intr_num;

Could you keep this as an 'unsigned int' to stay consistent with the other
"intr_num" variables?

Thanks,
Niklas

> +};
> +
>  /* There is one xhci_hcd structure per controller */
>  struct xhci_hcd {
>  	struct usb_hcd *main_hcd;
> @@ -1533,6 +1541,12 @@ struct xhci_hcd {
>  	/* data structures */
>  	struct xhci_device_context_array *dcbaa;
>  	struct xhci_interrupter **interrupters;
> +	/* Number of secondary interrupters successfully allocated */
> +	unsigned int		secondary_irqs_alloc;
> +	/* Number of IRQ vectors successfully requested (includes vector 0) */
> +	unsigned int		irqs_enabled;
> +	/* MSI/MSI-X vector contexts. Vector 0 uses [0], secondary use [1..] */
> +	struct xhci_irq_ctx	irq_ctx[MAX_SECONDARY_INTRNUM + 1];
>  	struct xhci_ring	*cmd_ring;
>  	unsigned int            cmd_ring_state;
>  #define CMD_RING_STATE_RUNNING         (1 << 0)
> @@ -1895,7 +1909,7 @@ int xhci_suspend(struct xhci_hcd *xhci, bool do_wakeup);
>  int xhci_resume(struct xhci_hcd *xhci, bool power_lost, bool is_auto_resume);
> 
>  irqreturn_t xhci_irq(struct usb_hcd *hcd);
> -irqreturn_t xhci_msi_irq(int irq, void *hcd);
> +irqreturn_t xhci_msi_irq(int irq, void *data);
>  int xhci_alloc_dev(struct usb_hcd *hcd, struct usb_device *udev);
>  int xhci_alloc_tt_info(struct xhci_hcd *xhci,
>  		struct xhci_virt_device *virt_dev,
> --
> 2.50.1
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot
  2026-01-27  2:39 ` [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot raoxu
  2026-01-27  7:39   ` Greg KH
@ 2026-01-27 12:57   ` Neronin, Niklas
  1 sibling, 0 replies; 18+ messages in thread
From: Neronin, Niklas @ 2026-01-27 12:57 UTC (permalink / raw)
  To: raoxu; +Cc: gregkh, kenny, linux-usb, mathias.nyman, michal.pecio, zhanjun



On 27/01/2026 4.39, raoxu wrote:
> From: Xu Rao <raoxu@uniontech.com>
> 
> Some xHCI hosts expose multiple MSI/MSI-X vectors and support multiple
> interrupters with independent event rings, but Linux commonly routes all
> transfer completions through interrupter 0.
> 
> This is not about increasing USB link throughput, but about avoiding a
> driver-imposed single hot spot. On hosts that already provide multiple
> MSI/MSI-X vectors and independent event rings, routing all completions
> through interrupter 0 creates unnecessary contention (serialized event
> handling/locks and coupled moderation), increasing CPU cost and tail
> latency under many active devices/endpoints. Using secondary interrupters
> simply matches the hardware's design, similar in spirit to merged
> xHCI-sideband work: exploit available parallel paths rather than
> funneling all events through one.
> 
> Allocate a small capped set of secondary interrupters in xhci_mem_init()
> (MAX_SECONDARY_INTRNUM, default 4) and request up to the matching number
> of IRQ vectors (bounded by what PCI core provides).  Dispatch each vector
> to its interrupter via the per-vector irq_ctx.
> 
> Route transfers per USB device (slot) by assigning vdev->interrupter at
> device allocation time and programming the interrupter target (slot
> context + TRB_INTR_TARGET) from that selection, so completions land on the
> selected event ring.  Drain a slot's secondary event ring at selected
> command completion boundaries to reduce late-event artifacts when teardown
> happens on interrupter 0.
> 
> Signed-off-by: Xu Rao <raoxu@uniontech.com>
> 

...

> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> index b3efd6d8fd9c..4ff3e29070b1 100644
> --- a/drivers/usb/host/xhci-pci.c
> +++ b/drivers/usb/host/xhci-pci.c
> @@ -150,6 +150,8 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
>  	struct pci_dev *pdev = to_pci_dev(hcd->self.controller);
>  	struct xhci_hcd *xhci = hcd_to_xhci(hcd);
>  	int ret;
> +	unsigned int irqs_num;
> +	int i;
>  	struct xhci_irq_ctx *ctx;
> 
>  	/*
> @@ -173,7 +175,7 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
> 
>  	/* TODO: Check with MSI Soc for sysdev */
>  	xhci->nvecs = pci_alloc_irq_vectors(pdev, 1, xhci->nvecs,
> -					    PCI_IRQ_MSIX | PCI_IRQ_MSI);
> +					    PCI_IRQ_MSIX | PCI_IRQ_MSI | PCI_IRQ_AFFINITY);
>  	if (xhci->nvecs < 0) {
>  		xhci_dbg_trace(xhci, trace_xhci_dbg_init,
>  			       "failed to allocate IRQ vectors");
> @@ -183,14 +185,30 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
>  	memset(xhci->irq_ctx, 0, sizeof(xhci->irq_ctx));
>  	xhci->irqs_enabled = 0;
> 
> -	ctx = &xhci->irq_ctx[0];
> -	ctx->hcd = hcd;
> -	ctx->intr_num = 0;
> -	ret = request_irq(pci_irq_vector(pdev, 0), xhci_msi_irq, 0, "xhci_hcd", ctx);
> -	if (ret)
> -		goto free_irq_vectors;
> +	/*
> +	 * Request up to 1 + secondary_irqs_alloc vectors (vector 0 + secondary),
> +	 * limited by what PCI core actually allocated.
> +	 */
> +	irqs_num = min_t(unsigned int,
> +		     (unsigned int)xhci->nvecs,
> +		     (unsigned int)(1 + xhci->secondary_irqs_alloc));
> +
> +	for (i = 0; i < irqs_num; i++) {
> +		ctx = &xhci->irq_ctx[i];
> +		ctx->hcd = hcd;
> +		ctx->intr_num = i;
> +		ret = request_irq(pci_irq_vector(pdev, i), xhci_msi_irq, 0,
> +				  "xhci_hcd", ctx);
> +		if (ret) {
> +			while (--i >= 0)
> +				free_irq(pci_irq_vector(pdev, i), &xhci->irq_ctx[i]);
> +			xhci->irqs_enabled = 0;
> +			memset(xhci->irq_ctx, 0, sizeof(xhci->irq_ctx));
> +			goto free_irq_vectors;

Won't this 'goto' end up freeing all IRQ vectors?

Now, if requesting an IRQ for a secondary interrupter fails,
the xhci driver will then fall back to using the legacy IRQ?

> +		}
> +		xhci->irqs_enabled++;
> +	}

Maybe it would be useful to add a debug message here, for example:
	xhci_dbg_trace(xhci, trace_xhci_dbg_init, "enabled %u %s interrupters",
		       xhci->irqs_enabled, pdev->msix_enabled ? "MSI-X" : "MSI");

I'm not sure whether this information is already reported elsewhere.

Thanks,
Niklas


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot
  2026-01-27  7:39   ` Greg KH
  2026-01-27 10:55     ` raoxu
@ 2026-01-28  8:09     ` raoxu
  2026-01-28  8:35       ` Greg KH
  2026-01-29 14:22       ` Michal Pecio
  1 sibling, 2 replies; 18+ messages in thread
From: raoxu @ 2026-01-28  8:09 UTC (permalink / raw)
  To: gregkh
  Cc: kenny, linux-usb, mathias.nyman, michal.pecio, niklas.neronin,
	raoxu, zhanjun

Hi greg,

Thanks for the review. Below is why I believe enabling secondary
interrupters is justified.

> What do you mean exactly by "hot spot"?  Usually this is a good thing,
> driver code is in the cache, as are the data structures, so this keeps
> data flowing well with less latencies.  Why would you not want this?
>

By "hot spot" I do not mean a CPU cache problem. I mean a software-side
serialization point: all devices share a single event ring and a single
IRQ handler which processes completions under the global xhci->lock. That
centralizes the work into one place regardless of cache locality.

> > On hosts that already provide multiple
> > MSI/MSI-X vectors and independent event rings, routing all completions
> > through interrupter 0 creates unnecessary contention (serialized event
> > handling/locks and coupled moderation), increasing CPU cost and tail
> > latency under many active devices/endpoints.
>
> So this is a MSI routing issue, not a cpu cache issue?
>

Yes. The primary issue is routing and event-ring fan-in: everything
funnels through interrupter 0 today even when the controller advertises
multiple interrupters/vectors with independent event rings.

> And exactly what type of contention is happening here?  How can it be
> measured and noticed?  The latencies involved in USB are huge due to the
> protocol and hardware, why would a CPU even notice such things?
>

Even if USB link latency is high, the IRQ/event path is still a CPU-side
serialization point: all completions land in one event ring and are
drained by one handler under xhci->lock. Under mixed workloads (e.g. isoc
video plus periodic audio/bulk), events from unrelated devices can queue
behind the same lock/handler and increase jitter/tail latency, even if
the bus is not saturated.

> > Using secondary interrupters
> > simply matches the hardware's design, similar in spirit to merged
> > xHCI-sideband work: exploit available parallel paths rather than
> > funneling all events through one.
>
> What do you mean by "secondary interrupters"?  The sideband work was for
> a totally different issue, whereby the normal hardware and CPU was
> bypassed to allow it to remain powered down and to save battery life.
> Spreading interrupts across CPU cores does the exact opposite of that,
> ensuring that cores can NOT go to sleep when the work should be only
> done by one of them.
>

By "secondary interrupters" I mean the non-zero xHCI interrupters
(interrupter 1..N when Max Interrupters > 1), each with its own event
ring/ERST/IMAN/IMOD, backed by MSI/MSI-X vectors when available. This is
a standard xHCI capability; the series just enables it with a small cap
and a clean fallback to interrupter 0.

You're right that xhci-sideband targets a different goal (power), and I
did not mean to equate the motivation. I referenced it only because the
upstream xhci-sideband work has already exercised and validated the core
multi-interrupter/event-ring plumbing in the driver. So using additional
interrupters is a proven in-tree mechanism, not an experimental path
unique to this series.

> In short, please post numbers showing how this really is noticable
> somehow.

Understood. v11 still takes the global xhci->lock in the IRQ/event path,
so it does not claim true parallel event processing yet. The goal of v11
is to make multi-vector routing and lifecycle correctness solid
(request/free/sync, run/stop/resume, quiesce/teardown) and establish
per-slot routing as groundwork for a follow-up that can reduce lock scope.

Below is the exact test command and the fio results I observed.

Test script (perf + fio, 2 USB storage devices, 2 USB 2.0 uvc cameras,
60s time_based):
  sudo perf stat -a -e cycles,cache-misses \
    fio \
      --name=usb1 --filename=/media/uos/Ventoy1/fio.bin --size=1G \
      --rw=randrw --rwmixread=50 --bs=4k --iodepth=64 --numjobs=4 \
      --time_based=1 --runtime=60 --direct=1 --ioengine=libaio \
      --group_reporting --lat_percentiles=1 \
      --name=usb2 --filename=/media/uos/Ventoy2/fio.bin --size=1G \
      --rw=randrw --rwmixread=50 --bs=4k --iodepth=64 --numjobs=4 \
      --time_based=1 --runtime=60 --direct=1 --ioengine=libaio \
      --group_reporting --lat_percentiles=1

Baseline (v6.19-rc6, two-device randrw 4k):
  read:  IOPS=480,  BW=1923KiB/s, clat avg=484 ms, p50=79 ms
         p90=700 ms, p95=726 ms, p99=801 ms, p99.5=17.1 s, max=32 s
  write: IOPS=485,  BW=1943KiB/s, clat avg=470 ms, p50=81 ms
         p90=693 ms, p95=726 ms, p99=802 ms, p99.5=17.1 s, max=32 s

With v11 applied (same command, same devices):
  read:  IOPS=1376, BW=5505KiB/s, clat avg=157 ms, p50=20 ms
         p90=90 ms, p95=927 ms, p99=1003 ms, p99.9=17.1 s, max=32 s
  write: IOPS=1381, BW=5528KiB/s, clat avg=154 ms, p50=21 ms
         p90=91 ms, p95=927 ms, p99=1003 ms, p99.9=17.1 s, max=32 s

On this setup, this run shows higher throughput and lower typical
latencies with v11: read IOPS increased from 480 to 1376 (+186.7%) and
write IOPS from 485 to 1381 (+184.7%). Typical latency also improved:
read p50 dropped from 79 ms to 20 ms (-74.7%) and read p90 from ~700 ms
to ~90 ms (-87.1%); write p50 dropped from 81 ms to 21 ms (-74.1%) and
write p90 from ~693 ms to ~91 ms (-86.8%).
thanks,

Xu Rao

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot
  2026-01-28  8:09     ` raoxu
@ 2026-01-28  8:35       ` Greg KH
  2026-01-29 14:22       ` Michal Pecio
  1 sibling, 0 replies; 18+ messages in thread
From: Greg KH @ 2026-01-28  8:35 UTC (permalink / raw)
  To: raoxu
  Cc: kenny, linux-usb, mathias.nyman, michal.pecio, niklas.neronin,
	zhanjun

On Wed, Jan 28, 2026 at 04:09:39PM +0800, raoxu wrote:
> Hi greg,
> 
> Thanks for the review. Below is why I believe enabling secondary
> interrupters is justified.
> 
> > What do you mean exactly by "hot spot"?  Usually this is a good thing,
> > driver code is in the cache, as are the data structures, so this keeps
> > data flowing well with less latencies.  Why would you not want this?
> >
> 
> By "hot spot" I do not mean a CPU cache problem. I mean a software-side
> serialization point: all devices share a single event ring and a single
> IRQ handler which processes completions under the global xhci->lock. That
> centralizes the work into one place regardless of cache locality.

Ok, but is there actually any lock contention happening here?  Using the
perf tools should show that, but again, given the slow device speeds, is
the lock contention even measurable?

And if you still have a single lock, spreading out the irqs over
different cpus will make this contention WORSE, not better.

> > > On hosts that already provide multiple
> > > MSI/MSI-X vectors and independent event rings, routing all completions
> > > through interrupter 0 creates unnecessary contention (serialized event
> > > handling/locks and coupled moderation), increasing CPU cost and tail
> > > latency under many active devices/endpoints.
> >
> > So this is a MSI routing issue, not a cpu cache issue?
> >
> 
> Yes. The primary issue is routing and event-ring fan-in: everything
> funnels through interrupter 0 today even when the controller advertises
> multiple interrupters/vectors with independent event rings.

Ok, but given the single lock, that still makes sense as doing that
"fan-in" is faster than hitting lock contention/bouncing across cpus,
right?

> > And exactly what type of contention is happening here?  How can it be
> > measured and noticed?  The latencies involved in USB are huge due to the
> > protocol and hardware, why would a CPU even notice such things?
> >
> 
> Even if USB link latency is high, the IRQ/event path is still a CPU-side
> serialization point: all completions land in one event ring and are
> drained by one handler under xhci->lock. Under mixed workloads (e.g. isoc
> video plus periodic audio/bulk), events from unrelated devices can queue
> behind the same lock/handler and increase jitter/tail latency, even if
> the bus is not saturated.

Do you have measurements of such latency and jitter?  USB is horribly
bad in latency and jitter at the hardware level, and does moving the
data to different cpus like this actually change anything if you still
have the same global lock issue?  I would think it would make it much
worse, especially if you are sharing across big/little type cores.

> > > Using secondary interrupters
> > > simply matches the hardware's design, similar in spirit to merged
> > > xHCI-sideband work: exploit available parallel paths rather than
> > > funneling all events through one.
> >
> > What do you mean by "secondary interrupters"?  The sideband work was for
> > a totally different issue, whereby the normal hardware and CPU was
> > bypassed to allow it to remain powered down and to save battery life.
> > Spreading interrupts across CPU cores does the exact opposite of that,
> > ensuring that cores can NOT go to sleep when the work should be only
> > done by one of them.
> >
> 
> By "secondary interrupters" I mean the non-zero xHCI interrupters
> (interrupter 1..N when Max Interrupters > 1), each with its own event
> ring/ERST/IMAN/IMOD, backed by MSI/MSI-X vectors when available. This is
> a standard xHCI capability; the series just enables it with a small cap
> and a clean fallback to interrupter 0.
> 
> You're right that xhci-sideband targets a different goal (power), and I
> did not mean to equate the motivation. I referenced it only because the
> upstream xhci-sideband work has already exercised and validated the core
> multi-interrupter/event-ring plumbing in the driver. So using additional
> interrupters is a proven in-tree mechanism, not an experimental path
> unique to this series.
> 
> > In short, please post numbers showing how this really is noticable
> > somehow.
> 
> Understood. v11 still takes the global xhci->lock in the IRQ/event path,
> so it does not claim true parallel event processing yet. The goal of v11
> is to make multi-vector routing and lifecycle correctness solid
> (request/free/sync, run/stop/resume, quiesce/teardown) and establish
> per-slot routing as groundwork for a follow-up that can reduce lock scope.

I think you need to determine if you really do have a measurable lock
contention here, and work to make a patch series that removes that.
Just seeing this intermediate step which adds complexity for no
noticable gain is not really a good idea (especially given the number of
attempts this single patch has taken for basic things like "does not
break the build").

> Below is the exact test command and the fio results I observed.
> 
> Test script (perf + fio, 2 USB storage devices, 2 USB 2.0 uvc cameras,
> 60s time_based):
>   sudo perf stat -a -e cycles,cache-misses \
>     fio \
>       --name=usb1 --filename=/media/uos/Ventoy1/fio.bin --size=1G \
>       --rw=randrw --rwmixread=50 --bs=4k --iodepth=64 --numjobs=4 \
>       --time_based=1 --runtime=60 --direct=1 --ioengine=libaio \
>       --group_reporting --lat_percentiles=1 \
>       --name=usb2 --filename=/media/uos/Ventoy2/fio.bin --size=1G \
>       --rw=randrw --rwmixread=50 --bs=4k --iodepth=64 --numjobs=4 \
>       --time_based=1 --runtime=60 --direct=1 --ioengine=libaio \
>       --group_reporting --lat_percentiles=1
> 
> Baseline (v6.19-rc6, two-device randrw 4k):
>   read:  IOPS=480,  BW=1923KiB/s, clat avg=484 ms, p50=79 ms
>          p90=700 ms, p95=726 ms, p99=801 ms, p99.5=17.1 s, max=32 s
>   write: IOPS=485,  BW=1943KiB/s, clat avg=470 ms, p50=81 ms
>          p90=693 ms, p95=726 ms, p99=802 ms, p99.5=17.1 s, max=32 s
> 
> With v11 applied (same command, same devices):
>   read:  IOPS=1376, BW=5505KiB/s, clat avg=157 ms, p50=20 ms
>          p90=90 ms, p95=927 ms, p99=1003 ms, p99.9=17.1 s, max=32 s
>   write: IOPS=1381, BW=5528KiB/s, clat avg=154 ms, p50=21 ms
>          p90=91 ms, p95=927 ms, p99=1003 ms, p99.9=17.1 s, max=32 s
> 
> On this setup, this run shows higher throughput and lower typical
> latencies with v11: read IOPS increased from 480 to 1376 (+186.7%) and
> write IOPS from 485 to 1381 (+184.7%). Typical latency also improved:
> read p50 dropped from 79 ms to 20 ms (-74.7%) and read p90 from ~700 ms
> to ~90 ms (-87.1%); write p50 dropped from 81 ms to 21 ms (-74.1%) and
> write p90 from ~693 ms to ~91 ms (-86.8%).

Ok, yes, that's real, and frankly huge, speadups.  I take back what I
wrote above, this type of information should have been in the first
patch :)

But, is this an actual use case?  Your devices are not using USB
storage, but only uvc cameras, right?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 0/2] usb: xhci: route device to secondary interrupters
  2026-01-27  2:34 [PATCH v11 0/2] usb: xhci: route device to secondary interrupters raoxu
                   ` (2 preceding siblings ...)
  2026-01-27  7:33 ` [PATCH v11 0/2] usb: xhci: route device to secondary interrupters Greg KH
@ 2026-01-28 13:19 ` Kenneth Crudup
  3 siblings, 0 replies; 18+ messages in thread
From: Kenneth Crudup @ 2026-01-28 13:19 UTC (permalink / raw)
  To: raoxu, mathias.nyman
  Cc: gregkh, linux-usb, michal.pecio, niklas.neronin, zhanjun



On 1/26/26 18:34, raoxu wrote:
> From: Xu Rao <raoxu@uniontech.com>
> 
> This series is split into two steps: patch 1/2 refactors xHCI IRQ and
> interrupter handling to make multi-vector operation possible without
> changing behavior; patch 2/2 then enables a small capped set of secondary
> interrupters/vectors and routes transfer completions per device (slot) to
> reduce contention on interrupter 0.
> 
> This is not about increasing USB link throughput, but about avoiding a
> driver-imposed single hot spot. On hosts that already provide multiple
> MSI/MSI-X vectors and independent event rings, routing all completions
> through interrupter 0 creates unnecessary contention (serialized event
> handling/locks and coupled moderation), increasing CPU cost and tail
> latency under many active devices/endpoints. Using secondary interrupters
> simply matches the hardware's design, similar in spirit to merged
> xHCI-sideband work: exploit available parallel paths rather than
> funneling all events through one.
> 
> Xu Rao (2):
>    usb: xhci: refactor IRQ/interrupter plumbing for multi-vector support
>    usb: xhci: enable secondary interrupters and route transfers per slot
> 
> Signed-off-by: Xu Rao <raoxu@uniontech.com>

Tested-By: Kenneth R. Crudup <kenny@panix.com>

> ---
> Changelog:
> v1 -> v2:
>    - Bind interrupters to endpoints at enable time instead of selecting
>      per transfer.
>    - Store the selected interrupter in struct xhci_virt_ep and program
>      TRB_INTR_TARGET() from the bound interrupter.
>    - Use a single IRQ handler for both primary and secondary vectors,
>      with STS_EINT handling restricted to interrupter 0.
>    - Keep a common dev_id for IRQ registration to match the existing
>      xhci_cleanup_msix() teardown constraints and avoid dev_id
>      lifetime issues.
>    - Clarify secondary interrupter teardown to avoid double-free or
>      use-after-free during xHCI removal.
> v2 -> v3:
>    - modify commit information
> v3 -> v4:
>    - Bind interrupters per USB device (slot) via struct xhci_virt_device,
>      program TRB_INTR_TARGET() from vdev->interrupter for bulk/ctrl/isoc.
>    - Drop xfer_interrupters and unify on xhci->interrupters[] for both
>      primary and secondary event rings and IRQ routing.
>    - Allocate secondary interrupters in xhci_mem_init; on any allocation
>      failure, rollback and continue with primary interrupter only.
>    - Cap secondary interrupter creation with MAX_SECONDARY_INTRNUM,
>      defaulting to 4.
>    - Route xhci_msi_irq by irq handler_data token (intr_num + 1) to keep
>      correct interrupter selection across resume/power_lost.
>    - Apply MSI-X affinity hints for secondary vectors.
> v4 -> v5:
>    - Fix min() signedness build error reported by 0day CI.
>    - Rebase onto v6.19-rc2.
> v5 -> v6:
>    - Route secondary MSI/MSI-X IRQs by storing struct xhci_interrupter in
>      irq handler_data, instead of using an (intr_num + 1) token mapping.
>    - Program Slot Context Interrupter Target (tt_info[31:22]) from
>      vdev->interrupter to keep slot default routing aligned with TRB
>      TRB_INTR_TARGET() selection.
> v6 -> v7:
>    - Add xhci_quiesce_interrupter() and use it for secondary
>      interrupters in xhci_stop() and the power_lost path of xhci_resume(),
>      ensuring IMAN.IP (RW1C) and ERDP.EHB are properly cleared.
> v7 -> v8:
>    - Sync secondary MSI/MSI-X vectors in xhci_msix_sync_irqs() with
>      synchronize_irq().
>    - Fix build errors by adding missing header includes for the IRQ helper APIs.
> v8 -> v9:
>    - Use PCI_IRQ_AFFINITY to let PCI core spread MSI/MSI-X vectors across CPUs.
>    - Route each MSI/MSI-X vector to its interrupter via per-vector irq_ctx dev_id.
>    - Fix modpost error: xhci_msix_set_handler_data undefined (0-day CI).
>      https://lore.kernel.org/oe-kbuild-all/202601171743.omq3DpnM-lkp@intel.com/
>    - Rebase onto v6.19-rc6.
> v9 -> v10:
>    - refactor IRQ/interrupter plumbing for multi-vector support.
>    - add xhci_handle_slot_secondary_events to handle secondary event ring.
> v10 -> v11:
>    - Fix build warnings:-Wsometimes-uninitialized.
>      Closes: https://lore.kernel.org/oe-kbuild-all/202601262208.UybEjc9X-lkp@intel.com/
> ---
>   drivers/usb/host/xhci-mem.c  |  48 +++++++++++
>   drivers/usb/host/xhci-pci.c  |  57 ++++++++++---
>   drivers/usb/host/xhci-ring.c | 156 +++++++++++++++++++++++++++++------
>   drivers/usb/host/xhci.c      |  37 ++++++---
>   drivers/usb/host/xhci.h      |  26 +++++-
>   5 files changed, 274 insertions(+), 50 deletions(-)
> ---
> 

-- 
Kenneth R. Crudup / Sr. SW Engineer, Scott County Consulting, Orange 
County CA


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot
  2026-01-28  8:09     ` raoxu
  2026-01-28  8:35       ` Greg KH
@ 2026-01-29 14:22       ` Michal Pecio
  2026-01-29 19:43         ` Mathias Nyman
  1 sibling, 1 reply; 18+ messages in thread
From: Michal Pecio @ 2026-01-29 14:22 UTC (permalink / raw)
  To: raoxu; +Cc: gregkh, kenny, linux-usb, mathias.nyman, niklas.neronin, zhanjun

On Wed, 28 Jan 2026 16:09:39 +0800, raoxu wrote:
> By "hot spot" I do not mean a CPU cache problem. I mean a
> software-side serialization point: all devices share a single event
> ring and a single IRQ handler which processes completions under the
> global xhci->lock. That centralizes the work into one place
> regardless of cache locality.
> [...]
> Even if USB link latency is high, the IRQ/event path is still a
> CPU-side serialization point: all completions land in one event ring
> and are drained by one handler under xhci->lock. Under mixed
> workloads (e.g. isoc video plus periodic audio/bulk), events from
> unrelated devices can queue behind the same lock/handler and increase
> jitter/tail latency, even if the bus is not saturated.

But this patch doesn't address such contetion. It's still one big lock
per xHCI chip and a few CPUs take turns to do all work sequentially.

I find Greg's counterargument convincing. This change seems to only
harm xhci_hcd performance, and if your bottleneck truly were in this
driver, it's hard to imagine how anything could improve.

I suspect it's somewhere else, and considering the results you report,
quite likely in URB completion handlers. They used to be called from
this IRQ handler, but nowadays it's delegated to USB core.

Core uses BH workqueues. AFAIU this means completions run later on the
same CPU which handled the IRQ. So if you spread IRQs, you also spread
completions. It's easy to see how this could increase throughput or
remedy latency spikes caused by sluggish class drivers, because unlike
this IRQ, completions can usually run in parallel on multiple CPUs.

If that's the real problem here then no xHCI driver surgery should be
necessary to work around it, only updating the core.

> the upstream xhci-sideband work has already exercised and validated
> the core multi-interrupter/event-ring plumbing in the driver. So
> using additional interrupters is a proven in-tree mechanism, not an
> experimental path unique to this series.

I'm afraid the only driver functionality exercised by xhci-sideband is
allocation and freeing of extra event rings. Handling those rings in SW
and solving race conditions is new, as is having no way to tell which
order some events were generated in when things go wrong.

Regards,
Michal

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot
  2026-01-29 14:22       ` Michal Pecio
@ 2026-01-29 19:43         ` Mathias Nyman
  2026-01-29 20:03           ` Kenneth Crudup
  2026-01-30  3:48           ` raoxu
  0 siblings, 2 replies; 18+ messages in thread
From: Mathias Nyman @ 2026-01-29 19:43 UTC (permalink / raw)
  To: Michal Pecio, raoxu; +Cc: gregkh, kenny, linux-usb, niklas.neronin, zhanjun

On 1/29/26 16:22, Michal Pecio wrote:
> On Wed, 28 Jan 2026 16:09:39 +0800, raoxu wrote:
>> By "hot spot" I do not mean a CPU cache problem. I mean a
>> software-side serialization point: all devices share a single event
>> ring and a single IRQ handler which processes completions under the
>> global xhci->lock. That centralizes the work into one place
>> regardless of cache locality.
>> [...]
>> Even if USB link latency is high, the IRQ/event path is still a
>> CPU-side serialization point: all completions land in one event ring
>> and are drained by one handler under xhci->lock. Under mixed
>> workloads (e.g. isoc video plus periodic audio/bulk), events from
>> unrelated devices can queue behind the same lock/handler and increase
>> jitter/tail latency, even if the bus is not saturated.
> 
> But this patch doesn't address such contetion. It's still one big lock
> per xHCI chip and a few CPUs take turns to do all work sequentially.
> 
> I find Greg's counterargument convincing. This change seems to only
> harm xhci_hcd performance, and if your bottleneck truly were in this
> driver, it's hard to imagine how anything could improve.

We would need more fine-grained xhci locking, and dynamically on-demand
created and removed secondary interrupters to get any real performance
benefit here.

I'm have another, virtualization and VTIO related interest in seeing this
working.
Having a per-device interrupter is one step forward in implementing
xHCI section 8 virtualization, allowing xHIC to pass a usb device over from
VM host to a VM guest

> 
> I suspect it's somewhere else, and considering the results you report,
> quite likely in URB completion handlers. They used to be called from
> this IRQ handler, but nowadays it's delegated to USB core.
> 
> Core uses BH workqueues. AFAIU this means completions run later on the
> same CPU which handled the IRQ. So if you spread IRQs, you also spread
> completions. It's easy to see how this could increase throughput or
> remedy latency spikes caused by sluggish class drivers, because unlike
> this IRQ, completions can usually run in parallel on multiple CPUs.

Sounds plausible. Let me see if I understood this correctly.

When xhci only has one interrupt it will process all transfer events from
all  devices in the interrupt handler in one go, with xhci->lock held for
a fairly long time.
During this processing the interrupt handler calls usb core to queue all
URB completions on the system_bh_wq (BH workqueue) on that same CPU.

This workqueue only stars processing URBs in softirq context, which
is _after_ completing xhci interrupt handler.

With several interrupts on different CPUs there will be fewer events to
handle per interrupt handler in one go with xhci->lock held.
There will also be several BH workqueues, one per CPU? which can start
earlier and have fewer URB completions per workqueue.

As a downside we will have several CPUs spinning in interupt context,
waiting for their turn to get the xhci->lock and handle the interrupt.

So we get a bit shorter URB completion latency traded for hogging more
overall system resources

> 
> If that's the real problem here then no xHCI driver surgery should be
> necessary to work around it, only updating the core.

Could be worth trying to reduce xHCI interrupt moderation interval.
A xHC interrupter can't trigger a new interrupr until 40 microseconds has
passed since previous interrupt completed (EHB bit cleared).
This might cause some tranfer event build up and thus latency.

Xu Rao, can I ask you you run the same test as before with only primary
interrupter with interrupt moderation the change below?

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 585b2f3117b0..9a2a4d17ed68 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -573,7 +573,7 @@ static int xhci_pci_setup(struct usb_hcd *hcd)
  	xhci = hcd_to_xhci(hcd);
  
  	/* imod_interval is the interrupt moderation value in nanoseconds. */
-	xhci->imod_interval = 40000;
+	xhci->imod_interval = 10000;
  
  	retval = xhci_gen_setup(hcd, xhci_pci_quirks);
  	if (retval)

> 
>> the upstream xhci-sideband work has already exercised and validated
>> the core multi-interrupter/event-ring plumbing in the driver. So
>> using additional interrupters is a proven in-tree mechanism, not an
>> experimental path unique to this series.
> 
> I'm afraid the only driver functionality exercised by xhci-sideband is
> allocation and freeing of extra event rings. Handling those rings in SW
> and solving race conditions is new, as is having no way to tell which
> order some events were generated in when things go wrong.

Agree, commands like 'stop endpoint' would trigger a transfer event on one cpu
and the command completion on another. PATCH V12 2/2 attempts to resolve
this by handling transfer events on the other interrupter before handling
specific command completions. This will get complicated and messy,
especially if we implement more fine-grained xhci locking.

The reward is too small compared to the risk of turning USB (more) unreliable.

Maybe we could add support for several interrupters but just use the
primary one as default.
Brave developers and testers could enable more interrupters via Kconfig,
debugfs, sysfs, or some other way.

Thanks
Mathias

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot
  2026-01-29 19:43         ` Mathias Nyman
@ 2026-01-29 20:03           ` Kenneth Crudup
  2026-01-30  3:48           ` raoxu
  1 sibling, 0 replies; 18+ messages in thread
From: Kenneth Crudup @ 2026-01-29 20:03 UTC (permalink / raw)
  To: Mathias Nyman, Michal Pecio, raoxu
  Cc: gregkh, linux-usb, niklas.neronin, zhanjun, Kenneth C


On 1/29/26 11:43, Mathias Nyman wrote:

> On 1/29/26 16:22, Michal Pecio wrote:

>> I'm afraid the only driver functionality exercised by xhci-sideband is
>> allocation and freeing of extra event rings. Handling those rings in SW
>> and solving race conditions is new, as is having no way to tell which
>> order some events were generated in when things go wrong.
> 
> Agree, commands like 'stop endpoint' would trigger a transfer event on 
> one cpu
> and the command completion on another. PATCH V12 2/2 attempts to resolve
> this by handling transfer events on the other interrupter before handling
> specific command completions. This will get complicated and messy,
> especially if we implement more fine-grained xhci locking.
> 
> The reward is too small compared to the risk of turning USB (more) 
> unreliable.

I've been reading these comment series and understand everyone's 
objections, but FWIW I've been running this patch series since something 
like its V3 on my laptop, moving between various USB network interfaces, 
hubs, storage devices and even headphones, with sudden disconnections, 
media errors and over dozens of suspend/hibernate/resume cycles, for a 
month now and haven't had any adverse issues.

(I haven't been running any numbers or tests, I just picked up the patch 
series' cause it looked interesting so I thought I'd try it out.)

-Kenny

-- 
Kenneth R. Crudup / Sr. SW Engineer, Scott County Consulting, Orange 
County CA


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot
  2026-01-29 19:43         ` Mathias Nyman
  2026-01-29 20:03           ` Kenneth Crudup
@ 2026-01-30  3:48           ` raoxu
  2026-01-30  5:34             ` Greg KH
  1 sibling, 1 reply; 18+ messages in thread
From: raoxu @ 2026-01-30  3:48 UTC (permalink / raw)
  To: mathias.nyman
  Cc: gregkh, kenny, linux-usb, michal.pecio, niklas.neronin, raoxu,
	zhanjun

Hi Michal, Mathias,

Thanks for the detailed feedback.

> Xu Rao, can I ask you you run the same test as before with only primary
> interrupter with interrupt moderation the change below?
>
> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> index 585b2f3117b0..9a2a4d17ed68 100644
> --- a/drivers/usb/host/xhci-pci.c
> +++ b/drivers/usb/host/xhci-pci.c
> @@ -573,7 +573,7 @@ static int xhci_pci_setup(struct usb_hcd *hcd)
>   xhci = hcd_to_xhci(hcd);
>
>   /* imod_interval is the interrupt moderation value in nanoseconds. */
> - xhci->imod_interval = 40000;
> + xhci->imod_interval = 10000;
>
>   retval = xhci_gen_setup(hcd, xhci_pci_quirks);
>   if (retval)

Test device:
ID 090c:2320 Silicon Motion, Inc. - Taiwan (formerly Feiya Technology Corp.) FF952

The results of testing a single USB drive are as follows: there is no
significant difference in speed. Perhaps I can test more devices to
see the differences.

6.19.0-rc6 result:
dd if=/dev/zero of=./test bs=10M count=1024 status=progress oflag=direct
10611589120 bytes (11 GB, 9.9 GiB) copied, 30 s, 353 MB/s
1024+0 records in
1024+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 30.3882 s, 353 MB/s

6.19.0-rc6 + v12 patch result:
dd if=/dev/zero of=./test bs=10M count=1024 status=progress oflag=direct
10475274240 bytes (10 GB, 9.8 GiB) copied, 30 s, 349 MB/s
1024+0 records in
1024+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 30.7994 s, 349 MB/s

6.19.0-rc6 + v12 patch + imod_interval =10000 result:
dd if=/dev/zero of=./test bs=10M count=1024 status=progress oflag=direct
10643046400 bytes (11 GB, 9.9 GiB) copied, 30 s, 355 MB/s
1024+0 records in
1024+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 30.2851 s, 355 MB/s

> Agree, commands like 'stop endpoint' would trigger a transfer event on one cpu
> and the command completion on another. PATCH V12 2/2 attempts to resolve
> this by handling transfer events on the other interrupter before handling
> specific command completions. This will get complicated and messy,
> especially if we implement more fine-grained xhci locking.
>
> The reward is too small compared to the risk of turning USB (more) unreliable.
>
> Maybe we could add support for several interrupters but just use the
> primary one as default.
> Brave developers and testers could enable more interrupters via Kconfig,
> debugfs, sysfs, or some other way.

For patch v13, I will make the several interrupter support opt-in via Kconfig,
defaulting to off, so only interested users enable it.

My next step is to try using more fine-grained XHCI locking to attempt a real
improvement in USB performance.

thanks,

Xu Rao

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot
  2026-01-30  3:48           ` raoxu
@ 2026-01-30  5:34             ` Greg KH
  2026-02-02 13:29               ` [PATCH v12 " raoxu
  0 siblings, 1 reply; 18+ messages in thread
From: Greg KH @ 2026-01-30  5:34 UTC (permalink / raw)
  To: raoxu
  Cc: mathias.nyman, kenny, linux-usb, michal.pecio, niklas.neronin,
	zhanjun

On Fri, Jan 30, 2026 at 11:48:47AM +0800, raoxu wrote:
> Hi Michal, Mathias,
> 
> Thanks for the detailed feedback.
> 
> > Xu Rao, can I ask you you run the same test as before with only primary
> > interrupter with interrupt moderation the change below?
> >
> > diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> > index 585b2f3117b0..9a2a4d17ed68 100644
> > --- a/drivers/usb/host/xhci-pci.c
> > +++ b/drivers/usb/host/xhci-pci.c
> > @@ -573,7 +573,7 @@ static int xhci_pci_setup(struct usb_hcd *hcd)
> >   xhci = hcd_to_xhci(hcd);
> >
> >   /* imod_interval is the interrupt moderation value in nanoseconds. */
> > - xhci->imod_interval = 40000;
> > + xhci->imod_interval = 10000;
> >
> >   retval = xhci_gen_setup(hcd, xhci_pci_quirks);
> >   if (retval)
> 
> Test device:
> ID 090c:2320 Silicon Motion, Inc. - Taiwan (formerly Feiya Technology Corp.) FF952
> 
> The results of testing a single USB drive are as follows: there is no
> significant difference in speed. Perhaps I can test more devices to
> see the differences.
> 
> 6.19.0-rc6 result:
> dd if=/dev/zero of=./test bs=10M count=1024 status=progress oflag=direct
> 10611589120 bytes (11 GB, 9.9 GiB) copied, 30 s, 353 MB/s
> 1024+0 records in
> 1024+0 records out
> 10737418240 bytes (11 GB, 10 GiB) copied, 30.3882 s, 353 MB/s
> 
> 6.19.0-rc6 + v12 patch result:
> dd if=/dev/zero of=./test bs=10M count=1024 status=progress oflag=direct
> 10475274240 bytes (10 GB, 9.8 GiB) copied, 30 s, 349 MB/s
> 1024+0 records in
> 1024+0 records out
> 10737418240 bytes (11 GB, 10 GiB) copied, 30.7994 s, 349 MB/s
> 
> 6.19.0-rc6 + v12 patch + imod_interval =10000 result:
> dd if=/dev/zero of=./test bs=10M count=1024 status=progress oflag=direct
> 10643046400 bytes (11 GB, 9.9 GiB) copied, 30 s, 355 MB/s
> 1024+0 records in
> 1024+0 records out
> 10737418240 bytes (11 GB, 10 GiB) copied, 30.2851 s, 355 MB/s

dd is a horrible benchmark tool and does not mean that the data really
is flushed to the devicei (oflag=direct _should_ bypass the page cache,
but I don't remember if that's always the case.)

fio would be more interesting, spreading transactions across multiple
devices at the same time, what about running the benchmark you showed
earlier in the thread?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v12 2/2] usb: xhci: enable secondary interrupters and route transfers per slot
  2026-01-30  5:34             ` Greg KH
@ 2026-02-02 13:29               ` raoxu
  0 siblings, 0 replies; 18+ messages in thread
From: raoxu @ 2026-02-02 13:29 UTC (permalink / raw)
  To: gregkh
  Cc: kenny, linux-usb, mathias.nyman, michal.pecio, niklas.neronin,
	raoxu, zhanjun

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 6587 bytes --]

Hi greg ,

Thanks for the review.

> > Test device:
> > ID 090c:2320 Silicon Motion, Inc. - Taiwan (formerly Feiya Technology Corp.) FF952
> >
> > The results of testing a single USB drive are as follows: there is no
> > significant difference in speed. Perhaps I can test more devices to
> > see the differences.
> >
> > 6.19.0-rc6 result:
> > dd if=/dev/zero of=./test bs=10M count=1024 status=progress oflag=direct
> > 10611589120 bytes (11 GB, 9.9 GiB) copied, 30 s, 353 MB/s
> > 1024+0 records in
> > 1024+0 records out
> > 10737418240 bytes (11 GB, 10 GiB) copied, 30.3882 s, 353 MB/s
> >
> > 6.19.0-rc6 + v12 patch result:
> > dd if=/dev/zero of=./test bs=10M count=1024 status=progress oflag=direct
> > 10475274240 bytes (10 GB, 9.8 GiB) copied, 30 s, 349 MB/s
> > 1024+0 records in
> > 1024+0 records out
> > 10737418240 bytes (11 GB, 10 GiB) copied, 30.7994 s, 349 MB/s
> >
> > 6.19.0-rc6 + v12 patch + imod_interval =10000 result:
> > dd if=/dev/zero of=./test bs=10M count=1024 status=progress oflag=direct
> > 10643046400 bytes (11 GB, 9.9 GiB) copied, 30 s, 355 MB/s
> > 1024+0 records in
> > 1024+0 records out
> > 10737418240 bytes (11 GB, 10 GiB) copied, 30.2851 s, 355 MB/s
>
> dd is a horrible benchmark tool and does not mean that the data really
> is flushed to the devicei (oflag=direct _should_ bypass the page cache,
> but I don't remember if that's always the case.)
>
> fio would be more interesting, spreading transactions across multiple
> devices at the same time, what about running the benchmark you showed
> earlier in the thread?

I have added the following test results.

Test device:
ID 090c:2320 Silicon Motion, Inc. - Taiwan (formerly Feiya Technology Corp.) FF952

Test script (perf + fio, 1 USB storage device):
  sudo perf stat -a -e cycles,cache-misses \
    fio \
      --name=usb1 --filename=/media/uos/Ventoy1/fio.bin --size=1G \
      --rw=randrw --rwmixread=50 --bs=4k --iodepth=64 --numjobs=4 \
      --time_based=1 --runtime=60 --direct=1 --ioengine=libaio \
      --group_reporting --lat_percentiles=1 \

6.19.0-rc6 result:
  Run#1 (pid=5545)
  read: IOPS=3159, BW=12645KiB/s, clat avg=38.0 ms, p50=12 ms
  p90=89 ms, p95=148 ms, p99=163 ms, p99.5=217 ms, p99.9=228 ms, max=369 ms
  write: IOPS=3158, BW=12638KiB/s, clat avg=41.7 ms, p50=14 ms
  p90=92 ms, p95=153 ms, p99=167 ms, p99.5=222 ms, p99.9=230 ms, max=369 ms

  Run#2 (pid=6633)
  read: IOPS=3039, BW=12159KiB/s, clat avg=39.4 ms, p50=12 ms
  p90=90 ms, p95=150 ms, p99=165 ms, p99.5=221 ms, p99.9=230 ms, max=396 ms
  write: IOPS=3038, BW=12155KiB/s, clat avg=43.3 ms, p50=14 ms
  p90=93 ms, p95=153 ms, p99=171 ms, p99.5=224 ms, p99.9=232 ms, max=396 ms

  Run#3 (pid=8091)
  read: IOPS=3063, BW=12259KiB/s, clat avg=39.1 ms, p50=12 ms
  p90=90 ms, p95=150 ms, p99=167 ms, p99.5=223 ms, p99.9=234 ms, max=318 ms
  write: IOPS=3062, BW=12254KiB/s, clat avg=43.0 ms, p50=14 ms
  p90=93 ms, p95=153 ms, p99=220 ms, p99.5=226 ms, p99.9=236 ms, max=318 ms

  Average(3 times)
  read: IOPS=3087, BW=12354KiB/s, clat avg=38.8 ms, p50=12 ms
  p90=90 ms, p95=150 ms, p99=165 ms, p99.5=221 ms, p99.9=230 ms, max=361 ms
  write: IOPS=3086, BW=12349KiB/s, clat avg=42.7 ms, p50=14 ms
  p90=93 ms, p95=153 ms, p99=186 ms, p99.5=224 ms, p99.9=233 ms, max=361 ms

6.19.0-rc6 + v12 patch result:
  Run#1 (pid=3447359)
  read: IOPS=4682, BW=18732KiB/s, clat avg=26.3 ms, p50=15 ms
  p90=85 ms, p95=94 ms, p99=155 ms, p99.5=163 ms, p99.9=180 ms, max=322 ms
  write: IOPS=4683, BW=18734KiB/s, clat avg=27.4 ms, p50=15 ms
  p90=86 ms, p95=95 ms, p99=157 ms, p99.5=165 ms, p99.9=186 ms, max=322 ms

  Run#2 (pid=3450671)
  read: IOPS=4780, BW=19122KiB/s, clat avg=25.6 ms, p50=15 ms
  p90=84 ms, p95=93 ms, p99=155 ms, p99.5=163 ms, p99.9=192 ms, max=381 ms
  write: IOPS=4779, BW=19119KiB/s, clat avg=27.0 ms, p50=15 ms
  p90=86 ms, p95=94 ms, p99=157 ms, p99.5=165 ms, p99.9=226 ms, max=381 ms

  Run#3 (pid=3454140)
  read: IOPS=4892, BW=19576KiB/s, clat avg=25.2 ms, p50=14 ms
  p90=84 ms, p95=93 ms, p99=122 ms, p99.5=163 ms, p99.9=176 ms, max=300 ms
  write: IOPS=4889, BW=19565KiB/s, clat avg=26.3 ms, p50=15 ms
  p90=85 ms, p95=94 ms, p99=153 ms, p99.5=163 ms, p99.9=176 ms, max=300 ms

  Average(3 times)
  read: IOPS=4785, BW=19143KiB/s, clat avg=25.7 ms, p50=15 ms
  p90=84 ms, p95=93 ms, p99=144 ms, p99.5=163 ms, p99.9=183 ms, max=334 ms
  write: IOPS=4784, BW=19139KiB/s, clat avg=26.9 ms, p50=15 ms
  p90=86 ms, p95=94 ms, p99=156 ms, p99.5=164 ms, p99.9=196 ms, max=334 ms

6.19.0-rc6 + v12 patch + imod_interval =10000 result:
  Run#1 (pid=4172)
  read: IOPS=4669, BW=18677KiB/s, clat avg=26.3 ms, p50=15 ms
  p90=85 ms, p95=93 ms, p99=155 ms, p99.5=163 ms, p99.9=182 ms, max=258 ms
  write: IOPS=4666, BW=18669KiB/s, clat avg=27.6 ms, p50=16 ms
  p90=86 ms, p95=94 ms, p99=157 ms, p99.5=165 ms, p99.9=184 ms, max=261 ms

  Run#2 (pid=5621)
  read: IOPS=4888, BW=19557KiB/s, clat avg=25.1 ms, p50=14 ms
  p90=83 ms, p95=93 ms, p99=157 ms, p99.5=165 ms, p99.9=182 ms, max=301 ms
  write: IOPS=4887, BW=19555KiB/s, clat avg=26.3 ms, p50=15 ms
  p90=84 ms, p95=94 ms, p99=159 ms, p99.5=167 ms, p99.9=205 ms, max=308 ms

  Run#3 (pid=7179)
  read: IOPS=4521, BW=18085KiB/s, clat avg=27.1 ms, p50=15 ms
  p90=86 ms, p95=94 ms, p99=157 ms, p99.5=167 ms, p99.9=230 ms, max=310 ms
  write: IOPS=4521, BW=18088KiB/s, clat avg=28.5 ms, p50=16 ms
  p90=87 ms, p95=95 ms, p99=161 ms, p99.5=169 ms, p99.9=234 ms, max=310 ms

  Average(3 times)
  read: IOPS=4693, BW=18773KiB/s, clat avg=26.2 ms, p50=15 ms
  p90=85 ms, p95=93 ms, p99=156 ms, p99.5=165 ms, p99.9=198 ms, max=290 ms
  write: IOPS=4691, BW=18771KiB/s, clat avg=27.5 ms, p50=16 ms
  p90=86 ms, p95=94 ms, p99=159 ms, p99.5=167 ms, p99.9=208 ms, max=293 ms

With the patch applied, fio randrw 4k performance improves
substantially versus the baseline (rc6): throughput increases
from ~3.1k IOPS (~12.3 MiB/s) to ~4.7–4.8k IOPS (~18.8–19.1 MiB/s),
and average clat drops from ~39–43 ms to ~26–28 ms. High-percentile
latency also improves (p95 ~150 ms -> ~93–94 ms; p99.9 ~230 ms -> ~180–200 ms).
The only regression is a slightly higher median latency (p50 ~12–14 ms -> ~14–16 ms).

Based on the fio results, performance improved after applying the
patch on my setup. I haven't confirmed whether the same improvement
holds in other environments. My test hardware and commands are shown
above. Note that a simple dd throughput test did not show any speed
difference.

I haven't fully figured out the reason for the speed difference.
Please let me know if you need other testing methods.

thanks,

Xu Rao

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2026-02-02 13:30 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-27  2:34 [PATCH v11 0/2] usb: xhci: route device to secondary interrupters raoxu
2026-01-27  2:38 ` [PATCH v11 1/2] usb: xhci: refactor IRQ/interrupter plumbing for multi-vector support raoxu
2026-01-27 12:54   ` Neronin, Niklas
2026-01-27  2:39 ` [PATCH v11 2/2] usb: xhci: enable secondary interrupters and route transfers per slot raoxu
2026-01-27  7:39   ` Greg KH
2026-01-27 10:55     ` raoxu
2026-01-27 11:04       ` Greg KH
2026-01-28  8:09     ` raoxu
2026-01-28  8:35       ` Greg KH
2026-01-29 14:22       ` Michal Pecio
2026-01-29 19:43         ` Mathias Nyman
2026-01-29 20:03           ` Kenneth Crudup
2026-01-30  3:48           ` raoxu
2026-01-30  5:34             ` Greg KH
2026-02-02 13:29               ` [PATCH v12 " raoxu
2026-01-27 12:57   ` [PATCH v11 " Neronin, Niklas
2026-01-27  7:33 ` [PATCH v11 0/2] usb: xhci: route device to secondary interrupters Greg KH
2026-01-28 13:19 ` Kenneth Crudup

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox