linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] xhci fixes for usb-linus
@ 2025-11-07 16:28 Mathias Nyman
  2025-11-07 16:28 ` [PATCH 1/3] xhci: fix stale flag preventig URBs after link state error is cleared Mathias Nyman
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Mathias Nyman @ 2025-11-07 16:28 UTC (permalink / raw)
  To: gregkh; +Cc: linux-usb, Mathias Nyman

Hi Greg

A few more fixes for usb-linus 6.18-rc
Includes the xhci sideband locking fix and a DbC data corruption fix

Thanks
Mathias

Mathias Nyman (3):
  xhci: fix stale flag preventig URBs after link state error is cleared
  xhci: dbgtty: Fix data corruption when transmitting data form DbC to
    host
  xhci: sideband: Fix race condition in sideband unregister

 drivers/usb/host/xhci-dbgcap.h   |   1 +
 drivers/usb/host/xhci-dbgtty.c   |  17 +++++-
 drivers/usb/host/xhci-ring.c     |  15 +++--
 drivers/usb/host/xhci-sideband.c | 102 ++++++++++++++++++-------------
 drivers/usb/host/xhci.c          |   1 +
 5 files changed, 86 insertions(+), 50 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] xhci: fix stale flag preventig URBs after link state error is cleared
  2025-11-07 16:28 [PATCH 0/3] xhci fixes for usb-linus Mathias Nyman
@ 2025-11-07 16:28 ` Mathias Nyman
  2025-11-07 16:28 ` [PATCH 2/3] xhci: dbgtty: Fix data corruption when transmitting data form DbC to host Mathias Nyman
  2025-11-07 16:28 ` [PATCH 3/3] xhci: sideband: Fix race condition in sideband unregister Mathias Nyman
  2 siblings, 0 replies; 4+ messages in thread
From: Mathias Nyman @ 2025-11-07 16:28 UTC (permalink / raw)
  To: gregkh; +Cc: linux-usb, Mathias Nyman, stable

A usb device caught behind a link in ss.Inactive error state needs to
be reset to recover. A VDEV_PORT_ERROR flag is used to track this state,
preventing new transfers from being queued until error is cleared.

This flag may be left uncleared if link goes to error state between two
resets, and print the following message:

"xhci_hcd 0000:00:14.0: Can't queue urb, port error, link inactive"

Fix setting and clearing the flag.

The flag is cleared after hub driver has successfully reset the device
when hcd->reset_device is called. xhci-hcd issues an internal "reset
device" command in this callback, and clear all flags once the command
completes successfully.

This command may complete with a context state error if slot was recently
reset and is already in the defauilt state. This is treated as a success
but flag was left uncleared.

The link state field is also unreliable if port is currently in reset,
so don't set the flag in active reset cases.
Also clear the flag immediately when link is no longer in ss.Inactive
state and port event handler detects a completed reset.

This issue was discovered while debugging kernel bugzilla issue 220491.
It is likely one small part of the problem, causing some of the failures,
but root cause remains unknown

Link: https://bugzilla.kernel.org/show_bug.cgi?id=220491
Fixes: b8c3b718087b ("usb: xhci: Don't try to recover an endpoint if port is in error state.")
Cc: stable@vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
---
 drivers/usb/host/xhci-ring.c | 15 ++++++++++-----
 drivers/usb/host/xhci.c      |  1 +
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 8e209aa33ea7..5bdcf9ab2b99 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1985,6 +1985,7 @@ static void xhci_cavium_reset_phy_quirk(struct xhci_hcd *xhci)
 
 static void handle_port_status(struct xhci_hcd *xhci, union xhci_trb *event)
 {
+	struct xhci_virt_device *vdev = NULL;
 	struct usb_hcd *hcd;
 	u32 port_id;
 	u32 portsc, cmd_reg;
@@ -2016,6 +2017,9 @@ static void handle_port_status(struct xhci_hcd *xhci, union xhci_trb *event)
 		goto cleanup;
 	}
 
+	if (port->slot_id)
+		vdev = xhci->devs[port->slot_id];
+
 	/* We might get interrupts after shared_hcd is removed */
 	if (port->rhub == &xhci->usb3_rhub && xhci->shared_hcd == NULL) {
 		xhci_dbg(xhci, "ignore port event for removed USB3 hcd\n");
@@ -2038,10 +2042,11 @@ static void handle_port_status(struct xhci_hcd *xhci, union xhci_trb *event)
 		usb_hcd_resume_root_hub(hcd);
 	}
 
-	if (hcd->speed >= HCD_USB3 &&
-	    (portsc & PORT_PLS_MASK) == XDEV_INACTIVE) {
-		if (port->slot_id && xhci->devs[port->slot_id])
-			xhci->devs[port->slot_id]->flags |= VDEV_PORT_ERROR;
+	if (vdev && (portsc & PORT_PLS_MASK) == XDEV_INACTIVE) {
+		if (!(portsc & PORT_RESET))
+			vdev->flags |= VDEV_PORT_ERROR;
+	} else if (vdev && portsc & PORT_RC) {
+		vdev->flags &= ~VDEV_PORT_ERROR;
 	}
 
 	if ((portsc & PORT_PLC) && (portsc & PORT_PLS_MASK) == XDEV_RESUME) {
@@ -2099,7 +2104,7 @@ static void handle_port_status(struct xhci_hcd *xhci, union xhci_trb *event)
 		 * so the roothub behavior is consistent with external
 		 * USB 3.0 hub behavior.
 		 */
-		if (port->slot_id && xhci->devs[port->slot_id])
+		if (vdev)
 			xhci_ring_device(xhci, port->slot_id);
 		if (bus_state->port_remote_wakeup & (1 << hcd_portnum)) {
 			xhci_test_and_clear_bit(xhci, port, PORT_PLC);
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 0cb45b95e4f5..a148a1280126 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4007,6 +4007,7 @@ static int xhci_discover_or_reset_device(struct usb_hcd *hcd,
 				xhci_get_slot_state(xhci, virt_dev->out_ctx));
 		xhci_dbg(xhci, "Not freeing device rings.\n");
 		/* Don't treat this as an error.  May change my mind later. */
+		virt_dev->flags = 0;
 		ret = 0;
 		goto command_cleanup;
 	case COMP_SUCCESS:
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/3] xhci: dbgtty: Fix data corruption when transmitting data form DbC to host
  2025-11-07 16:28 [PATCH 0/3] xhci fixes for usb-linus Mathias Nyman
  2025-11-07 16:28 ` [PATCH 1/3] xhci: fix stale flag preventig URBs after link state error is cleared Mathias Nyman
@ 2025-11-07 16:28 ` Mathias Nyman
  2025-11-07 16:28 ` [PATCH 3/3] xhci: sideband: Fix race condition in sideband unregister Mathias Nyman
  2 siblings, 0 replies; 4+ messages in thread
From: Mathias Nyman @ 2025-11-07 16:28 UTC (permalink / raw)
  To: gregkh; +Cc: linux-usb, Mathias Nyman, stable, Łukasz Bartosik

Data read from a DbC device may be corrupted due to a race between
ongoing write and write request completion handler both queuing new
transfer blocks (TRBs) if there are remining data in the kfifo.

TRBs may be in incorrct order compared to the data in the kfifo.

Driver fails to keep lock between reading data from kfifo into a
dbc request buffer, and queuing the request to the transfer ring.

This allows completed request to re-queue itself in the middle of
an ongoing transfer loop, forcing itself between a kfifo read and
request TRB write of another request

cpu0					cpu1 (re-queue completed req2)

lock(port_lock)
dbc_start_tx()
kfifo_out(fifo, req1->buffer)
unlock(port_lock)
					lock(port_lock)
					dbc_write_complete(req2)
					dbc_start_tx()
      					kfifo_out(fifo, req2->buffer)
					unlock(port_lock)
					lock(port_lock)
					req2->trb = ring->enqueue;
					ring->enqueue++
					unlock(port_lock)
lock(port_lock)
req1->trb = ring->enqueue;
ring->enqueue++
unlock(port_lock)

In the above scenario a kfifo containing "12345678" would read "1234" to
req1 and "5678" to req2, but req2 is queued before req1 leading to
data being transmitted as "56781234"

Solve this by adding a flag that prevents starting a new tx if we
are already mid dbc_start_tx() during the unlocked part.

The already running dbc_do_start_tx() will make sure the newly completed
request gets re-queued as it is added to the request write_pool while
holding the lock.

Cc: stable@vger.kernel.org
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Tested-by: Łukasz Bartosik <ukaszb@chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
---
 drivers/usb/host/xhci-dbgcap.h |  1 +
 drivers/usb/host/xhci-dbgtty.c | 17 ++++++++++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-dbgcap.h b/drivers/usb/host/xhci-dbgcap.h
index 47ac72c2286d..5426c971d2d3 100644
--- a/drivers/usb/host/xhci-dbgcap.h
+++ b/drivers/usb/host/xhci-dbgcap.h
@@ -114,6 +114,7 @@ struct dbc_port {
 	unsigned int			tx_boundary;
 
 	bool				registered;
+	bool				tx_running;
 };
 
 struct dbc_driver {
diff --git a/drivers/usb/host/xhci-dbgtty.c b/drivers/usb/host/xhci-dbgtty.c
index d894081d8d15..b7f95565524d 100644
--- a/drivers/usb/host/xhci-dbgtty.c
+++ b/drivers/usb/host/xhci-dbgtty.c
@@ -47,7 +47,7 @@ dbc_kfifo_to_req(struct dbc_port *port, char *packet)
 	return len;
 }
 
-static int dbc_start_tx(struct dbc_port *port)
+static int dbc_do_start_tx(struct dbc_port *port)
 	__releases(&port->port_lock)
 	__acquires(&port->port_lock)
 {
@@ -57,6 +57,8 @@ static int dbc_start_tx(struct dbc_port *port)
 	bool			do_tty_wake = false;
 	struct list_head	*pool = &port->write_pool;
 
+	port->tx_running = true;
+
 	while (!list_empty(pool)) {
 		req = list_entry(pool->next, struct dbc_request, list_pool);
 		len = dbc_kfifo_to_req(port, req->buf);
@@ -77,12 +79,25 @@ static int dbc_start_tx(struct dbc_port *port)
 		}
 	}
 
+	port->tx_running = false;
+
 	if (do_tty_wake && port->port.tty)
 		tty_wakeup(port->port.tty);
 
 	return status;
 }
 
+/* must be called with port->port_lock held */
+static int dbc_start_tx(struct dbc_port *port)
+{
+	lockdep_assert_held(&port->port_lock);
+
+	if (port->tx_running)
+		return -EBUSY;
+
+	return dbc_do_start_tx(port);
+}
+
 static void dbc_start_rx(struct dbc_port *port)
 	__releases(&port->port_lock)
 	__acquires(&port->port_lock)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/3] xhci: sideband: Fix race condition in sideband unregister
  2025-11-07 16:28 [PATCH 0/3] xhci fixes for usb-linus Mathias Nyman
  2025-11-07 16:28 ` [PATCH 1/3] xhci: fix stale flag preventig URBs after link state error is cleared Mathias Nyman
  2025-11-07 16:28 ` [PATCH 2/3] xhci: dbgtty: Fix data corruption when transmitting data form DbC to host Mathias Nyman
@ 2025-11-07 16:28 ` Mathias Nyman
  2 siblings, 0 replies; 4+ messages in thread
From: Mathias Nyman @ 2025-11-07 16:28 UTC (permalink / raw)
  To: gregkh; +Cc: linux-usb, Mathias Nyman, Uttkarsh Aggarwal

Uttkarsh Aggarwal observed a kernel panic during sideband un-register
and found it was caused by a race condition between sideband unregister,
and creating sideband interrupters.
The issue occurrs when thread T1 runs uaudio_disconnect() and released
sb->xhci via sideband_unregister, while thread T2 simultaneously accessed
the now-NULL sb->xhci in xhci_sideband_create_interrupter() resulting in
a crash.

Ensure new endpoints or interrupter can't be added to a sidenband after
xhci_sideband_unregister() cleared the existing ones, and unlocked the
sideband mutex.
Reorganize code so that mutex is only taken and released once in
xhci_sideband_unregister(), and clear sb->vdev while mutex is taken.

Use mutex guards to reduce human unlock errors in code

Refuse to add endpoints or interrupter if sb->vdev is not set.
sb->vdev is set when sideband is created and registered.

Reported-by: Uttkarsh Aggarwal <uttkarsh.aggarwal@oss.qualcomm.com>
Closes: https://lore.kernel.org/linux-usb/20251028080043.27760-1-uttkarsh.aggarwal@oss.qualcomm.com
Fixes: de66754e9f80 ("xhci: sideband: add initial api to register a secondary interrupter entity")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
---
 drivers/usb/host/xhci-sideband.c | 102 ++++++++++++++++++-------------
 1 file changed, 58 insertions(+), 44 deletions(-)

diff --git a/drivers/usb/host/xhci-sideband.c b/drivers/usb/host/xhci-sideband.c
index e771a476fef2..a85f62a73313 100644
--- a/drivers/usb/host/xhci-sideband.c
+++ b/drivers/usb/host/xhci-sideband.c
@@ -73,9 +73,12 @@ xhci_ring_to_sgtable(struct xhci_sideband *sb, struct xhci_ring *ring)
 	return NULL;
 }
 
+/* Caller must hold sb->mutex */
 static void
 __xhci_sideband_remove_endpoint(struct xhci_sideband *sb, struct xhci_virt_ep *ep)
 {
+	lockdep_assert_held(&sb->mutex);
+
 	/*
 	 * Issue a stop endpoint command when an endpoint is removed.
 	 * The stop ep cmd handler will handle the ring cleanup.
@@ -86,6 +89,25 @@ __xhci_sideband_remove_endpoint(struct xhci_sideband *sb, struct xhci_virt_ep *e
 	sb->eps[ep->ep_index] = NULL;
 }
 
+/* Caller must hold sb->mutex */
+static void
+__xhci_sideband_remove_interrupter(struct xhci_sideband *sb)
+{
+	struct usb_device *udev;
+
+	lockdep_assert_held(&sb->mutex);
+
+	if (!sb->ir)
+		return;
+
+	xhci_remove_secondary_interrupter(xhci_to_hcd(sb->xhci), sb->ir);
+	sb->ir = NULL;
+	udev = sb->vdev->udev;
+
+	if (udev->state != USB_STATE_NOTATTACHED)
+		usb_offload_put(udev);
+}
+
 /* sideband api functions */
 
 /**
@@ -131,14 +153,16 @@ xhci_sideband_add_endpoint(struct xhci_sideband *sb,
 	struct xhci_virt_ep *ep;
 	unsigned int ep_index;
 
-	mutex_lock(&sb->mutex);
+	guard(mutex)(&sb->mutex);
+
+	if (!sb->vdev)
+		return -ENODEV;
+
 	ep_index = xhci_get_endpoint_index(&host_ep->desc);
 	ep = &sb->vdev->eps[ep_index];
 
-	if (ep->ep_state & EP_HAS_STREAMS) {
-		mutex_unlock(&sb->mutex);
+	if (ep->ep_state & EP_HAS_STREAMS)
 		return -EINVAL;
-	}
 
 	/*
 	 * Note, we don't know the DMA mask of the audio DSP device, if its
@@ -148,14 +172,11 @@ xhci_sideband_add_endpoint(struct xhci_sideband *sb,
 	 * and let this function add the endpoint and allocate the ring buffer
 	 * with the smallest common DMA mask
 	 */
-	if (sb->eps[ep_index] || ep->sideband) {
-		mutex_unlock(&sb->mutex);
+	if (sb->eps[ep_index] || ep->sideband)
 		return -EBUSY;
-	}
 
 	ep->sideband = sb;
 	sb->eps[ep_index] = ep;
-	mutex_unlock(&sb->mutex);
 
 	return 0;
 }
@@ -180,18 +201,16 @@ xhci_sideband_remove_endpoint(struct xhci_sideband *sb,
 	struct xhci_virt_ep *ep;
 	unsigned int ep_index;
 
-	mutex_lock(&sb->mutex);
+	guard(mutex)(&sb->mutex);
+
 	ep_index = xhci_get_endpoint_index(&host_ep->desc);
 	ep = sb->eps[ep_index];
 
-	if (!ep || !ep->sideband || ep->sideband != sb) {
-		mutex_unlock(&sb->mutex);
+	if (!ep || !ep->sideband || ep->sideband != sb)
 		return -ENODEV;
-	}
 
 	__xhci_sideband_remove_endpoint(sb, ep);
 	xhci_initialize_ring_info(ep->ring);
-	mutex_unlock(&sb->mutex);
 
 	return 0;
 }
@@ -316,28 +335,25 @@ xhci_sideband_create_interrupter(struct xhci_sideband *sb, int num_seg,
 	if (!sb || !sb->xhci)
 		return -ENODEV;
 
-	mutex_lock(&sb->mutex);
-	if (sb->ir) {
-		ret = -EBUSY;
-		goto out;
-	}
+	guard(mutex)(&sb->mutex);
+
+	if (!sb->vdev)
+		return -ENODEV;
+
+	if (sb->ir)
+		return -EBUSY;
 
 	sb->ir = xhci_create_secondary_interrupter(xhci_to_hcd(sb->xhci),
 						   num_seg, imod_interval,
 						   intr_num);
-	if (!sb->ir) {
-		ret = -ENOMEM;
-		goto out;
-	}
+	if (!sb->ir)
+		return -ENOMEM;
 
 	udev = sb->vdev->udev;
 	ret = usb_offload_get(udev);
 
 	sb->ir->ip_autoclear = ip_autoclear;
 
-out:
-	mutex_unlock(&sb->mutex);
-
 	return ret;
 }
 EXPORT_SYMBOL_GPL(xhci_sideband_create_interrupter);
@@ -352,21 +368,12 @@ EXPORT_SYMBOL_GPL(xhci_sideband_create_interrupter);
 void
 xhci_sideband_remove_interrupter(struct xhci_sideband *sb)
 {
-	struct usb_device *udev;
-
-	if (!sb || !sb->ir)
+	if (!sb)
 		return;
 
-	mutex_lock(&sb->mutex);
-	xhci_remove_secondary_interrupter(xhci_to_hcd(sb->xhci), sb->ir);
-
-	sb->ir = NULL;
-	udev = sb->vdev->udev;
+	guard(mutex)(&sb->mutex);
 
-	if (udev->state != USB_STATE_NOTATTACHED)
-		usb_offload_put(udev);
-
-	mutex_unlock(&sb->mutex);
+	__xhci_sideband_remove_interrupter(sb);
 }
 EXPORT_SYMBOL_GPL(xhci_sideband_remove_interrupter);
 
@@ -465,6 +472,7 @@ EXPORT_SYMBOL_GPL(xhci_sideband_register);
 void
 xhci_sideband_unregister(struct xhci_sideband *sb)
 {
+	struct xhci_virt_device *vdev;
 	struct xhci_hcd *xhci;
 	int i;
 
@@ -473,17 +481,23 @@ xhci_sideband_unregister(struct xhci_sideband *sb)
 
 	xhci = sb->xhci;
 
-	mutex_lock(&sb->mutex);
-	for (i = 0; i < EP_CTX_PER_DEV; i++)
-		if (sb->eps[i])
-			__xhci_sideband_remove_endpoint(sb, sb->eps[i]);
-	mutex_unlock(&sb->mutex);
+	scoped_guard(mutex, &sb->mutex) {
+		vdev = sb->vdev;
+		if (!vdev)
+			return;
+
+		for (i = 0; i < EP_CTX_PER_DEV; i++)
+			if (sb->eps[i])
+				__xhci_sideband_remove_endpoint(sb, sb->eps[i]);
 
-	xhci_sideband_remove_interrupter(sb);
+		__xhci_sideband_remove_interrupter(sb);
+
+		sb->vdev = NULL;
+	}
 
 	spin_lock_irq(&xhci->lock);
 	sb->xhci = NULL;
-	sb->vdev->sideband = NULL;
+	vdev->sideband = NULL;
 	spin_unlock_irq(&xhci->lock);
 
 	kfree(sb);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-11-07 16:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-07 16:28 [PATCH 0/3] xhci fixes for usb-linus Mathias Nyman
2025-11-07 16:28 ` [PATCH 1/3] xhci: fix stale flag preventig URBs after link state error is cleared Mathias Nyman
2025-11-07 16:28 ` [PATCH 2/3] xhci: dbgtty: Fix data corruption when transmitting data form DbC to host Mathias Nyman
2025-11-07 16:28 ` [PATCH 3/3] xhci: sideband: Fix race condition in sideband unregister Mathias Nyman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).