From: Michal Pecio <michal.pecio@gmail.com>
To: Niklas Neronin <niklas.neronin@linux.intel.com>
Cc: mathias.nyman@linux.intel.com, linux-usb@vger.kernel.org
Subject: [PATCH RFC] xhci: fix Command Aborting, diffferently
Date: Tue, 31 Mar 2026 13:46:57 +0200 [thread overview]
Message-ID: <20260331134657.1aea50b3.michal.pecio@gmail.com> (raw)
In-Reply-To: <20260316142720.1471906-2-niklas.neronin@linux.intel.com>
OK, so here's the alternative approach I suggested: don't try to avoid
CRCR corruption but simply fix it later.
Motivation:
1. Inserting udelay() between low and high DWORD write reliably
corrupts internal CR Dequeue pointer on some HW.
2. It's unclear if such corruption happens in the field, but surely
nobody will prove that it can't.
3. Low DWORD gets corrupted even if both DWORDs are written with
their proper values on some HW.
4. It is indeed a violation of 5.1 to write high DWORD without writing
the low DWORD immediately prior. The earlier abort doesn't count.
5. Delaying high DWORD until we are sure low DWORD wasn't enough slows
us down on some HW (and currently disables IRQs for a long time).
6. Need to support 64 bit HCs on systems without writeq() somehow.
Solution:
We can't just write arbitrary deq pointer to CRCR, but we can create
a single-TRB segment which links to our arbitrary command.
TODO:
Make sure new deq and cycle are correct under all circumstances.
Don't panic if we get an event on the trampoline TRB (can it happen?)
Be sure nothing weird happens due to unexpected preexisting bugs -
that whole command management logic is a little scary.
Potential other use:
Avoid the need for wiping the command ring on suspend/resume. Allow
commands to survive the cycle and execute later. Unsure if useful.
Regards,
Michal
---
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 98ef014c8dee..2dabf295a8bc 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -486,6 +486,28 @@ static void xhci_handle_stopped_cmd_ring(struct xhci_hcd *xhci,
}
}
+/*
+ * CRCR doesn't support setting arbitrary Dequeue, so write it
+ * to a suitably aligned Link TRB and point the ring there.
+ * The ring must be stopped and not restarted until this returns.
+ */
+static void xhci_set_cr_deq(struct xhci_hcd *xhci, dma_addr_t deq, u32 cycle)
+{
+ u64 cr;
+ dma_addr_t trampoline_dma = xhci->dcbaa->dma +
+ offsetof(struct xhci_device_context_array, cr_trampoline);
+
+ xhci->dcbaa->cr_trampoline.segment_ptr = deq;
+ xhci->dcbaa->cr_trampoline.control = cpu_to_le32(TRB_TYPE(TRB_LINK) | cycle);
+
+ cr = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
+ cr &= CMD_RING_RSVDP;
+ cr |= trampoline_dma | cycle;
+ xhci_write_64(xhci, cr, &xhci->op_regs->cmd_ring);
+
+ xhci_info(xhci, "xhci_set_cr_deq deq %llx\n", deq);
+}
+
/* Must be called with xhci->lock held, releases and acquires lock back */
static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long flags)
{
@@ -493,6 +515,7 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long flags)
union xhci_trb *new_deq = xhci->cmd_ring->dequeue;
u64 crcr;
int ret;
+ u64 ns1, ns2;
xhci_dbg(xhci, "Abort command ring\n");
@@ -502,16 +525,15 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long flags)
* The control bits like command stop, abort are located in lower
* dword of the command ring control register.
* Some controllers require all 64 bits to be written to abort the ring.
- * Make sure the upper dword is valid, pointing to the next command,
- * avoiding corrupting the command ring pointer in case the command ring
- * is stopped by the time the upper dword is written.
+ * This may corrupt *both* DWORDS of Dequeue if high DWORD write is seen
+ * after CRR=0 by some other controllers. We will fix it up.
*/
next_trb(&new_seg, &new_deq);
if (trb_is_link(new_deq))
next_trb(&new_seg, &new_deq);
-
crcr = xhci_trb_virt_to_dma(new_seg, new_deq);
- xhci_write_64(xhci, crcr | CMD_RING_ABORT, &xhci->op_regs->cmd_ring);
+
+ xhci_write_64(xhci, CMD_RING_ABORT, &xhci->op_regs->cmd_ring);
/* Section 4.6.1.2 of xHCI 1.0 spec says software should also time the
* completion of the Command Abort operation. If CRR is not negated in 5
@@ -519,14 +541,21 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long flags)
* In the future we should distinguish between -ENODEV and -ETIMEDOUT
* and try to recover a -ETIMEDOUT with a host controller reset.
*/
+ ns1 = ktime_get_ns();
ret = xhci_handshake(&xhci->op_regs->cmd_ring,
CMD_RING_RUNNING, 0, 5 * 1000 * 1000);
+ ns2 = ktime_get_ns();
+ xhci_info(xhci, "abort took %lldns\n", ns2 - ns1);
if (ret < 0) {
xhci_err(xhci, "Abort failed to stop command ring: %d\n", ret);
xhci_halt(xhci);
xhci_hc_died(xhci);
return ret;
}
+
+ /* fixme: cycle may be wrong, not sure about crcr */
+ xhci_set_cr_deq(xhci, crcr, xhci->cmd_ring->cycle_state);
+
/*
* Writing the CMD_RING_ABORT bit should cause a cmd completion event,
* however on some host hw the CMD_RING_RUNNING bit is correctly cleared
@@ -1817,6 +1846,7 @@ static void handle_cmd_completion(struct xhci_hcd *xhci,
trace_xhci_handle_command(xhci->cmd_ring, &cmd_trb->generic, cmd_dma);
cmd_comp_code = GET_COMP_CODE(le32_to_cpu(event->status));
+ xhci_info(xhci, "handle_cmd_completion cmd_dma %llx cmd_comp_code %d\n", cmd_dma, cmd_comp_code);
/* If CMD ring stopped we own the trbs between enqueue and dequeue */
if (cmd_comp_code == COMP_COMMAND_RING_STOPPED) {
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 32617dc155ac..40069a2c7bf9 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -193,6 +193,7 @@ struct xhci_op_regs {
#define CMD_RING_ABORT (1 << 2)
/* true: command ring is running */
#define CMD_RING_RUNNING (1 << 3)
+#define CMD_RING_RSVDP GENMASK_ULL(5, 4)
/* bits 63:6 - Command Ring pointer */
#define CMD_RING_PTR_MASK GENMASK_ULL(63, 6)
@@ -789,22 +790,6 @@ struct xhci_tt_bw_info {
};
-/**
- * struct xhci_device_context_array
- * @dev_context_ptr array of 64-bit DMA addresses for device contexts
- */
-struct xhci_device_context_array {
- /* 64-bit device addresses; we only write 32-bit addresses */
- __le64 dev_context_ptrs[MAX_HC_SLOTS];
- /* private xHCD pointers */
- dma_addr_t dma;
-};
-/*
- * TODO: change this to be dynamically sized at HC mem init time since the HC
- * might not be able to handle the maximum number of devices possible.
- */
-
-
struct xhci_transfer_event {
/* 64-bit buffer address, or immediate data */
__le64 buffer;
@@ -956,6 +941,24 @@ struct xhci_link_trb {
/* control bitfields */
#define LINK_TOGGLE (0x1<<1)
+/**
+ * struct xhci_device_context_array
+ * @dev_context_ptr array of 64-bit DMA addresses for device contexts
+ */
+struct xhci_device_context_array {
+ /* 64-bit device addresses; we only write 32-bit addresses */
+ __le64 dev_context_ptrs[MAX_HC_SLOTS];
+ /* a Link TRB to jump into arbitrary command ring slot */
+ struct xhci_link_trb cr_trampoline __aligned(sizeof(64));
+ /* private xHCD pointers */
+ dma_addr_t dma;
+};
+/*
+ * TODO: change this to be dynamically sized at HC mem init time since the HC
+ * might not be able to handle the maximum number of devices possible.
+ */
+
+
/* Command completion event TRB */
struct xhci_event_cmd {
/* Pointer to command TRB, or the value passed by the event data trb */
next prev parent reply other threads:[~2026-03-31 11:47 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-16 14:27 [PATCH 0/2] usb: xhci: correct Command Abort bit handling Niklas Neronin
2026-03-16 14:27 ` [PATCH 1/2] usb: xhci: fix Command Aborting Niklas Neronin
2026-03-21 13:30 ` Michal Pecio
2026-03-23 10:25 ` Neronin, Niklas
2026-03-23 11:24 ` Michal Pecio
2026-03-23 14:00 ` Neronin, Niklas
2026-03-23 21:40 ` Michal Pecio
2026-03-31 11:46 ` Michal Pecio [this message]
2026-03-16 14:27 ` [PATCH 2/2] usb: xhci: use writeq() for CA write on 64-bit architectures Niklas Neronin
2026-03-22 11:36 ` Michal Pecio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260331134657.1aea50b3.michal.pecio@gmail.com \
--to=michal.pecio@gmail.com \
--cc=linux-usb@vger.kernel.org \
--cc=mathias.nyman@linux.intel.com \
--cc=niklas.neronin@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox