From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Joe Lawrence <joe.lawrence@stratus.com>,
Mathias Nyman <mathias.nyman@linux.intel.com>
Subject: [PATCH 4.6 27/31] xhci: Fix handling timeouted commands on hosts in weird states.
Date: Wed, 6 Jul 2016 18:19:17 -0700 [thread overview]
Message-ID: <20160707011558.615027047@linuxfoundation.org> (raw)
In-Reply-To: <20160707011557.518104444@linuxfoundation.org>
4.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mathias Nyman <mathias.nyman@linux.intel.com>
commit 3425aa03f484d45dc21e0e791c2f6c74ea656421 upstream.
If commands timeout we mark them for abortion, then stop the command
ring, and turn the commands to no-ops and finally restart the command
ring.
If the host is working properly the no-op commands will finish and
pending completions are called.
If we notice the host is failing, driver clears the command ring and
completes, deletes and frees all pending commands.
There are two separate cases reported where host is believed to work
properly but is not. In the first case we successfully stop the ring
but no abort or stop command ring event is ever sent and host locks up.
The second case is if a host is removed, command times out and driver
believes the ring is stopped, and assumes it will be restarted, but
actually ends up timing out on the same command forever.
If one of the pending commands has the xhci->mutex held it will block
xhci_stop() in the remove codepath which otherwise would cleanup pending
commands.
Add a check that clears all pending commands in case host is removed,
or we are stuck timing out on the same command. Also restart the
command timeout timer when stopping the command ring to ensure we
recive an ring stop/abort event.
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/usb/host/xhci-ring.c | 27 ++++++++++++++++++++++-----
1 file changed, 22 insertions(+), 5 deletions(-)
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -290,6 +290,14 @@ static int xhci_abort_cmd_ring(struct xh
temp_64 = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
xhci->cmd_ring_state = CMD_RING_STATE_ABORTED;
+
+ /*
+ * Writing the CMD_RING_ABORT bit should cause a cmd completion event,
+ * however on some host hw the CMD_RING_RUNNING bit is correctly cleared
+ * but the completion event in never sent. Use the cmd timeout timer to
+ * handle those cases. Use twice the time to cover the bit polling retry
+ */
+ mod_timer(&xhci->cmd_timer, jiffies + (2 * XHCI_CMD_DEFAULT_TIMEOUT));
xhci_write_64(xhci, temp_64 | CMD_RING_ABORT,
&xhci->op_regs->cmd_ring);
@@ -314,6 +322,7 @@ static int xhci_abort_cmd_ring(struct xh
xhci_err(xhci, "Stopped the command ring failed, "
"maybe the host is dead\n");
+ del_timer(&xhci->cmd_timer);
xhci->xhc_state |= XHCI_STATE_DYING;
xhci_quiesce(xhci);
xhci_halt(xhci);
@@ -1253,22 +1262,21 @@ void xhci_handle_command_timeout(unsigne
int ret;
unsigned long flags;
u64 hw_ring_state;
- struct xhci_command *cur_cmd = NULL;
+ bool second_timeout = false;
xhci = (struct xhci_hcd *) data;
/* mark this command to be cancelled */
spin_lock_irqsave(&xhci->lock, flags);
if (xhci->current_cmd) {
- cur_cmd = xhci->current_cmd;
- cur_cmd->status = COMP_CMD_ABORT;
+ if (xhci->current_cmd->status == COMP_CMD_ABORT)
+ second_timeout = true;
+ xhci->current_cmd->status = COMP_CMD_ABORT;
}
-
/* Make sure command ring is running before aborting it */
hw_ring_state = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
if ((xhci->cmd_ring_state & CMD_RING_STATE_RUNNING) &&
(hw_ring_state & CMD_RING_RUNNING)) {
-
spin_unlock_irqrestore(&xhci->lock, flags);
xhci_dbg(xhci, "Command timeout\n");
ret = xhci_abort_cmd_ring(xhci);
@@ -1280,6 +1288,15 @@ void xhci_handle_command_timeout(unsigne
}
return;
}
+
+ /* command ring failed to restart, or host removed. Bail out */
+ if (second_timeout || xhci->xhc_state & XHCI_STATE_REMOVING) {
+ spin_unlock_irqrestore(&xhci->lock, flags);
+ xhci_dbg(xhci, "command timed out twice, ring start fail?\n");
+ xhci_cleanup_command_queue(xhci);
+ return;
+ }
+
/* command timeout on stopped ring, ring can't be aborted */
xhci_dbg(xhci, "Command timeout on stopped ring\n");
xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd);
next prev parent reply other threads:[~2016-07-07 1:19 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-07 1:18 [PATCH 4.6 00/31] 4.6.4-stable review Greg Kroah-Hartman
2016-07-07 1:18 ` [PATCH 4.6 01/31] net_sched: fix pfifo_head_drop behavior vs backlog Greg Kroah-Hartman
2016-07-07 1:18 ` [PATCH 4.6 02/31] act_ipt: fix a bind refcnt leak Greg Kroah-Hartman
2016-07-07 1:18 ` [PATCH 4.6 03/31] net: Dont forget pr_fmt on net_dbg_ratelimited for CONFIG_DYNAMIC_DEBUG Greg Kroah-Hartman
2016-07-07 1:18 ` [PATCH 4.6 04/31] sit: correct IP protocol used in ipip6_err Greg Kroah-Hartman
2016-07-07 1:18 ` [PATCH 4.6 05/31] kcm: fix /proc memory leak Greg Kroah-Hartman
2016-07-07 1:18 ` [PATCH 4.6 06/31] esp: Fix ESN generation under UDP encapsulation Greg Kroah-Hartman
2016-07-07 1:18 ` [PATCH 4.6 07/31] netem: fix a use after free Greg Kroah-Hartman
2016-07-07 1:18 ` [PATCH 4.6 08/31] ipmr/ip6mr: Initialize the last assert time of mfc entries Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 10/31] sock_diag: do not broadcast raw socket destruction Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 11/31] bpf, perf: delay release of BPF prog after grace period Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 12/31] neigh: Explicitly declare RCU-bh read side critical section in neigh_xmit() Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 13/31] AX.25: Close socket connection on session completion Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 14/31] crypto: vmx - Increase priority of aes-cbc cipher Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 15/31] crypto: ux500 - memmove the right size Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 16/31] crypto: user - re-add size check for CRYPTO_MSG_GETALG Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 17/31] USB: uas: Fix slave queue_depth not being set Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 18/31] usb: quirks: Fix sorting Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 19/31] usb: quirks: Add no-lpm quirk for Acer C120 LED Projector Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 20/31] usb: musb: only restore devctl when session was set in backup Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 21/31] usb: musb: Stop bulk endpoint while queue is rotated Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 22/31] usb: musb: Ensure rx reinit occurs for shared_fifo endpoints Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 23/31] usb: musb: host: correct cppi dma channel for isoch transfer Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 24/31] xhci: Cleanup only when releasing primary hcd Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 25/31] usb: xhci-plat: properly handle probe deferral for devm_clk_get() Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 26/31] USB: xhci: Add broken streams quirk for Frescologic device id 1009 Greg Kroah-Hartman
2016-07-07 1:19 ` Greg Kroah-Hartman [this message]
2016-07-07 1:19 ` [PATCH 4.6 28/31] USB: mos7720: delete parport Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 29/31] usb: gadget: fix spinlock dead lock in gadgetfs Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 30/31] usb: host: ehci-tegra: Grab the correct UTMI pads reset Greg Kroah-Hartman
2016-07-07 1:19 ` [PATCH 4.6 31/31] usb: dwc3: exynos: Fix deferred probing storm Greg Kroah-Hartman
2016-07-07 12:39 ` [PATCH 4.6 00/31] 4.6.4-stable review Heinz Diehl
2016-07-07 19:13 ` Greg Kroah-Hartman
2016-07-07 13:31 ` Guenter Roeck
2016-07-08 3:45 ` Shuah Khan
2016-07-09 5:13 ` Greg Kroah-Hartman
[not found] ` <577ffc00.48371c0a.49d2e.ffff82fc@mx.google.com>
[not found] ` <7hwpks6iwc.fsf@baylibre.com>
2016-07-11 20:29 ` Guenter Roeck
2016-07-12 4:16 ` Kevin Hilman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160707011558.615027047@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=joe.lawrence@stratus.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathias.nyman@linux.intel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).