linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>,
	Roger Quadros <rogerq@ti.com>,
	joel@jms.id.au, Mathias Nyman <mathias.nyman@linux.intel.com>
Subject: [PATCH 4.6 24/31] xhci: Cleanup only when releasing primary hcd
Date: Wed,  6 Jul 2016 18:19:14 -0700	[thread overview]
Message-ID: <20160707011558.486706027@linuxfoundation.org> (raw)
In-Reply-To: <20160707011557.518104444@linuxfoundation.org>

4.6-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>

commit 27a41a83ec54d0edfcaf079310244e7f013a7701 upstream.

Under stress occasions some TI devices might not return early when
reading the status register during the quirk invocation of xhci_irq made
by usb_hcd_pci_remove.  This means that instead of returning, we end up
handling this interruption in the middle of a shutdown.  Since
xhci->event_ring has already been freed in xhci_mem_cleanup, we end up
accessing freed memory, causing the Oops below.

commit 8c24d6d7b09d ("usb: xhci: stop everything on the first call to
xhci_stop") is the one that changed the instant in which we clean up the
event queue when stopping a device.  Before, we didn't call
xhci_mem_cleanup at the first time xhci_stop is executed (for the shared
HCD), instead, we only did it after the invocation for the primary HCD,
much later at the removal path.  The code flow for this oops looks like
this:

xhci_pci_remove()
	usb_remove_hcd(xhci->shared)
	        xhci_stop(xhci->shared)
 			xhci_halt()
			xhci_mem_cleanup(xhci);  // Free the event_queue
	usb_hcd_pci_remove(primary)
		xhci_irq()  // Access the event_queue if STS_EINT is set. Crash.
		xhci_stop()
			xhci_halt()
			// return early

The fix modifies xhci_stop to only cleanup the xhci data when releasing
the primary HCD.  This way, we still have the event_queue configured
when invoking xhci_irq.  We still halt the device on the first call to
xhci_stop, though.

I could reproduce this issue several times on the mainline kernel by
doing a bind-unbind stress test with a specific storage gadget attached.
I also ran the same test over-night with my patch applied and didn't
observe the issue anymore.

[  113.334124] Unable to handle kernel paging request for data at address 0x00000028
[  113.335514] Faulting instruction address: 0xd00000000d4f767c
[  113.336839] Oops: Kernel access of bad area, sig: 11 [#1]
[  113.338214] SMP NR_CPUS=1024 NUMA PowerNV

[c000000efe47ba90] c000000000720850 usb_hcd_irq+0x50/0x80
[c000000efe47bac0] c00000000073d328 usb_hcd_pci_remove+0x68/0x1f0
[c000000efe47bb00] d00000000daf0128 xhci_pci_remove+0x78/0xb0
[xhci_pci]
[c000000efe47bb30] c00000000055cf70 pci_device_remove+0x70/0x110
[c000000efe47bb70] c00000000061c6bc __device_release_driver+0xbc/0x190
[c000000efe47bba0] c00000000061c7d0 device_release_driver+0x40/0x70
[c000000efe47bbd0] c000000000619510 unbind_store+0x120/0x150
[c000000efe47bc20] c0000000006183c4 drv_attr_store+0x64/0xa0
[c000000efe47bc60] c00000000039f1d0 sysfs_kf_write+0x80/0xb0
[c000000efe47bca0] c00000000039e14c kernfs_fop_write+0x18c/0x1f0
[c000000efe47bcf0] c0000000002e962c __vfs_write+0x6c/0x190
[c000000efe47bd90] c0000000002eab40 vfs_write+0xc0/0x200
[c000000efe47bde0] c0000000002ec85c SyS_write+0x6c/0x110
[c000000efe47be30] c000000000009260 system_call+0x38/0x108

Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Cc: Roger Quadros <rogerq@ti.com>
Cc: joel@jms.id.au
Reviewed-by: Roger Quadros <rogerq@ti.com>
Tested-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/host/xhci-ring.c |    3 ++-
 drivers/usb/host/xhci.c      |   29 ++++++++++++++++-------------
 2 files changed, 18 insertions(+), 14 deletions(-)

--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2728,7 +2728,8 @@ hw_died:
 		writel(irq_pending, &xhci->ir_set->irq_pending);
 	}
 
-	if (xhci->xhc_state & XHCI_STATE_DYING) {
+	if (xhci->xhc_state & XHCI_STATE_DYING ||
+	    xhci->xhc_state & XHCI_STATE_HALTED) {
 		xhci_dbg(xhci, "xHCI dying, ignoring interrupt. "
 				"Shouldn't IRQs be disabled?\n");
 		/* Clear the event handler busy flag (RW1C);
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -685,20 +685,23 @@ void xhci_stop(struct usb_hcd *hcd)
 	u32 temp;
 	struct xhci_hcd *xhci = hcd_to_xhci(hcd);
 
-	if (xhci->xhc_state & XHCI_STATE_HALTED)
-		return;
-
 	mutex_lock(&xhci->mutex);
-	spin_lock_irq(&xhci->lock);
-	xhci->xhc_state |= XHCI_STATE_HALTED;
-	xhci->cmd_ring_state = CMD_RING_STATE_STOPPED;
-
-	/* Make sure the xHC is halted for a USB3 roothub
-	 * (xhci_stop() could be called as part of failed init).
-	 */
-	xhci_halt(xhci);
-	xhci_reset(xhci);
-	spin_unlock_irq(&xhci->lock);
+
+	if (!(xhci->xhc_state & XHCI_STATE_HALTED)) {
+		spin_lock_irq(&xhci->lock);
+
+		xhci->xhc_state |= XHCI_STATE_HALTED;
+		xhci->cmd_ring_state = CMD_RING_STATE_STOPPED;
+		xhci_halt(xhci);
+		xhci_reset(xhci);
+
+		spin_unlock_irq(&xhci->lock);
+	}
+
+	if (!usb_hcd_is_primary_hcd(hcd)) {
+		mutex_unlock(&xhci->mutex);
+		return;
+	}
 
 	xhci_cleanup_msix(xhci);
 

  parent reply	other threads:[~2016-07-07  1:19 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-07  1:18 [PATCH 4.6 00/31] 4.6.4-stable review Greg Kroah-Hartman
2016-07-07  1:18 ` [PATCH 4.6 01/31] net_sched: fix pfifo_head_drop behavior vs backlog Greg Kroah-Hartman
2016-07-07  1:18 ` [PATCH 4.6 02/31] act_ipt: fix a bind refcnt leak Greg Kroah-Hartman
2016-07-07  1:18 ` [PATCH 4.6 03/31] net: Dont forget pr_fmt on net_dbg_ratelimited for CONFIG_DYNAMIC_DEBUG Greg Kroah-Hartman
2016-07-07  1:18 ` [PATCH 4.6 04/31] sit: correct IP protocol used in ipip6_err Greg Kroah-Hartman
2016-07-07  1:18 ` [PATCH 4.6 05/31] kcm: fix /proc memory leak Greg Kroah-Hartman
2016-07-07  1:18 ` [PATCH 4.6 06/31] esp: Fix ESN generation under UDP encapsulation Greg Kroah-Hartman
2016-07-07  1:18 ` [PATCH 4.6 07/31] netem: fix a use after free Greg Kroah-Hartman
2016-07-07  1:18 ` [PATCH 4.6 08/31] ipmr/ip6mr: Initialize the last assert time of mfc entries Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 10/31] sock_diag: do not broadcast raw socket destruction Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 11/31] bpf, perf: delay release of BPF prog after grace period Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 12/31] neigh: Explicitly declare RCU-bh read side critical section in neigh_xmit() Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 13/31] AX.25: Close socket connection on session completion Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 14/31] crypto: vmx - Increase priority of aes-cbc cipher Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 15/31] crypto: ux500 - memmove the right size Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 16/31] crypto: user - re-add size check for CRYPTO_MSG_GETALG Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 17/31] USB: uas: Fix slave queue_depth not being set Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 18/31] usb: quirks: Fix sorting Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 19/31] usb: quirks: Add no-lpm quirk for Acer C120 LED Projector Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 20/31] usb: musb: only restore devctl when session was set in backup Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 21/31] usb: musb: Stop bulk endpoint while queue is rotated Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 22/31] usb: musb: Ensure rx reinit occurs for shared_fifo endpoints Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 23/31] usb: musb: host: correct cppi dma channel for isoch transfer Greg Kroah-Hartman
2016-07-07  1:19 ` Greg Kroah-Hartman [this message]
2016-07-07  1:19 ` [PATCH 4.6 25/31] usb: xhci-plat: properly handle probe deferral for devm_clk_get() Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 26/31] USB: xhci: Add broken streams quirk for Frescologic device id 1009 Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 27/31] xhci: Fix handling timeouted commands on hosts in weird states Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 28/31] USB: mos7720: delete parport Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 29/31] usb: gadget: fix spinlock dead lock in gadgetfs Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 30/31] usb: host: ehci-tegra: Grab the correct UTMI pads reset Greg Kroah-Hartman
2016-07-07  1:19 ` [PATCH 4.6 31/31] usb: dwc3: exynos: Fix deferred probing storm Greg Kroah-Hartman
2016-07-07 12:39 ` [PATCH 4.6 00/31] 4.6.4-stable review Heinz Diehl
2016-07-07 19:13   ` Greg Kroah-Hartman
2016-07-07 13:31 ` Guenter Roeck
2016-07-08  3:45 ` Shuah Khan
2016-07-09  5:13   ` Greg Kroah-Hartman
     [not found] ` <577ffc00.48371c0a.49d2e.ffff82fc@mx.google.com>
     [not found]   ` <7hwpks6iwc.fsf@baylibre.com>
2016-07-11 20:29     ` Guenter Roeck
2016-07-12  4:16       ` Kevin Hilman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160707011558.486706027@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=joel@jms.id.au \
    --cc=krisman@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathias.nyman@linux.intel.com \
    --cc=rogerq@ti.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).