public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Sarah Sharp <sarah.a.sharp@linux.intel.com>
To: walt <w41ter@gmail.com>
Cc: Alan Stern <stern@rowland.harvard.edu>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	stable@vger.kernel.org, David Laight <david.laight@aculab.com>,
	linux-usb@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst
Date: Fri, 3 Jan 2014 15:29:29 -0800	[thread overview]
Message-ID: <20140103232929.GD4193@xanatos> (raw)
In-Reply-To: <52C729CE.9050307@gmail.com>

On Fri, Jan 03, 2014 at 01:21:18PM -0800, walt wrote:
> I'm so sorry Sarah, that was another mistake.  The mistake is so stupid I'm not
> going to publish it here :(
> 
> Once I finally ran the kernel with debugging actually compiled in, dmesg contains
> xhci debugging messages.  Wow :)
> 
> It's a big file so I zipped and attached it, which I hope is acceptable in lkml.

Yep, that's fine.  Sticking it in pastebin (or up on your server) is
also fine, if it gets really big.

> BTW, this dmesg is from a kernel with sg_tablesize = 31, which as I said before
> doesn't fix the problem.  The cp stopped around 7GB just as before.
> 
> Sorry for the noise...

No worries! :)  With the dmesg, I can finally see what happened:

[  188.703059] xhci_hcd 0000:03:00.0: Cancel URB ffff8800b7d2e0c0, dev 1, ep 0x2, starting at offset 0xbb7b9000
[  188.703072] xhci_hcd 0000:03:00.0: // Ding dong!
[  193.711022] xhci_hcd 0000:03:00.0: xHCI host not responding to stop endpoint command.
[  193.711029] xhci_hcd 0000:03:00.0: Assuming host is dying, halting host.
[  193.711046] xhci_hcd 0000:03:00.0: // Halt the HC
[  193.711060] xhci_hcd 0000:03:00.0: Killing URBs for slot ID 1, ep index 0
[  193.711066] xhci_hcd 0000:03:00.0: Killing URBs for slot ID 1, ep index 2
[  193.711078] xhci_hcd 0000:03:00.0: Killing URBs for slot ID 1, ep index 3
[  193.711096] xhci_hcd 0000:03:00.0: Calling usb_hc_died()
[  193.711103] xhci_hcd 0000:03:00.0: HC died; cleaning up
[  193.711116] xhci_hcd 0000:03:00.0: xHCI host controller is dead.

It seems that the xHCI driver tried to stop the endpoint ring in order
to cancel a SCSI transfer, and the driver never got a response for that.

The offset is rather suspicious (0xbb7b9000), and it probably means the
driver attempted to cancel a transfer that had been moved to the
beginning of the ring segment, with no-op TRBs before the link TRB.

I suspect David's patch triggers a bug in the command cancellation code.
There's also the unlikely possibility that the no-op TRBs did indeed
cause the host to hang.  Either way, I'll have to look into it.

I'll let you know when I have some diagnostic patches ready.

Sarah Sharp

  reply	other threads:[~2014-01-03 23:29 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20131218211219.461663463@linuxfoundation.org>
     [not found] ` <20131218211220.412278148@linuxfoundation.org>
     [not found]   ` <52C32BB0.90600@gmail.com>
2014-01-02 19:15     ` [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst Sarah Sharp
2014-01-02 21:01       ` Mark Lord
     [not found]         ` <52C5D3A9.60708-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
2014-01-02 21:19           ` James Bottomley
2014-01-02 21:33         ` Sarah Sharp
2014-01-03 15:40       ` walt
2014-01-03 19:54         ` Sarah Sharp
2014-01-03 21:21           ` walt
2014-01-03 23:29             ` Sarah Sharp [this message]
2014-01-07  0:31               ` Sarah Sharp
2014-01-07 13:29                 ` walt
2014-01-07 13:51                   ` David Laight
2014-01-07 13:58                   ` David Laight
2014-01-07 19:15                     ` walt
2014-01-07 20:00                     ` walt
2014-01-07 23:31                       ` Sarah Sharp
     [not found]                     ` <063D6719AE5E284EB5DD2968C1650D6D453E1A-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
2014-01-07 21:27                       ` Sarah Sharp
     [not found]                   ` <52CC014C.5060604-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-01-07 21:21                     ` Sarah Sharp
2014-01-08  0:47                       ` Sarah Sharp
2014-01-08  1:35                         ` walt
2014-01-08 16:09                         ` David Laight
     [not found]                           ` <52CCA94D.5090700@gmail.com>
2014-01-09 23:50                             ` Sarah Sharp
2014-01-10 14:40                               ` walt
2014-01-10 14:58                                 ` David Laight
2014-01-10 15:12                               ` Alan Stern
2014-01-13 23:39                               ` [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst [NEW HARDWARE] walt
2014-01-14  9:43                                 ` David Laight
     [not found]                                 ` <52D4791B.3030309-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-01-14 17:20                                   ` Sarah Sharp
2014-01-14 21:27                                     ` walt
2014-01-16 20:46                                       ` Sarah Sharp
2014-01-17 14:34                                       ` David Laight
     [not found]                                         ` <063D6719AE5E284EB5DD2968C1650D6D45EDA3-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
2014-01-18 18:34                                           ` walt
2014-01-18 20:23                                           ` walt
2014-01-20 10:40                                             ` David Laight
2014-01-20 11:21                                             ` David Laight
2014-01-20 18:14                                               ` Sarah Sharp
2014-01-21  9:51                                                 ` David Laight
2014-01-21 22:07                                                   ` walt
2014-01-22  9:17                                                     ` David Laight
2014-01-08 16:39                         ` [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst Alan Stern
     [not found]                           ` <Pine.LNX.4.44L0.1401081130410.1659-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
2014-01-08 16:51                             ` David Laight
     [not found]                               ` <063D6719AE5E284EB5DD2968C1650D6D455602-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
2014-01-08 17:14                                 ` Alan Stern
2014-01-08 17:24                                   ` David Laight
2014-01-09  1:22               ` walt
2014-01-09 10:05                 ` David Laight
     [not found]                   ` <063D6719AE5E284EB5DD2968C1650D6D455F92-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
2014-01-09 15:10                     ` walt
2014-01-04 14:03         ` Mark Lord
2014-01-06 10:35         ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140103232929.GD4193@xanatos \
    --to=sarah.a.sharp@linux.intel.com \
    --cc=david.laight@aculab.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    --cc=w41ter@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox