Since Linux 4.13 tlp or powertop usage cause "xHCI host controller not responding, assume dead" on Dell 5855

linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: russianneuromancer@ya.ru, linux-usb@vger.kernel.org
Subject: Since Linux 4.13 tlp or powertop usage cause "xHCI host controller not responding, assume dead" on Dell 5855
Date: Mon, 16 Apr 2018 14:55:48 +0300	[thread overview]
Message-ID: <16a67206-6dce-01f1-1074-ee5d3b7e2602@linux.intel.com> (raw)

On 10.04.2018 12:15, russianneuromancer@ya.ru wrote:
> Hello!
> 
> On Dell Venue 8 Pro 5855 tablet installing tlp or running "powertop --
> auto-tune" cause "xHCI host controller not responding, assume dead"
> error, when error happen two integrated USB devices (Bluetooth adapter
> and LTE modem) disappear until reboot. First time this issue was
> observer in Linux 4.13 and still present in Linux 4.16. Blacklisting
> both "Linux Foundation 3.0 root hub" from autosuspend in tlp
> configuration is workaround for this issue, however on other devices
> tlp works fine without blacklisting usb hub autosuspend, and on this
> tablet there was no such issue before (at least in Linux ~4.8-4.12
> range) so I assume there is regression somewhere.
> 
> Is there any related commits between 4.12 and 4.13 that I could try to
> revert?
> 

In 4.12 there was a added sensitivity to react to hotplug removed
xhc controllers, i.e. if we read 0xffffffff from a xhci register
we assume host is removed and start cleaning up.

commit d9f11ba9f107aa335091ab8d7ba5eea714e46e8b
     xhci: Rework how we handle unresponsive or hoptlug removed hosts

You can try to revert that, but as a final solution we should
find the real rootcause

> How issue looks like in logs:
> 
> [  227.258385] xhci_hcd 0000:00:14.0: xHC is not running.
> [  329.671544] xhci_hcd 0000:00:14.0: xHC is not running.
> [  416.695796] xhci_hcd 0000:00:14.0: xHC is not running.

The "xHC is not running" is the xhci driver handing a port event
interrupt for a resuming port, but whole host controller is not running.
We stop the host controller in xhci_suspend(), and start it in xhci_resume()

Attaching a patch that improves preventing xhci host suspend during
USB2 resume signaling.
Could help, worth a shot.

> [  416.695862] xhci_hcd 0000:00:14.0: xHCI host controller not
> responding, assume dead

This means xhci_hc_died() was called, many possible places.
Adding the code below could give a hint:


> [  416.695900] xhci_hcd 0000:00:14.0: HC died; cleaning up
> [  416.696052] usb 1-3: USB disconnect, device number 2
> [  416.815610] cdc_mbim 1-3:1.12 wwp0s20u3i12: unregister 'cdc_mbim'
> usb-0000:00:14.0-3, CDC MBIM
> [  416.847934] usb 1-4: USB disconnect, device number 3
> 
> After that Bluetooth adapter and LTE modem disappear from lsusb output,
> while xHCI controller itself remain visible.

we stop the host activity in xhci_hc_died(), no usb devices under this host will work.

> Complete dmesg: https://paste.fedoraproject.org/paste/7aMpVGLfZ82zppdGs
> 56Oqg
> lsusb -v: https://paste.fedoraproject.org/paste/c7y8GisC13YdzcYE9B-JIw
> dsdt.dsl: https://paste.fedoraproject.org/paste/8g6mp2dafypUkFT4sa43iA

xhci traces and dynamic debug could help:

mount -t debugfs none /sys/kernel/debug
echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable

echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control

-Mathias

From 090b13a6df3f489a9781223dd959e03c2f81347b Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman@linux.intel.com>
Date: Thu, 1 Mar 2018 18:48:32 +0200
Subject: [PATCH] xhci: prevent USB 2 roothub autosuspend during port resume
 signaling

xhci USB 2 roothub tries to autosuspended itself again immediately after
being resumed by a remote wake. This can be avoided by calling the
usb_hcd_start_port_resume() and usb_hcd_end_port_resume() implemented
especially for this purpose.

Use them, and prevent roothub autosuspend during resume signaling.

Suggested-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
---
 drivers/usb/host/xhci-hub.c  | 3 +++
 drivers/usb/host/xhci-ring.c | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index 72ebbc9..671a336 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -905,6 +905,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
 
 				set_bit(wIndex, &bus_state->resuming_ports);
 				bus_state->resume_done[wIndex] = timeout;
+				usb_hcd_start_port_resume(&hcd->self, wIndex);
 				mod_timer(&hcd->rh_timer, timeout);
 			}
 		/* Has resume been signalled for USB_RESUME_TIME yet? */
@@ -930,6 +931,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
 					msecs_to_jiffies(
 						XHCI_MAX_REXIT_TIMEOUT));
 			spin_lock_irqsave(&xhci->lock, flags);
+			usb_hcd_end_port_resume(&hcd->self, wIndex);
 
 			if (time_left) {
 				slot_id = xhci_find_slot_id_by_port(hcd,
@@ -970,6 +972,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
 	    (raw_port_status & PORT_PLS_MASK) != XDEV_RESUME) {
 		bus_state->resume_done[wIndex] = 0;
 		clear_bit(wIndex, &bus_state->resuming_ports);
+		usb_hcd_end_port_resume(&hcd->self, wIndex);
 	}
 
 
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index daa94c3..a1cffe9 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1666,6 +1666,8 @@ static void handle_port_status(struct xhci_hcd *xhci,
 			bus_state->resume_done[faked_port_index] = jiffies +
 				msecs_to_jiffies(USB_RESUME_TIMEOUT);
 			set_bit(faked_port_index, &bus_state->resuming_ports);
+			usb_hcd_start_port_resume(&hcd->self, faked_port_index);
+
 			/* Do the rest in GetPortStatus after resume time delay.
 			 * Avoid polling roothub status before that so that a
 			 * usb device auto-resume latency around ~40ms.
-- 
2.7.4

next             reply	other threads:[~2018-04-16 11:55 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-16 11:55 Mathias Nyman [this message]
  -- strict thread matches above, loose matches on Subject: below --
2018-04-22  6:29 Since Linux 4.13 tlp or powertop usage cause "xHCI host controller not responding, assume dead" on Dell 5855 russianneuromancer
2018-04-23 14:52 Mathias Nyman
2018-04-23 15:11 Alan Stern
2018-04-24 13:15 Mathias Nyman
2018-04-24 13:24 Alan Stern
2018-04-24 13:32 Mathias Nyman
2018-04-24 13:50 Alan Stern
2018-05-02 14:47 Mathias Nyman
2018-05-02 17:52 Alan Stern
2018-05-03 11:37 Mathias Nyman
2018-05-03 11:53 Mathias Nyman
2018-05-03 18:56 Alan Stern
2018-05-04 11:53 Mathias Nyman

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:72ebbc9 dfblob:671a336 dfblob:daa94c3 dfblob:a1cffe9 )
 OR (
bs:"xhci: prevent USB 2 roothub autosuspend during port resume" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16a67206-6dce-01f1-1074-ee5d3b7e2602@linux.intel.com \
    --to=mathias.nyman@linux.intel.com \
    --cc=linux-usb@vger.kernel.org \
    --cc=russianneuromancer@ya.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).