All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: Oliver Neukum <oneukum@suse.com>,
	Jonas Karlsson <jonas.karlsson@actia.se>,
	Greg KH <gregkh@linuxfoundation.org>
Cc: "linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
	Alan Stern <stern@rowland.harvard.edu>
Subject: Re: USB transaction errors causing RCU stalls and kernel panics
Date: Wed, 4 Mar 2020 18:21:38 +0200	[thread overview]
Message-ID: <4fa64e92-64ce-07f3-ed8e-ea4e07d091bb@linux.intel.com> (raw)
In-Reply-To: <1583331173.12738.26.camel@suse.com>

On 4.3.2020 16.12, Oliver Neukum wrote:
> Am Mittwoch, den 04.03.2020, 14:11 +0200 schrieb Mathias Nyman:
>> On 3.3.2020 22.08, Jonas Karlsson wrote:
> 
>>
>> I recently got a report about a bit similar issue on a 4.4 stable kernel, so this
>> might not be xhci-cdns specific.
>>
>> That case involved autosuspend of the cdc-acm, and there was only a short burst of
>> transaction erros and resubmitted URBs even if the device was supposed to be suspended.
>> It looks like cdc_acm autosuspended even if it had URBs pending.
> 
> That must not happen. Do you have details?

Shared what I got in:
https://drive.google.com/open?id=1PjwmIK97bIfugWL697lJCe65yuDVOZhx

cdc-acm is unfortunately not traced, but usb core and xhci is.
grep for "MATTU" to find the relevant places. 

> 
>> I'm guessing that in that case the transfer ring restarted even if link was already "suspeded",
>> causing transaction errors. Ring could be restarted if URBs were resubmitted
>> by the class driver when usb core suspends all interfaces, flushing all pending URBs which
>> calls the URB completion handler.
> 
> How should a class driver do that? It gets -EPROTO and that's it,
> 
> 	Regards
> 		Oliver
> 

This is just my speculation, I haven't checked details yet.

It's not just the class driver that causes this, but a combination of
the following gaps in xhci, cdc-acm, and usb core.

A class driver autosuspending with URBs pending,
and not killing all URBs synchronously when usb core calls suspend for the interface drivers.

xhci restarting an endpoint ring due to a race between xhci_stop_device(), handling 
a Transaction error, and having pending URB (unhandled trb) on the ring. [1]

Sleeping between setting port to U3 link state, and clearing udev->can_submit.
allowing URBs to be submitted during that gap when link is in U3 already.
URBs will complete with -EPROTO, and resubmitted until udev->can_submit is cleared.

usb_runtime_suspend()      
  usb_suspend_both() // suspend a device and its interfaces
    for (each udev->actconfig->interface)
      usb_suspend_interface(udev, intf, msg);
	driver = to_usb_driver(intf->dev.driver);
	status = driver->suspend(intf, msg);  // URBs shuold be killed, sync, miss one??
    usb_suspend_device()
      generic_suspend()
        usb_port_suspend()
          hub_set_port_link_state(USB_SS_PORT_LS_U3)
            xhci_stop_device()
            xhci_set_link_state(USB_SS_PORT_LS_U3)  // port link is in U3
            msleep()       // during this URBs can be resubmitteded and complete with -EPROTO loop       
    udev->can_submit = 0;  // we can submit URBs until here (except the killed ones, they are flagged).
    for (each endpoint)
      usb_hcd_flush_endpoint(udev, udev->ep_out[i]);
      usb_hcd_flush_endpoint(udev, udev->ep_in[i]);

[1]
xhci_stop_device()
  xhci_queue_stop_endpoint()
-> interrupt (transfer event, ring has not stopped yet)
  handle_tx_event()  // bulk transfer with Transaction error
    process_bulk_intr_td()
      finish_td()
	xhci_cleanup_halted_endpoint()
          xhci_queue_reset_ep()
          xhci_queue_new_dequeue_state()
-> interupt, command completion event for stop endpoint, 
-> interrupt handle reset_ep command, xhci_handle_cmd_stop_ep()
-> interrupt handle new deq state
      xhci_handle_cmd_set_deq()
	ring_doorbell_for_active_rings(xhci, slot_id, ep_index)  - this restarts the ring.

-Mathias

  reply	other threads:[~2020-03-04 16:19 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-03 15:05 USB transaction errors causing RCU stalls and kernel panics Jonas Karlsson
2020-03-03 16:39 ` Greg KH
2020-03-03 20:08   ` Jonas Karlsson
2020-03-04  6:37     ` Greg KH
2020-03-04 10:29     ` Oliver Neukum
2020-03-04 12:11     ` Mathias Nyman
2020-03-04 14:12       ` Oliver Neukum
2020-03-04 16:21         ` Mathias Nyman [this message]
2020-03-06  1:31           ` Peter Chen
2020-03-09 14:21             ` Jonas Karlsson
2020-03-10  8:14               ` Peter Chen
2020-03-10 10:04                 ` Jonas Karlsson
2020-03-10 11:04                   ` Oliver Neukum
2020-03-10 11:21                     ` Oliver Neukum
2020-03-10 12:26                       ` Jonas Karlsson
2020-03-10 16:04                         ` Jonas Karlsson
2020-03-10 16:11                           ` Fabio Estevam
2020-03-11  6:25                             ` Jonas Karlsson
2020-03-11 10:28                               ` Oliver Neukum
2020-03-11 14:59                                 ` Jonas Karlsson
2020-03-12 13:45                                   ` Oliver Neukum
2020-03-12 15:37                                     ` Jonas Karlsson
2020-03-13  9:27                                       ` Oliver Neukum
2020-03-16  7:07                                     ` Jonas Karlsson
2020-03-23 11:37                                       ` Jonas Karlsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4fa64e92-64ce-07f3-ed8e-ea4e07d091bb@linux.intel.com \
    --to=mathias.nyman@linux.intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jonas.karlsson@actia.se \
    --cc=linux-usb@vger.kernel.org \
    --cc=oneukum@suse.com \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.