linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: Jonas Karlsson <jonas.karlsson@actia.se>
Cc: "linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>
Subject: Re: USB transaction errors causing RCU stalls and kernel panics
Date: Wed, 4 Mar 2020 07:37:35 +0100	[thread overview]
Message-ID: <20200304063735.GA1203111@kroah.com> (raw)
In-Reply-To: <ca6f029a57f24ee9aea39385a9ad55bd@actia.se>

On Tue, Mar 03, 2020 at 08:08:38PM +0000, Jonas Karlsson wrote:
> > 
> > On Tue, Mar 03, 2020 at 03:05:50PM +0000, Jonas Karlsson wrote:
> > > Hi,
> > >
> > > We have a board with an NXP i.MX8 SoC. We are running Linux 4.19.35 from
> > NXP on the SoC.
> > >
> > > There is a modem connected to the SoC via USB through a USB hub.
> > > The modem presents it self as a cdc-acm device with 4 tty:s.
> > >
> > > Sometimes we end up in a situation where all transfers over USB generetes
> > 'USB transaction Errors".
> > > It is likely that the modem is misbehaving. When this happens we get a lot of
> > "xhci-cdns3: ERROR unknown event type 37"
> > > in the terminal indicating that the xhci event ring is full. This often leads to RCU
> > stalls and sometimes Kernel panics.
> > >
> > > If I enable dynamic debug on xhci_hcd and cdc-acm I can see that all
> > > transfers have error code -71 (-EPROTO which in xhci translates to
> > > 'USB transaction error"). When this happens it seems like xhci resets
> > > the ep, sets TR Deq Ptr to unstall the ep and then a new transfer is
> > > started which also fails. This behavior generates a lot of events on
> > > the event ring which causes 'ERROR unknown event type 37'. This loop
> > > of failing transfers seems to continue until we either unbind the USB driver or
> > get a kernel panic. The SoC almost becomes unresponsive since it spends most
> > of the time executing usb interrupts.
> > >
> > > If I pull the reset pin of the USB hub and keep it in reset state at
> > > this point, the event loop of failing transfers continues despite
> > > there is nothing on the USB bus any longer. The only way to get out of that
> > loop is to either unbind the usb driver or power cycle the board.
> > >
> > > Is this the expected behavior when USB transaction error happens for all
> > transfers when using cdc-acm class driver?
> > > Or could there be something wrong in the low level USB driver (Cadence
> > > in our case)? We need to figure out why we get all the transaction errors but
> > we also need to make sure the kernel does not die on us when we have a
> > misbehaving USB device.
> > > Does anyone have a suggestion on what we could do to improve the stability
> > of the kernel in this situation?
> > 
> > I would blame the xhci-cdns driver as it is the one controlling all of this.
> > 
> > I don't see this driver in the 4.19 tree, so I think you are going to have to get
> > support from the company that provided you with that driver as you are already
> > paying for that support from them :)
> > 
> > good luck!
> > 
> > greg k-h
> 
> Thanks for the feedback! If the cadence driver is the main suspect I totally agree with you.
> 
> The reason I posted on this mailing list was that I was afraid that the cdc-acm driver could
> be causing new transfers to be started when the previous fails due to USB transaction errors and
> then trigger this event storm.

Yes, it could, but the host controller should handle that just fine.

> The acm_ctrl_irq() function seems to submit a new urb directly if the previous fails, but I cannot 
> say that I understand that code very well yet. The acm_read_bulk_callback() function also seem
> to submit a new read urb on USB transaction Errors. But If you think this could not cause this
> behavior I will ask our supplier to fix the cdns driver.

Please ask them to fix the driver and get it merged upstream :)

thanks,

greg k-h

  reply	other threads:[~2020-03-04  6:37 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-03 15:05 USB transaction errors causing RCU stalls and kernel panics Jonas Karlsson
2020-03-03 16:39 ` Greg KH
2020-03-03 20:08   ` Jonas Karlsson
2020-03-04  6:37     ` Greg KH [this message]
2020-03-04 10:29     ` Oliver Neukum
2020-03-04 12:11     ` Mathias Nyman
2020-03-04 14:12       ` Oliver Neukum
2020-03-04 16:21         ` Mathias Nyman
2020-03-06  1:31           ` Peter Chen
2020-03-09 14:21             ` Jonas Karlsson
2020-03-10  8:14               ` Peter Chen
2020-03-10 10:04                 ` Jonas Karlsson
2020-03-10 11:04                   ` Oliver Neukum
2020-03-10 11:21                     ` Oliver Neukum
2020-03-10 12:26                       ` Jonas Karlsson
2020-03-10 16:04                         ` Jonas Karlsson
2020-03-10 16:11                           ` Fabio Estevam
2020-03-11  6:25                             ` Jonas Karlsson
2020-03-11 10:28                               ` Oliver Neukum
2020-03-11 14:59                                 ` Jonas Karlsson
2020-03-12 13:45                                   ` Oliver Neukum
2020-03-12 15:37                                     ` Jonas Karlsson
2020-03-13  9:27                                       ` Oliver Neukum
2020-03-16  7:07                                     ` Jonas Karlsson
2020-03-23 11:37                                       ` Jonas Karlsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200304063735.GA1203111@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=jonas.karlsson@actia.se \
    --cc=linux-usb@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).