linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
To: Jeff Vanhoof <jdv1029@gmail.com>
Cc: Thinh Nguyen <Thinh.Nguyen@synopsys.com>,
	Jeffrey Vanhoof <jvanhoof@motorola.com>,
	"balbi@kernel.org" <balbi@kernel.org>,
	"corbet@lwn.net" <corbet@lwn.net>,
	"dan.scally@ideasonboard.com" <dan.scally@ideasonboard.com>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"laurent.pinchart@ideasonboard.com" 
	<laurent.pinchart@ideasonboard.com>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
	"m.grzeschik@pengutronix.de" <m.grzeschik@pengutronix.de>,
	"paul.elder@ideasonboard.com" <paul.elder@ideasonboard.com>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	Dan Vacura <W36195@motorola.com>
Subject: Re: [PATCH v3 2/6] usb: dwc3: gadget: cancel requests instead of release after missed isoc
Date: Wed, 19 Oct 2022 02:02:53 +0000	[thread overview]
Message-ID: <20221019020240.exujmo7uvae4xfdi@synopsys.com> (raw)
In-Reply-To: <20221019014108.GA5732@qjv001-XeonWs>

On Tue, Oct 18, 2022, Jeff Vanhoof wrote:
> Hi Thinh,
> 
> On Tue, Oct 18, 2022 at 10:35:30PM +0000, Thinh Nguyen wrote:
> > On Tue, Oct 18, 2022, Jeffrey Vanhoof wrote:
> > > Hi Thinh,
> > > 
> > > On Tue, Oct 18, 2022 at 06:45:40PM +0000, Thinh Nguyen wrote:
> > > > Hi Dan,
> > > > 
> > > > On Mon, Oct 17, 2022, Dan Vacura wrote:
> > > > > Hi Thinh,
> > > > > 
> > > > > On Mon, Oct 17, 2022 at 09:30:38PM +0000, Thinh Nguyen wrote:
> > > > > > On Mon, Oct 17, 2022, Dan Vacura wrote:
> > > > > > > From: Jeff Vanhoof <qjv001@motorola.com>
> > > > > > > 
> > > > > > > arm-smmu related crashes seen after a Missed ISOC interrupt when
> > > > > > > no_interrupt=1 is used. This can happen if the hardware is still using
> > > > > > > the data associated with a TRB after the usb_request's ->complete call
> > > > > > > has been made.  Instead of immediately releasing a request when a Missed
> > > > > > > ISOC interrupt has occurred, this change will add logic to cancel the
> > > > > > > request instead where it will eventually be released when the
> > > > > > > END_TRANSFER command has completed. This logic is similar to some of the
> > > > > > > cleanup done in dwc3_gadget_ep_dequeue.
> > > > > > 
> > > > > > This doesn't sound right. How did you determine that the hardware is
> > > > > > still using the data associated with the TRB? Did you check the TRB's
> > > > > > HWO bit?
> > > > > 
> > > > > The problem we're seeing was mentioned in the summary of this patch
> > > > > series, issue #1. Basically, with the following patch
> > > > > https://urldefense.com/v3/__https://patchwork.kernel.org/project/linux-usb/patch/20210628155311.16762-6-m.grzeschik@pengutronix.de/__;!!A4F2R9G_pg!aSNZ-IjMcPgL47A4NR5qp9qhVlP91UGTuCxej5NRTv8-FmTrMkKK7CjNToQQVEgtpqbKzLU2HXET9O226AEN$  
> > > > > integrated a smmu panic is occurring on our Android device with the 5.15
> > > > > kernel which is:
> > > > > 
> > > > >     <3>[  718.314900][  T803] arm-smmu 15000000.apps-smmu: Unhandled arm-smmu context fault from a600000.dwc3!
> > > > > 
> > > > > The uvc gadget driver appears to be the first (and only) gadget that
> > > > > uses the no_interrupt=1 logic, so this seems to be a new condition for
> > > > > the dwc3 driver. In our configuration, we have up to 64 requests and the
> > > > > no_interrupt=1 for up to 15 requests. The list size of dep->started_list
> > > > > would get up to that amount when looping through to cleanup the
> > > > > completed requests. From testing and debugging the smmu panic occurs
> > > > > when a -EXDEV status shows up and right after
> > > > > dwc3_gadget_ep_cleanup_completed_request() was visited. The conclusion
> > > > > we had was the requests were getting returned to the gadget too early.
> > > > 
> > > > As I mentioned, if the status is updated to missed isoc, that means that
> > > > the controller returned ownership of the TRB to the driver. At least for
> > > > the particular request with -EXDEV, its TRBs are completed. I'm not
> > > > clear on your conclusion.
> > > > 
> > > > Do we know where did the crash occur? Is it from dwc3 driver or from uvc
> > > > driver, and at what line? It'd great if we can see the driver log.
> > > >
> > > 
> > > To interject, what should happen in dwc3_gadget_ep_reclaim_completed_trb if the
> > > IOC bit is not set (but the IMI bit is) and -EXDEV status is passed into it?
> > 
> > Hm... we may have overlooked this case for no_interrupt scenario. If IMI
> > is set, then there will be an interrupt when there's missed isoc
> > regardless of whether no_interrupt is set by the gadget driver.
> > 
> > > If the function returns 0, another attempt to reclaim may occur. If this
> > > happens and the next request did have the HWO bit set, the function would
> > > return 1 but dwc3_gadget_ep_cleanup_completed_request would still call
> > > dwc3_gadget_giveback.
> > > 
> > > As a test (without this patch), I added a check to see if HWO bit was set in
> > > dwc3_gadget_ep_cleanup_completed_requests(). If the usecase was ISOC and the
> > > HWO bit was set I avoided calling dwc3_gadget_ep_cleanup_completed_request().
> > > This seemed to also avoid the iommu related crash being seen.
> > > 
> > > Is there an issue in this area that needs to be corrected instead? Not having
> > > interrupts set for each request may be causing some new issues to be uncovered.
> > > 
> > > As far as the crash seen without this patch, no good stacktrace is given. Line
> > > provided for crash varied a bit, but tended to appear towards the end of
> > > dwc3_stop_active_transfer() or dwc3_gadget_endpoint_trbs_complete().
> > > 
> > > Since dwc3_gadget_endpoint_trbs_complete() can be called from multiple
> > > locations, I duplicated the function to help identify which path it was likely
> > > being called from. At the time of the crashes seen,
> > > dwc3_gadget_endpoint_transfer_in_progress() appeared to be the caller.
> > > 
> > > dwc3_gadget_endpoint_transfer_in_progress()
> > > ->dwc3_gadget_endpoint_trbs_complete() (crashed towards end of here)
> > > ->dwc3_stop_active_transfer() (sometimes crashed towards end of here)
> > > 
> > > I hope this clarifies things a bit.
> > >  
> > 
> > Can we try this? Let me know if it resolves your issue.
> > 
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 61fba2b7389b..8352f4b5dd9f 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -3657,6 +3657,10 @@ static int dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
> >  	if (event->status & DEPEVT_STATUS_SHORT && !chain)
> >  		return 1;
> >  
> > +	if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
> > +	    (event->status & DEPEVT_STATUS_MISSED_ISOC) && !chain)
> > +		return 1;
> > +
> >  	if ((trb->ctrl & DWC3_TRB_CTRL_IOC) ||
> >  	    (trb->ctrl & DWC3_TRB_CTRL_LST))
> >  		return 1;
> >
> 
> With this change it doesn't seem to crash but unfortunately the output
> completely hangs after the first missed isoc. At the moment I do not understand
> why this might happen. 
> 

Can you capture the driver tracepoints with the change above?

> 
> Note that I haven't quite learned correctly how to reply correct to the mailing
> list.  I appologize for messing up the thread a bit.
> 

Seems fine to me. As long as I can read and understand, I've no issue. :)

Thanks,
Thinh

  reply	other threads:[~2022-10-19  2:03 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-18 20:49 [PATCH v3 2/6] usb: dwc3: gadget: cancel requests instead of release after missed isoc Jeffrey Vanhoof
2022-10-18 22:35 ` Thinh Nguyen
2022-10-19  1:41   ` Jeff Vanhoof
2022-10-19  2:02     ` Thinh Nguyen [this message]
2022-10-19  7:40       ` Jeff Vanhoof
2022-10-19 19:08         ` Thinh Nguyen
2022-10-19 21:34           ` Jeff Vanhoof
2022-10-19 23:06             ` Thinh Nguyen
2022-10-20 16:47               ` Jeff Vanhoof
2022-10-20 20:53                 ` Jeff Vanhoof
2022-10-20 22:47                 ` Thinh Nguyen
2022-10-21  0:55                   ` Thinh Nguyen
2022-10-21  9:39                     ` Jeff Vanhoof
2022-10-21 16:43                       ` Thinh Nguyen
2022-10-21 18:28                         ` Jeff Vanhoof
2022-10-21 19:09                           ` Thinh Nguyen
2022-10-21 19:27                             ` Jeff Vanhoof
  -- strict thread matches above, loose matches on Subject: below --
2022-10-17 20:54 [PATCH v3 0/6] uvc gadget performance issues Dan Vacura
2022-10-17 20:54 ` [PATCH v3 2/6] usb: dwc3: gadget: cancel requests instead of release after missed isoc Dan Vacura
2022-10-17 21:30   ` Thinh Nguyen
2022-10-18  2:10     ` Dan Vacura
2022-10-18 18:45       ` Thinh Nguyen
2022-10-18 19:13         ` Michael Grzeschik
2022-10-18 22:45           ` Thinh Nguyen
2022-10-19  6:46             ` Michael Grzeschik
2024-02-22  0:02   ` Michael Grzeschik
2024-02-22  1:20     ` Thinh Nguyen
2024-02-27 21:01       ` Michael Grzeschik
2024-03-07  1:57         ` Thinh Nguyen
2024-03-07 16:15           ` Michael Grzeschik
2024-03-08  2:47             ` Thinh Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221019020240.exujmo7uvae4xfdi@synopsys.com \
    --to=thinh.nguyen@synopsys.com \
    --cc=W36195@motorola.com \
    --cc=balbi@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dan.scally@ideasonboard.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jdv1029@gmail.com \
    --cc=jvanhoof@motorola.com \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=m.grzeschik@pengutronix.de \
    --cc=paul.elder@ideasonboard.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).