public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Felipe Balbi <balbi@ti.com>
To: Paul Zimmerman <Paul.Zimmerman@synopsys.com>
Cc: "balbi@ti.com" <balbi@ti.com>,
	Alan Stern <stern@rowland.harvard.edu>,
	Robert Baldyga <r.baldyga@samsung.com>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"dinguyen@opensource.altera.com" <dinguyen@opensource.altera.com>,
	"yousaf.kaukab@intel.com" <yousaf.kaukab@intel.com>,
	"m.szyprowski@samsung.com" <m.szyprowski@samsung.com>
Subject: Re: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock
Date: Thu, 15 Jan 2015 00:24:36 -0600	[thread overview]
Message-ID: <20150115062436.GA6615@saruman> (raw)
In-Reply-To: <A2CA0424C0A6F04399FB9E1CD98E030484507789@US01WEMBX2.internal.synopsys.com>

[-- Attachment #1: Type: text/plain, Size: 4591 bytes --]

Hi,

On Wed, Jan 14, 2015 at 11:04:27PM +0000, Paul Zimmerman wrote:
> > > > > > > > > > This is really, really odd. Register accesses are atomic, so the lock
> > > > > > > > > > isn't really doing anything. Besides, you're calling
> > > > > > > > > > dwc2_is_controller_alive() from within the IRQ handler, so IRQs are
> > > > > > > > > > already disabled.
> > > > > > > > >
> > > > > > > > > Spinlocks sometimes do more than you think.  For instance, here the
> > > > > > > > > lock prevents the register access from happening while some other CPU
> > > > > > > > > is holding the lock.  If a silicon quirk causes the register access to
> > > > > > > > > interfere with other activities, this could be important.
> > > > > > > >
> > > > > > > > readl() (which is used by dwc2_is_controller_alive()) adds a memory
> > > > > > > > barrier to the register accesses, that should force all register
> > > > > > > > accesses the be correctly ordered.
> > > > > > >
> > > > > > > Memory barriers will order accesses that are all made on the same CPU
> > > > > > > with respect to each other.  They do not order these accesses against
> > > > > > > accesses made from another CPU -- that's why we have spinlocks.  :-)
> > > > > >
> > > > > > a fair point :-) The register is still read-only, so that shouldn't
> > > > > > matter either :-)
> > > > > >
> > > > > > > >  I fail to see how a silicon quirk
> > > > > > > > could cause this and if, indeed, it does, I'd be more comfortable with a
> > > > > > > > proper STARS tickect number from synopsys :-s
> > > > > > >
> > > > > > > Maybe accessing this register somehow resets something else.  I don't
> > > > > > > know.  It seems unlikely, but at least it explains how adding a
> > > > > > > spinlock could fix the problem.
> > > > > >
> > > > > > I would really need Paul (or someone at Synopsys) to confirm this
> > > > > > somehow. Maybe it has something to do with how the register is
> > > > > > implemented, dunno.
> > > > > >
> > > > > > Paul, do you have any idea what could cause this ? Could the HW into
> > > > > > some weird state if we read GSNPSID at random locations or when data is
> > > > > > being transferred, or anything like that ?
> > > > >
> > > > > Only thing I can think of is that there is some silicon bug in Robert's
> > > > > platform. But I am not aware of any STARs that mention accesses to the
> > > > > GSNPSID register as being problematic.
> > > > >
> > > > > Funny thing is, this code has been basically the same since at least
> > > > > November 2013. So I think some other recent change must have modified
> > > > > the timing of the register accesses, or something like that. But that's
> > > > > just handwaving, really.
> > > >
> > > > Alright, I'll apply this patch but for 3.20 with a stable tag as I have
> > > > already sent my last pull request to Greg. Unless someone has a really
> > > > big complaint about doing things as such.
> > >
> > > It should go to 3.19-rc shouldn't it? It's a fix, and Robert's platform
> > > is broken without it, IIUC.
> > 
> > It can also be categorized as "has-never-worked-before" before the code
> > has been like this forever. Since we don't really have a git bisect
> > result pointing to a commit that went in v3.19 merge window, I'm not
> > sure how I can convince myself that this absolutely needs to be in
> > v3.19.
> > 
> > At a minimum, I need a proper bisection with a proper commit being
> > blamed (even if it's a commit from months ago). From my point of view,
> > debugging of this "regression" has not been finalized and we're just
> > "assuming" it's caused by GSNPSID because moving that inside the
> > spin_lock seems to fix the problem.
> 
> On further investigation, I was wrong about "this code has been
> basically the same since at least November 2013". Prior to commit
> db8178c33db "usb: dwc2: Update common interrupt handler to call gadget
> interrupt handler" from November 2014, the gadget interrupt handler
> did not read from the GSNPSID register.

right, but the common IRQ always did. So unless Robert's SoC has always
been used only for peripheral, then I agree with you that behavior did,
in fact, change.

> So likely the bug in Robert's hardware has been there all along, and
> that commit just caused it to manifest itself.

Robert, out of curiosity, which SoC are you using ? Is it UP or SMP ?

I guess we need a mention on commit log that at least SoC XYZ is known
to break unless the register access is done with locks held.

-- 
balbi

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2015-01-15  6:25 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-14  6:45 [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock Robert Baldyga
2015-01-14 19:03 ` Paul Zimmerman
2015-01-14 19:37 ` Felipe Balbi
2015-01-14 20:06   ` Alan Stern
2015-01-14 21:14     ` Felipe Balbi
2015-01-14 21:41       ` Alan Stern
2015-01-14 21:46         ` Felipe Balbi
2015-01-14 22:28           ` Paul Zimmerman
2015-01-14 22:39             ` Felipe Balbi
2015-01-14 22:40               ` Felipe Balbi
2015-01-14 22:45               ` Paul Zimmerman
2015-01-14 22:49                 ` Felipe Balbi
2015-01-14 23:04                   ` Paul Zimmerman
2015-01-15  6:24                     ` Felipe Balbi [this message]
2015-01-15 10:23                       ` Robert Baldyga

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150115062436.GA6615@saruman \
    --to=balbi@ti.com \
    --cc=Paul.Zimmerman@synopsys.com \
    --cc=dinguyen@opensource.altera.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=r.baldyga@samsung.com \
    --cc=stern@rowland.harvard.edu \
    --cc=yousaf.kaukab@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox