From: Robert Baldyga <r.baldyga@samsung.com>
To: balbi@ti.com, Paul Zimmerman <Paul.Zimmerman@synopsys.com>
Cc: Alan Stern <stern@rowland.harvard.edu>,
"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"dinguyen@opensource.altera.com" <dinguyen@opensource.altera.com>,
"yousaf.kaukab@intel.com" <yousaf.kaukab@intel.com>,
"m.szyprowski@samsung.com" <m.szyprowski@samsung.com>
Subject: Re: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock
Date: Thu, 15 Jan 2015 11:23:50 +0100 [thread overview]
Message-ID: <54B79536.5090300@samsung.com> (raw)
In-Reply-To: <20150115062436.GA6615@saruman>
Hi,
On 01/15/2015 07:24 AM, Felipe Balbi wrote:
>>>>>>>>>>> This is really, really odd. Register accesses are atomic, so the lock
>>>>>>>>>>> isn't really doing anything. Besides, you're calling
>>>>>>>>>>> dwc2_is_controller_alive() from within the IRQ handler, so IRQs are
>>>>>>>>>>> already disabled.
>>>>>>>>>>
>>>>>>>>>> Spinlocks sometimes do more than you think. For instance, here the
>>>>>>>>>> lock prevents the register access from happening while some other CPU
>>>>>>>>>> is holding the lock. If a silicon quirk causes the register access to
>>>>>>>>>> interfere with other activities, this could be important.
>>>>>>>>>
>>>>>>>>> readl() (which is used by dwc2_is_controller_alive()) adds a memory
>>>>>>>>> barrier to the register accesses, that should force all register
>>>>>>>>> accesses the be correctly ordered.
>>>>>>>>
>>>>>>>> Memory barriers will order accesses that are all made on the same CPU
>>>>>>>> with respect to each other. They do not order these accesses against
>>>>>>>> accesses made from another CPU -- that's why we have spinlocks. :-)
>>>>>>>
>>>>>>> a fair point :-) The register is still read-only, so that shouldn't
>>>>>>> matter either :-)
>>>>>>>
>>>>>>>>> I fail to see how a silicon quirk
>>>>>>>>> could cause this and if, indeed, it does, I'd be more comfortable with a
>>>>>>>>> proper STARS tickect number from synopsys :-s
>>>>>>>>
>>>>>>>> Maybe accessing this register somehow resets something else. I don't
>>>>>>>> know. It seems unlikely, but at least it explains how adding a
>>>>>>>> spinlock could fix the problem.
>>>>>>>
>>>>>>> I would really need Paul (or someone at Synopsys) to confirm this
>>>>>>> somehow. Maybe it has something to do with how the register is
>>>>>>> implemented, dunno.
>>>>>>>
>>>>>>> Paul, do you have any idea what could cause this ? Could the HW into
>>>>>>> some weird state if we read GSNPSID at random locations or when data is
>>>>>>> being transferred, or anything like that ?
>>>>>>
>>>>>> Only thing I can think of is that there is some silicon bug in Robert's
>>>>>> platform. But I am not aware of any STARs that mention accesses to the
>>>>>> GSNPSID register as being problematic.
>>>>>>
>>>>>> Funny thing is, this code has been basically the same since at least
>>>>>> November 2013. So I think some other recent change must have modified
>>>>>> the timing of the register accesses, or something like that. But that's
>>>>>> just handwaving, really.
>>>>>
>>>>> Alright, I'll apply this patch but for 3.20 with a stable tag as I have
>>>>> already sent my last pull request to Greg. Unless someone has a really
>>>>> big complaint about doing things as such.
>>>>
>>>> It should go to 3.19-rc shouldn't it? It's a fix, and Robert's platform
>>>> is broken without it, IIUC.
>>>
>>> It can also be categorized as "has-never-worked-before" before the code
>>> has been like this forever. Since we don't really have a git bisect
>>> result pointing to a commit that went in v3.19 merge window, I'm not
>>> sure how I can convince myself that this absolutely needs to be in
>>> v3.19.
>>>
>>> At a minimum, I need a proper bisection with a proper commit being
>>> blamed (even if it's a commit from months ago). From my point of view,
>>> debugging of this "regression" has not been finalized and we're just
>>> "assuming" it's caused by GSNPSID because moving that inside the
>>> spin_lock seems to fix the problem.
>>
>> On further investigation, I was wrong about "this code has been
>> basically the same since at least November 2013". Prior to commit
>> db8178c33db "usb: dwc2: Update common interrupt handler to call gadget
>> interrupt handler" from November 2014, the gadget interrupt handler
>> did not read from the GSNPSID register.
>
> right, but the common IRQ always did. So unless Robert's SoC has always
> been used only for peripheral, then I agree with you that behavior did,
> in fact, change.
As far as I know, DWC2 at this platform was always used as peripheral.
Exynos SoC's has EHCI USB controllers, so in 99% of cases there is
simply no need to use DWC2 as host.
>
>> So likely the bug in Robert's hardware has been there all along, and
>> that commit just caused it to manifest itself.
>
> Robert, out of curiosity, which SoC are you using ? Is it UP or SMP ?
>
> I guess we need a mention on commit log that at least SoC XYZ is known
> to break unless the register access is done with locks held.
>
I'm using Exynos4412 (Odroid U3). Revision number of my DWC2 is 2.81a.
I will update commit message and send patch v3.
Thanks,
Robert Baldyga
prev parent reply other threads:[~2015-01-15 10:23 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-14 6:45 [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock Robert Baldyga
2015-01-14 19:03 ` Paul Zimmerman
2015-01-14 19:37 ` Felipe Balbi
2015-01-14 20:06 ` Alan Stern
2015-01-14 21:14 ` Felipe Balbi
2015-01-14 21:41 ` Alan Stern
2015-01-14 21:46 ` Felipe Balbi
2015-01-14 22:28 ` Paul Zimmerman
2015-01-14 22:39 ` Felipe Balbi
2015-01-14 22:40 ` Felipe Balbi
2015-01-14 22:45 ` Paul Zimmerman
2015-01-14 22:49 ` Felipe Balbi
2015-01-14 23:04 ` Paul Zimmerman
2015-01-15 6:24 ` Felipe Balbi
2015-01-15 10:23 ` Robert Baldyga [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54B79536.5090300@samsung.com \
--to=r.baldyga@samsung.com \
--cc=Paul.Zimmerman@synopsys.com \
--cc=balbi@ti.com \
--cc=dinguyen@opensource.altera.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=stern@rowland.harvard.edu \
--cc=yousaf.kaukab@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox