* [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock @ 2015-01-14 6:45 Robert Baldyga 2015-01-14 19:03 ` Paul Zimmerman 2015-01-14 19:37 ` Felipe Balbi 0 siblings, 2 replies; 15+ messages in thread From: Robert Baldyga @ 2015-01-14 6:45 UTC (permalink / raw) To: paulz Cc: balbi, gregkh, linux-usb, linux-kernel, dinguyen, yousaf.kaukab, m.szyprowski, Robert Baldyga This patch fixes bug described here: https://lkml.org/lkml/2014/12/22/185 Signed-off-by: Robert Baldyga <r.baldyga@samsung.com> --- Changelog: v2: - fixed comment from Paul Zimmerman v1: https://lkml.org/lkml/2015/1/13/186 drivers/usb/dwc2/core_intr.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/usb/dwc2/core_intr.c b/drivers/usb/dwc2/core_intr.c index ad43c5b..02e3e2d 100644 --- a/drivers/usb/dwc2/core_intr.c +++ b/drivers/usb/dwc2/core_intr.c @@ -476,13 +476,13 @@ irqreturn_t dwc2_handle_common_intr(int irq, void *dev) u32 gintsts; irqreturn_t retval = IRQ_NONE; + spin_lock(&hsotg->lock); + if (!dwc2_is_controller_alive(hsotg)) { dev_warn(hsotg->dev, "Controller is dead\n"); goto out; } - spin_lock(&hsotg->lock); - gintsts = dwc2_read_common_intr(hsotg); if (gintsts & ~GINTSTS_PRTINT) retval = IRQ_HANDLED; @@ -515,8 +515,8 @@ irqreturn_t dwc2_handle_common_intr(int irq, void *dev) } } - spin_unlock(&hsotg->lock); out: + spin_unlock(&hsotg->lock); return retval; } EXPORT_SYMBOL_GPL(dwc2_handle_common_intr); -- 1.9.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* RE: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-14 6:45 [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock Robert Baldyga @ 2015-01-14 19:03 ` Paul Zimmerman 2015-01-14 19:37 ` Felipe Balbi 1 sibling, 0 replies; 15+ messages in thread From: Paul Zimmerman @ 2015-01-14 19:03 UTC (permalink / raw) To: Robert Baldyga, balbi@ti.com Cc: gregkh@linuxfoundation.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, dinguyen@opensource.altera.com, yousaf.kaukab@intel.com, m.szyprowski@samsung.com > From: Robert Baldyga [mailto:r.baldyga@samsung.com] > Sent: Tuesday, January 13, 2015 10:46 PM > > This patch fixes bug described here: > https://lkml.org/lkml/2014/12/22/185 > > Signed-off-by: Robert Baldyga <r.baldyga@samsung.com> Although I don't understand *why* this fixes Robert's issue, it's certainly a harmless patch, so Acked-by: Paul Zimmerman <paulz@synopsys.com> But I suspect Felipe will want a better changelog, I don't think just a URL is good enough. -- Paul > --- > > Changelog: > > v2: > - fixed comment from Paul Zimmerman > > v1: https://lkml.org/lkml/2015/1/13/186 > > drivers/usb/dwc2/core_intr.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/usb/dwc2/core_intr.c b/drivers/usb/dwc2/core_intr.c > index ad43c5b..02e3e2d 100644 > --- a/drivers/usb/dwc2/core_intr.c > +++ b/drivers/usb/dwc2/core_intr.c > @@ -476,13 +476,13 @@ irqreturn_t dwc2_handle_common_intr(int irq, void *dev) > u32 gintsts; > irqreturn_t retval = IRQ_NONE; > > + spin_lock(&hsotg->lock); > + > if (!dwc2_is_controller_alive(hsotg)) { > dev_warn(hsotg->dev, "Controller is dead\n"); > goto out; > } > > - spin_lock(&hsotg->lock); > - > gintsts = dwc2_read_common_intr(hsotg); > if (gintsts & ~GINTSTS_PRTINT) > retval = IRQ_HANDLED; > @@ -515,8 +515,8 @@ irqreturn_t dwc2_handle_common_intr(int irq, void *dev) > } > } > > - spin_unlock(&hsotg->lock); > out: > + spin_unlock(&hsotg->lock); > return retval; > } > EXPORT_SYMBOL_GPL(dwc2_handle_common_intr); > -- > 1.9.1 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-14 6:45 [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock Robert Baldyga 2015-01-14 19:03 ` Paul Zimmerman @ 2015-01-14 19:37 ` Felipe Balbi 2015-01-14 20:06 ` Alan Stern 1 sibling, 1 reply; 15+ messages in thread From: Felipe Balbi @ 2015-01-14 19:37 UTC (permalink / raw) To: Robert Baldyga Cc: paulz, balbi, gregkh, linux-usb, linux-kernel, dinguyen, yousaf.kaukab, m.szyprowski [-- Attachment #1: Type: text/plain, Size: 1142 bytes --] On Wed, Jan 14, 2015 at 07:45:31AM +0100, Robert Baldyga wrote: > This patch fixes bug described here: > https://lkml.org/lkml/2014/12/22/185 > > Signed-off-by: Robert Baldyga <r.baldyga@samsung.com> > --- > > Changelog: > > v2: > - fixed comment from Paul Zimmerman > > v1: https://lkml.org/lkml/2015/1/13/186 > > drivers/usb/dwc2/core_intr.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/usb/dwc2/core_intr.c b/drivers/usb/dwc2/core_intr.c > index ad43c5b..02e3e2d 100644 > --- a/drivers/usb/dwc2/core_intr.c > +++ b/drivers/usb/dwc2/core_intr.c > @@ -476,13 +476,13 @@ irqreturn_t dwc2_handle_common_intr(int irq, void *dev) > u32 gintsts; > irqreturn_t retval = IRQ_NONE; > > + spin_lock(&hsotg->lock); > + > if (!dwc2_is_controller_alive(hsotg)) { This is really, really odd. Register accesses are atomic, so the lock isn't really doing anything. Besides, you're calling dwc2_is_controller_alive() from within the IRQ handler, so IRQs are already disabled. When the problem happens, do you see this "Controller is dead" message ? -- balbi [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-14 19:37 ` Felipe Balbi @ 2015-01-14 20:06 ` Alan Stern 2015-01-14 21:14 ` Felipe Balbi 0 siblings, 1 reply; 15+ messages in thread From: Alan Stern @ 2015-01-14 20:06 UTC (permalink / raw) To: Felipe Balbi Cc: Robert Baldyga, paulz, gregkh, linux-usb, linux-kernel, dinguyen, yousaf.kaukab, m.szyprowski On Wed, 14 Jan 2015, Felipe Balbi wrote: > On Wed, Jan 14, 2015 at 07:45:31AM +0100, Robert Baldyga wrote: > > This patch fixes bug described here: > > https://lkml.org/lkml/2014/12/22/185 > > > > Signed-off-by: Robert Baldyga <r.baldyga@samsung.com> > > --- > > > > Changelog: > > > > v2: > > - fixed comment from Paul Zimmerman > > > > v1: https://lkml.org/lkml/2015/1/13/186 > > > > drivers/usb/dwc2/core_intr.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/usb/dwc2/core_intr.c b/drivers/usb/dwc2/core_intr.c > > index ad43c5b..02e3e2d 100644 > > --- a/drivers/usb/dwc2/core_intr.c > > +++ b/drivers/usb/dwc2/core_intr.c > > @@ -476,13 +476,13 @@ irqreturn_t dwc2_handle_common_intr(int irq, void *dev) > > u32 gintsts; > > irqreturn_t retval = IRQ_NONE; > > > > + spin_lock(&hsotg->lock); > > + > > if (!dwc2_is_controller_alive(hsotg)) { > > This is really, really odd. Register accesses are atomic, so the lock > isn't really doing anything. Besides, you're calling > dwc2_is_controller_alive() from within the IRQ handler, so IRQs are > already disabled. Spinlocks sometimes do more than you think. For instance, here the lock prevents the register access from happening while some other CPU is holding the lock. If a silicon quirk causes the register access to interfere with other activities, this could be important. Alan Stern ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-14 20:06 ` Alan Stern @ 2015-01-14 21:14 ` Felipe Balbi 2015-01-14 21:41 ` Alan Stern 0 siblings, 1 reply; 15+ messages in thread From: Felipe Balbi @ 2015-01-14 21:14 UTC (permalink / raw) To: Alan Stern Cc: Felipe Balbi, Robert Baldyga, paulz, gregkh, linux-usb, linux-kernel, dinguyen, yousaf.kaukab, m.szyprowski [-- Attachment #1: Type: text/plain, Size: 2009 bytes --] Hi, On Wed, Jan 14, 2015 at 03:06:39PM -0500, Alan Stern wrote: > > > This patch fixes bug described here: > > > https://lkml.org/lkml/2014/12/22/185 > > > > > > Signed-off-by: Robert Baldyga <r.baldyga@samsung.com> > > > --- > > > > > > Changelog: > > > > > > v2: > > > - fixed comment from Paul Zimmerman > > > > > > v1: https://lkml.org/lkml/2015/1/13/186 > > > > > > drivers/usb/dwc2/core_intr.c | 6 +++--- > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/usb/dwc2/core_intr.c b/drivers/usb/dwc2/core_intr.c > > > index ad43c5b..02e3e2d 100644 > > > --- a/drivers/usb/dwc2/core_intr.c > > > +++ b/drivers/usb/dwc2/core_intr.c > > > @@ -476,13 +476,13 @@ irqreturn_t dwc2_handle_common_intr(int irq, void *dev) > > > u32 gintsts; > > > irqreturn_t retval = IRQ_NONE; > > > > > > + spin_lock(&hsotg->lock); > > > + > > > if (!dwc2_is_controller_alive(hsotg)) { > > > > This is really, really odd. Register accesses are atomic, so the lock > > isn't really doing anything. Besides, you're calling > > dwc2_is_controller_alive() from within the IRQ handler, so IRQs are > > already disabled. > > Spinlocks sometimes do more than you think. For instance, here the > lock prevents the register access from happening while some other CPU > is holding the lock. If a silicon quirk causes the register access to > interfere with other activities, this could be important. readl() (which is used by dwc2_is_controller_alive()) adds a memory barrier to the register accesses, that should force all register accesses the be correctly ordered. I fail to see how a silicon quirk could cause this and if, indeed, it does, I'd be more comfortable with a proper STARS tickect number from synopsys :-s Then again, I don't even have a device with this controller and it seems to only be a problem with Robert's setup, so maybe it's a silicon bug caused by whoever integrated dwc2 in his silicon. -- balbi [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-14 21:14 ` Felipe Balbi @ 2015-01-14 21:41 ` Alan Stern 2015-01-14 21:46 ` Felipe Balbi 0 siblings, 1 reply; 15+ messages in thread From: Alan Stern @ 2015-01-14 21:41 UTC (permalink / raw) To: Felipe Balbi Cc: Robert Baldyga, paulz, gregkh, linux-usb, linux-kernel, dinguyen, yousaf.kaukab, m.szyprowski On Wed, 14 Jan 2015, Felipe Balbi wrote: > > > This is really, really odd. Register accesses are atomic, so the lock > > > isn't really doing anything. Besides, you're calling > > > dwc2_is_controller_alive() from within the IRQ handler, so IRQs are > > > already disabled. > > > > Spinlocks sometimes do more than you think. For instance, here the > > lock prevents the register access from happening while some other CPU > > is holding the lock. If a silicon quirk causes the register access to > > interfere with other activities, this could be important. > > readl() (which is used by dwc2_is_controller_alive()) adds a memory > barrier to the register accesses, that should force all register > accesses the be correctly ordered. Memory barriers will order accesses that are all made on the same CPU with respect to each other. They do not order these accesses against accesses made from another CPU -- that's why we have spinlocks. :-) > I fail to see how a silicon quirk > could cause this and if, indeed, it does, I'd be more comfortable with a > proper STARS tickect number from synopsys :-s Maybe accessing this register somehow resets something else. I don't know. It seems unlikely, but at least it explains how adding a spinlock could fix the problem. > Then again, I don't even have a device with this controller and it seems > to only be a problem with Robert's setup, so maybe it's a silicon bug > caused by whoever integrated dwc2 in his silicon. Alan Stern ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-14 21:41 ` Alan Stern @ 2015-01-14 21:46 ` Felipe Balbi 2015-01-14 22:28 ` Paul Zimmerman 0 siblings, 1 reply; 15+ messages in thread From: Felipe Balbi @ 2015-01-14 21:46 UTC (permalink / raw) To: Alan Stern Cc: Felipe Balbi, Robert Baldyga, paulz, gregkh, linux-usb, linux-kernel, dinguyen, yousaf.kaukab, m.szyprowski [-- Attachment #1: Type: text/plain, Size: 1827 bytes --] Hi, On Wed, Jan 14, 2015 at 04:41:23PM -0500, Alan Stern wrote: > > > > This is really, really odd. Register accesses are atomic, so the lock > > > > isn't really doing anything. Besides, you're calling > > > > dwc2_is_controller_alive() from within the IRQ handler, so IRQs are > > > > already disabled. > > > > > > Spinlocks sometimes do more than you think. For instance, here the > > > lock prevents the register access from happening while some other CPU > > > is holding the lock. If a silicon quirk causes the register access to > > > interfere with other activities, this could be important. > > > > readl() (which is used by dwc2_is_controller_alive()) adds a memory > > barrier to the register accesses, that should force all register > > accesses the be correctly ordered. > > Memory barriers will order accesses that are all made on the same CPU > with respect to each other. They do not order these accesses against > accesses made from another CPU -- that's why we have spinlocks. :-) a fair point :-) The register is still read-only, so that shouldn't matter either :-) > > I fail to see how a silicon quirk > > could cause this and if, indeed, it does, I'd be more comfortable with a > > proper STARS tickect number from synopsys :-s > > Maybe accessing this register somehow resets something else. I don't > know. It seems unlikely, but at least it explains how adding a > spinlock could fix the problem. I would really need Paul (or someone at Synopsys) to confirm this somehow. Maybe it has something to do with how the register is implemented, dunno. Paul, do you have any idea what could cause this ? Could the HW into some weird state if we read GSNPSID at random locations or when data is being transferred, or anything like that ? -- balbi [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-14 21:46 ` Felipe Balbi @ 2015-01-14 22:28 ` Paul Zimmerman 2015-01-14 22:39 ` Felipe Balbi 0 siblings, 1 reply; 15+ messages in thread From: Paul Zimmerman @ 2015-01-14 22:28 UTC (permalink / raw) To: balbi@ti.com, Alan Stern Cc: Robert Baldyga, gregkh@linuxfoundation.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, dinguyen@opensource.altera.com, yousaf.kaukab@intel.com, m.szyprowski@samsung.com > From: Felipe Balbi [mailto:balbi@ti.com] > Sent: Wednesday, January 14, 2015 1:46 PM > > On Wed, Jan 14, 2015 at 04:41:23PM -0500, Alan Stern wrote: > > > > > This is really, really odd. Register accesses are atomic, so the lock > > > > > isn't really doing anything. Besides, you're calling > > > > > dwc2_is_controller_alive() from within the IRQ handler, so IRQs are > > > > > already disabled. > > > > > > > > Spinlocks sometimes do more than you think. For instance, here the > > > > lock prevents the register access from happening while some other CPU > > > > is holding the lock. If a silicon quirk causes the register access to > > > > interfere with other activities, this could be important. > > > > > > readl() (which is used by dwc2_is_controller_alive()) adds a memory > > > barrier to the register accesses, that should force all register > > > accesses the be correctly ordered. > > > > Memory barriers will order accesses that are all made on the same CPU > > with respect to each other. They do not order these accesses against > > accesses made from another CPU -- that's why we have spinlocks. :-) > > a fair point :-) The register is still read-only, so that shouldn't > matter either :-) > > > > I fail to see how a silicon quirk > > > could cause this and if, indeed, it does, I'd be more comfortable with a > > > proper STARS tickect number from synopsys :-s > > > > Maybe accessing this register somehow resets something else. I don't > > know. It seems unlikely, but at least it explains how adding a > > spinlock could fix the problem. > > I would really need Paul (or someone at Synopsys) to confirm this > somehow. Maybe it has something to do with how the register is > implemented, dunno. > > Paul, do you have any idea what could cause this ? Could the HW into > some weird state if we read GSNPSID at random locations or when data is > being transferred, or anything like that ? Only thing I can think of is that there is some silicon bug in Robert's platform. But I am not aware of any STARs that mention accesses to the GSNPSID register as being problematic. Funny thing is, this code has been basically the same since at least November 2013. So I think some other recent change must have modified the timing of the register accesses, or something like that. But that's just handwaving, really. -- Paul ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-14 22:28 ` Paul Zimmerman @ 2015-01-14 22:39 ` Felipe Balbi 2015-01-14 22:40 ` Felipe Balbi 2015-01-14 22:45 ` Paul Zimmerman 0 siblings, 2 replies; 15+ messages in thread From: Felipe Balbi @ 2015-01-14 22:39 UTC (permalink / raw) To: Paul Zimmerman Cc: balbi@ti.com, Alan Stern, Robert Baldyga, gregkh@linuxfoundation.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, dinguyen@opensource.altera.com, yousaf.kaukab@intel.com, m.szyprowski@samsung.com [-- Attachment #1: Type: text/plain, Size: 2762 bytes --] On Wed, Jan 14, 2015 at 10:28:54PM +0000, Paul Zimmerman wrote: > > From: Felipe Balbi [mailto:balbi@ti.com] > > Sent: Wednesday, January 14, 2015 1:46 PM > > > > On Wed, Jan 14, 2015 at 04:41:23PM -0500, Alan Stern wrote: > > > > > > This is really, really odd. Register accesses are atomic, so the lock > > > > > > isn't really doing anything. Besides, you're calling > > > > > > dwc2_is_controller_alive() from within the IRQ handler, so IRQs are > > > > > > already disabled. > > > > > > > > > > Spinlocks sometimes do more than you think. For instance, here the > > > > > lock prevents the register access from happening while some other CPU > > > > > is holding the lock. If a silicon quirk causes the register access to > > > > > interfere with other activities, this could be important. > > > > > > > > readl() (which is used by dwc2_is_controller_alive()) adds a memory > > > > barrier to the register accesses, that should force all register > > > > accesses the be correctly ordered. > > > > > > Memory barriers will order accesses that are all made on the same CPU > > > with respect to each other. They do not order these accesses against > > > accesses made from another CPU -- that's why we have spinlocks. :-) > > > > a fair point :-) The register is still read-only, so that shouldn't > > matter either :-) > > > > > > I fail to see how a silicon quirk > > > > could cause this and if, indeed, it does, I'd be more comfortable with a > > > > proper STARS tickect number from synopsys :-s > > > > > > Maybe accessing this register somehow resets something else. I don't > > > know. It seems unlikely, but at least it explains how adding a > > > spinlock could fix the problem. > > > > I would really need Paul (or someone at Synopsys) to confirm this > > somehow. Maybe it has something to do with how the register is > > implemented, dunno. > > > > Paul, do you have any idea what could cause this ? Could the HW into > > some weird state if we read GSNPSID at random locations or when data is > > being transferred, or anything like that ? > > Only thing I can think of is that there is some silicon bug in Robert's > platform. But I am not aware of any STARs that mention accesses to the > GSNPSID register as being problematic. > > Funny thing is, this code has been basically the same since at least > November 2013. So I think some other recent change must have modified > the timing of the register accesses, or something like that. But that's > just handwaving, really. Alright, I'll apply this patch but for 3.20 with a stable tag as I have already sent my last pull request to Greg. Unless someone has a really big complaint about doing things as such. -- balbi [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-14 22:39 ` Felipe Balbi @ 2015-01-14 22:40 ` Felipe Balbi 2015-01-14 22:45 ` Paul Zimmerman 1 sibling, 0 replies; 15+ messages in thread From: Felipe Balbi @ 2015-01-14 22:40 UTC (permalink / raw) To: Felipe Balbi Cc: Paul Zimmerman, Alan Stern, Robert Baldyga, gregkh@linuxfoundation.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, dinguyen@opensource.altera.com, yousaf.kaukab@intel.com, m.szyprowski@samsung.com [-- Attachment #1: Type: text/plain, Size: 2981 bytes --] On Wed, Jan 14, 2015 at 04:39:41PM -0600, Felipe Balbi wrote: > On Wed, Jan 14, 2015 at 10:28:54PM +0000, Paul Zimmerman wrote: > > > From: Felipe Balbi [mailto:balbi@ti.com] > > > Sent: Wednesday, January 14, 2015 1:46 PM > > > > > > On Wed, Jan 14, 2015 at 04:41:23PM -0500, Alan Stern wrote: > > > > > > > This is really, really odd. Register accesses are atomic, so the lock > > > > > > > isn't really doing anything. Besides, you're calling > > > > > > > dwc2_is_controller_alive() from within the IRQ handler, so IRQs are > > > > > > > already disabled. > > > > > > > > > > > > Spinlocks sometimes do more than you think. For instance, here the > > > > > > lock prevents the register access from happening while some other CPU > > > > > > is holding the lock. If a silicon quirk causes the register access to > > > > > > interfere with other activities, this could be important. > > > > > > > > > > readl() (which is used by dwc2_is_controller_alive()) adds a memory > > > > > barrier to the register accesses, that should force all register > > > > > accesses the be correctly ordered. > > > > > > > > Memory barriers will order accesses that are all made on the same CPU > > > > with respect to each other. They do not order these accesses against > > > > accesses made from another CPU -- that's why we have spinlocks. :-) > > > > > > a fair point :-) The register is still read-only, so that shouldn't > > > matter either :-) > > > > > > > > I fail to see how a silicon quirk > > > > > could cause this and if, indeed, it does, I'd be more comfortable with a > > > > > proper STARS tickect number from synopsys :-s > > > > > > > > Maybe accessing this register somehow resets something else. I don't > > > > know. It seems unlikely, but at least it explains how adding a > > > > spinlock could fix the problem. > > > > > > I would really need Paul (or someone at Synopsys) to confirm this > > > somehow. Maybe it has something to do with how the register is > > > implemented, dunno. > > > > > > Paul, do you have any idea what could cause this ? Could the HW into > > > some weird state if we read GSNPSID at random locations or when data is > > > being transferred, or anything like that ? > > > > Only thing I can think of is that there is some silicon bug in Robert's > > platform. But I am not aware of any STARs that mention accesses to the > > GSNPSID register as being problematic. > > > > Funny thing is, this code has been basically the same since at least > > November 2013. So I think some other recent change must have modified > > the timing of the register accesses, or something like that. But that's > > just handwaving, really. > > Alright, I'll apply this patch but for 3.20 with a stable tag as I have > already sent my last pull request to Greg. Unless someone has a really > big complaint about doing things as such. But of course, I need a better changelog :-) -- balbi [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-14 22:39 ` Felipe Balbi 2015-01-14 22:40 ` Felipe Balbi @ 2015-01-14 22:45 ` Paul Zimmerman 2015-01-14 22:49 ` Felipe Balbi 1 sibling, 1 reply; 15+ messages in thread From: Paul Zimmerman @ 2015-01-14 22:45 UTC (permalink / raw) To: balbi@ti.com Cc: Alan Stern, Robert Baldyga, gregkh@linuxfoundation.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, dinguyen@opensource.altera.com, yousaf.kaukab@intel.com, m.szyprowski@samsung.com > From: Felipe Balbi [mailto:balbi@ti.com] > Sent: Wednesday, January 14, 2015 2:40 PM > > On Wed, Jan 14, 2015 at 10:28:54PM +0000, Paul Zimmerman wrote: > > > From: Felipe Balbi [mailto:balbi@ti.com] > > > Sent: Wednesday, January 14, 2015 1:46 PM > > > > > > On Wed, Jan 14, 2015 at 04:41:23PM -0500, Alan Stern wrote: > > > > > > > This is really, really odd. Register accesses are atomic, so the lock > > > > > > > isn't really doing anything. Besides, you're calling > > > > > > > dwc2_is_controller_alive() from within the IRQ handler, so IRQs are > > > > > > > already disabled. > > > > > > > > > > > > Spinlocks sometimes do more than you think. For instance, here the > > > > > > lock prevents the register access from happening while some other CPU > > > > > > is holding the lock. If a silicon quirk causes the register access to > > > > > > interfere with other activities, this could be important. > > > > > > > > > > readl() (which is used by dwc2_is_controller_alive()) adds a memory > > > > > barrier to the register accesses, that should force all register > > > > > accesses the be correctly ordered. > > > > > > > > Memory barriers will order accesses that are all made on the same CPU > > > > with respect to each other. They do not order these accesses against > > > > accesses made from another CPU -- that's why we have spinlocks. :-) > > > > > > a fair point :-) The register is still read-only, so that shouldn't > > > matter either :-) > > > > > > > > I fail to see how a silicon quirk > > > > > could cause this and if, indeed, it does, I'd be more comfortable with a > > > > > proper STARS tickect number from synopsys :-s > > > > > > > > Maybe accessing this register somehow resets something else. I don't > > > > know. It seems unlikely, but at least it explains how adding a > > > > spinlock could fix the problem. > > > > > > I would really need Paul (or someone at Synopsys) to confirm this > > > somehow. Maybe it has something to do with how the register is > > > implemented, dunno. > > > > > > Paul, do you have any idea what could cause this ? Could the HW into > > > some weird state if we read GSNPSID at random locations or when data is > > > being transferred, or anything like that ? > > > > Only thing I can think of is that there is some silicon bug in Robert's > > platform. But I am not aware of any STARs that mention accesses to the > > GSNPSID register as being problematic. > > > > Funny thing is, this code has been basically the same since at least > > November 2013. So I think some other recent change must have modified > > the timing of the register accesses, or something like that. But that's > > just handwaving, really. > > Alright, I'll apply this patch but for 3.20 with a stable tag as I have > already sent my last pull request to Greg. Unless someone has a really > big complaint about doing things as such. It should go to 3.19-rc shouldn't it? It's a fix, and Robert's platform is broken without it, IIUC. -- Paul ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-14 22:45 ` Paul Zimmerman @ 2015-01-14 22:49 ` Felipe Balbi 2015-01-14 23:04 ` Paul Zimmerman 0 siblings, 1 reply; 15+ messages in thread From: Felipe Balbi @ 2015-01-14 22:49 UTC (permalink / raw) To: Paul Zimmerman Cc: balbi@ti.com, Alan Stern, Robert Baldyga, gregkh@linuxfoundation.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, dinguyen@opensource.altera.com, yousaf.kaukab@intel.com, m.szyprowski@samsung.com [-- Attachment #1: Type: text/plain, Size: 3857 bytes --] Hi, On Wed, Jan 14, 2015 at 10:45:26PM +0000, Paul Zimmerman wrote: > > From: Felipe Balbi [mailto:balbi@ti.com] > > Sent: Wednesday, January 14, 2015 2:40 PM > > > > On Wed, Jan 14, 2015 at 10:28:54PM +0000, Paul Zimmerman wrote: > > > > From: Felipe Balbi [mailto:balbi@ti.com] > > > > Sent: Wednesday, January 14, 2015 1:46 PM > > > > > > > > On Wed, Jan 14, 2015 at 04:41:23PM -0500, Alan Stern wrote: > > > > > > > > This is really, really odd. Register accesses are atomic, so the lock > > > > > > > > isn't really doing anything. Besides, you're calling > > > > > > > > dwc2_is_controller_alive() from within the IRQ handler, so IRQs are > > > > > > > > already disabled. > > > > > > > > > > > > > > Spinlocks sometimes do more than you think. For instance, here the > > > > > > > lock prevents the register access from happening while some other CPU > > > > > > > is holding the lock. If a silicon quirk causes the register access to > > > > > > > interfere with other activities, this could be important. > > > > > > > > > > > > readl() (which is used by dwc2_is_controller_alive()) adds a memory > > > > > > barrier to the register accesses, that should force all register > > > > > > accesses the be correctly ordered. > > > > > > > > > > Memory barriers will order accesses that are all made on the same CPU > > > > > with respect to each other. They do not order these accesses against > > > > > accesses made from another CPU -- that's why we have spinlocks. :-) > > > > > > > > a fair point :-) The register is still read-only, so that shouldn't > > > > matter either :-) > > > > > > > > > > I fail to see how a silicon quirk > > > > > > could cause this and if, indeed, it does, I'd be more comfortable with a > > > > > > proper STARS tickect number from synopsys :-s > > > > > > > > > > Maybe accessing this register somehow resets something else. I don't > > > > > know. It seems unlikely, but at least it explains how adding a > > > > > spinlock could fix the problem. > > > > > > > > I would really need Paul (or someone at Synopsys) to confirm this > > > > somehow. Maybe it has something to do with how the register is > > > > implemented, dunno. > > > > > > > > Paul, do you have any idea what could cause this ? Could the HW into > > > > some weird state if we read GSNPSID at random locations or when data is > > > > being transferred, or anything like that ? > > > > > > Only thing I can think of is that there is some silicon bug in Robert's > > > platform. But I am not aware of any STARs that mention accesses to the > > > GSNPSID register as being problematic. > > > > > > Funny thing is, this code has been basically the same since at least > > > November 2013. So I think some other recent change must have modified > > > the timing of the register accesses, or something like that. But that's > > > just handwaving, really. > > > > Alright, I'll apply this patch but for 3.20 with a stable tag as I have > > already sent my last pull request to Greg. Unless someone has a really > > big complaint about doing things as such. > > It should go to 3.19-rc shouldn't it? It's a fix, and Robert's platform > is broken without it, IIUC. It can also be categorized as "has-never-worked-before" before the code has been like this forever. Since we don't really have a git bisect result pointing to a commit that went in v3.19 merge window, I'm not sure how I can convince myself that this absolutely needs to be in v3.19. At a minimum, I need a proper bisection with a proper commit being blamed (even if it's a commit from months ago). From my point of view, debugging of this "regression" has not been finalized and we're just "assuming" it's caused by GSNPSID because moving that inside the spin_lock seems to fix the problem. -- balbi [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-14 22:49 ` Felipe Balbi @ 2015-01-14 23:04 ` Paul Zimmerman 2015-01-15 6:24 ` Felipe Balbi 0 siblings, 1 reply; 15+ messages in thread From: Paul Zimmerman @ 2015-01-14 23:04 UTC (permalink / raw) To: balbi@ti.com Cc: Alan Stern, Robert Baldyga, gregkh@linuxfoundation.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, dinguyen@opensource.altera.com, yousaf.kaukab@intel.com, m.szyprowski@samsung.com > From: Felipe Balbi [mailto:balbi@ti.com] > Sent: Wednesday, January 14, 2015 2:50 PM > > On Wed, Jan 14, 2015 at 10:45:26PM +0000, Paul Zimmerman wrote: > > > From: Felipe Balbi [mailto:balbi@ti.com] > > > Sent: Wednesday, January 14, 2015 2:40 PM > > > > > > On Wed, Jan 14, 2015 at 10:28:54PM +0000, Paul Zimmerman wrote: > > > > > From: Felipe Balbi [mailto:balbi@ti.com] > > > > > Sent: Wednesday, January 14, 2015 1:46 PM > > > > > > > > > > On Wed, Jan 14, 2015 at 04:41:23PM -0500, Alan Stern wrote: > > > > > > > > > This is really, really odd. Register accesses are atomic, so the lock > > > > > > > > > isn't really doing anything. Besides, you're calling > > > > > > > > > dwc2_is_controller_alive() from within the IRQ handler, so IRQs are > > > > > > > > > already disabled. > > > > > > > > > > > > > > > > Spinlocks sometimes do more than you think. For instance, here the > > > > > > > > lock prevents the register access from happening while some other CPU > > > > > > > > is holding the lock. If a silicon quirk causes the register access to > > > > > > > > interfere with other activities, this could be important. > > > > > > > > > > > > > > readl() (which is used by dwc2_is_controller_alive()) adds a memory > > > > > > > barrier to the register accesses, that should force all register > > > > > > > accesses the be correctly ordered. > > > > > > > > > > > > Memory barriers will order accesses that are all made on the same CPU > > > > > > with respect to each other. They do not order these accesses against > > > > > > accesses made from another CPU -- that's why we have spinlocks. :-) > > > > > > > > > > a fair point :-) The register is still read-only, so that shouldn't > > > > > matter either :-) > > > > > > > > > > > > I fail to see how a silicon quirk > > > > > > > could cause this and if, indeed, it does, I'd be more comfortable with a > > > > > > > proper STARS tickect number from synopsys :-s > > > > > > > > > > > > Maybe accessing this register somehow resets something else. I don't > > > > > > know. It seems unlikely, but at least it explains how adding a > > > > > > spinlock could fix the problem. > > > > > > > > > > I would really need Paul (or someone at Synopsys) to confirm this > > > > > somehow. Maybe it has something to do with how the register is > > > > > implemented, dunno. > > > > > > > > > > Paul, do you have any idea what could cause this ? Could the HW into > > > > > some weird state if we read GSNPSID at random locations or when data is > > > > > being transferred, or anything like that ? > > > > > > > > Only thing I can think of is that there is some silicon bug in Robert's > > > > platform. But I am not aware of any STARs that mention accesses to the > > > > GSNPSID register as being problematic. > > > > > > > > Funny thing is, this code has been basically the same since at least > > > > November 2013. So I think some other recent change must have modified > > > > the timing of the register accesses, or something like that. But that's > > > > just handwaving, really. > > > > > > Alright, I'll apply this patch but for 3.20 with a stable tag as I have > > > already sent my last pull request to Greg. Unless someone has a really > > > big complaint about doing things as such. > > > > It should go to 3.19-rc shouldn't it? It's a fix, and Robert's platform > > is broken without it, IIUC. > > It can also be categorized as "has-never-worked-before" before the code > has been like this forever. Since we don't really have a git bisect > result pointing to a commit that went in v3.19 merge window, I'm not > sure how I can convince myself that this absolutely needs to be in > v3.19. > > At a minimum, I need a proper bisection with a proper commit being > blamed (even if it's a commit from months ago). From my point of view, > debugging of this "regression" has not been finalized and we're just > "assuming" it's caused by GSNPSID because moving that inside the > spin_lock seems to fix the problem. On further investigation, I was wrong about "this code has been basically the same since at least November 2013". Prior to commit db8178c33db "usb: dwc2: Update common interrupt handler to call gadget interrupt handler" from November 2014, the gadget interrupt handler did not read from the GSNPSID register. So likely the bug in Robert's hardware has been there all along, and that commit just caused it to manifest itself. -- Paul ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-14 23:04 ` Paul Zimmerman @ 2015-01-15 6:24 ` Felipe Balbi 2015-01-15 10:23 ` Robert Baldyga 0 siblings, 1 reply; 15+ messages in thread From: Felipe Balbi @ 2015-01-15 6:24 UTC (permalink / raw) To: Paul Zimmerman Cc: balbi@ti.com, Alan Stern, Robert Baldyga, gregkh@linuxfoundation.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, dinguyen@opensource.altera.com, yousaf.kaukab@intel.com, m.szyprowski@samsung.com [-- Attachment #1: Type: text/plain, Size: 4591 bytes --] Hi, On Wed, Jan 14, 2015 at 11:04:27PM +0000, Paul Zimmerman wrote: > > > > > > > > > > This is really, really odd. Register accesses are atomic, so the lock > > > > > > > > > > isn't really doing anything. Besides, you're calling > > > > > > > > > > dwc2_is_controller_alive() from within the IRQ handler, so IRQs are > > > > > > > > > > already disabled. > > > > > > > > > > > > > > > > > > Spinlocks sometimes do more than you think. For instance, here the > > > > > > > > > lock prevents the register access from happening while some other CPU > > > > > > > > > is holding the lock. If a silicon quirk causes the register access to > > > > > > > > > interfere with other activities, this could be important. > > > > > > > > > > > > > > > > readl() (which is used by dwc2_is_controller_alive()) adds a memory > > > > > > > > barrier to the register accesses, that should force all register > > > > > > > > accesses the be correctly ordered. > > > > > > > > > > > > > > Memory barriers will order accesses that are all made on the same CPU > > > > > > > with respect to each other. They do not order these accesses against > > > > > > > accesses made from another CPU -- that's why we have spinlocks. :-) > > > > > > > > > > > > a fair point :-) The register is still read-only, so that shouldn't > > > > > > matter either :-) > > > > > > > > > > > > > > I fail to see how a silicon quirk > > > > > > > > could cause this and if, indeed, it does, I'd be more comfortable with a > > > > > > > > proper STARS tickect number from synopsys :-s > > > > > > > > > > > > > > Maybe accessing this register somehow resets something else. I don't > > > > > > > know. It seems unlikely, but at least it explains how adding a > > > > > > > spinlock could fix the problem. > > > > > > > > > > > > I would really need Paul (or someone at Synopsys) to confirm this > > > > > > somehow. Maybe it has something to do with how the register is > > > > > > implemented, dunno. > > > > > > > > > > > > Paul, do you have any idea what could cause this ? Could the HW into > > > > > > some weird state if we read GSNPSID at random locations or when data is > > > > > > being transferred, or anything like that ? > > > > > > > > > > Only thing I can think of is that there is some silicon bug in Robert's > > > > > platform. But I am not aware of any STARs that mention accesses to the > > > > > GSNPSID register as being problematic. > > > > > > > > > > Funny thing is, this code has been basically the same since at least > > > > > November 2013. So I think some other recent change must have modified > > > > > the timing of the register accesses, or something like that. But that's > > > > > just handwaving, really. > > > > > > > > Alright, I'll apply this patch but for 3.20 with a stable tag as I have > > > > already sent my last pull request to Greg. Unless someone has a really > > > > big complaint about doing things as such. > > > > > > It should go to 3.19-rc shouldn't it? It's a fix, and Robert's platform > > > is broken without it, IIUC. > > > > It can also be categorized as "has-never-worked-before" before the code > > has been like this forever. Since we don't really have a git bisect > > result pointing to a commit that went in v3.19 merge window, I'm not > > sure how I can convince myself that this absolutely needs to be in > > v3.19. > > > > At a minimum, I need a proper bisection with a proper commit being > > blamed (even if it's a commit from months ago). From my point of view, > > debugging of this "regression" has not been finalized and we're just > > "assuming" it's caused by GSNPSID because moving that inside the > > spin_lock seems to fix the problem. > > On further investigation, I was wrong about "this code has been > basically the same since at least November 2013". Prior to commit > db8178c33db "usb: dwc2: Update common interrupt handler to call gadget > interrupt handler" from November 2014, the gadget interrupt handler > did not read from the GSNPSID register. right, but the common IRQ always did. So unless Robert's SoC has always been used only for peripheral, then I agree with you that behavior did, in fact, change. > So likely the bug in Robert's hardware has been there all along, and > that commit just caused it to manifest itself. Robert, out of curiosity, which SoC are you using ? Is it UP or SMP ? I guess we need a mention on commit log that at least SoC XYZ is known to break unless the register access is done with locks held. -- balbi [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock 2015-01-15 6:24 ` Felipe Balbi @ 2015-01-15 10:23 ` Robert Baldyga 0 siblings, 0 replies; 15+ messages in thread From: Robert Baldyga @ 2015-01-15 10:23 UTC (permalink / raw) To: balbi, Paul Zimmerman Cc: Alan Stern, gregkh@linuxfoundation.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, dinguyen@opensource.altera.com, yousaf.kaukab@intel.com, m.szyprowski@samsung.com Hi, On 01/15/2015 07:24 AM, Felipe Balbi wrote: >>>>>>>>>>> This is really, really odd. Register accesses are atomic, so the lock >>>>>>>>>>> isn't really doing anything. Besides, you're calling >>>>>>>>>>> dwc2_is_controller_alive() from within the IRQ handler, so IRQs are >>>>>>>>>>> already disabled. >>>>>>>>>> >>>>>>>>>> Spinlocks sometimes do more than you think. For instance, here the >>>>>>>>>> lock prevents the register access from happening while some other CPU >>>>>>>>>> is holding the lock. If a silicon quirk causes the register access to >>>>>>>>>> interfere with other activities, this could be important. >>>>>>>>> >>>>>>>>> readl() (which is used by dwc2_is_controller_alive()) adds a memory >>>>>>>>> barrier to the register accesses, that should force all register >>>>>>>>> accesses the be correctly ordered. >>>>>>>> >>>>>>>> Memory barriers will order accesses that are all made on the same CPU >>>>>>>> with respect to each other. They do not order these accesses against >>>>>>>> accesses made from another CPU -- that's why we have spinlocks. :-) >>>>>>> >>>>>>> a fair point :-) The register is still read-only, so that shouldn't >>>>>>> matter either :-) >>>>>>> >>>>>>>>> I fail to see how a silicon quirk >>>>>>>>> could cause this and if, indeed, it does, I'd be more comfortable with a >>>>>>>>> proper STARS tickect number from synopsys :-s >>>>>>>> >>>>>>>> Maybe accessing this register somehow resets something else. I don't >>>>>>>> know. It seems unlikely, but at least it explains how adding a >>>>>>>> spinlock could fix the problem. >>>>>>> >>>>>>> I would really need Paul (or someone at Synopsys) to confirm this >>>>>>> somehow. Maybe it has something to do with how the register is >>>>>>> implemented, dunno. >>>>>>> >>>>>>> Paul, do you have any idea what could cause this ? Could the HW into >>>>>>> some weird state if we read GSNPSID at random locations or when data is >>>>>>> being transferred, or anything like that ? >>>>>> >>>>>> Only thing I can think of is that there is some silicon bug in Robert's >>>>>> platform. But I am not aware of any STARs that mention accesses to the >>>>>> GSNPSID register as being problematic. >>>>>> >>>>>> Funny thing is, this code has been basically the same since at least >>>>>> November 2013. So I think some other recent change must have modified >>>>>> the timing of the register accesses, or something like that. But that's >>>>>> just handwaving, really. >>>>> >>>>> Alright, I'll apply this patch but for 3.20 with a stable tag as I have >>>>> already sent my last pull request to Greg. Unless someone has a really >>>>> big complaint about doing things as such. >>>> >>>> It should go to 3.19-rc shouldn't it? It's a fix, and Robert's platform >>>> is broken without it, IIUC. >>> >>> It can also be categorized as "has-never-worked-before" before the code >>> has been like this forever. Since we don't really have a git bisect >>> result pointing to a commit that went in v3.19 merge window, I'm not >>> sure how I can convince myself that this absolutely needs to be in >>> v3.19. >>> >>> At a minimum, I need a proper bisection with a proper commit being >>> blamed (even if it's a commit from months ago). From my point of view, >>> debugging of this "regression" has not been finalized and we're just >>> "assuming" it's caused by GSNPSID because moving that inside the >>> spin_lock seems to fix the problem. >> >> On further investigation, I was wrong about "this code has been >> basically the same since at least November 2013". Prior to commit >> db8178c33db "usb: dwc2: Update common interrupt handler to call gadget >> interrupt handler" from November 2014, the gadget interrupt handler >> did not read from the GSNPSID register. > > right, but the common IRQ always did. So unless Robert's SoC has always > been used only for peripheral, then I agree with you that behavior did, > in fact, change. As far as I know, DWC2 at this platform was always used as peripheral. Exynos SoC's has EHCI USB controllers, so in 99% of cases there is simply no need to use DWC2 as host. > >> So likely the bug in Robert's hardware has been there all along, and >> that commit just caused it to manifest itself. > > Robert, out of curiosity, which SoC are you using ? Is it UP or SMP ? > > I guess we need a mention on commit log that at least SoC XYZ is known > to break unless the register access is done with locks held. > I'm using Exynos4412 (Odroid U3). Revision number of my DWC2 is 2.81a. I will update commit message and send patch v3. Thanks, Robert Baldyga ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2015-01-15 10:23 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-01-14 6:45 [PATCH v2] usb: dwc2: call dwc2_is_controller_alive() under spinlock Robert Baldyga 2015-01-14 19:03 ` Paul Zimmerman 2015-01-14 19:37 ` Felipe Balbi 2015-01-14 20:06 ` Alan Stern 2015-01-14 21:14 ` Felipe Balbi 2015-01-14 21:41 ` Alan Stern 2015-01-14 21:46 ` Felipe Balbi 2015-01-14 22:28 ` Paul Zimmerman 2015-01-14 22:39 ` Felipe Balbi 2015-01-14 22:40 ` Felipe Balbi 2015-01-14 22:45 ` Paul Zimmerman 2015-01-14 22:49 ` Felipe Balbi 2015-01-14 23:04 ` Paul Zimmerman 2015-01-15 6:24 ` Felipe Balbi 2015-01-15 10:23 ` Robert Baldyga
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox