linux-omap.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
@ 2017-01-13 18:01 Tony Lindgren
       [not found] ` <20170113180132.9188-1-tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Tony Lindgren @ 2017-01-13 18:01 UTC (permalink / raw)
  To: Dan Williams, Vinod Koul
  Cc: Bin Liu, Daniel Mack, Felipe Balbi, George Cherian, Johan Hovold,
	Peter Ujfalusi, Sekhar Nori, Sebastian Andrzej Siewior,
	dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko,
	Grygorii Strashko, Kevin Hilman, Patrick Titiano, Sergei Shtylyov

Commit fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
together with recent MUSB changes allowed USB and DMA on BeagleBone to idle
when no cable is connected. But looks like few corner case issues still
remain.

Looks like just by re-plugging USB cable about ten or so times on BeagleBone
when configured in USB peripheral mode we can get warnings and eventually
trigger an oops in cppi41 DMA:

WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:1154 cppi41_runtime_suspend+
x28/0x38 [cppi41]
...

WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:452
push_desc_queue+0x94/0x9c [cppi41]
...

Unable to handle kernel NULL pointer dereference at virtual
address 00000104
pgd = c0004000
[00000104] *pgd=00000000
Internal error: Oops: 805 [#1] SMP ARM
...
[<bf0d92cc>] (cppi41_runtime_resume [cppi41]) from [<c0589838>]
(__rpm_callback+0xc0/0x214)
[<c0589838>] (__rpm_callback) from [<c05899ac>] (rpm_callback+0x20/0x80)
[<c05899ac>] (rpm_callback) from [<c0589460>] (rpm_resume+0x504/0x78c)
[<c0589460>] (rpm_resume) from [<c058a1a0>] (pm_runtime_work+0x60/0xa8)
[<c058a1a0>] (pm_runtime_work) from [<c0156120>] (process_one_work+0x2b4/0x808)

This is because of a race with runtime PM and cppi41_dma_issue_pending()
as reported by Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> in earlier
set of patches. Based on mailing list discussions we however came to the
conclusion that a different fix from Alexandre's fix is needed in order
to guarantee that DMA is really active when we try to use it.

To fix the issue, we need to add a driver specific flag as we otherwise
can have -EINPROGRESS state set by runtime PM and can't rely on
pm_runtime_active() to tell us when we can use the DMA.

And we need to make sure the DMA transfers get triggered in the queued
order. So let's always queue the transfers, then flush the queue
from both cppi41_dma_issue_pending() and cppi41_runtime_resume()
as suggested by Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> in an
earlier example patch.

For reference, this is also documented in Documentation/power/runtime_pm.txt
in the example at the end of the file as pointed out by Grygorii Strashko
<grygorii.strashko-l0cyMroinI0@public.gmane.org>.

Based on earlier patches from Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
and Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> modified based on
testing and what was discussed on the mailing lists.

Fixes: fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
Cc: Andy Shevchenko <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Bin Liu <b-liu-l0cyMroinI0@public.gmane.org>
Cc: Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org>
Cc: Kevin Hilman <khilman-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
Cc: Patrick Titiano <ptitiano-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
Cc: Sergei Shtylyov <sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>
Reported-by: Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
Signed-off-by: Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
---

Alexandre & Grygorii, can you guys please review and test?

Then if OK, I suggest you both reply with Tested-by/Reviewed-by plus
Signed-off-by as this patch contains contributions from both of you.

---
 drivers/dma/cppi41.c | 45 +++++++++++++++++++++++++++++----------------
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
--- a/drivers/dma/cppi41.c
+++ b/drivers/dma/cppi41.c
@@ -153,6 +153,8 @@ struct cppi41_dd {
 
 	/* context for suspend/resume */
 	unsigned int dma_tdfdq;
+
+	bool is_suspended;
 };
 
 #define FIST_COMPLETION_QUEUE	93
@@ -320,10 +322,13 @@ static irqreturn_t cppi41_irq(int irq, void *data)
 			int error;
 
 			error = pm_runtime_get(cdd->ddev.dev);
-			if (error < 0)
+			if ((error != -EINPROGRESS) && (error < 0))
 				dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
 					__func__, error);
 
+			/* This warning should never trigger */
+			WARN_ON(cdd->is_suspended);
+
 			q_num = __fls(val);
 			val &= ~(1 << q_num);
 			q_num += 32 * i;
@@ -457,20 +462,26 @@ static void push_desc_queue(struct cppi41_channel *c)
 	cppi_writel(reg, cdd->qmgr_mem + QMGR_QUEUE_D(c->q_num));
 }
 
-static void pending_desc(struct cppi41_channel *c)
+/*
+ * Caller must hold cdd->lock to prevent push_desc_queue()
+ * getting called out of order. We have both cppi41_dma_issue_pending()
+ * and cppi41_runtime_resume() call this function.
+ */
+static void cppi41_run_queue(struct cppi41_dd *cdd)
 {
-	struct cppi41_dd *cdd = c->cdd;
-	unsigned long flags;
+	struct cppi41_channel *c, *_c;
 
-	spin_lock_irqsave(&cdd->lock, flags);
-	list_add_tail(&c->node, &cdd->pending);
-	spin_unlock_irqrestore(&cdd->lock, flags);
+	list_for_each_entry_safe(c, _c, &cdd->pending, node) {
+		push_desc_queue(c);
+		list_del(&c->node);
+	}
 }
 
 static void cppi41_dma_issue_pending(struct dma_chan *chan)
 {
 	struct cppi41_channel *c = to_cpp41_chan(chan);
 	struct cppi41_dd *cdd = c->cdd;
+	unsigned long flags;
 	int error;
 
 	error = pm_runtime_get(cdd->ddev.dev);
@@ -482,10 +493,11 @@ static void cppi41_dma_issue_pending(struct dma_chan *chan)
 		return;
 	}
 
-	if (likely(pm_runtime_active(cdd->ddev.dev)))
-		push_desc_queue(c);
-	else
-		pending_desc(c);
+	spin_lock_irqsave(&cdd->lock, flags);
+	list_add_tail(&c->node, &cdd->pending);
+	if (!cdd->is_suspended)
+		cppi41_run_queue(cdd);
+	spin_unlock_irqrestore(&cdd->lock, flags);
 
 	pm_runtime_mark_last_busy(cdd->ddev.dev);
 	pm_runtime_put_autosuspend(cdd->ddev.dev);
@@ -1150,8 +1162,12 @@ static int __maybe_unused cppi41_resume(struct device *dev)
 static int __maybe_unused cppi41_runtime_suspend(struct device *dev)
 {
 	struct cppi41_dd *cdd = dev_get_drvdata(dev);
+	unsigned long flags;
 
+	spin_lock_irqsave(&cdd->lock, flags);
+	cdd->is_suspended = true;
 	WARN_ON(!list_empty(&cdd->pending));
+	spin_unlock_irqrestore(&cdd->lock, flags);
 
 	return 0;
 }
@@ -1159,14 +1175,11 @@ static int __maybe_unused cppi41_runtime_suspend(struct device *dev)
 static int __maybe_unused cppi41_runtime_resume(struct device *dev)
 {
 	struct cppi41_dd *cdd = dev_get_drvdata(dev);
-	struct cppi41_channel *c, *_c;
 	unsigned long flags;
 
 	spin_lock_irqsave(&cdd->lock, flags);
-	list_for_each_entry_safe(c, _c, &cdd->pending, node) {
-		push_desc_queue(c);
-		list_del(&c->node);
-	}
+	cdd->is_suspended = false;
+	cppi41_run_queue(cdd);
 	spin_unlock_irqrestore(&cdd->lock, flags);
 
 	return 0;
-- 
2.11.0
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found] ` <20170113180132.9188-1-tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-13 18:36   ` Grygorii Strashko
       [not found]     ` <d6fddff7-3d00-1de1-ebbf-ee2a949a6284-l0cyMroinI0@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Grygorii Strashko @ 2017-01-13 18:36 UTC (permalink / raw)
  To: Tony Lindgren, Dan Williams, Vinod Koul
  Cc: Bin Liu, Daniel Mack, Felipe Balbi, George Cherian, Johan Hovold,
	Peter Ujfalusi, Sekhar Nori, Sebastian Andrzej Siewior,
	dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov



On 01/13/2017 12:01 PM, Tony Lindgren wrote:
> Commit fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
> together with recent MUSB changes allowed USB and DMA on BeagleBone to idle
> when no cable is connected. But looks like few corner case issues still
> remain.
> 
> Looks like just by re-plugging USB cable about ten or so times on BeagleBone
> when configured in USB peripheral mode we can get warnings and eventually
> trigger an oops in cppi41 DMA:
> 
> WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:1154 cppi41_runtime_suspend+
> x28/0x38 [cppi41]
> ...
> 
> WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:452
> push_desc_queue+0x94/0x9c [cppi41]
> ...
> 
> Unable to handle kernel NULL pointer dereference at virtual
> address 00000104
> pgd = c0004000
> [00000104] *pgd=00000000
> Internal error: Oops: 805 [#1] SMP ARM
> ...
> [<bf0d92cc>] (cppi41_runtime_resume [cppi41]) from [<c0589838>]
> (__rpm_callback+0xc0/0x214)
> [<c0589838>] (__rpm_callback) from [<c05899ac>] (rpm_callback+0x20/0x80)
> [<c05899ac>] (rpm_callback) from [<c0589460>] (rpm_resume+0x504/0x78c)
> [<c0589460>] (rpm_resume) from [<c058a1a0>] (pm_runtime_work+0x60/0xa8)
> [<c058a1a0>] (pm_runtime_work) from [<c0156120>] (process_one_work+0x2b4/0x808)
> 
> This is because of a race with runtime PM and cppi41_dma_issue_pending()
> as reported by Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> in earlier
> set of patches. Based on mailing list discussions we however came to the
> conclusion that a different fix from Alexandre's fix is needed in order
> to guarantee that DMA is really active when we try to use it.
> 
> To fix the issue, we need to add a driver specific flag as we otherwise
> can have -EINPROGRESS state set by runtime PM and can't rely on
> pm_runtime_active() to tell us when we can use the DMA.
> 
> And we need to make sure the DMA transfers get triggered in the queued
> order. So let's always queue the transfers, then flush the queue
> from both cppi41_dma_issue_pending() and cppi41_runtime_resume()
> as suggested by Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> in an
> earlier example patch.
> 
> For reference, this is also documented in Documentation/power/runtime_pm.txt
> in the example at the end of the file as pointed out by Grygorii Strashko
> <grygorii.strashko-l0cyMroinI0@public.gmane.org>.
> 
> Based on earlier patches from Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> and Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> modified based on
> testing and what was discussed on the mailing lists.
> 
> Fixes: fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
> Cc: Andy Shevchenko <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Bin Liu <b-liu-l0cyMroinI0@public.gmane.org>
> Cc: Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org>
> Cc: Kevin Hilman <khilman-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> Cc: Patrick Titiano <ptitiano-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> Cc: Sergei Shtylyov <sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>
> Reported-by: Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
> ---
> 
> Alexandre & Grygorii, can you guys please review and test?

I've just tested it on BBB. And it triggers warning from IRQ

root@am335x-evm:~# [  242.280546] ------------[ cut here ]------------
[  242.285532] WARNING: CPU: 0 PID: 0 at drivers/dma/cppi41.c:330 cppi41_irq+0x228/0x260 [cppi41]
[  242.335164] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.10.0-rc3-00560-g51afc29 #133
[  242.344601] Hardware name: Generic AM33XX (Flattened Device Tree)
[  242.351084] [<c011013c>] (unwind_backtrace) from [<c010c300>] (show_stack+0x10/0x14)
[  242.359269] [<c010c300>] (show_stack) from [<c04a0278>] (dump_stack+0xac/0xe0)
[  242.366919] [<c04a0278>] (dump_stack) from [<c013700c>] (__warn+0xd8/0x104)
[  242.374282] [<c013700c>] (__warn) from [<c01370e4>] (warn_slowpath_null+0x20/0x28)
[  242.382299] [<c01370e4>] (warn_slowpath_null) from [<bf1d0384>] (cppi41_irq+0x228/0x260 [cppi41])
[  242.391714] [<bf1d0384>] (cppi41_irq [cppi41]) from [<c01a6888>] (__handle_irq_event_percpu+0x48/0x3b4)
[  242.401727] [<c01a6888>] (__handle_irq_event_percpu) from [<c01a6c10>] (handle_irq_event_percpu+0x1c/0x58)
[  242.412017] [<c01a6c10>] (handle_irq_event_percpu) from [<c01a6c84>] (handle_irq_event+0x38/0x5c)
[  242.421477] [<c01a6c84>] (handle_irq_event) from [<c01aa004>] (handle_level_irq+0xc0/0x154)
[  242.430378] [<c01aa004>] (handle_level_irq) from [<c01a5b9c>] (generic_handle_irq+0x20/0x34)
[  242.439378] [<c01a5b9c>] (generic_handle_irq) from [<c01a60f0>] (__handle_domain_irq+0x64/0xe0)
[  242.448660] [<c01a60f0>] (__handle_domain_irq) from [<c07f92f0>] (__irq_svc+0x70/0x98)
[  242.457117] [<c07f92f0>] (__irq_svc) from [<c010836c>] (arch_cpu_idle+0x20/0x3c)
[  242.464935] [<c010836c>] (arch_cpu_idle) from [<c018bf7c>] (do_idle+0x164/0x218)
[  242.472748] [<c018bf7c>] (do_idle) from [<c018c39c>] (cpu_startup_entry+0x18/0x1c)
[  242.480758] [<c018c39c>] (cpu_startup_entry) from [<c0b00c30>] (start_kernel+0x344/0x3bc)
[  242.489376] ---[ end trace 12c5b6488c1e8c75 ]---
[  242.496525] usb 1-1.3: USB disconnect, device number 4

test sequence:
- plug usb hub + cd card reader
- plug usb stick in hub
  37  mount /dev/sdb /media
   38  ls /media/
   39  rm /media/data 
   40  time dd if=/dev/zero of=/media/data bs=128k count=100
   41  umount /media
- unplug USB stick - > Warning

^ The same sequence without Hub -- > no warnings

When I plug-unplug USB stick from Hub I can reproduce it pretty often :(
but it can be different issue :(

Patch by itself looks good


root@am335x-evm:~# [  590.948093] usb 1-1: new high-speed USB device number 6 using musb-hdrc
[  591.119731] usb 1-1: New USB device found, idVendor=058f, idProduct=6254
[  591.126841] usb 1-1: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[  591.134565] usb 1-1: Product: USB2.0Hub
[  591.146776] hub 1-1:1.0: USB hub found
[  591.151375] hub 1-1:1.0: 4 ports detected
[  591.487947] usb 1-1.4: new high-speed USB device number 7 using musb-hdrc
[  591.631141] usb 1-1.4: New USB device found, idVendor=058f, idProduct=6366
[  591.638474] usb 1-1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  591.646140] usb 1-1.4: Product: Mass Storage Device
[  591.651336] usb 1-1.4: Manufacturer: Generic
[  591.655817] usb 1-1.4: SerialNumber: 058F0O1111B1
[  591.665677] usb-storage 1-1.4:1.0: USB Mass Storage device detected
[  591.679987] scsi host0: usb-storage 1-1.4:1.0
[  592.731385] scsi 0:0:0:0: Direct-Access     Multi    Flash Reader     1.00 PQ: 0 ANSI: 0
[  592.769590] sd 0:0:0:0: [sda] Attached SCSI removable disk
[  593.948402] ------------[ cut here ]------------
[  593.953396] WARNING: CPU: 0 PID: 0 at drivers/dma/cppi41.c:330 cppi41_irq+0x228/0x260 [cppi41]
[  593.962470] Modules linked in: usb_storage evdev musb_dsps musb_hdrc udc_core cppi41 phy_am335x usbcore phy_generic usb_common phy_am335x_control snd_soc_simple_card snd_soc_simple_card_utils snd_soc_davinci_mcasp tps65218_pn
[  594.003038] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.10.0-rc3-00560-g51afc29 #133
[  594.012473] Hardware name: Generic AM33XX (Flattened Device Tree)
[  594.018958] [<c011013c>] (unwind_backtrace) from [<c010c300>] (show_stack+0x10/0x14)
[  594.027143] [<c010c300>] (show_stack) from [<c04a0278>] (dump_stack+0xac/0xe0)
[  594.034793] [<c04a0278>] (dump_stack) from [<c013700c>] (__warn+0xd8/0x104)
[  594.042158] [<c013700c>] (__warn) from [<c01370e4>] (warn_slowpath_null+0x20/0x28)
[  594.050174] [<c01370e4>] (warn_slowpath_null) from [<bf1d0384>] (cppi41_irq+0x228/0x260 [cppi41])
[  594.059590] [<bf1d0384>] (cppi41_irq [cppi41]) from [<c01a6888>] (__handle_irq_event_percpu+0x48/0x3b4)
[  594.069602] [<c01a6888>] (__handle_irq_event_percpu) from [<c01a6c10>] (handle_irq_event_percpu+0x1c/0x58)
[  594.079892] [<c01a6c10>] (handle_irq_event_percpu) from [<c01a6c84>] (handle_irq_event+0x38/0x5c)
[  594.089344] [<c01a6c84>] (handle_irq_event) from [<c01aa004>] (handle_level_irq+0xc0/0x154)
[  594.098245] [<c01aa004>] (handle_level_irq) from [<c01a5b9c>] (generic_handle_irq+0x20/0x34)
[  594.107243] [<c01a5b9c>] (generic_handle_irq) from [<c01a60f0>] (__handle_domain_irq+0x64/0xe0)
[  594.116525] [<c01a60f0>] (__handle_domain_irq) from [<c07f92f0>] (__irq_svc+0x70/0x98)
[  594.124980] [<c07f92f0>] (__irq_svc) from [<c010836c>] (arch_cpu_idle+0x20/0x3c)
[  594.132798] [<c010836c>] (arch_cpu_idle) from [<c018bf7c>] (do_idle+0x164/0x218)
[  594.140610] [<c018bf7c>] (do_idle) from [<c018c39c>] (cpu_startup_entry+0x18/0x1c)
[  594.148619] [<c018c39c>] (cpu_startup_entry) from [<c0b00c30>] (start_kernel+0x344/0x3bc)
[  594.157238] ---[ end trace 12c5b6488c1e8c76 ]---
[  594.467931] usb 1-1.3: new high-speed USB device number 8 using musb-hdrc
[  594.672753] usb 1-1.3: New USB device found, idVendor=058f, idProduct=6387
[  594.680179] usb 1-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  594.688032] usb 1-1.3: Product: Mass Storage Device
[  594.693192] usb 1-1.3: Manufacturer: Generic
[  594.697708] usb 1-1.3: SerialNumber: OU2ACT49
[  594.710778] usb-storage 1-1.3:1.0: USB Mass Storage device detected
[  594.730376] scsi host1: usb-storage 1-1.3:1.0
[  595.771756] scsi 1:0:0:0: Direct-Access     JetFlash TS128MJF2A       8.07 PQ: 0 ANSI: 2
[  595.799177] sd 1:0:0:0: [sdb] 255998 512-byte logical blocks: (131 MB/125 MiB)
[  595.818767] sd 1:0:0:0: [sdb] Write Protect is off
[  595.838684] sd 1:0:0:0: [sdb] No Caching mode page found
[  595.844350] sd 1:0:0:0: [sdb] Assuming drive cache: write through
[  595.905633]  sdb:
[  595.921145] sd 1:0:0:0: [sdb] Attached SCSI removable disk
[  596.556411] FAT-fs (sdb): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
[  615.452300] ------------[ cut here ]------------
[  615.457291] WARNING: CPU: 0 PID: 0 at drivers/dma/cppi41.c:330 cppi41_irq+0x228/0x260 [cppi41]
[  615.466367] Modules linked in: usb_storage evdev musb_dsps musb_hdrc udc_core cppi41 phy_am335x usbcore phy_generic usb_common phy_am335x_control snd_soc_simple_card snd_soc_simple_card_utils snd_soc_davinci_mcasp tps65218_pn
[  615.506931] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.10.0-rc3-00560-g51afc29 #133
[  615.516366] Hardware name: Generic AM33XX (Flattened Device Tree)
[  615.522850] [<c011013c>] (unwind_backtrace) from [<c010c300>] (show_stack+0x10/0x14)
[  615.531034] [<c010c300>] (show_stack) from [<c04a0278>] (dump_stack+0xac/0xe0)
[  615.538685] [<c04a0278>] (dump_stack) from [<c013700c>] (__warn+0xd8/0x104)
[  615.546049] [<c013700c>] (__warn) from [<c01370e4>] (warn_slowpath_null+0x20/0x28)
[  615.554066] [<c01370e4>] (warn_slowpath_null) from [<bf1d0384>] (cppi41_irq+0x228/0x260 [cppi41])
[  615.563481] [<bf1d0384>] (cppi41_irq [cppi41]) from [<c01a6888>] (__handle_irq_event_percpu+0x48/0x3b4)
[  615.573500] [<c01a6888>] (__handle_irq_event_percpu) from [<c01a6c10>] (handle_irq_event_percpu+0x1c/0x58)
[  615.583789] [<c01a6c10>] (handle_irq_event_percpu) from [<c01a6c84>] (handle_irq_event+0x38/0x5c)
[  615.593251] [<c01a6c84>] (handle_irq_event) from [<c01aa004>] (handle_level_irq+0xc0/0x154)
[  615.602151] [<c01aa004>] (handle_level_irq) from [<c01a5b9c>] (generic_handle_irq+0x20/0x34)
[  615.611144] [<c01a5b9c>] (generic_handle_irq) from [<c01a60f0>] (__handle_domain_irq+0x64/0xe0)
[  615.620432] [<c01a60f0>] (__handle_domain_irq) from [<c07f92f0>] (__irq_svc+0x70/0x98)
[  615.628893] [<c07f92f0>] (__irq_svc) from [<c010836c>] (arch_cpu_idle+0x20/0x3c)
[  615.636786] [<c010836c>] (arch_cpu_idle) from [<c018bf7c>] (do_idle+0x164/0x218)
[  615.644677] [<c018bf7c>] (do_idle) from [<c018c39c>] (cpu_startup_entry+0x18/0x1c)
[  615.652766] [<c018c39c>] (cpu_startup_entry) from [<c0b00c30>] (start_kernel+0x344/0x3bc)
[  615.661470] ---[ end trace 12c5b6488c1e8c77 ]---
[  615.676487] usb 1-1.3: USB disconnect, device number 8

root@am335x-evm:~# 
root@am335x-evm:~# [  626.008636] usb 1-1: USB disconnect, device number 6
[  626.013945] usb 1-1.4: USB disconnect, device number 7
[  630.538101] usb 1-1: new high-speed USB device number 9 using musb-hdrc
[  630.709737] usb 1-1: New USB device found, idVendor=058f, idProduct=6254
[  630.716845] usb 1-1: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[  630.724564] usb 1-1: Product: USB2.0Hub
[  630.736885] hub 1-1:1.0: USB hub found
[  630.750975] hub 1-1:1.0: 4 ports detected
[  631.077965] usb 1-1.4: new high-speed USB device number 10 using musb-hdrc
[  631.211138] usb 1-1.4: New USB device found, idVendor=058f, idProduct=6366
[  631.218452] usb 1-1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  631.226116] usb 1-1.4: Product: Mass Storage Device
[  631.231311] usb 1-1.4: Manufacturer: Generic
[  631.235793] usb 1-1.4: SerialNumber: 058F0O1111B1
[  631.245578] usb-storage 1-1.4:1.0: USB Mass Storage device detected
[  631.259530] scsi host0: usb-storage 1-1.4:1.0
[  632.333419] scsi 0:0:0:0: Direct-Access     Multi    Flash Reader     1.00 PQ: 0 ANSI: 0
[  632.389945] sd 0:0:0:0: [sda] Attached SCSI removable disk
[  637.378367] ------------[ cut here ]------------
[  637.383358] WARNING: CPU: 0 PID: 0 at drivers/dma/cppi41.c:330 cppi41_irq+0x228/0x260 [cppi41]
[  637.392431] Modules linked in: usb_storage evdev musb_dsps musb_hdrc udc_core cppi41 phy_am335x usbcore phy_generic usb_common phy_am335x_control snd_soc_simple_card snd_soc_simple_card_utils snd_soc_davinci_mcasp tps65218_pn
[  637.432996] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.10.0-rc3-00560-g51afc29 #133
[  637.442432] Hardware name: Generic AM33XX (Flattened Device Tree)
[  637.448914] [<c011013c>] (unwind_backtrace) from [<c010c300>] (show_stack+0x10/0x14)
[  637.457099] [<c010c300>] (show_stack) from [<c04a0278>] (dump_stack+0xac/0xe0)
[  637.464748] [<c04a0278>] (dump_stack) from [<c013700c>] (__warn+0xd8/0x104)
[  637.472112] [<c013700c>] (__warn) from [<c01370e4>] (warn_slowpath_null+0x20/0x28)
[  637.480130] [<c01370e4>] (warn_slowpath_null) from [<bf1d0384>] (cppi41_irq+0x228/0x260 [cppi41])
[  637.489545] [<bf1d0384>] (cppi41_irq [cppi41]) from [<c01a6888>] (__handle_irq_event_percpu+0x48/0x3b4)
[  637.499565] [<c01a6888>] (__handle_irq_event_percpu) from [<c01a6c10>] (handle_irq_event_percpu+0x1c/0x58)
[  637.509852] [<c01a6c10>] (handle_irq_event_percpu) from [<c01a6c84>] (handle_irq_event+0x38/0x5c)
[  637.519315] [<c01a6c84>] (handle_irq_event) from [<c01aa004>] (handle_level_irq+0xc0/0x154)
[  637.528216] [<c01aa004>] (handle_level_irq) from [<c01a5b9c>] (generic_handle_irq+0x20/0x34)
[  637.537217] [<c01a5b9c>] (generic_handle_irq) from [<c01a60f0>] (__handle_domain_irq+0x64/0xe0)
[  637.546495] [<c01a60f0>] (__handle_domain_irq) from [<c07f92f0>] (__irq_svc+0x70/0x98)
[  637.554954] [<c07f92f0>] (__irq_svc) from [<c010836c>] (arch_cpu_idle+0x20/0x3c)
[  637.562772] [<c010836c>] (arch_cpu_idle) from [<c018bf7c>] (do_idle+0x164/0x218)
[  637.570585] [<c018bf7c>] (do_idle) from [<c018c39c>] (cpu_startup_entry+0x18/0x1c)
[  637.578594] [<c018c39c>] (cpu_startup_entry) from [<c0b00c30>] (start_kernel+0x344/0x3bc)
[  637.587211] ---[ end trace 12c5b6488c1e8c78 ]---
[  637.897902] usb 1-1.3: new high-speed USB device number 11 using musb-hdrc
[  638.032611] usb 1-1.3: New USB device found, idVendor=058f, idProduct=6387
[  638.040050] usb 1-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  638.047911] usb 1-1.3: Product: Mass Storage Device
[  638.053072] usb 1-1.3: Manufacturer: Generic
[  638.057587] usb 1-1.3: SerialNumber: OU2ACT49
[  638.070774] usb-storage 1-1.3:1.0: USB Mass Storage device detected
[  638.089869] scsi host1: usb-storage 1-1.3:1.0
[  639.133691] scsi 1:0:0:0: Direct-Access     JetFlash TS128MJF2A       8.07 PQ: 0 ANSI: 2
[  639.167142] sd 1:0:0:0: [sdb] 255998 512-byte logical blocks: (131 MB/125 MiB)
[  639.189067] sd 1:0:0:0: [sdb] Write Protect is off
[  639.200963] sd 1:0:0:0: [sdb] No Caching mode page found
[  639.206621] sd 1:0:0:0: [sdb] Assuming drive cache: write through
[  639.260360]  sdb:
[  639.295633] sd 1:0:0:0: [sdb] Attached SCSI removable disk
[  639.937425] FAT-fs (sdb): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
[  643.522357] ------------[ cut here ]------------
[  643.527345] WARNING: CPU: 0 PID: 0 at drivers/dma/cppi41.c:330 cppi41_irq+0x228/0x260 [cppi41]
[  643.536418] Modules linked in: usb_storage evdev musb_dsps musb_hdrc udc_core cppi41 phy_am335x usbcore phy_generic usb_common phy_am335x_control snd_soc_simple_card snd_soc_simple_card_utils snd_soc_davinci_mcasp tps65218_pn
[  643.576981] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.10.0-rc3-00560-g51afc29 #133
[  643.586416] Hardware name: Generic AM33XX (Flattened Device Tree)
[  643.592898] [<c011013c>] (unwind_backtrace) from [<c010c300>] (show_stack+0x10/0x14)
[  643.601082] [<c010c300>] (show_stack) from [<c04a0278>] (dump_stack+0xac/0xe0)
[  643.608733] [<c04a0278>] (dump_stack) from [<c013700c>] (__warn+0xd8/0x104)
[  643.616095] [<c013700c>] (__warn) from [<c01370e4>] (warn_slowpath_null+0x20/0x28)
[  643.624111] [<c01370e4>] (warn_slowpath_null) from [<bf1d0384>] (cppi41_irq+0x228/0x260 [cppi41])
[  643.633526] [<bf1d0384>] (cppi41_irq [cppi41]) from [<c01a6888>] (__handle_irq_event_percpu+0x48/0x3b4)
[  643.643547] [<c01a6888>] (__handle_irq_event_percpu) from [<c01a6c10>] (handle_irq_event_percpu+0x1c/0x58)
[  643.653839] [<c01a6c10>] (handle_irq_event_percpu) from [<c01a6c84>] (handle_irq_event+0x38/0x5c)
[  643.663294] [<c01a6c84>] (handle_irq_event) from [<c01aa004>] (handle_level_irq+0xc0/0x154)
[  643.672200] [<c01aa004>] (handle_level_irq) from [<c01a5b9c>] (generic_handle_irq+0x20/0x34)
[  643.681196] [<c01a5b9c>] (generic_handle_irq) from [<c01a60f0>] (__handle_domain_irq+0x64/0xe0)
[  643.690479] [<c01a60f0>] (__handle_domain_irq) from [<c07f92f0>] (__irq_svc+0x70/0x98)
[  643.698937] [<c07f92f0>] (__irq_svc) from [<c010836c>] (arch_cpu_idle+0x20/0x3c)
[  643.706753] [<c010836c>] (arch_cpu_idle) from [<c018bf7c>] (do_idle+0x164/0x218)
[  643.714566] [<c018bf7c>] (do_idle) from [<c018c39c>] (cpu_startup_entry+0x18/0x1c)
[  643.722573] [<c018c39c>] (cpu_startup_entry) from [<c0b00c30>] (start_kernel+0x344/0x3bc)
[  643.731190] ---[ end trace 12c5b6488c1e8c79 ]---
[  643.745259] usb 1-1.3: USB disconnect, device number 11
[  647.703381] usb 1-1: USB disconnect, device number 9
[  647.708865] usb 1-1.4: USB disconnect, device number 10
[  654.308094] usb 1-1: new high-speed USB device number 12 using musb-hdrc
[  654.482563] usb 1-1: New USB device found, idVendor=058f, idProduct=6387
[  654.489807] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  654.497345] usb 1-1: Product: Mass Storage Device
[  654.502452] usb 1-1: Manufacturer: Generic
[  654.506791] usb 1-1: SerialNumber: OU2ACT49
[  654.519849] usb-storage 1-1:1.0: USB Mass Storage device detected
[  654.539319] scsi host0: usb-storage 1-1:1.0
[  655.613494] scsi 0:0:0:0: Direct-Access     JetFlash TS128MJF2A       8.07 PQ: 0 ANSI: 2
[  655.646875] sd 0:0:0:0: [sda] 255998 512-byte logical blocks: (131 MB/125 MiB)
[  655.668688] sd 0:0:0:0: [sda] Write Protect is off
[  655.680249] sd 0:0:0:0: [sda] No Caching mode page found
[  655.685906] sd 0:0:0:0: [sda] Assuming drive cache: write through
[  655.754170]  sda:
[  655.774799] sd 0:0:0:0: [sda] Attached SCSI removable disk
[  656.382509] FAT-fs (sda): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
[  661.348430] usb 1-1: USB disconnect, device number 12
[  667.908145] usb 1-1: new high-speed USB device number 13 using musb-hdrc
[  668.082589] usb 1-1: New USB device found, idVendor=058f, idProduct=6387
[  668.089834] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  668.097373] usb 1-1: Product: Mass Storage Device
[  668.102528] usb 1-1: Manufacturer: Generic
[  668.106868] usb 1-1: SerialNumber: OU2ACT49
[  668.119949] usb-storage 1-1:1.0: USB Mass Storage device detected
[  668.139462] scsi host0: usb-storage 1-1:1.0
[  669.211804] scsi 0:0:0:0: Direct-Access     JetFlash TS128MJF2A       8.07 PQ: 0 ANSI: 2
[  669.238458] sd 0:0:0:0: [sda] 255998 512-byte logical blocks: (131 MB/125 MiB)
[  669.258577] sd 0:0:0:0: [sda] Write Protect is off
[  669.270862] sd 0:0:0:0: [sda] No Caching mode page found
[  669.276468] sd 0:0:0:0: [sda] Assuming drive cache: write through
[  669.337468]  sda:
[  669.380588] sd 0:0:0:0: [sda] Attached SCSI removable disk
[  670.003597] FAT-fs (sda): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
[  674.253935] usb 1-1: USB disconnect, device number 13
[  678.888173] usb 1-1: new high-speed USB device number 14 using musb-hdrc
[  679.062609] usb 1-1: New USB device found, idVendor=058f, idProduct=6387
[  679.069866] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  679.077403] usb 1-1: Product: Mass Storage Device
[  679.082516] usb 1-1: Manufacturer: Generic
[  679.086855] usb 1-1: SerialNumber: OU2ACT49
[  679.100234] usb-storage 1-1:1.0: USB Mass Storage device detected
[  679.119805] scsi host0: usb-storage 1-1:1.0
[  680.181609] scsi 0:0:0:0: Direct-Access     JetFlash TS128MJF2A       8.07 PQ: 0 ANSI: 2
[  680.216574] sd 0:0:0:0: [sda] 255998 512-byte logical blocks: (131 MB/125 MiB)
[  680.239868] sd 0:0:0:0: [sda] Write Protect is off
[  680.250587] sd 0:0:0:0: [sda] No Caching mode page found
[  680.256243] sd 0:0:0:0: [sda] Assuming drive cache: write through
[  680.313569]  sda:
[  680.334123] sd 0:0:0:0: [sda] Attached SCSI removable disk
[  680.949436] FAT-fs (sda): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
[  683.787309] usb 1-1: USB disconnect, device number 14


-- 
regards,
-grygorii
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]     ` <d6fddff7-3d00-1de1-ebbf-ee2a949a6284-l0cyMroinI0@public.gmane.org>
@ 2017-01-13 19:01       ` Tony Lindgren
       [not found]         ` <20170113190126.GE2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Tony Lindgren @ 2017-01-13 19:01 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	George Cherian, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170113 10:37]:
> 
> 
> On 01/13/2017 12:01 PM, Tony Lindgren wrote:
> > Commit fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
> > together with recent MUSB changes allowed USB and DMA on BeagleBone to idle
> > when no cable is connected. But looks like few corner case issues still
> > remain.
> > 
> > Looks like just by re-plugging USB cable about ten or so times on BeagleBone
> > when configured in USB peripheral mode we can get warnings and eventually
> > trigger an oops in cppi41 DMA:
> > 
> > WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:1154 cppi41_runtime_suspend+
> > x28/0x38 [cppi41]
> > ...
> > 
> > WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:452
> > push_desc_queue+0x94/0x9c [cppi41]
> > ...
> > 
> > Unable to handle kernel NULL pointer dereference at virtual
> > address 00000104
> > pgd = c0004000
> > [00000104] *pgd=00000000
> > Internal error: Oops: 805 [#1] SMP ARM
> > ...
> > [<bf0d92cc>] (cppi41_runtime_resume [cppi41]) from [<c0589838>]
> > (__rpm_callback+0xc0/0x214)
> > [<c0589838>] (__rpm_callback) from [<c05899ac>] (rpm_callback+0x20/0x80)
> > [<c05899ac>] (rpm_callback) from [<c0589460>] (rpm_resume+0x504/0x78c)
> > [<c0589460>] (rpm_resume) from [<c058a1a0>] (pm_runtime_work+0x60/0xa8)
> > [<c058a1a0>] (pm_runtime_work) from [<c0156120>] (process_one_work+0x2b4/0x808)
> > 
> > This is because of a race with runtime PM and cppi41_dma_issue_pending()
> > as reported by Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> in earlier
> > set of patches. Based on mailing list discussions we however came to the
> > conclusion that a different fix from Alexandre's fix is needed in order
> > to guarantee that DMA is really active when we try to use it.
> > 
> > To fix the issue, we need to add a driver specific flag as we otherwise
> > can have -EINPROGRESS state set by runtime PM and can't rely on
> > pm_runtime_active() to tell us when we can use the DMA.
> > 
> > And we need to make sure the DMA transfers get triggered in the queued
> > order. So let's always queue the transfers, then flush the queue
> > from both cppi41_dma_issue_pending() and cppi41_runtime_resume()
> > as suggested by Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> in an
> > earlier example patch.
> > 
> > For reference, this is also documented in Documentation/power/runtime_pm.txt
> > in the example at the end of the file as pointed out by Grygorii Strashko
> > <grygorii.strashko-l0cyMroinI0@public.gmane.org>.
> > 
> > Based on earlier patches from Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> > and Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> modified based on
> > testing and what was discussed on the mailing lists.
> > 
> > Fixes: fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
> > Cc: Andy Shevchenko <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Cc: Bin Liu <b-liu-l0cyMroinI0@public.gmane.org>
> > Cc: Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org>
> > Cc: Kevin Hilman <khilman-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> > Cc: Patrick Titiano <ptitiano-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> > Cc: Sergei Shtylyov <sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>
> > Reported-by: Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> > Signed-off-by: Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
> > ---
> > 
> > Alexandre & Grygorii, can you guys please review and test?
> 
> I've just tested it on BBB. And it triggers warning from IRQ
> 
> root@am335x-evm:~# [  242.280546] ------------[ cut here ]------------
> [  242.285532] WARNING: CPU: 0 PID: 0 at drivers/dma/cppi41.c:330 cppi41_irq+0x228/0x260 [cppi41]
> [  242.335164] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.10.0-rc3-00560-g51afc29 #133
> [  242.344601] Hardware name: Generic AM33XX (Flattened Device Tree)
> [  242.351084] [<c011013c>] (unwind_backtrace) from [<c010c300>] (show_stack+0x10/0x14)
> [  242.359269] [<c010c300>] (show_stack) from [<c04a0278>] (dump_stack+0xac/0xe0)
> [  242.366919] [<c04a0278>] (dump_stack) from [<c013700c>] (__warn+0xd8/0x104)
> [  242.374282] [<c013700c>] (__warn) from [<c01370e4>] (warn_slowpath_null+0x20/0x28)
> [  242.382299] [<c01370e4>] (warn_slowpath_null) from [<bf1d0384>] (cppi41_irq+0x228/0x260 [cppi41])
> [  242.391714] [<bf1d0384>] (cppi41_irq [cppi41]) from [<c01a6888>] (__handle_irq_event_percpu+0x48/0x3b4)
> [  242.401727] [<c01a6888>] (__handle_irq_event_percpu) from [<c01a6c10>] (handle_irq_event_percpu+0x1c/0x58)
> [  242.412017] [<c01a6c10>] (handle_irq_event_percpu) from [<c01a6c84>] (handle_irq_event+0x38/0x5c)
> [  242.421477] [<c01a6c84>] (handle_irq_event) from [<c01aa004>] (handle_level_irq+0xc0/0x154)
> [  242.430378] [<c01aa004>] (handle_level_irq) from [<c01a5b9c>] (generic_handle_irq+0x20/0x34)
> [  242.439378] [<c01a5b9c>] (generic_handle_irq) from [<c01a60f0>] (__handle_domain_irq+0x64/0xe0)
> [  242.448660] [<c01a60f0>] (__handle_domain_irq) from [<c07f92f0>] (__irq_svc+0x70/0x98)
> [  242.457117] [<c07f92f0>] (__irq_svc) from [<c010836c>] (arch_cpu_idle+0x20/0x3c)
> [  242.464935] [<c010836c>] (arch_cpu_idle) from [<c018bf7c>] (do_idle+0x164/0x218)
> [  242.472748] [<c018bf7c>] (do_idle) from [<c018c39c>] (cpu_startup_entry+0x18/0x1c)
> [  242.480758] [<c018c39c>] (cpu_startup_entry) from [<c0b00c30>] (start_kernel+0x344/0x3bc)
> [  242.489376] ---[ end trace 12c5b6488c1e8c75 ]---
> [  242.496525] usb 1-1.3: USB disconnect, device number 4
> 
> test sequence:
> - plug usb hub + cd card reader
> - plug usb stick in hub
>   37  mount /dev/sdb /media
>    38  ls /media/
>    39  rm /media/data 
>    40  time dd if=/dev/zero of=/media/data bs=128k count=100
>    41  umount /media
> - unplug USB stick - > Warning
> 
> ^ The same sequence without Hub -- > no warnings
> 
> When I plug-unplug USB stick from Hub I can reproduce it pretty often :(
> but it can be different issue :(

Hmm interesting, I'll try it here too. I wonder if adding spin_lock_irqsave/
restore around the WARN_ON() in the interrupt handler is enough to make it
go away?

> Patch by itself looks good

OK

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]         ` <20170113190126.GE2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-13 19:53           ` Grygorii Strashko
       [not found]             ` <c80b51a2-1acd-914c-17b8-32754ea9ce3e-l0cyMroinI0@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Grygorii Strashko @ 2017-01-13 19:53 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	George Cherian, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov



On 01/13/2017 01:01 PM, Tony Lindgren wrote:
> * Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170113 10:37]:
>>
>>
>> On 01/13/2017 12:01 PM, Tony Lindgren wrote:
>>> Commit fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
>>> together with recent MUSB changes allowed USB and DMA on BeagleBone to idle
>>> when no cable is connected. But looks like few corner case issues still
>>> remain.
>>>
>>> Looks like just by re-plugging USB cable about ten or so times on BeagleBone
>>> when configured in USB peripheral mode we can get warnings and eventually
>>> trigger an oops in cppi41 DMA:
>>>
>>> WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:1154 cppi41_runtime_suspend+
>>> x28/0x38 [cppi41]
>>> ...
>>>
>>> WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:452
>>> push_desc_queue+0x94/0x9c [cppi41]
>>> ...
>>>
>>> Unable to handle kernel NULL pointer dereference at virtual
>>> address 00000104
>>> pgd = c0004000
>>> [00000104] *pgd=00000000
>>> Internal error: Oops: 805 [#1] SMP ARM
>>> ...
>>> [<bf0d92cc>] (cppi41_runtime_resume [cppi41]) from [<c0589838>]
>>> (__rpm_callback+0xc0/0x214)
>>> [<c0589838>] (__rpm_callback) from [<c05899ac>] (rpm_callback+0x20/0x80)
>>> [<c05899ac>] (rpm_callback) from [<c0589460>] (rpm_resume+0x504/0x78c)
>>> [<c0589460>] (rpm_resume) from [<c058a1a0>] (pm_runtime_work+0x60/0xa8)
>>> [<c058a1a0>] (pm_runtime_work) from [<c0156120>] (process_one_work+0x2b4/0x808)
>>>
>>> This is because of a race with runtime PM and cppi41_dma_issue_pending()
>>> as reported by Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> in earlier
>>> set of patches. Based on mailing list discussions we however came to the
>>> conclusion that a different fix from Alexandre's fix is needed in order
>>> to guarantee that DMA is really active when we try to use it.
>>>
>>> To fix the issue, we need to add a driver specific flag as we otherwise
>>> can have -EINPROGRESS state set by runtime PM and can't rely on
>>> pm_runtime_active() to tell us when we can use the DMA.
>>>
>>> And we need to make sure the DMA transfers get triggered in the queued
>>> order. So let's always queue the transfers, then flush the queue
>>> from both cppi41_dma_issue_pending() and cppi41_runtime_resume()
>>> as suggested by Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> in an
>>> earlier example patch.
>>>
>>> For reference, this is also documented in Documentation/power/runtime_pm.txt
>>> in the example at the end of the file as pointed out by Grygorii Strashko
>>> <grygorii.strashko-l0cyMroinI0@public.gmane.org>.
>>>
>>> Based on earlier patches from Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
>>> and Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> modified based on
>>> testing and what was discussed on the mailing lists.
>>>
>>> Fixes: fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
>>> Cc: Andy Shevchenko <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>> Cc: Bin Liu <b-liu-l0cyMroinI0@public.gmane.org>
>>> Cc: Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org>
>>> Cc: Kevin Hilman <khilman-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
>>> Cc: Patrick Titiano <ptitiano-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
>>> Cc: Sergei Shtylyov <sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>
>>> Reported-by: Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
>>> Signed-off-by: Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
>>> ---
>>>
>>> Alexandre & Grygorii, can you guys please review and test?
>>
>> I've just tested it on BBB. And it triggers warning from IRQ
>>
>> root@am335x-evm:~# [  242.280546] ------------[ cut here ]------------
>> [  242.285532] WARNING: CPU: 0 PID: 0 at drivers/dma/cppi41.c:330 cppi41_irq+0x228/0x260 [cppi41]
>> [  242.335164] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.10.0-rc3-00560-g51afc29 #133
>> [  242.344601] Hardware name: Generic AM33XX (Flattened Device Tree)
>> [  242.351084] [<c011013c>] (unwind_backtrace) from [<c010c300>] (show_stack+0x10/0x14)
>> [  242.359269] [<c010c300>] (show_stack) from [<c04a0278>] (dump_stack+0xac/0xe0)
>> [  242.366919] [<c04a0278>] (dump_stack) from [<c013700c>] (__warn+0xd8/0x104)
>> [  242.374282] [<c013700c>] (__warn) from [<c01370e4>] (warn_slowpath_null+0x20/0x28)
>> [  242.382299] [<c01370e4>] (warn_slowpath_null) from [<bf1d0384>] (cppi41_irq+0x228/0x260 [cppi41])
>> [  242.391714] [<bf1d0384>] (cppi41_irq [cppi41]) from [<c01a6888>] (__handle_irq_event_percpu+0x48/0x3b4)
>> [  242.401727] [<c01a6888>] (__handle_irq_event_percpu) from [<c01a6c10>] (handle_irq_event_percpu+0x1c/0x58)
>> [  242.412017] [<c01a6c10>] (handle_irq_event_percpu) from [<c01a6c84>] (handle_irq_event+0x38/0x5c)
>> [  242.421477] [<c01a6c84>] (handle_irq_event) from [<c01aa004>] (handle_level_irq+0xc0/0x154)
>> [  242.430378] [<c01aa004>] (handle_level_irq) from [<c01a5b9c>] (generic_handle_irq+0x20/0x34)
>> [  242.439378] [<c01a5b9c>] (generic_handle_irq) from [<c01a60f0>] (__handle_domain_irq+0x64/0xe0)
>> [  242.448660] [<c01a60f0>] (__handle_domain_irq) from [<c07f92f0>] (__irq_svc+0x70/0x98)
>> [  242.457117] [<c07f92f0>] (__irq_svc) from [<c010836c>] (arch_cpu_idle+0x20/0x3c)
>> [  242.464935] [<c010836c>] (arch_cpu_idle) from [<c018bf7c>] (do_idle+0x164/0x218)
>> [  242.472748] [<c018bf7c>] (do_idle) from [<c018c39c>] (cpu_startup_entry+0x18/0x1c)
>> [  242.480758] [<c018c39c>] (cpu_startup_entry) from [<c0b00c30>] (start_kernel+0x344/0x3bc)
>> [  242.489376] ---[ end trace 12c5b6488c1e8c75 ]---
>> [  242.496525] usb 1-1.3: USB disconnect, device number 4
>>
>> test sequence:
>> - plug usb hub + cd card reader
>> - plug usb stick in hub
>>   37  mount /dev/sdb /media
>>    38  ls /media/
>>    39  rm /media/data 
>>    40  time dd if=/dev/zero of=/media/data bs=128k count=100
>>    41  umount /media
>> - unplug USB stick - > Warning
>>
>> ^ The same sequence without Hub -- > no warnings
>>
>> When I plug-unplug USB stick from Hub I can reproduce it pretty often :(
>> but it can be different issue :(
> 
> Hmm interesting, I'll try it here too. I wonder if adding spin_lock_irqsave/
> restore around the WARN_ON() in the interrupt handler is enough to make it
> go away?
> 

I think not :) Most probably the PM runtime autosuspend is too small for the 
USB hub case.

I've applied below diff and can see that PM runtime suspend happens before
IRQ is triggered and there is one pending descs always.

Then instead of trying to adjust autosuspend delay I tried to
maintain proper PM runtime counter:
 - push desc -> pm_runtime_get -> counter ++
 - irq -> pop desc -> pm_runtime_put_autosuspend -> counter--
which means "Keep cppi active until as least on desc is in progress"/

As result with above change I do not see any warning any more - i've tried two different hubs.

I saw your 098de42 "dmaengine: cppi41: Fix unpaired pm runtime when only a USB hub is connected",
but, to be honest, I did not get how is it possible to get unpaired get/put from your description:
- descriptor pushed in to the Q should be returned back as part of normal operation
 or as part of cppi41_tear_down_chan. May be you've hit an issue with tear_down?


diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
index ce37a1a..5b9d027 100644
--- a/drivers/dma/cppi41.c
+++ b/drivers/dma/cppi41.c
@@ -1,3 +1,4 @@
+#include <linux/atomic.h>
 #include <linux/delay.h>
 #include <linux/dmaengine.h>
 #include <linux/dma-mapping.h>
@@ -155,6 +156,7 @@ struct cppi41_dd {
 	unsigned int dma_tdfdq;
 
 	bool is_suspended;
+	atomic_t push_descs;
 };
 
 #define FIST_COMPLETION_QUEUE	93
@@ -321,13 +323,8 @@ static irqreturn_t cppi41_irq(int irq, void *data)
 			u32 desc, len;
 			int error;
 
-			error = pm_runtime_get(cdd->ddev.dev);
-			if ((error != -EINPROGRESS) && (error < 0))
-				dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
-					__func__, error);
-
 			/* This warning should never trigger */
-			WARN_ON(cdd->is_suspended);
+			WARN(cdd->is_suspended, "=== push_descs %d\n", atomic_read(&cdd->push_descs));
 
 			q_num = __fls(val);
 			val &= ~(1 << q_num);
@@ -351,6 +348,7 @@ static irqreturn_t cppi41_irq(int irq, void *data)
 
 			pm_runtime_mark_last_busy(cdd->ddev.dev);
 			pm_runtime_put_autosuspend(cdd->ddev.dev);
+			atomic_dec(&cdd->push_descs);
 		}
 	}
 	return IRQ_HANDLED;
@@ -460,6 +458,7 @@ static void push_desc_queue(struct cppi41_channel *c)
 	reg = (sizeof(struct cppi41_desc) - 24) / 4;
 	reg |= desc_phys;
 	cppi_writel(reg, cdd->qmgr_mem + QMGR_QUEUE_D(c->q_num));
+	atomic_inc(&cdd->push_descs);
 }
 
 /*
@@ -500,7 +499,6 @@ static void cppi41_dma_issue_pending(struct dma_chan *chan)
 	spin_unlock_irqrestore(&cdd->lock, flags);
 
 	pm_runtime_mark_last_busy(cdd->ddev.dev);
-	pm_runtime_put_autosuspend(cdd->ddev.dev);
 }
 
 static u32 get_host_pd0(u32 length)
@@ -1163,12 +1161,17 @@ static int __maybe_unused cppi41_runtime_suspend(struct device *dev)
 {
 	struct cppi41_dd *cdd = dev_get_drvdata(dev);
 	unsigned long flags;
+	int x;
 
 	spin_lock_irqsave(&cdd->lock, flags);
 	cdd->is_suspended = true;
 	WARN_ON(!list_empty(&cdd->pending));
+	x = atomic_read(&cdd->push_descs);
+	WARN(x, "=== push_descs %d", x);
+
 	spin_unlock_irqrestore(&cdd->lock, flags);
 
+
 	return 0;
 }
 



-- 
regards,
-grygorii
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]             ` <c80b51a2-1acd-914c-17b8-32754ea9ce3e-l0cyMroinI0@public.gmane.org>
@ 2017-01-13 19:57               ` Grygorii Strashko
       [not found]                 ` <c58cebdf-5752-098a-8b3e-de99f6af14af-l0cyMroinI0@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Grygorii Strashko @ 2017-01-13 19:57 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov



On 01/13/2017 01:53 PM, Grygorii Strashko wrote:
> 
> 
> On 01/13/2017 01:01 PM, Tony Lindgren wrote:
>> * Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170113 10:37]:
>>>
>>>
>>> On 01/13/2017 12:01 PM, Tony Lindgren wrote:
>>>> Commit fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
>>>> together with recent MUSB changes allowed USB and DMA on BeagleBone to idle
>>>> when no cable is connected. But looks like few corner case issues still
>>>> remain.
>>>>
>>>> Looks like just by re-plugging USB cable about ten or so times on BeagleBone
>>>> when configured in USB peripheral mode we can get warnings and eventually
>>>> trigger an oops in cppi41 DMA:
>>>>
>>>> WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:1154 cppi41_runtime_suspend+
>>>> x28/0x38 [cppi41]
>>>> ...
>>>>
>>>> WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:452
>>>> push_desc_queue+0x94/0x9c [cppi41]
>>>> ...
>>>>
>>>> Unable to handle kernel NULL pointer dereference at virtual
>>>> address 00000104
>>>> pgd = c0004000
>>>> [00000104] *pgd=00000000
>>>> Internal error: Oops: 805 [#1] SMP ARM
>>>> ...
>>>> [<bf0d92cc>] (cppi41_runtime_resume [cppi41]) from [<c0589838>]
>>>> (__rpm_callback+0xc0/0x214)
>>>> [<c0589838>] (__rpm_callback) from [<c05899ac>] (rpm_callback+0x20/0x80)
>>>> [<c05899ac>] (rpm_callback) from [<c0589460>] (rpm_resume+0x504/0x78c)
>>>> [<c0589460>] (rpm_resume) from [<c058a1a0>] (pm_runtime_work+0x60/0xa8)
>>>> [<c058a1a0>] (pm_runtime_work) from [<c0156120>] (process_one_work+0x2b4/0x808)
>>>>
>>>> This is because of a race with runtime PM and cppi41_dma_issue_pending()
>>>> as reported by Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> in earlier
>>>> set of patches. Based on mailing list discussions we however came to the
>>>> conclusion that a different fix from Alexandre's fix is needed in order
>>>> to guarantee that DMA is really active when we try to use it.
>>>>
>>>> To fix the issue, we need to add a driver specific flag as we otherwise
>>>> can have -EINPROGRESS state set by runtime PM and can't rely on
>>>> pm_runtime_active() to tell us when we can use the DMA.
>>>>
>>>> And we need to make sure the DMA transfers get triggered in the queued
>>>> order. So let's always queue the transfers, then flush the queue
>>>> from both cppi41_dma_issue_pending() and cppi41_runtime_resume()
>>>> as suggested by Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> in an
>>>> earlier example patch.
>>>>
>>>> For reference, this is also documented in Documentation/power/runtime_pm.txt
>>>> in the example at the end of the file as pointed out by Grygorii Strashko
>>>> <grygorii.strashko-l0cyMroinI0@public.gmane.org>.
>>>>
>>>> Based on earlier patches from Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
>>>> and Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> modified based on
>>>> testing and what was discussed on the mailing lists.
>>>>
>>>> Fixes: fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
>>>> Cc: Andy Shevchenko <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>>> Cc: Bin Liu <b-liu-l0cyMroinI0@public.gmane.org>
>>>> Cc: Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org>
>>>> Cc: Kevin Hilman <khilman-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
>>>> Cc: Patrick Titiano <ptitiano-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
>>>> Cc: Sergei Shtylyov <sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>
>>>> Reported-by: Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
>>>> Signed-off-by: Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
>>>> ---
>>>>
>>>> Alexandre & Grygorii, can you guys please review and test?
>>>
>>> I've just tested it on BBB. And it triggers warning from IRQ
>>>
>>> root@am335x-evm:~# [  242.280546] ------------[ cut here ]------------
>>> [  242.285532] WARNING: CPU: 0 PID: 0 at drivers/dma/cppi41.c:330 cppi41_irq+0x228/0x260 [cppi41]
>>> [  242.335164] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.10.0-rc3-00560-g51afc29 #133
>>> [  242.344601] Hardware name: Generic AM33XX (Flattened Device Tree)
>>> [  242.351084] [<c011013c>] (unwind_backtrace) from [<c010c300>] (show_stack+0x10/0x14)
>>> [  242.359269] [<c010c300>] (show_stack) from [<c04a0278>] (dump_stack+0xac/0xe0)
>>> [  242.366919] [<c04a0278>] (dump_stack) from [<c013700c>] (__warn+0xd8/0x104)
>>> [  242.374282] [<c013700c>] (__warn) from [<c01370e4>] (warn_slowpath_null+0x20/0x28)
>>> [  242.382299] [<c01370e4>] (warn_slowpath_null) from [<bf1d0384>] (cppi41_irq+0x228/0x260 [cppi41])
>>> [  242.391714] [<bf1d0384>] (cppi41_irq [cppi41]) from [<c01a6888>] (__handle_irq_event_percpu+0x48/0x3b4)
>>> [  242.401727] [<c01a6888>] (__handle_irq_event_percpu) from [<c01a6c10>] (handle_irq_event_percpu+0x1c/0x58)
>>> [  242.412017] [<c01a6c10>] (handle_irq_event_percpu) from [<c01a6c84>] (handle_irq_event+0x38/0x5c)
>>> [  242.421477] [<c01a6c84>] (handle_irq_event) from [<c01aa004>] (handle_level_irq+0xc0/0x154)
>>> [  242.430378] [<c01aa004>] (handle_level_irq) from [<c01a5b9c>] (generic_handle_irq+0x20/0x34)
>>> [  242.439378] [<c01a5b9c>] (generic_handle_irq) from [<c01a60f0>] (__handle_domain_irq+0x64/0xe0)
>>> [  242.448660] [<c01a60f0>] (__handle_domain_irq) from [<c07f92f0>] (__irq_svc+0x70/0x98)
>>> [  242.457117] [<c07f92f0>] (__irq_svc) from [<c010836c>] (arch_cpu_idle+0x20/0x3c)
>>> [  242.464935] [<c010836c>] (arch_cpu_idle) from [<c018bf7c>] (do_idle+0x164/0x218)
>>> [  242.472748] [<c018bf7c>] (do_idle) from [<c018c39c>] (cpu_startup_entry+0x18/0x1c)
>>> [  242.480758] [<c018c39c>] (cpu_startup_entry) from [<c0b00c30>] (start_kernel+0x344/0x3bc)
>>> [  242.489376] ---[ end trace 12c5b6488c1e8c75 ]---
>>> [  242.496525] usb 1-1.3: USB disconnect, device number 4
>>>
>>> test sequence:
>>> - plug usb hub + cd card reader
>>> - plug usb stick in hub
>>>   37  mount /dev/sdb /media
>>>    38  ls /media/
>>>    39  rm /media/data 
>>>    40  time dd if=/dev/zero of=/media/data bs=128k count=100
>>>    41  umount /media
>>> - unplug USB stick - > Warning
>>>
>>> ^ The same sequence without Hub -- > no warnings
>>>
>>> When I plug-unplug USB stick from Hub I can reproduce it pretty often :(
>>> but it can be different issue :(
>>
>> Hmm interesting, I'll try it here too. I wonder if adding spin_lock_irqsave/
>> restore around the WARN_ON() in the interrupt handler is enough to make it
>> go away?
>>
> 
> I think not :) Most probably the PM runtime autosuspend is too small for the 
> USB hub case.
> 
> I've applied below diff and can see that PM runtime suspend happens before
> IRQ is triggered and there is one pending descs always.
> 
> Then instead of trying to adjust autosuspend delay I tried to
> maintain proper PM runtime counter:
>  - push desc -> pm_runtime_get -> counter ++
>  - irq -> pop desc -> pm_runtime_put_autosuspend -> counter--
> which means "Keep cppi active until as least on desc is in progress"/
> 
> As result with above change I do not see any warning any more - i've tried two different hubs.
> 
> I saw your 098de42 "dmaengine: cppi41: Fix unpaired pm runtime when only a USB hub is connected",
> but, to be honest, I did not get how is it possible to get unpaired get/put from your description:
> - descriptor pushed in to the Q should be returned back as part of normal operation
>  or as part of cppi41_tear_down_chan. May be you've hit an issue with tear_down?
> 
> 

Simplified diff with fix on top of your patch:

diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
index ce37a1a..9e9403a 100644
--- a/drivers/dma/cppi41.c
+++ b/drivers/dma/cppi41.c
@@ -319,12 +319,6 @@ static irqreturn_t cppi41_irq(int irq, void *data)
 
 		while (val) {
 			u32 desc, len;
-			int error;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                 ` <c58cebdf-5752-098a-8b3e-de99f6af14af-l0cyMroinI0@public.gmane.org>
@ 2017-01-13 21:36                   ` Grygorii Strashko
       [not found]                     ` <c1bbb731-940e-6d04-f127-222050d831b8-l0cyMroinI0@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Grygorii Strashko @ 2017-01-13 21:36 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov



On 01/13/2017 01:57 PM, Grygorii Strashko wrote:
> 
> 
> On 01/13/2017 01:53 PM, Grygorii Strashko wrote:
>>
>>
>> On 01/13/2017 01:01 PM, Tony Lindgren wrote:
>>> * Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170113 10:37]:
>>>>
>>>>
>>>> On 01/13/2017 12:01 PM, Tony Lindgren wrote:
>>>>> Commit fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
>>>>> together with recent MUSB changes allowed USB and DMA on BeagleBone to idle
>>>>> when no cable is connected. But looks like few corner case issues still
>>>>> remain.
>>>>>
>>>>> Looks like just by re-plugging USB cable about ten or so times on BeagleBone
>>>>> when configured in USB peripheral mode we can get warnings and eventually
>>>>> trigger an oops in cppi41 DMA:
>>>>>
>>>>> WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:1154 cppi41_runtime_suspend+
>>>>> x28/0x38 [cppi41]
>>>>> ...
>>>>>
>>>>> WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:452
>>>>> push_desc_queue+0x94/0x9c [cppi41]
>>>>> ...
>>>>>
>>>>> Unable to handle kernel NULL pointer dereference at virtual
>>>>> address 00000104
>>>>> pgd = c0004000
>>>>> [00000104] *pgd=00000000
>>>>> Internal error: Oops: 805 [#1] SMP ARM
>>>>> ...
>>>>> [<bf0d92cc>] (cppi41_runtime_resume [cppi41]) from [<c0589838>]
>>>>> (__rpm_callback+0xc0/0x214)
>>>>> [<c0589838>] (__rpm_callback) from [<c05899ac>] (rpm_callback+0x20/0x80)
>>>>> [<c05899ac>] (rpm_callback) from [<c0589460>] (rpm_resume+0x504/0x78c)
>>>>> [<c0589460>] (rpm_resume) from [<c058a1a0>] (pm_runtime_work+0x60/0xa8)
>>>>> [<c058a1a0>] (pm_runtime_work) from [<c0156120>] (process_one_work+0x2b4/0x808)
>>>>>
>>>>> This is because of a race with runtime PM and cppi41_dma_issue_pending()
>>>>> as reported by Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> in earlier
>>>>> set of patches. Based on mailing list discussions we however came to the
>>>>> conclusion that a different fix from Alexandre's fix is needed in order
>>>>> to guarantee that DMA is really active when we try to use it.
>>>>>
>>>>> To fix the issue, we need to add a driver specific flag as we otherwise
>>>>> can have -EINPROGRESS state set by runtime PM and can't rely on
>>>>> pm_runtime_active() to tell us when we can use the DMA.
>>>>>
>>>>> And we need to make sure the DMA transfers get triggered in the queued
>>>>> order. So let's always queue the transfers, then flush the queue
>>>>> from both cppi41_dma_issue_pending() and cppi41_runtime_resume()
>>>>> as suggested by Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> in an
>>>>> earlier example patch.
>>>>>
>>>>> For reference, this is also documented in Documentation/power/runtime_pm.txt
>>>>> in the example at the end of the file as pointed out by Grygorii Strashko
>>>>> <grygorii.strashko-l0cyMroinI0@public.gmane.org>.
>>>>>
>>>>> Based on earlier patches from Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
>>>>> and Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> modified based on
>>>>> testing and what was discussed on the mailing lists.
>>>>>
>>>>> Fixes: fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
>>>>> Cc: Andy Shevchenko <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>>>> Cc: Bin Liu <b-liu-l0cyMroinI0@public.gmane.org>
>>>>> Cc: Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org>
>>>>> Cc: Kevin Hilman <khilman-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
>>>>> Cc: Patrick Titiano <ptitiano-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
>>>>> Cc: Sergei Shtylyov <sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>
>>>>> Reported-by: Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
>>>>> Signed-off-by: Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
>>>>> ---
>>>>>
>>>>> Alexandre & Grygorii, can you guys please review and test?
>>>>
>>>> I've just tested it on BBB. And it triggers warning from IRQ
>>>>
>>>> root@am335x-evm:~# [  242.280546] ------------[ cut here ]------------
>>>> [  242.285532] WARNING: CPU: 0 PID: 0 at drivers/dma/cppi41.c:330 cppi41_irq+0x228/0x260 [cppi41]
>>>> [  242.335164] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.10.0-rc3-00560-g51afc29 #133
>>>> [  242.344601] Hardware name: Generic AM33XX (Flattened Device Tree)
>>>> [  242.351084] [<c011013c>] (unwind_backtrace) from [<c010c300>] (show_stack+0x10/0x14)
>>>> [  242.359269] [<c010c300>] (show_stack) from [<c04a0278>] (dump_stack+0xac/0xe0)
>>>> [  242.366919] [<c04a0278>] (dump_stack) from [<c013700c>] (__warn+0xd8/0x104)
>>>> [  242.374282] [<c013700c>] (__warn) from [<c01370e4>] (warn_slowpath_null+0x20/0x28)
>>>> [  242.382299] [<c01370e4>] (warn_slowpath_null) from [<bf1d0384>] (cppi41_irq+0x228/0x260 [cppi41])
>>>> [  242.391714] [<bf1d0384>] (cppi41_irq [cppi41]) from [<c01a6888>] (__handle_irq_event_percpu+0x48/0x3b4)
>>>> [  242.401727] [<c01a6888>] (__handle_irq_event_percpu) from [<c01a6c10>] (handle_irq_event_percpu+0x1c/0x58)
>>>> [  242.412017] [<c01a6c10>] (handle_irq_event_percpu) from [<c01a6c84>] (handle_irq_event+0x38/0x5c)
>>>> [  242.421477] [<c01a6c84>] (handle_irq_event) from [<c01aa004>] (handle_level_irq+0xc0/0x154)
>>>> [  242.430378] [<c01aa004>] (handle_level_irq) from [<c01a5b9c>] (generic_handle_irq+0x20/0x34)
>>>> [  242.439378] [<c01a5b9c>] (generic_handle_irq) from [<c01a60f0>] (__handle_domain_irq+0x64/0xe0)
>>>> [  242.448660] [<c01a60f0>] (__handle_domain_irq) from [<c07f92f0>] (__irq_svc+0x70/0x98)
>>>> [  242.457117] [<c07f92f0>] (__irq_svc) from [<c010836c>] (arch_cpu_idle+0x20/0x3c)
>>>> [  242.464935] [<c010836c>] (arch_cpu_idle) from [<c018bf7c>] (do_idle+0x164/0x218)
>>>> [  242.472748] [<c018bf7c>] (do_idle) from [<c018c39c>] (cpu_startup_entry+0x18/0x1c)
>>>> [  242.480758] [<c018c39c>] (cpu_startup_entry) from [<c0b00c30>] (start_kernel+0x344/0x3bc)
>>>> [  242.489376] ---[ end trace 12c5b6488c1e8c75 ]---
>>>> [  242.496525] usb 1-1.3: USB disconnect, device number 4
>>>>
>>>> test sequence:
>>>> - plug usb hub + cd card reader
>>>> - plug usb stick in hub
>>>>   37  mount /dev/sdb /media
>>>>    38  ls /media/
>>>>    39  rm /media/data 
>>>>    40  time dd if=/dev/zero of=/media/data bs=128k count=100
>>>>    41  umount /media
>>>> - unplug USB stick - > Warning
>>>>
>>>> ^ The same sequence without Hub -- > no warnings
>>>>
>>>> When I plug-unplug USB stick from Hub I can reproduce it pretty often :(
>>>> but it can be different issue :(
>>>
>>> Hmm interesting, I'll try it here too. I wonder if adding spin_lock_irqsave/
>>> restore around the WARN_ON() in the interrupt handler is enough to make it
>>> go away?
>>>
>>
>> I think not :) Most probably the PM runtime autosuspend is too small for the 
>> USB hub case.
>>
>> I've applied below diff and can see that PM runtime suspend happens before
>> IRQ is triggered and there is one pending descs always.
>>
>> Then instead of trying to adjust autosuspend delay I tried to
>> maintain proper PM runtime counter:
>>  - push desc -> pm_runtime_get -> counter ++
>>  - irq -> pop desc -> pm_runtime_put_autosuspend -> counter--
>> which means "Keep cppi active until as least on desc is in progress"/
>>
>> As result with above change I do not see any warning any more - i've tried two different hubs.
>>
>> I saw your 098de42 "dmaengine: cppi41: Fix unpaired pm runtime when only a USB hub is connected",
>> but, to be honest, I did not get how is it possible to get unpaired get/put from your description:
>> - descriptor pushed in to the Q should be returned back as part of normal operation
>>  or as part of cppi41_tear_down_chan. May be you've hit an issue with tear_down?
>>
>>
> 
> Simplified diff with fix on top of your patch:
> 
> diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
> index ce37a1a..9e9403a 100644
> --- a/drivers/dma/cppi41.c
> +++ b/drivers/dma/cppi41.c
> @@ -319,12 +319,6 @@ static irqreturn_t cppi41_irq(int irq, void *data)
>  
>  		while (val) {
>  			u32 desc, len;
> -			int error;
> -
> -			error = pm_runtime_get(cdd->ddev.dev);
> -			if ((error != -EINPROGRESS) && (error < 0))
> -				dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
> -					__func__, error);
>  
>  			/* This warning should never trigger */
>  			WARN_ON(cdd->is_suspended);
> @@ -500,7 +494,6 @@ static void cppi41_dma_issue_pending(struct dma_chan *chan)
>  	spin_unlock_irqrestore(&cdd->lock, flags);
>  
>  	pm_runtime_mark_last_busy(cdd->ddev.dev);
> -	pm_runtime_put_autosuspend(cdd->ddev.dev);
>  }
>  
>  static u32 get_host_pd0(u32 length)
> 

Ok. this is incorrect in case of USB hub and will just hide the problem
by incrementing PM runtime usage counter every time USB host is connected :(

Once USB hub is connected the PM runtime usage counter will became 1 and stay
1 after disconnection. Which means that some descriptor was pushed in Q, but IRQ
was not triggered.

Don't know how to proceed as I'm not so familiar with musb :(

-- 
regards,
-grygorii
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                     ` <c1bbb731-940e-6d04-f127-222050d831b8-l0cyMroinI0@public.gmane.org>
@ 2017-01-13 21:59                       ` Tony Lindgren
       [not found]                         ` <20170113215936.GF2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Tony Lindgren @ 2017-01-13 21:59 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170113 13:37]:
> > Simplified diff with fix on top of your patch:
> > 
> > diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
> > index ce37a1a..9e9403a 100644
> > --- a/drivers/dma/cppi41.c
> > +++ b/drivers/dma/cppi41.c
> > @@ -319,12 +319,6 @@ static irqreturn_t cppi41_irq(int irq, void *data)
> >  
> >  		while (val) {
> >  			u32 desc, len;
> > -			int error;
> > -
> > -			error = pm_runtime_get(cdd->ddev.dev);
> > -			if ((error != -EINPROGRESS) && (error < 0))
> > -				dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
> > -					__func__, error);
> >  
> >  			/* This warning should never trigger */
> >  			WARN_ON(cdd->is_suspended);
> > @@ -500,7 +494,6 @@ static void cppi41_dma_issue_pending(struct dma_chan *chan)
> >  	spin_unlock_irqrestore(&cdd->lock, flags);
> >  
> >  	pm_runtime_mark_last_busy(cdd->ddev.dev);
> > -	pm_runtime_put_autosuspend(cdd->ddev.dev);
> >  }
> >  
> >  static u32 get_host_pd0(u32 length)
> > 
> 
> Ok. this is incorrect in case of USB hub and will just hide the problem
> by incrementing PM runtime usage counter every time USB host is connected :(

Yeah if anything changes in those two nested loops, the pm_runtime counts
get unbalanced.

> Once USB hub is connected the PM runtime usage counter will became 1 and stay
> 1 after disconnection. Which means that some descriptor was pushed in Q, but IRQ
> was not triggered.
> 
> Don't know how to proceed as I'm not so familiar with musb :(

I'm now playing with saving the queue manager value and kicking the
PM runtime if we have transfers in progress. Looks like the dma
registers are zero while there are transfers in progress, or at least
I have not yet found any hardware registers that would tell that.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                         ` <20170113215936.GF2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-16 23:33                           ` Tony Lindgren
       [not found]                             ` <20170116233329.GF7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Tony Lindgren @ 2017-01-16 23:33 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> [170113 14:00]:
> * Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170113 13:37]:
> > > Simplified diff with fix on top of your patch:
> > > 
> > > diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
> > > index ce37a1a..9e9403a 100644
> > > --- a/drivers/dma/cppi41.c
> > > +++ b/drivers/dma/cppi41.c
> > > @@ -319,12 +319,6 @@ static irqreturn_t cppi41_irq(int irq, void *data)
> > >  
> > >  		while (val) {
> > >  			u32 desc, len;
> > > -			int error;
> > > -
> > > -			error = pm_runtime_get(cdd->ddev.dev);
> > > -			if ((error != -EINPROGRESS) && (error < 0))
> > > -				dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
> > > -					__func__, error);
> > >  
> > >  			/* This warning should never trigger */
> > >  			WARN_ON(cdd->is_suspended);
> > > @@ -500,7 +494,6 @@ static void cppi41_dma_issue_pending(struct dma_chan *chan)
> > >  	spin_unlock_irqrestore(&cdd->lock, flags);
> > >  
> > >  	pm_runtime_mark_last_busy(cdd->ddev.dev);
> > > -	pm_runtime_put_autosuspend(cdd->ddev.dev);
> > >  }
> > >  
> > >  static u32 get_host_pd0(u32 length)
> > > 
> > 
> > Ok. this is incorrect in case of USB hub and will just hide the problem
> > by incrementing PM runtime usage counter every time USB host is connected :(
> 
> Yeah if anything changes in those two nested loops, the pm_runtime counts
> get unbalanced.
> 
> > Once USB hub is connected the PM runtime usage counter will became 1 and stay
> > 1 after disconnection. Which means that some descriptor was pushed in Q, but IRQ
> > was not triggered.
> > 
> > Don't know how to proceed as I'm not so familiar with musb :(
> 
> I'm now playing with saving the queue manager value and kicking the
> PM runtime if we have transfers in progress. Looks like the dma
> registers are zero while there are transfers in progress, or at least
> I have not yet found any hardware registers that would tell that.

Looks like drivers/usb/musb/musb_cppi41.c is not properly terminating
dma transactions.. The following additional patch makes things behave
without warnings for me.

I'll fold the drivers/dma/cppi41.c changes to $subject patch and repost,
then will post a separate musb fix for drivers/usb/musb/musb_cppi41.c
that avoids the warning after some more investigating.

Luckily the warning is harmless in this case as musb and cppi41 are
in the same device so the common parent is powered and cppi41 is
getting clocked.

Regards,

Tony

8< ---------------------------
diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
--- a/drivers/dma/cppi41.c
+++ b/drivers/dma/cppi41.c
@@ -705,19 +705,31 @@ static int cppi41_stop_chan(struct dma_chan *chan)
 	u32 desc_phys;
 	int ret;
 
+	ret = pm_runtime_get(cdd->ddev.dev);
+	if ((ret != -EINPROGRESS) && (ret < 0))
+		dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
+			__func__, ret);
+	WARN_ON(cdd->is_suspended);
+
 	desc_phys = lower_32_bits(c->desc_phys);
 	desc_num = (desc_phys - cdd->descs_phys) / sizeof(struct cppi41_desc);
-	if (!cdd->chan_busy[desc_num])
-		return 0;
+	if (!cdd->chan_busy[desc_num]) {
+		ret = 0;
+		goto out;
+	}
 
 	ret = cppi41_tear_down_chan(c);
 	if (ret)
-		return ret;
+		goto out;
 
 	WARN_ON(!cdd->chan_busy[desc_num]);
 	cdd->chan_busy[desc_num] = NULL;
 
-	return 0;
+out:
+	pm_runtime_mark_last_busy(cdd->ddev.dev);
+	pm_runtime_put_autosuspend(cdd->ddev.dev);
+
+	return ret;
 }
 
 static void cleanup_chans(struct cppi41_dd *cdd)
diff --git a/drivers/usb/musb/musb_cppi41.c b/drivers/usb/musb/musb_cppi41.c
--- a/drivers/usb/musb/musb_cppi41.c
+++ b/drivers/usb/musb/musb_cppi41.c
@@ -445,9 +445,16 @@ static struct dma_channel *cppi41_dma_channel_allocate(struct dma_controller *c,
 static void cppi41_dma_channel_release(struct dma_channel *channel)
 {
 	struct cppi41_dma_channel *cppi41_channel = channel->private_data;
+	struct musb_hw_ep *hw_ep = cppi41_channel->hw_ep;
+	int error;
 
 	trace_musb_cppi41_free(cppi41_channel);
 	if (cppi41_channel->is_allocated) {
+		error = dmaengine_terminate_all(cppi41_channel->dc);
+		if (error)
+			dev_err(hw_ep->musb->controller,
+				"dma terminate failed: %i\n",
+				error);
 		cppi41_channel->is_allocated = 0;
 		channel->status = MUSB_DMA_STATUS_FREE;
 		channel->actual_len = 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                             ` <20170116233329.GF7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-16 23:54                               ` Tony Lindgren
       [not found]                                 ` <20170116235428.GG7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Tony Lindgren @ 2017-01-16 23:54 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> [170116 15:36]:
> * Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> [170113 14:00]:
> > * Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170113 13:37]:
> > > > Simplified diff with fix on top of your patch:
> > > > 
> > > > diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
> > > > index ce37a1a..9e9403a 100644
> > > > --- a/drivers/dma/cppi41.c
> > > > +++ b/drivers/dma/cppi41.c
> > > > @@ -319,12 +319,6 @@ static irqreturn_t cppi41_irq(int irq, void *data)
> > > >  
> > > >  		while (val) {
> > > >  			u32 desc, len;
> > > > -			int error;
> > > > -
> > > > -			error = pm_runtime_get(cdd->ddev.dev);
> > > > -			if ((error != -EINPROGRESS) && (error < 0))
> > > > -				dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
> > > > -					__func__, error);
> > > >  
> > > >  			/* This warning should never trigger */
> > > >  			WARN_ON(cdd->is_suspended);
> > > > @@ -500,7 +494,6 @@ static void cppi41_dma_issue_pending(struct dma_chan *chan)
> > > >  	spin_unlock_irqrestore(&cdd->lock, flags);
> > > >  
> > > >  	pm_runtime_mark_last_busy(cdd->ddev.dev);
> > > > -	pm_runtime_put_autosuspend(cdd->ddev.dev);
> > > >  }
> > > >  
> > > >  static u32 get_host_pd0(u32 length)
> > > > 
> > > 
> > > Ok. this is incorrect in case of USB hub and will just hide the problem
> > > by incrementing PM runtime usage counter every time USB host is connected :(
> > 
> > Yeah if anything changes in those two nested loops, the pm_runtime counts
> > get unbalanced.
> > 
> > > Once USB hub is connected the PM runtime usage counter will became 1 and stay
> > > 1 after disconnection. Which means that some descriptor was pushed in Q, but IRQ
> > > was not triggered.
> > > 
> > > Don't know how to proceed as I'm not so familiar with musb :(
> > 
> > I'm now playing with saving the queue manager value and kicking the
> > PM runtime if we have transfers in progress. Looks like the dma
> > registers are zero while there are transfers in progress, or at least
> > I have not yet found any hardware registers that would tell that.
> 
> Looks like drivers/usb/musb/musb_cppi41.c is not properly terminating
> dma transactions.. The following additional patch makes things behave
> without warnings for me.
> 
> I'll fold the drivers/dma/cppi41.c changes to $subject patch and repost,
> then will post a separate musb fix for drivers/usb/musb/musb_cppi41.c
> that avoids the warning after some more investigating.
> 
> Luckily the warning is harmless in this case as musb and cppi41 are
> in the same device so the common parent is powered and cppi41 is
> getting clocked.

Sorry actually it's after these fixes when we still need to implement
something for cppi41 PM runtime usecount that makes sense as the
calls will finally be paired. For testing, reverting 098de42ad670
("dmaengine: cppi41: Fix unpaired pm runtime when only a USB hub
is connected") can be done. But I don't like that as it's still
unpaired pm_runtime_calls potentially if something goes wrong.

Anyways, for the -rc series oops, we can just leave out the WARN_ON
parts for now until drivers/usb/musb/musb_cppi41.c is fixed too.

Regards,

Tony


> 8< ---------------------------
> diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
> --- a/drivers/dma/cppi41.c
> +++ b/drivers/dma/cppi41.c
> @@ -705,19 +705,31 @@ static int cppi41_stop_chan(struct dma_chan *chan)
>  	u32 desc_phys;
>  	int ret;
>  
> +	ret = pm_runtime_get(cdd->ddev.dev);
> +	if ((ret != -EINPROGRESS) && (ret < 0))
> +		dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
> +			__func__, ret);
> +	WARN_ON(cdd->is_suspended);
> +
>  	desc_phys = lower_32_bits(c->desc_phys);
>  	desc_num = (desc_phys - cdd->descs_phys) / sizeof(struct cppi41_desc);
> -	if (!cdd->chan_busy[desc_num])
> -		return 0;
> +	if (!cdd->chan_busy[desc_num]) {
> +		ret = 0;
> +		goto out;
> +	}
>  
>  	ret = cppi41_tear_down_chan(c);
>  	if (ret)
> -		return ret;
> +		goto out;
>  
>  	WARN_ON(!cdd->chan_busy[desc_num]);
>  	cdd->chan_busy[desc_num] = NULL;
>  
> -	return 0;
> +out:
> +	pm_runtime_mark_last_busy(cdd->ddev.dev);
> +	pm_runtime_put_autosuspend(cdd->ddev.dev);
> +
> +	return ret;
>  }
>  
>  static void cleanup_chans(struct cppi41_dd *cdd)
> diff --git a/drivers/usb/musb/musb_cppi41.c b/drivers/usb/musb/musb_cppi41.c
> --- a/drivers/usb/musb/musb_cppi41.c
> +++ b/drivers/usb/musb/musb_cppi41.c
> @@ -445,9 +445,16 @@ static struct dma_channel *cppi41_dma_channel_allocate(struct dma_controller *c,
>  static void cppi41_dma_channel_release(struct dma_channel *channel)
>  {
>  	struct cppi41_dma_channel *cppi41_channel = channel->private_data;
> +	struct musb_hw_ep *hw_ep = cppi41_channel->hw_ep;
> +	int error;
>  
>  	trace_musb_cppi41_free(cppi41_channel);
>  	if (cppi41_channel->is_allocated) {
> +		error = dmaengine_terminate_all(cppi41_channel->dc);
> +		if (error)
> +			dev_err(hw_ep->musb->controller,
> +				"dma terminate failed: %i\n",
> +				error);
>  		cppi41_channel->is_allocated = 0;
>  		channel->status = MUSB_DMA_STATUS_FREE;
>  		channel->actual_len = 0;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                                 ` <20170116235428.GG7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-17 12:59                                   ` Bin Liu
  2017-01-17 16:11                                     ` Tony Lindgren
  0 siblings, 1 reply; 17+ messages in thread
From: Bin Liu @ 2017-01-17 12:59 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Grygorii Strashko, Dan Williams, Vinod Koul, Daniel Mack,
	Felipe Balbi, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

Tony,

On Mon, Jan 16, 2017 at 03:54:29PM -0800, Tony Lindgren wrote:
> * Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> [170116 15:36]:
> > * Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> [170113 14:00]:
> > > * Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170113 13:37]:
> > > > > Simplified diff with fix on top of your patch:
> > > > > 
> > > > > diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
> > > > > index ce37a1a..9e9403a 100644
> > > > > --- a/drivers/dma/cppi41.c
> > > > > +++ b/drivers/dma/cppi41.c
> > > > > @@ -319,12 +319,6 @@ static irqreturn_t cppi41_irq(int irq, void *data)
> > > > >  
> > > > >  		while (val) {
> > > > >  			u32 desc, len;
> > > > > -			int error;
> > > > > -
> > > > > -			error = pm_runtime_get(cdd->ddev.dev);
> > > > > -			if ((error != -EINPROGRESS) && (error < 0))
> > > > > -				dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
> > > > > -					__func__, error);
> > > > >  
> > > > >  			/* This warning should never trigger */
> > > > >  			WARN_ON(cdd->is_suspended);
> > > > > @@ -500,7 +494,6 @@ static void cppi41_dma_issue_pending(struct dma_chan *chan)
> > > > >  	spin_unlock_irqrestore(&cdd->lock, flags);
> > > > >  
> > > > >  	pm_runtime_mark_last_busy(cdd->ddev.dev);
> > > > > -	pm_runtime_put_autosuspend(cdd->ddev.dev);
> > > > >  }
> > > > >  
> > > > >  static u32 get_host_pd0(u32 length)
> > > > > 
> > > > 
> > > > Ok. this is incorrect in case of USB hub and will just hide the problem
> > > > by incrementing PM runtime usage counter every time USB host is connected :(
> > > 
> > > Yeah if anything changes in those two nested loops, the pm_runtime counts
> > > get unbalanced.
> > > 
> > > > Once USB hub is connected the PM runtime usage counter will became 1 and stay
> > > > 1 after disconnection. Which means that some descriptor was pushed in Q, but IRQ
> > > > was not triggered.
> > > > 
> > > > Don't know how to proceed as I'm not so familiar with musb :(
> > > 
> > > I'm now playing with saving the queue manager value and kicking the
> > > PM runtime if we have transfers in progress. Looks like the dma
> > > registers are zero while there are transfers in progress, or at least
> > > I have not yet found any hardware registers that would tell that.
> > 
> > Looks like drivers/usb/musb/musb_cppi41.c is not properly terminating
> > dma transactions.. The following additional patch makes things behave
> > without warnings for me.
> > 
> > I'll fold the drivers/dma/cppi41.c changes to $subject patch and repost,
> > then will post a separate musb fix for drivers/usb/musb/musb_cppi41.c
> > that avoids the warning after some more investigating.
> > 
> > Luckily the warning is harmless in this case as musb and cppi41 are
> > in the same device so the common parent is powered and cppi41 is
> > getting clocked.
> 
> Sorry actually it's after these fixes when we still need to implement
> something for cppi41 PM runtime usecount that makes sense as the
> calls will finally be paired. For testing, reverting 098de42ad670
> ("dmaengine: cppi41: Fix unpaired pm runtime when only a USB hub
> is connected") can be done. But I don't like that as it's still
> unpaired pm_runtime_calls potentially if something goes wrong.
> 
> Anyways, for the -rc series oops, we can just leave out the WARN_ON
> parts for now until drivers/usb/musb/musb_cppi41.c is fixed too.

Giving that cppi is a submodule inside the usb subsysytem and it does't
have separate power rail or clock, what is the benefit to adding runtime
PM in the cppi driver?

Thanks,
-Bin.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
  2017-01-17 12:59                                   ` Bin Liu
@ 2017-01-17 16:11                                     ` Tony Lindgren
       [not found]                                       ` <20170117161138.GJ7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Tony Lindgren @ 2017-01-17 16:11 UTC (permalink / raw)
  To: Bin Liu, Grygorii Strashko, Dan Williams, Vinod Koul, Daniel Mack,
	Felipe Balbi, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 05:00]:
> On Mon, Jan 16, 2017 at 03:54:29PM -0800, Tony Lindgren wrote:
> > Anyways, for the -rc series oops, we can just leave out the WARN_ON
> > parts for now until drivers/usb/musb/musb_cppi41.c is fixed too.
> 
> Giving that cppi is a submodule inside the usb subsysytem and it does't
> have separate power rail or clock, what is the benefit to adding runtime
> PM in the cppi driver?

Good question. We need at least minimal support to enable things for
probe and then idle cppi41 properly if only cppi41.ko is loaded with no
USB modules.

But yeah now that musb does runtime PM based on the cable detection, we
pretty much guarantee that cppi41 is always enabled when USB is in use.

And if there are no other devices using cppi41 dma on davinci, we can
simplify the PM runtime a bit for cppi41.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                                       ` <20170117161138.GJ7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-17 16:21                                         ` Bin Liu
  2017-01-17 16:31                                           ` Tony Lindgren
  0 siblings, 1 reply; 17+ messages in thread
From: Bin Liu @ 2017-01-17 16:21 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Grygorii Strashko, Dan Williams, Vinod Koul, Daniel Mack,
	Felipe Balbi, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

On Tue, Jan 17, 2017 at 08:11:39AM -0800, Tony Lindgren wrote:
> * Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 05:00]:
> > On Mon, Jan 16, 2017 at 03:54:29PM -0800, Tony Lindgren wrote:
> > > Anyways, for the -rc series oops, we can just leave out the WARN_ON
> > > parts for now until drivers/usb/musb/musb_cppi41.c is fixed too.
> > 
> > Giving that cppi is a submodule inside the usb subsysytem and it does't
> > have separate power rail or clock, what is the benefit to adding runtime
> > PM in the cppi driver?
> 
> Good question. We need at least minimal support to enable things for
> probe and then idle cppi41 properly if only cppi41.ko is loaded with no
> USB modules.
> 
> But yeah now that musb does runtime PM based on the cable detection, we
> pretty much guarantee that cppi41 is always enabled when USB is in use.
> 
> And if there are no other devices using cppi41 dma on davinci, we can
> simplify the PM runtime a bit for cppi41.

This might be a good idea. I didn't have much time to play with this
cppi41 runtime PM, but it seems I am having more issues than you and
others seeing.

Regards,
-Bin.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
  2017-01-17 16:21                                         ` Bin Liu
@ 2017-01-17 16:31                                           ` Tony Lindgren
       [not found]                                             ` <20170117163103.GO7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Tony Lindgren @ 2017-01-17 16:31 UTC (permalink / raw)
  To: Bin Liu, Grygorii Strashko, Dan Williams, Vinod Koul, Daniel Mack,
	Felipe Balbi, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 08:22]:
> On Tue, Jan 17, 2017 at 08:11:39AM -0800, Tony Lindgren wrote:
> > * Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 05:00]:
> > > On Mon, Jan 16, 2017 at 03:54:29PM -0800, Tony Lindgren wrote:
> > > > Anyways, for the -rc series oops, we can just leave out the WARN_ON
> > > > parts for now until drivers/usb/musb/musb_cppi41.c is fixed too.
> > > 
> > > Giving that cppi is a submodule inside the usb subsysytem and it does't
> > > have separate power rail or clock, what is the benefit to adding runtime
> > > PM in the cppi driver?
> > 
> > Good question. We need at least minimal support to enable things for
> > probe and then idle cppi41 properly if only cppi41.ko is loaded with no
> > USB modules.
> > 
> > But yeah now that musb does runtime PM based on the cable detection, we
> > pretty much guarantee that cppi41 is always enabled when USB is in use.
> > 
> > And if there are no other devices using cppi41 dma on davinci, we can
> > simplify the PM runtime a bit for cppi41.
> 
> This might be a good idea. I didn't have much time to play with this
> cppi41 runtime PM, but it seems I am having more issues than you and
> others seeing.

What kind of additional issues are you seeing not described in the $subject
patch?

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                                             ` <20170117163103.GO7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-17 16:48                                               ` Bin Liu
  2017-01-17 17:39                                                 ` Tony Lindgren
  0 siblings, 1 reply; 17+ messages in thread
From: Bin Liu @ 2017-01-17 16:48 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Grygorii Strashko, Dan Williams, Vinod Koul, Daniel Mack,
	Felipe Balbi, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

On Tue, Jan 17, 2017 at 08:31:03AM -0800, Tony Lindgren wrote:
> * Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 08:22]:
> > On Tue, Jan 17, 2017 at 08:11:39AM -0800, Tony Lindgren wrote:
> > > * Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 05:00]:
> > > > On Mon, Jan 16, 2017 at 03:54:29PM -0800, Tony Lindgren wrote:
> > > > > Anyways, for the -rc series oops, we can just leave out the WARN_ON
> > > > > parts for now until drivers/usb/musb/musb_cppi41.c is fixed too.
> > > > 
> > > > Giving that cppi is a submodule inside the usb subsysytem and it does't
> > > > have separate power rail or clock, what is the benefit to adding runtime
> > > > PM in the cppi driver?
> > > 
> > > Good question. We need at least minimal support to enable things for
> > > probe and then idle cppi41 properly if only cppi41.ko is loaded with no
> > > USB modules.
> > > 
> > > But yeah now that musb does runtime PM based on the cable detection, we
> > > pretty much guarantee that cppi41 is always enabled when USB is in use.
> > > 
> > > And if there are no other devices using cppi41 dma on davinci, we can
> > > simplify the PM runtime a bit for cppi41.
> > 
> > This might be a good idea. I didn't have much time to play with this
> > cppi41 runtime PM, but it seems I am having more issues than you and
> > others seeing.
> 
> What kind of additional issues are you seeing not described in the $subject
> patch?

I didn't take a note and don't remember if those are in the $subject
patch. But

- enumeration begining with a reset that the device doesn't accept
  address X, error code -71; or

- console fooding with cppi error code -115 after thumb drive enumeration.

Again, I only tried for a few minutes and didn't take a note, since I
don't have time to look at this ATM.

Regards,
-Bin.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
  2017-01-17 16:48                                               ` Bin Liu
@ 2017-01-17 17:39                                                 ` Tony Lindgren
       [not found]                                                   ` <20170117173922.GR7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Tony Lindgren @ 2017-01-17 17:39 UTC (permalink / raw)
  To: Bin Liu, Grygorii Strashko, Dan Williams, Vinod Koul, Daniel Mack,
	Felipe Balbi, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 08:49]:
> On Tue, Jan 17, 2017 at 08:31:03AM -0800, Tony Lindgren wrote:
> > * Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 08:22]:
> > > On Tue, Jan 17, 2017 at 08:11:39AM -0800, Tony Lindgren wrote:
> > > > * Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 05:00]:
> > > > > On Mon, Jan 16, 2017 at 03:54:29PM -0800, Tony Lindgren wrote:
> > > > > > Anyways, for the -rc series oops, we can just leave out the WARN_ON
> > > > > > parts for now until drivers/usb/musb/musb_cppi41.c is fixed too.
> > > > > 
> > > > > Giving that cppi is a submodule inside the usb subsysytem and it does't
> > > > > have separate power rail or clock, what is the benefit to adding runtime
> > > > > PM in the cppi driver?
> > > > 
> > > > Good question. We need at least minimal support to enable things for
> > > > probe and then idle cppi41 properly if only cppi41.ko is loaded with no
> > > > USB modules.
> > > > 
> > > > But yeah now that musb does runtime PM based on the cable detection, we
> > > > pretty much guarantee that cppi41 is always enabled when USB is in use.
> > > > 
> > > > And if there are no other devices using cppi41 dma on davinci, we can
> > > > simplify the PM runtime a bit for cppi41.
> > > 
> > > This might be a good idea. I didn't have much time to play with this
> > > cppi41 runtime PM, but it seems I am having more issues than you and
> > > others seeing.
> > 
> > What kind of additional issues are you seeing not described in the $subject
> > patch?
> 
> I didn't take a note and don't remember if those are in the $subject
> patch. But
> 
> - enumeration begining with a reset that the device doesn't accept
>   address X, error code -71; or

Some of this could be fixed by $subject patch if caused by a race.
Some of it I'm suspecting might be a different issue where cppi41 dma
will just hang until the device is disconnected on 1 sized in dma transfer.
See the experimental patch below.

> - console fooding with cppi error code -115 after thumb drive enumeration.

This should get fixed with the $subject patch if we additionally set
the autosuspend_delay to something sufficient, like 1000.

> Again, I only tried for a few minutes and didn't take a note, since I
> don't have time to look at this ATM.

Well I'll post what I think we should fix for the -rc series for cppi41.
If you can then please test that a bit and see if it works. Assuming
things work, then all the other changes can be done later on with no
rush.

Regards,

Tony

8< ----------------------------------------
diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -751,6 +751,15 @@ static void musb_ep_program(struct musb *musb, u8 epnum,
 		hw_ep->tx_channel = NULL;
 	}
 
+	/*
+	 * At least cppi41 in dma will just hang with size of 1 until the
+	 * device is connected. For sizes less than 8 it seems to take a
+	 * long time to complete. Let's use minimum size of 16 to avoid
+	 * tiny in DMA transfers.
+	 */
+	if (!is_out && (len < 16))
+		use_dma = 0;
+
 	/* candidate for DMA? */
 	dma_controller = musb->dma_controller;
 	if (use_dma && is_dma_capable() && epnum && dma_controller) {
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                                                   ` <20170117173922.GR7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-17 19:46                                                     ` Bin Liu
  2017-01-17 20:40                                                       ` Tony Lindgren
  0 siblings, 1 reply; 17+ messages in thread
From: Bin Liu @ 2017-01-17 19:46 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Grygorii Strashko, Dan Williams, Vinod Koul, Daniel Mack,
	Felipe Balbi, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

On Tue, Jan 17, 2017 at 09:39:23AM -0800, Tony Lindgren wrote:
> * Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 08:49]:
> > On Tue, Jan 17, 2017 at 08:31:03AM -0800, Tony Lindgren wrote:
> > > * Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 08:22]:
> > > > On Tue, Jan 17, 2017 at 08:11:39AM -0800, Tony Lindgren wrote:
> > > > > * Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 05:00]:
> > > > > > On Mon, Jan 16, 2017 at 03:54:29PM -0800, Tony Lindgren wrote:
> > > > > > > Anyways, for the -rc series oops, we can just leave out the WARN_ON
> > > > > > > parts for now until drivers/usb/musb/musb_cppi41.c is fixed too.
> > > > > > 
> > > > > > Giving that cppi is a submodule inside the usb subsysytem and it does't
> > > > > > have separate power rail or clock, what is the benefit to adding runtime
> > > > > > PM in the cppi driver?
> > > > > 
> > > > > Good question. We need at least minimal support to enable things for
> > > > > probe and then idle cppi41 properly if only cppi41.ko is loaded with no
> > > > > USB modules.
> > > > > 
> > > > > But yeah now that musb does runtime PM based on the cable detection, we
> > > > > pretty much guarantee that cppi41 is always enabled when USB is in use.
> > > > > 
> > > > > And if there are no other devices using cppi41 dma on davinci, we can
> > > > > simplify the PM runtime a bit for cppi41.
> > > > 
> > > > This might be a good idea. I didn't have much time to play with this
> > > > cppi41 runtime PM, but it seems I am having more issues than you and
> > > > others seeing.
> > > 
> > > What kind of additional issues are you seeing not described in the $subject
> > > patch?
> > 
> > I didn't take a note and don't remember if those are in the $subject
> > patch. But
> > 
> > - enumeration begining with a reset that the device doesn't accept
> >   address X, error code -71; or
> 
> Some of this could be fixed by $subject patch if caused by a race.
> Some of it I'm suspecting might be a different issue where cppi41 dma
> will just hang until the device is disconnected on 1 sized in dma transfer.
> See the experimental patch below.
> 
> > - console fooding with cppi error code -115 after thumb drive enumeration.
> 
> This should get fixed with the $subject patch if we additionally set
> the autosuspend_delay to something sufficient, like 1000.
> 
> > Again, I only tried for a few minutes and didn't take a note, since I
> > don't have time to look at this ATM.
> 
> Well I'll post what I think we should fix for the -rc series for cppi41.
> If you can then please test that a bit and see if it works. Assuming
> things work, then all the other changes can be done later on with no
> rush.

Sure, I will spend some time on this tomorrow morning, and let you know.

> 
> Regards,
> 
> Tony
> 
> 8< ----------------------------------------
> diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
> --- a/drivers/usb/musb/musb_host.c
> +++ b/drivers/usb/musb/musb_host.c
> @@ -751,6 +751,15 @@ static void musb_ep_program(struct musb *musb, u8 epnum,
>  		hw_ep->tx_channel = NULL;
>  	}
>  
> +	/*
> +	 * At least cppi41 in dma will just hang with size of 1 until the
> +	 * device is connected. For sizes less than 8 it seems to take a
> +	 * long time to complete. Let's use minimum size of 16 to avoid
> +	 * tiny in DMA transfers.
> +	 */
> +	if (!is_out && (len < 16))
> +		use_dma = 0;

I think you also need to add more cleanup in here when not use dma,
as in musb_ep_program():

746         if (is_out && !len) {                                                   
747                 use_dma = 0;                                                    
748                 csr = musb_readw(epio, MUSB_TXCSR);                             
749                 csr &= ~MUSB_TXCSR_DMAENAB;                                     
750                 musb_writew(epio, MUSB_TXCSR, csr);                             
751                 hw_ep->tx_channel = NULL;                                       
752         }

Regards,
-Bin.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
  2017-01-17 19:46                                                     ` Bin Liu
@ 2017-01-17 20:40                                                       ` Tony Lindgren
  0 siblings, 0 replies; 17+ messages in thread
From: Tony Lindgren @ 2017-01-17 20:40 UTC (permalink / raw)
  To: Bin Liu, Grygorii Strashko, Dan Williams, Vinod Koul, Daniel Mack,
	Felipe Balbi, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 11:47]:
> On Tue, Jan 17, 2017 at 09:39:23AM -0800, Tony Lindgren wrote:
> > * Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 08:49]:
> > > On Tue, Jan 17, 2017 at 08:31:03AM -0800, Tony Lindgren wrote:
> > > > * Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 08:22]:
> > > > > On Tue, Jan 17, 2017 at 08:11:39AM -0800, Tony Lindgren wrote:
> > > > > > * Bin Liu <b-liu-l0cyMroinI0@public.gmane.org> [170117 05:00]:
> > > > > > > On Mon, Jan 16, 2017 at 03:54:29PM -0800, Tony Lindgren wrote:
> > > > > > > > Anyways, for the -rc series oops, we can just leave out the WARN_ON
> > > > > > > > parts for now until drivers/usb/musb/musb_cppi41.c is fixed too.
> > > > > > > 
> > > > > > > Giving that cppi is a submodule inside the usb subsysytem and it does't
> > > > > > > have separate power rail or clock, what is the benefit to adding runtime
> > > > > > > PM in the cppi driver?
> > > > > > 
> > > > > > Good question. We need at least minimal support to enable things for
> > > > > > probe and then idle cppi41 properly if only cppi41.ko is loaded with no
> > > > > > USB modules.
> > > > > > 
> > > > > > But yeah now that musb does runtime PM based on the cable detection, we
> > > > > > pretty much guarantee that cppi41 is always enabled when USB is in use.
> > > > > > 
> > > > > > And if there are no other devices using cppi41 dma on davinci, we can
> > > > > > simplify the PM runtime a bit for cppi41.
> > > > > 
> > > > > This might be a good idea. I didn't have much time to play with this
> > > > > cppi41 runtime PM, but it seems I am having more issues than you and
> > > > > others seeing.
> > > > 
> > > > What kind of additional issues are you seeing not described in the $subject
> > > > patch?
> > > 
> > > I didn't take a note and don't remember if those are in the $subject
> > > patch. But
> > > 
> > > - enumeration begining with a reset that the device doesn't accept
> > >   address X, error code -71; or
> > 
> > Some of this could be fixed by $subject patch if caused by a race.
> > Some of it I'm suspecting might be a different issue where cppi41 dma
> > will just hang until the device is disconnected on 1 sized in dma transfer.
> > See the experimental patch below.
> > 
> > > - console fooding with cppi error code -115 after thumb drive enumeration.
> > 
> > This should get fixed with the $subject patch if we additionally set
> > the autosuspend_delay to something sufficient, like 1000.
> > 
> > > Again, I only tried for a few minutes and didn't take a note, since I
> > > don't have time to look at this ATM.
> > 
> > Well I'll post what I think we should fix for the -rc series for cppi41.
> > If you can then please test that a bit and see if it works. Assuming
> > things work, then all the other changes can be done later on with no
> > rush.
> 
> Sure, I will spend some time on this tomorrow morning, and let you know.
> 
> > 
> > Regards,
> > 
> > Tony
> > 
> > 8< ----------------------------------------
> > diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
> > --- a/drivers/usb/musb/musb_host.c
> > +++ b/drivers/usb/musb/musb_host.c
> > @@ -751,6 +751,15 @@ static void musb_ep_program(struct musb *musb, u8 epnum,
> >  		hw_ep->tx_channel = NULL;
> >  	}
> >  
> > +	/*
> > +	 * At least cppi41 in dma will just hang with size of 1 until the
> > +	 * device is connected. For sizes less than 8 it seems to take a
> > +	 * long time to complete. Let's use minimum size of 16 to avoid
> > +	 * tiny in DMA transfers.
> > +	 */
> > +	if (!is_out && (len < 16))
> > +		use_dma = 0;
> 
> I think you also need to add more cleanup in here when not use dma,
> as in musb_ep_program():
> 
> 746         if (is_out && !len) {                                                   
> 747                 use_dma = 0;                                                    
> 748                 csr = musb_readw(epio, MUSB_TXCSR);                             
> 749                 csr &= ~MUSB_TXCSR_DMAENAB;                                     
> 750                 musb_writew(epio, MUSB_TXCSR, csr);                             
> 751                 hw_ep->tx_channel = NULL;                                       
> 752         }

Oh OK. Yeah this one is in the mysteries to unravel category. And I noticed
the -71 resets with another USB mass storage device I tried with this patch
so clearly not the right fix for that one.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-01-17 20:40 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-13 18:01 [PATCHv2] dmaengine: cppi41: Fix oops in cppi41_runtime_resume Tony Lindgren
     [not found] ` <20170113180132.9188-1-tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-01-13 18:36   ` Grygorii Strashko
     [not found]     ` <d6fddff7-3d00-1de1-ebbf-ee2a949a6284-l0cyMroinI0@public.gmane.org>
2017-01-13 19:01       ` Tony Lindgren
     [not found]         ` <20170113190126.GE2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-01-13 19:53           ` Grygorii Strashko
     [not found]             ` <c80b51a2-1acd-914c-17b8-32754ea9ce3e-l0cyMroinI0@public.gmane.org>
2017-01-13 19:57               ` Grygorii Strashko
     [not found]                 ` <c58cebdf-5752-098a-8b3e-de99f6af14af-l0cyMroinI0@public.gmane.org>
2017-01-13 21:36                   ` Grygorii Strashko
     [not found]                     ` <c1bbb731-940e-6d04-f127-222050d831b8-l0cyMroinI0@public.gmane.org>
2017-01-13 21:59                       ` Tony Lindgren
     [not found]                         ` <20170113215936.GF2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-01-16 23:33                           ` Tony Lindgren
     [not found]                             ` <20170116233329.GF7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-01-16 23:54                               ` Tony Lindgren
     [not found]                                 ` <20170116235428.GG7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-01-17 12:59                                   ` Bin Liu
2017-01-17 16:11                                     ` Tony Lindgren
     [not found]                                       ` <20170117161138.GJ7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-01-17 16:21                                         ` Bin Liu
2017-01-17 16:31                                           ` Tony Lindgren
     [not found]                                             ` <20170117163103.GO7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-01-17 16:48                                               ` Bin Liu
2017-01-17 17:39                                                 ` Tony Lindgren
     [not found]                                                   ` <20170117173922.GR7403-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-01-17 19:46                                                     ` Bin Liu
2017-01-17 20:40                                                       ` Tony Lindgren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).