linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* dvb usb issues since kernel 4.9
@ 2018-01-08  9:43 Mauro Carvalho Chehab
  0 siblings, 0 replies; 11+ messages in thread
From: Mauro Carvalho Chehab @ 2018-01-08  9:43 UTC (permalink / raw)
  To: Alan Stern
  Cc: Josef Griebichler, Greg Kroah-Hartman, linux-usb, Eric Dumazet,
	Rik van Riel, Paolo Abeni, Hannes Frederic Sowa,
	Jesper Dangaard Brouer, linux-kernel, netdev, Jonathan Corbet,
	LMML, Peter Zijlstra, David Miller, torvalds

Em Sun, 7 Jan 2018 10:41:37 -0500 (EST)
Alan Stern <stern@rowland.harvard.edu> escreveu:

> On Sun, 7 Jan 2018, Mauro Carvalho Chehab wrote:
> 
> > > > It seems that the original patch were designed to solve some IRQ issues
> > > > with network cards with causes data losses on high traffic. However,
> > > > it is also causing bad effects on sustained high bandwidth demands
> > > > required by DVB cards, at least on some USB host drivers.
> > > > 
> > > > Alan/Greg/Eric/David:
> > > > 
> > > > Any ideas about how to fix it without causing regressions to
> > > > network?    
> > > 
> > > It would be good to know what hardware was involved on the x86 system
> > > and to have some timing data.  Can we see the output from lsusb and
> > > usbmon, running on a vanilla kernel that gets plenty of video glitches?  
> > 
> > From Josef's report, and from the BZ, the affected hardware seems
> > to be based on Montage Technology M88DS3103/M88TS2022 chipset.  
> 
> What type of USB host controller does the x86_64 system use?  EHCI or
> xHCI?

I'll let Josef answer this.

> 
> > The driver it uses is at drivers/media/usb/dvb-usb-v2/dvbsky.c,
> > with shares a USB implementation that is used by a lot more drivers.
> > The URB handling code is at:
> > 
> > 	drivers/media/usb/dvb-usb-v2/usb_urb.c
> > 
> > This particular driver allocates 8 buffers with 4096 bytes each
> > for bulk transfers, using transfer_flags = URB_NO_TRANSFER_DMA_MAP.
> > 
> > This become a popular USB hardware nowadays. I have one S960c
> > myself, so I can send you the lsusb from it. You should notice, however,
> > that a DVB-C/DVB-S2 channel can easily provide very high sustained bit
> > rates. Here, on my DVB-S2 provider, a typical transponder produces 58 Mpps
> > of payload after removing URB headers.  
> 
> You mentioned earlier that the driver uses bulk transfers.  In USB-2.0,
> the maximum possible payload data transfer rate using bulk transfers is
> 53248 bytes/ms, which is 53.248 MB/s (i.e., lower than 58 MB/s).  And
> even this is possible only if almost nothing else is using the bus at
> the same time.

No, I said 58 Mbits/s (not bytes).

On DVB-C and DVB-S2 specs, AFAIKT, there's no hard limit for the maximum
payload data rate, although industry seems to limit it to be around
60 Mbits/s. On those standards, the maximal bit rate is defined by the
modulation type and by the channel symbol rate.

To give you a practical example, my DVB-S2 provider modulates each
transponder with 8/PSK (3 bits/symbol), and define channels with a
symbol rate of 30 Mbauds/s. So, it could, theoretically, transport
a MPEG-TS stream up to 90 Mbits/s (minus headers and guard intervals).
In practice, the streams there are transmitted with 58,026.5 Kbits/s.

> > A 10 minutes record with the
> > entire data (with typically contains 5-10 channels) can easily go
> > above 4 GB, just to reproduce 1-2 glitches. So, I'm not sure if
> > a usbmon dump would be useful.  
> 
> It might not be helpful at all.  However, I'm not interested in the 
> payload data (which would be unintelligible to me anyway) but rather 
> the timing of URB submissions and completions.  A usbmon trace which 
> didn't keep much of the payload data would only require on the order of 
> 50 MB per minute -- and Josef said that glitches usually would show up 
> within a minute or so.

Yeah, this could help.

Josef,

You can get it with wireshark/tshark or tcpdump. See:
	https://technolinchpin.wordpress.com/2015/10/23/usb-bus-sniffers-for-linux-system/
	https://wiki.wireshark.org/CaptureSetup/USB

> > I'm enclosing the lsusb from a S960C device, with is based on those
> > Montage chipsets:  
> 
> What I wanted to see was the output from "lsusb" on the affected
> system, not the output from "lsusb -v -s B:D" on your system.
> 
> > > Overall, this may be a very difficult problem to solve.  The
> > > 4cd13c21b207 commit was intended to improve throughput at the cost of
> > > increased latency.  But then what do you do when the latency becomes
> > > too high for the video subsystem to handle?  
> > 
> > Latency can't be too high, otherwise frames will be dropped.  
> 
> Yes, that's the whole point.
> 
> > Even if the Kernel itself doesn't drop, if the delay goes higher
> > than a certain threshold, userspace will need to drop, as it
> > should be presenting audio and video on real time. Yet, typically,
> > userspace will delay it by one or two seconds, with would mean
> > 1500-3500 buffers, with I suspect it is a lot more than the hardware
> > limits. So I suspect that the hardware starves free buffers a way 
> > before userspace, as media hardware don't have unlimited buffers
> > inside them, as they assume that the Kernel/userspace will be fast
> > enough to sustain bit rates up to 66 Mbps of payload.  
> 
> The timing information would tell us how large the latency is.
> 
> In any case, you might be able to attack the problem simply by using
> more than 8 buffers.  With just eight 4096-byte buffers, the total
> pipeline capacity is only about 0.62 ms (at the maximum possible
> transfer rate).  Increasing the number of buffers to 65 would give a
> capacity of 5 ms, which is probably a lot better suited for situations
> where completions are handled by the ksoftirqd thread.

Increasing it to 65 shouldn't be hard. Not sure, however, if the hardware
will actually fill the 65 buffers, but it is worth to try.

> > Perhaps media drivers could pass some quirk similar to URB_ISO_ASAP,
> > in order to revert the kernel logic to prioritize latency instead of
> > throughput.  
> 
> It can't be done without pervasive changes to the USB subsystem, which
> I would greatly prefer to avoid.  Besides, this wouldn't really solve
> the problem.  Decreasing the latency for one device will cause it to be
> increased for others.

If there is a TV streaming traffic at a USB bus, it means that the
user wants to either watch and/or record a TV program. On such
usecase scenario, a low latency is highly desired for the TV capture
(and display, if the GPU is USB), even it means a higher latency for
other traffic.

Josef,

Could you please try the following patch on Kernel 4.14.10 (without
reverting any changesets), and see if it fixes the issue?


media: dvbsky: Increase the number of buffers

Right now, This driver expects a 0.62 ms delay with 8 buffers on an USB 2.0
high speed bus. Increase it to 65 buffers, in order to give more time for
the top half of the USB transfer handler to complete its task.

Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>



> 



Thanks,
Mauro
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/drivers/media/usb/dvb-usb-v2/dvbsky.c b/drivers/media/usb/dvb-usb-v2/dvbsky.c
index 131b6c08e199..d3f5ffc54b25 100644
--- a/drivers/media/usb/dvb-usb-v2/dvbsky.c
+++ b/drivers/media/usb/dvb-usb-v2/dvbsky.c
@@ -740,7 +740,7 @@ static struct dvb_usb_device_properties dvbsky_s960_props = {
 	.num_adapters = 1,
 	.adapter = {
 		{
-			.stream = DVB_USB_STREAM_BULK(0x82, 8, 4096),
+			.stream = DVB_USB_STREAM_BULK(0x82, 65, 4096),
 		}
 	}
 };

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* dvb usb issues since kernel 4.9
@ 2018-01-08 16:10 Alan Stern
  0 siblings, 0 replies; 11+ messages in thread
From: Alan Stern @ 2018-01-08 16:10 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Josef Griebichler, Greg Kroah-Hartman, linux-usb, Eric Dumazet,
	Rik van Riel, Paolo Abeni, Hannes Frederic Sowa,
	Jesper Dangaard Brouer, linux-kernel, netdev, Jonathan Corbet,
	LMML, Peter Zijlstra, David Miller, torvalds

On Mon, 8 Jan 2018, Mauro Carvalho Chehab wrote:

> Em Sun, 7 Jan 2018 10:41:37 -0500 (EST)
> Alan Stern <stern@rowland.harvard.edu> escreveu:
> 
> > On Sun, 7 Jan 2018, Mauro Carvalho Chehab wrote:
> > 
> > > > > It seems that the original patch were designed to solve some IRQ issues
> > > > > with network cards with causes data losses on high traffic. However,
> > > > > it is also causing bad effects on sustained high bandwidth demands
> > > > > required by DVB cards, at least on some USB host drivers.
> > > > > 
> > > > > Alan/Greg/Eric/David:
> > > > > 
> > > > > Any ideas about how to fix it without causing regressions to
> > > > > network?    
> > > > 
> > > > It would be good to know what hardware was involved on the x86 system
> > > > and to have some timing data.  Can we see the output from lsusb and
> > > > usbmon, running on a vanilla kernel that gets plenty of video glitches?  
> > > 
> > > From Josef's report, and from the BZ, the affected hardware seems
> > > to be based on Montage Technology M88DS3103/M88TS2022 chipset.  
> > 
> > What type of USB host controller does the x86_64 system use?  EHCI or
> > xHCI?
> 
> I'll let Josef answer this.
> 
> > 
> > > The driver it uses is at drivers/media/usb/dvb-usb-v2/dvbsky.c,
> > > with shares a USB implementation that is used by a lot more drivers.
> > > The URB handling code is at:
> > > 
> > > 	drivers/media/usb/dvb-usb-v2/usb_urb.c
> > > 
> > > This particular driver allocates 8 buffers with 4096 bytes each
> > > for bulk transfers, using transfer_flags = URB_NO_TRANSFER_DMA_MAP.
> > > 
> > > This become a popular USB hardware nowadays. I have one S960c
> > > myself, so I can send you the lsusb from it. You should notice, however,
> > > that a DVB-C/DVB-S2 channel can easily provide very high sustained bit
> > > rates. Here, on my DVB-S2 provider, a typical transponder produces 58 Mpps
> > > of payload after removing URB headers.  
> > 
> > You mentioned earlier that the driver uses bulk transfers.  In USB-2.0,
> > the maximum possible payload data transfer rate using bulk transfers is
> > 53248 bytes/ms, which is 53.248 MB/s (i.e., lower than 58 MB/s).  And
> > even this is possible only if almost nothing else is using the bus at
> > the same time.
> 
> No, I said 58 Mbits/s (not bytes).

Well, what you actually _wrote_ was "58 Mpps of payload" (see above),
and I couldn't tell how to interpret that.  :-)

58 Mb/s is obviously almost 8 times less than the full USB bus 
bandwidth.

> On DVB-C and DVB-S2 specs, AFAIKT, there's no hard limit for the maximum
> payload data rate, although industry seems to limit it to be around
> 60 Mbits/s. On those standards, the maximal bit rate is defined by the
> modulation type and by the channel symbol rate.
> 
> To give you a practical example, my DVB-S2 provider modulates each
> transponder with 8/PSK (3 bits/symbol), and define channels with a
> symbol rate of 30 Mbauds/s. So, it could, theoretically, transport
> a MPEG-TS stream up to 90 Mbits/s (minus headers and guard intervals).
> In practice, the streams there are transmitted with 58,026.5 Kbits/s.

Okay.  This is 58 Kb/ms or 7.25 KB/ms.  So your scheme of eight 4-KB 
buffers gives a latency of 0.57 ms with a total capacity of 4.5 ms, 
which is a lot better than what I was thinking.

> > In any case, you might be able to attack the problem simply by using
> > more than 8 buffers.  With just eight 4096-byte buffers, the total
> > pipeline capacity is only about 0.62 ms (at the maximum possible
> > transfer rate).  Increasing the number of buffers to 65 would give a
> > capacity of 5 ms, which is probably a lot better suited for situations
> > where completions are handled by the ksoftirqd thread.
> 
> Increasing it to 65 shouldn't be hard. Not sure, however, if the hardware
> will actually fill the 65 buffers, but it is worth to try.

Given the new information, 65 would be overkill.  But going from 8 to 
16 might help.

> > > Perhaps media drivers could pass some quirk similar to URB_ISO_ASAP,
> > > in order to revert the kernel logic to prioritize latency instead of
> > > throughput.  
> > 
> > It can't be done without pervasive changes to the USB subsystem, which
> > I would greatly prefer to avoid.  Besides, this wouldn't really solve
> > the problem.  Decreasing the latency for one device will cause it to be
> > increased for others.
> 
> If there is a TV streaming traffic at a USB bus, it means that the
> user wants to either watch and/or record a TV program. On such
> usecase scenario, a low latency is highly desired for the TV capture
> (and display, if the GPU is USB), even it means a higher latency for
> other traffic.

Not if the other traffic is also a TV capture.  :-)

It might make sense to classify softirq sources as "high priority" or 
"low priority", and only defer the "low priority" work to ksoftirqd.

Alan Stern
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* dvb usb issues since kernel 4.9
@ 2018-01-08 19:51 Linus Torvalds
  0 siblings, 0 replies; 11+ messages in thread
From: Linus Torvalds @ 2018-01-08 19:51 UTC (permalink / raw)
  To: Alan Stern
  Cc: Ingo Molnar, Mauro Carvalho Chehab, Josef Griebichler,
	Greg Kroah-Hartman, USB list, Eric Dumazet, Rik van Riel,
	Paolo Abeni, Hannes Frederic Sowa, Jesper Dangaard Brouer,
	linux-kernel, netdev, Jonathan Corbet, LMML, Peter Zijlstra,
	David Miller

On Mon, Jan 8, 2018 at 11:15 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
>
> Both dwc2_hsotg and ehci-hcd use the tasklets embedded in the
> giveback_urb_bh member of struct usb_hcd.  See usb_hcd_giveback_urb()
> in drivers/usb/core/hcd.c; the calls are
>
>         else if (high_prio_bh)
>                 tasklet_hi_schedule(&bh->bh);
>         else
>                 tasklet_schedule(&bh->bh);
>
> As it turns out, high_prio_bh gets set for interrupt and isochronous
> URBs but not for bulk and control URBs.  The DVB driver in question
> uses bulk transfers.

Ok, so we could try out something like the appended?

NOTE! I have not tested this at all. It LooksObvious(tm), but...

                    Linus
kernel/softirq.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 2f5e87f1bae2..97b080956fea 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -79,12 +79,16 @@ static void wakeup_softirqd(void)
 
 /*
  * If ksoftirqd is scheduled, we do not want to process pending softirqs
- * right now. Let ksoftirqd handle this at its own rate, to get fairness.
+ * right now. Let ksoftirqd handle this at its own rate, to get fairness,
+ * unless we're doing some of the synchronous softirqs.
  */
-static bool ksoftirqd_running(void)
+#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
+static bool ksoftirqd_running(unsigned long pending)
 {
 	struct task_struct *tsk = __this_cpu_read(ksoftirqd);
 
+	if (pending & SOFTIRQ_NOW_MASK)
+		return false;
 	return tsk && (tsk->state == TASK_RUNNING);
 }
 
@@ -325,7 +329,7 @@ asmlinkage __visible void do_softirq(void)
 
 	pending = local_softirq_pending();
 
-	if (pending && !ksoftirqd_running())
+	if (pending && !ksoftirqd_running(pending))
 		do_softirq_own_stack();
 
 	local_irq_restore(flags);
@@ -352,7 +356,7 @@ void irq_enter(void)
 
 static inline void invoke_softirq(void)
 {
-	if (ksoftirqd_running())
+	if (ksoftirqd_running(local_softirq_pending()))
 		return;
 
 	if (!force_irqthreads) {

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* dvb usb issues since kernel 4.9
@ 2018-01-09 17:42 Mauro Carvalho Chehab
  0 siblings, 0 replies; 11+ messages in thread
From: Mauro Carvalho Chehab @ 2018-01-09 17:42 UTC (permalink / raw)
  To: Linus Torvalds, Peter Zijlstra
  Cc: Alan Stern, Ingo Molnar, Josef Griebichler, Greg Kroah-Hartman,
	USB list, Eric Dumazet, Rik van Riel, Paolo Abeni,
	Hannes Frederic Sowa, Jesper Dangaard Brouer, linux-kernel,
	netdev, Jonathan Corbet, LMML, David Miller

Em Mon, 8 Jan 2018 11:51:04 -0800
Linus Torvalds <torvalds@linux-foundation.org> escreveu:

> On Mon, Jan 8, 2018 at 11:15 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> >
> > Both dwc2_hsotg and ehci-hcd use the tasklets embedded in the
> > giveback_urb_bh member of struct usb_hcd.  See usb_hcd_giveback_urb()
> > in drivers/usb/core/hcd.c; the calls are
> >
> >         else if (high_prio_bh)
> >                 tasklet_hi_schedule(&bh->bh);
> >         else
> >                 tasklet_schedule(&bh->bh);
> >
> > As it turns out, high_prio_bh gets set for interrupt and isochronous
> > URBs but not for bulk and control URBs.  The DVB driver in question
> > uses bulk transfers.  
> 
> Ok, so we could try out something like the appended?
> 
> NOTE! I have not tested this at all. It LooksObvious(tm), but...
> 
>                     Linus



>  kernel/softirq.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 2f5e87f1bae2..97b080956fea 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -79,12 +79,16 @@ static void wakeup_softirqd(void)
>  
>  /*
>   * If ksoftirqd is scheduled, we do not want to process pending softirqs
> - * right now. Let ksoftirqd handle this at its own rate, to get fairness.
> + * right now. Let ksoftirqd handle this at its own rate, to get fairness,
> + * unless we're doing some of the synchronous softirqs.
>   */
> -static bool ksoftirqd_running(void)
> +#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
> +static bool ksoftirqd_running(unsigned long pending)
>  {
>  	struct task_struct *tsk = __this_cpu_read(ksoftirqd);
>  
> +	if (pending & SOFTIRQ_NOW_MASK)
> +		return false;
>  	return tsk && (tsk->state == TASK_RUNNING);
>  }
>  
> @@ -325,7 +329,7 @@ asmlinkage __visible void do_softirq(void)
>  
>  	pending = local_softirq_pending();
>  
> -	if (pending && !ksoftirqd_running())
> +	if (pending && !ksoftirqd_running(pending))
>  		do_softirq_own_stack();
>  
>  	local_irq_restore(flags);
> @@ -352,7 +356,7 @@ void irq_enter(void)
>  
>  static inline void invoke_softirq(void)
>  {
> -	if (ksoftirqd_running())
> +	if (ksoftirqd_running(local_softirq_pending()))
>  		return;
>  
>  	if (!force_irqthreads) {


Hi Linus,

Patch makes sense to me, although I was not able to test it myself.

I set a RPi3 machine here with vanilla Kernel 4.14.11 running a standard
raspbian distribution (with elevator=deadline). Right now, I'm trying to
reproduce the bug with dvbv5-zap. I may eventually do more tests on 
some other slow machines.

Usually, applications like tvheadend records just one channel. So, instead
of a ~58 Mbits/s payload, it uses, typically, ~11 Mbits/s for a HD channel.
This is usually filtered by hardware. Here, I'm forcing to record the
entire TS, in order to make easier to reproduce the issue. So, I'm forcing
a condition that it is usually worse than real usecases (at last for HD - I
I don't have any DVB stream here with a 4K channel).

From what I checked so far, with vanila upstream Kernel on RPi3, just
receiving a DVB stream - or receiving it and writing to /dev/null works
with or without your patch.

The problem starts to happen when there are concurrency with writes.

On my preliminar tests, writing to a file on an ext4 partition at a
USB stick loses data up to the point to make it useless (1/4 of the data
is lost!). However, writing to a class 10 microSD card is doable.

If you're curious enough, this is what I'm doing (that are the results
while using class 10 microSD card):

$ FILE=/tmp/out.ts; for i in $(seq 1 6); do echo "step $i"; rm $FILE 2>/dev/null; dvbv5-zap -l universal -c ~/vivo-channels.conf NBR -o $FILE -P -t60 2>&1|grep -E "(buffer|received)"; du $FILE 2>/dev/null; done 
step 1
  Setting buffer length to 7250000
buffer overrun
buffer overrun
buffer overrun
buffer overrun
buffer overrun
buffer overrun
buffer overrun
received 347504652 bytes (5656 Kbytes/sec)
339368	/tmp/out.ts
step 2
  Setting buffer length to 7250000
buffer overrun
received 408995880 bytes (6656 Kbytes/sec)
399416	/tmp/out.ts
step 3
  Setting buffer length to 7250000
received 412999716 bytes (6722 Kbytes/sec)
403328	/tmp/out.ts
step 4
  Setting buffer length to 7250000
buffer overrun
received 415564788 bytes (6763 Kbytes/sec)
405832	/tmp/out.ts
step 5
  Setting buffer length to 7250000
received 412999716 bytes (6722 Kbytes/sec)
403324	/tmp/out.ts
step 6
  Setting buffer length to 7250000
received 408366080 bytes (6646 Kbytes/sec)
398796	/tmp/out.ts

My plan is to do more tests along this week, and try to tweak a little
bit both userspace and kernelspace, in order to see if I can get better
results.

Thanks,
Mauro
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* dvb usb issues since kernel 4.9
@ 2018-01-09 17:55 Linus Torvalds
  0 siblings, 0 replies; 11+ messages in thread
From: Linus Torvalds @ 2018-01-09 17:55 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Peter Zijlstra, Alan Stern, Ingo Molnar, Josef Griebichler,
	Greg Kroah-Hartman, USB list, Eric Dumazet, Rik van Riel,
	Paolo Abeni, Hannes Frederic Sowa, Jesper Dangaard Brouer,
	linux-kernel, netdev, Jonathan Corbet, LMML, David Miller

On Tue, Jan 9, 2018 at 9:42 AM, Mauro Carvalho Chehab
<mchehab@s-opensource.com> wrote:
>
> On my preliminar tests, writing to a file on an ext4 partition at a
> USB stick loses data up to the point to make it useless (1/4 of the data
> is lost!). However, writing to a class 10 microSD card is doable.

Note that most USB sticks are horrible crap. They can have write
latencies counted in _seconds_.

You can cause VM issues and various other non-hardware stalls with
them, simply because something gets stuck waiting for a page writeout
that should take a few ms on any reasonable hardware, but ends up
talking half a second or more.

For example, even really well-written software that tries to do things
like threaded write-behind to smooth out the IO will be _totally_
screwed by the USB stick behavior (where you might write a few MB at
high speeds, and then the next write - however small - takes a second
because the stupid USB stick does a synchronous garbage collection.
Suddenly all that clever software that tried to keep things moving
along smoothly without any hiccups, and tried hard to make the USB bus
have a nice constant loadm can't do anything at all about the crap
hardware.

So when testing writes to USB sticks, I'm not convinced you're
actually testing any USB bus limitations or even really any other
hardware limitations than the USB stick itself.

                  Linus
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* dvb usb issues since kernel 4.9
@ 2018-01-09 21:26 Jesper Dangaard Brouer
  0 siblings, 0 replies; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2018-01-09 21:26 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linus Torvalds, Peter Zijlstra, Alan Stern, Ingo Molnar,
	Josef Griebichler, Greg Kroah-Hartman, USB list, Eric Dumazet,
	Rik van Riel, Paolo Abeni, Hannes Frederic Sowa, linux-kernel,
	netdev, Jonathan Corbet, LMML, David Miller

On Tue, 9 Jan 2018 15:42:35 -0200 Mauro Carvalho Chehab <mchehab@s-opensource.com> wrote:
> Em Mon, 8 Jan 2018 11:51:04 -0800 Linus Torvalds <torvalds@linux-foundation.org> escreveu:
> 
[...]
> Patch makes sense to me, although I was not able to test it myself.
 
The patch also make sense to me.  I've done some basic testing with it
on my high-end Broadwell system (that I use for 100Gbit/s testing). As
expected the network overload case still works, as NET_RX_SOFTIRQ is
not matched. 

> I set a RPi3 machine here with vanilla Kernel 4.14.11 running a
> standard raspbian distribution (with elevator=deadline).

I found a Raspberry Pi Model B+ (I think, BCM2835), that I loaded the
LibreELEC distro on.  One of the guys even created an image for me with
a specific kernel[1] (that I just upgraded the system with).

[1] https://forum.libreelec.tv/thread/4235-dvb-issue-since-le-switched-to-kernel-4-9-x/?postID=77031#post77031
 
> My plan is to do more tests along this week, and try to tweak a little
> bit both userspace and kernelspace, in order to see if I can get
> better results.
 
I've previously experienced that you can be affected by the scheduler
granularity, which is adjustable (with CONFIG_SCHED_DEBUG=y):

 $ grep -H . /proc/sys/kernel/sched_*_granularity_ns
 /proc/sys/kernel/sched_min_granularity_ns:2250000
 /proc/sys/kernel/sched_wakeup_granularity_ns:3000000

The above numbers were confirmed on the RPi2 (see[2]). With commit
4cd13c21b207 ("softirq: Let ksoftirqd do its job"), I expect/assume that
softirq processing latency is bounded by the sched_wakeup_granularity_ns,
which with 3 ms is not good enough for their use-case.

Thus, if you manage to reproduce the case, try to see if adjusting this
can mitigate the issue...


Their system have non-preempt kernel, should they use PREEMPT?

 LibreELEC:~ # uname -a
 Linux LibreELEC 4.14.10 #1 SMP Tue Jan 9 17:35:03 GMT 2018 armv7l GNU/Linux

[2] https://forum.libreelec.tv/thread/4235-dvb-issue-since-le-switched-to-kernel-4-9-x/?postID=76999#post76999

^ permalink raw reply	[flat|nested] 11+ messages in thread

* dvb usb issues since kernel 4.9
@ 2018-01-10  3:02 Mike Galbraith
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2018-01-10  3:02 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Mauro Carvalho Chehab
  Cc: Linus Torvalds, Peter Zijlstra, Alan Stern, Ingo Molnar,
	Josef Griebichler, Greg Kroah-Hartman, USB list, Eric Dumazet,
	Rik van Riel, Paolo Abeni, Hannes Frederic Sowa, linux-kernel,
	netdev, Jonathan Corbet, LMML, David Miller

On Tue, 2018-01-09 at 22:26 +0100, Jesper Dangaard Brouer wrote:
> 
> I've previously experienced that you can be affected by the scheduler
> granularity, which is adjustable (with CONFIG_SCHED_DEBUG=y):
> 
>  $ grep -H . /proc/sys/kernel/sched_*_granularity_ns
>  /proc/sys/kernel/sched_min_granularity_ns:2250000
>  /proc/sys/kernel/sched_wakeup_granularity_ns:3000000
> 
> The above numbers were confirmed on the RPi2 (see[2]). With commit
> 4cd13c21b207 ("softirq: Let ksoftirqd do its job"), I expect/assume that
> softirq processing latency is bounded by the sched_wakeup_granularity_ns,
> which with 3 ms is not good enough for their use-case.

Note of caution wrt twiddling sched_wakeup_granularity_ns: it must
remain < sched_latency_ns/2 else you effectively disable wakeup
preemption completely, turning CFS into a tick granularity scheduler.

	-Mike
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* dvb usb issues since kernel 4.9
@ 2018-07-17 11:54 hannah
  0 siblings, 0 replies; 11+ messages in thread
From: hannah @ 2018-07-17 11:54 UTC (permalink / raw)
  To: torvalds
  Cc: corbet, davem, edumazet, gregkh, griebichler.josef, hannes,
	jbrouer, linux-kernel, linux-media, linux-usb, mchehab, mingo,
	netdev, pabeni, peterz, riel, stern, dmaengine, vkoul,
	dan.j.williams, nadavh, thomas.petazzoni, Omri Itach

Hi,

I'm a software developer working in Marvell SoC team.
I'm facing kernel panic issue while running raid 5 on sata disks 
connected to Macchiatobin (Marvell community board with Armada-8040 SoC 
with 4 ARMv8 cores of CA72)
Raid 5 built with Marvell DMA engine and async_tx mechanism 
(ASYNC_TX_DMA [=y]); the DMA driver (mv_xor_v2) uses a tasklet to clean 
the done descriptors from the queue.

The panic (see below) occurs while building the RAID-5 (mdadm) or while 
writing/reading to the raid partition.

After some debug/bisect/diff, found that patch "softirq: Let ksoftirqd 
do its job" is problematic patch.

- Using v4.14.0 and problematic patch reverted - no timout issue.
- Using v4.14.0 (including softirq patch) and the additional fix 
proposed by Linus - no timeout issue.

As others have reported in this thread, the softirq change is causing 
some regression.
Would it be possible to either revert the patch or apply a fix such as 
the one proposed by Linus ?

Below panic message:
[   25.371495] mv_xor_v2 f0400000.xor: dma_sync_wait: timeout!
[   25.377101] Kernel panic - not syncing: async_tx_quiesce: DMA error 
waiting for transaction
[   25.377101]
[   25.386973] CPU: 0 PID: 1417 Comm: md0_raid5 Not tainted 4.14.0 #16
[   25.393264] Hardware name: Marvell Armada 8040 DB board (DT)
[   25.398946] Call trace:
[   25.401410] [<ffff000008089310>] dump_backtrace+0x0/0x380
[   25.406831] [<ffff0000080896a4>] show_stack+0x14/0x20
[   25.411904] [<ffff00000890fa78>] dump_stack+0x98/0xb8
[   25.416976] [<ffff0000080c8ef0>] panic+0x118/0x280
[   25.421788] [<ffff000008386a44>] async_tx_quiesce+0x74/0x78
[   25.427382] [<ffff000008386ca4>] async_memcpy+0x1a4/0x2a0
[   25.432806] [<ffff000008747f9c>] async_copy_data.isra.16+0x1b4/0x280
[   25.439186] [<ffff00000874b6fc>] raid_run_ops+0x514/0x1320
[   25.444694] [<ffff000008751550>] handle_stripe+0x1040/0x2848
[   25.450377] [<ffff000008752f98>] 
handle_active_stripes.isra.28+0x240/0x460
[   25.457279] [<ffff000008753468>] raid5d+0x2b0/0x450
[   25.462177] [<ffff00000875ead4>] md_thread+0x104/0x160
[   25.467338] [<ffff0000080e638c>] kthread+0xfc/0x128
[   25.472234] [<ffff000008085354>] ret_from_fork+0x10/0x1c
[   25.477571] Kernel Offset: disabled
[   25.481073] CPU features: 0x002000
[   25.484487] Memory Limit: none
[   25.487556] ---[ end Kernel panic - not syncing: async_tx_quiesce: 
DMA error waiting for transaction
[   25.487556]

Thanks,
Hanna
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* dvb usb issues since kernel 4.9
@ 2018-07-17 17:09 Linus Torvalds
  0 siblings, 0 replies; 11+ messages in thread
From: Linus Torvalds @ 2018-07-17 17:09 UTC (permalink / raw)
  To: hannah
  Cc: Jonathan Corbet, David Miller, Eric Dumazet, Greg Kroah-Hartman,
	Josef Griebichler, Hannes Frederic Sowa, Jesper Dangaard Brouer,
	Linux Kernel Mailing List, Linux Media Mailing List, USB list,
	Mauro Carvalho Chehab, Ingo Molnar, Network Development,
	Paolo Abeni, Peter Zijlstra, Rik van Riel, Alan Stern, dma, vkoul,
	Dan Williams, nadavh, thomas.petazzoni, omrii

On Tue, Jul 17, 2018 at 4:58 AM Hanna Hawa <hannah@marvell.com> wrote:
>
> After some debug/bisect/diff, found that patch "softirq: Let ksoftirqd
> do its job" is problematic patch.

Ok, this thread died down without any resolution.

>- Using v4.14.0 (including softirq patch) and the additional fix
> proposed by Linus - no timeout issue.

Are you talking about the patch that made HI_SOFTIRQ and
TASKLET_SOFTIRQ special, and had this:

  #define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))

in it?

I think I'll just commit the damn thing. It's hacky, but it's simple,
and it never got applied because we had smarter suggestions. But the
smarter suggestions never ended up being applied either, so..

                       Linus
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* dvb usb issues since kernel 4.9
@ 2018-07-17 18:07 hannah
  0 siblings, 0 replies; 11+ messages in thread
From: hannah @ 2018-07-17 18:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jonathan Corbet, David Miller, Eric Dumazet, Greg Kroah-Hartman,
	Josef Griebichler, Hannes Frederic Sowa, Jesper Dangaard Brouer,
	Linux Kernel Mailing List, Linux Media Mailing List, USB list,
	Mauro Carvalho Chehab, Ingo Molnar, Network Development,
	Paolo Abeni, Peter Zijlstra, Rik van Riel, Alan Stern, dma, vkoul,
	Dan Williams, nadavh, thomas.petazzoni, omrii

Hi Linus,

On 07/17/2018 08:09 PM, Linus Torvalds wrote:
> On Tue, Jul 17, 2018 at 4:58 AM Hanna Hawa <hannah@marvell.com> wrote:
>>
>> After some debug/bisect/diff, found that patch "softirq: Let ksoftirqd
>> do its job" is problematic patch.
>
> Ok, this thread died down without any resolution.
>
>> - Using v4.14.0 (including softirq patch) and the additional fix
>> proposed by Linus - no timeout issue.
>
> Are you talking about the patch that made HI_SOFTIRQ and
> TASKLET_SOFTIRQ special, and had this:
>
>   #define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
>
> in it?
yes, exactly..

Link to the patch:
https://git.linuxtv.org/mchehab/experimental.git/commit/?h=v4.15%2bmedia%2bdwc2&id=ccf833fd4a5b99c3d3cf2c09c065670f74a230a7

Thanks,
Hanna

>
> I think I'll just commit the damn thing. It's hacky, but it's simple,
> and it never got applied because we had smarter suggestions. But the
> smarter suggestions never ended up being applied either, so..
>
>                        Linus
>
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* dvb usb issues since kernel 4.9
@ 2018-07-17 22:21 Mauro Carvalho Chehab
  0 siblings, 0 replies; 11+ messages in thread
From: Mauro Carvalho Chehab @ 2018-07-17 22:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: hannah, Jonathan Corbet, David Miller, Eric Dumazet,
	Greg Kroah-Hartman, Josef Griebichler, Hannes Frederic Sowa,
	Jesper Dangaard Brouer, Linux Kernel Mailing List,
	Linux Media Mailing List, USB list, Mauro Carvalho Chehab,
	Ingo Molnar, Network Development, Paolo Abeni, Peter Zijlstra,
	Rik van Riel, Alan Stern, dma, vkoul, Dan Williams, nadavh,
	thomas.petazzoni, omrii

Hi Linus,

Em Tue, 17 Jul 2018 10:09:28 -0700
Linus Torvalds <torvalds@linux-foundation.org> escreveu:

> On Tue, Jul 17, 2018 at 4:58 AM Hanna Hawa <hannah@marvell.com> wrote:
> >
> > After some debug/bisect/diff, found that patch "softirq: Let ksoftirqd
> > do its job" is problematic patch.  
> 
> Ok, this thread died down without any resolution.
> 
> >- Using v4.14.0 (including softirq patch) and the additional fix
> > proposed by Linus - no timeout issue.  
> 
> Are you talking about the patch that made HI_SOFTIRQ and
> TASKLET_SOFTIRQ special, and had this:
> 
>   #define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
> 
> in it?
> 
> I think I'll just commit the damn thing. It's hacky, but it's simple,
> and it never got applied because we had smarter suggestions. But the
> smarter suggestions never ended up being applied either, so..

Yeah, IMHO the best would be to apply your patch[1], c/c stable up to
4.9. Nothing prevents applying a better/smarter solution once we
have it. From my side, I can keep testing whatever smart suggestions
people propose. Yet, better to have one fix on our hand than two
fixes flying around.

[1] e. g. 
	https://git.linuxtv.org/mchehab/experimental.git/commit/?h=v4.15%2bmedia%2bdwc2&id=ccf833fd4a5b99c3d3cf2c09c065670f74a230a7

Regards,
Mauro
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-07-17 22:21 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-09 17:42 dvb usb issues since kernel 4.9 Mauro Carvalho Chehab
  -- strict thread matches above, loose matches on Subject: below --
2018-07-17 22:21 Mauro Carvalho Chehab
2018-07-17 18:07 hannah
2018-07-17 17:09 Linus Torvalds
2018-07-17 11:54 hannah
2018-01-10  3:02 Mike Galbraith
2018-01-09 21:26 Jesper Dangaard Brouer
2018-01-09 17:55 Linus Torvalds
2018-01-08 19:51 Linus Torvalds
2018-01-08 16:10 Alan Stern
2018-01-08  9:43 Mauro Carvalho Chehab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).