Beginner's questions on userspace latency

public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed

* Beginner's questions on userspace latency
@ 2013-03-22 19:35 Oliver Nittka
  2013-03-22 20:06 ` Staffan Tjernstrom
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Oliver Nittka @ 2013-03-22 19:35 UTC (permalink / raw)
  To: RT

Hello,

I need to get data from a data acquisition board into user space at
least every 2 milliseconds.

My setup: There's a kernel module that gets measurement data from a data
acquisition board via USB. We're talking about 6 channels @ 100kHz each,
so 512 bytes every 500us. Kernel timing is fine, I checked this using
get_cycles() and computing the delta to the previous urb callback.

The kernel module puts the data into a ring buffer which the user space
application accesses via mmap() (using remap_vmalloc_range()) which i
suppose has better timing compared to copy_to_user()?

Now the user space application needs to be notified at least every 2
milliseconds (so it can evaluate the data and trigger an action in due
time). Currently, I'm using an ioctl which calls
wait_event_interruptible(), the module then wakes up the waitq in its
urb_callback using wake_up_interruptible().

To minimize latency in user space, the application does the following
right at the beginning of its main():

  mlockall(MCL_FUTURE);

  schparm.sched_priority = 99;
  sched_setscheduler(0, SCHED_FIFO, &schparm);

  CPU_SET(1, &cpuset);
  sched_setaffinity(0, sizeof(cpuset), &cpuset);

I'm using a vanilla kernel 2.6.33.9 with patch-2.6.33.9-rt31.

I've also tuned the scheduler using
echo 500000 > /proc/sys/kernel/sched_latency_ns
echo 200000 > /proc/sys/kernel/sched_wakeup_granularity_ns
echo 100000 > /proc/sys/kernel/sched_min_granularity_ns

The timing is fine most of the time, but putting the system under heavy
CPU and USB load leads to occasional glitches where the time between two
ioctl calls can be as high as 50ms.

I'm an experienced developer on linux, but mostly on the application
front. Writing the device driver was as close to the hardware as I ever
got, yet ;-)

I read through the RT wiki, but I'm still unsure on how to proceed from
here. Is there any way to determine what's causing the glitches?
Anything I could do to improve the realtime behavior of my application?

Thank you all very much in advance!

O. Nittka

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Beginner's questions on userspace latency
  2013-03-22 19:35 Beginner's questions on userspace latency Oliver Nittka
@ 2013-03-22 20:06 ` Staffan Tjernstrom
  2013-03-22 20:21 ` Tim Sander
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Staffan Tjernstrom @ 2013-03-22 20:06 UTC (permalink / raw)
  To: Oliver Nittka, RT

Have you turned off all CPU power saving / turbo mode settings in the BIOS/UEFI ? 

-----Original Message-----
From: linux-rt-users-owner@vger.kernel.org [mailto:linux-rt-users-owner@vger.kernel.org] On Behalf Of Oliver Nittka
Sent: Friday, March 22, 2013 2:36 PM
To: RT
Subject: Beginner's questions on userspace latency

Hello,

I need to get data from a data acquisition board into user space at least every 2 milliseconds.

My setup: There's a kernel module that gets measurement data from a data acquisition board via USB. We're talking about 6 channels @ 100kHz each, so 512 bytes every 500us. Kernel timing is fine, I checked this using
get_cycles() and computing the delta to the previous urb callback.

The kernel module puts the data into a ring buffer which the user space application accesses via mmap() (using remap_vmalloc_range()) which i suppose has better timing compared to copy_to_user()?

Now the user space application needs to be notified at least every 2 milliseconds (so it can evaluate the data and trigger an action in due time). Currently, I'm using an ioctl which calls wait_event_interruptible(), the module then wakes up the waitq in its urb_callback using wake_up_interruptible().

To minimize latency in user space, the application does the following right at the beginning of its main():

  mlockall(MCL_FUTURE);

  schparm.sched_priority = 99;
  sched_setscheduler(0, SCHED_FIFO, &schparm);

  CPU_SET(1, &cpuset);
  sched_setaffinity(0, sizeof(cpuset), &cpuset);

I'm using a vanilla kernel 2.6.33.9 with patch-2.6.33.9-rt31.

I've also tuned the scheduler using
echo 500000 > /proc/sys/kernel/sched_latency_ns echo 200000 > /proc/sys/kernel/sched_wakeup_granularity_ns
echo 100000 > /proc/sys/kernel/sched_min_granularity_ns

The timing is fine most of the time, but putting the system under heavy CPU and USB load leads to occasional glitches where the time between two ioctl calls can be as high as 50ms.

I'm an experienced developer on linux, but mostly on the application front. Writing the device driver was as close to the hardware as I ever got, yet ;-)

I read through the RT wiki, but I'm still unsure on how to proceed from here. Is there any way to determine what's causing the glitches?
Anything I could do to improve the realtime behavior of my application?

Thank you all very much in advance!

O. Nittka

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Beginner's questions on userspace latency
  2013-03-22 19:35 Beginner's questions on userspace latency Oliver Nittka
  2013-03-22 20:06 ` Staffan Tjernstrom
@ 2013-03-22 20:21 ` Tim Sander
  2013-03-22 20:23 ` Thomas Gleixner
  2013-03-24 20:09 ` Oliver Nittka
  3 siblings, 0 replies; 5+ messages in thread
From: Tim Sander @ 2013-03-22 20:21 UTC (permalink / raw)
  To: Oliver Nittka, RT

Hi Oliver
> I read through the RT wiki, but I'm still unsure on how to proceed from
> here. Is there any way to determine what's causing the glitches?
> Anything I could do to improve the realtime behavior of my application?
Realtime has a lot to do with your computing hardware + the drivers. The worst 
driver determines your max latency. As you have not told on what hardware you 
are running it is hard to give some hints... other than do latency tracing if 
your hw is powerful enough. Also you might look at the nice latency charts 
from osadl.org. Find a similar system to yours and look at the  worst 
latencies recorded.

Also keep in mind that latency tracing needs some time. There are some 
recordings of talks available for example of Frank Rowand about latency 
tracing but they are probably a little bit dated as the don't acount to much 
for the much improved latency tracing infrastructure within the kernel.

Good luck
Tim

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Beginner's questions on userspace latency
  2013-03-22 19:35 Beginner's questions on userspace latency Oliver Nittka
  2013-03-22 20:06 ` Staffan Tjernstrom
  2013-03-22 20:21 ` Tim Sander
@ 2013-03-22 20:23 ` Thomas Gleixner
  2013-03-24 20:09 ` Oliver Nittka
  3 siblings, 0 replies; 5+ messages in thread
From: Thomas Gleixner @ 2013-03-22 20:23 UTC (permalink / raw)
  To: Oliver Nittka; +Cc: RT

On Fri, 22 Mar 2013, Oliver Nittka wrote:
> The timing is fine most of the time, but putting the system under heavy
> CPU and USB load leads to occasional glitches where the time between two
> ioctl calls can be as high as 50ms.

The kernel tracer is your friend. Enable the kernel function tracer
and add a check into your application which detects the issue and then
stops the trace. See the cyclictest source code for an example how to
do that.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Beginner's questions on userspace latency
  2013-03-22 19:35 Beginner's questions on userspace latency Oliver Nittka
                   ` (2 preceding siblings ...)
  2013-03-22 20:23 ` Thomas Gleixner
@ 2013-03-24 20:09 ` Oliver Nittka
  3 siblings, 0 replies; 5+ messages in thread
From: Oliver Nittka @ 2013-03-24 20:09 UTC (permalink / raw)
  To: RT; +Cc: Staffan Tjernstrom, tstone, tglx

Thank you all very much for your insights!

I was experimenting some more on the weekend, and before I could
investigate everything I learned from you, CPU isolation did the trick
for me (using cset just like shown in the wiki).

I originally thought sched_setaffinity() would suffice, but giving the
application its own CPU helped a lot. Didn't get anything over 1.735 ms
for two days now.

FYI: this is running on a Kontron pITX-SP
http://de.kontron.com/products/boards+and+mezzanines/embedded+sbc/pitx+25+sbc/pitxsp.html.

Thanks again!

  -- o

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-03-24 20:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-22 19:35 Beginner's questions on userspace latency Oliver Nittka
2013-03-22 20:06 ` Staffan Tjernstrom
2013-03-22 20:21 ` Tim Sander
2013-03-22 20:23 ` Thomas Gleixner
2013-03-24 20:09 ` Oliver Nittka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox