Measuring scheduling latency for RT threads

linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Measuring scheduling latency for RT threads
@ 2014-11-19 12:43 "Jürgen Lanner"
  2014-11-19 13:11 ` Bernhard Schiffner
  2014-11-19 13:19 ` Stanislav Meduna
  0 siblings, 2 replies; 4+ messages in thread
From: "Jürgen Lanner" @ 2014-11-19 12:43 UTC (permalink / raw)
  To: linux-rt-users

I have a complex automation project with makes heavy use of real time prio threads working with external hardware connected by e.g. TCP/IP.
IMHO most of my latencies simply come from the fact that there are multiple threads running on the same RT prio being scheduled FIFO. (poor design, yes, but that's what I want to fix)
Expected response time is in low millsecond area, so no miroseconds juggling here.
The process is running (alone, no other processes there) on one CPU of an i5 quad core (using 3.13.0-35 ubuntu 14.04)

My first goal is to find out about the worst case latency:
Is there a way I can find out how long (worst case) a RT thread being ready to run is just waiting to be dispatched?  
I tried /proc/pid/sched and was looking for se.statistics.wait_max but this was always 0.
Going through the kernel sources I found out that these value (among many of the neighbors) is not calculated for real time threads (rt.c).
Is there a reason for this?
Is there another, maybe better way how I can achieve my goal?

My second goal would be to identify the tasks (which ones?) consuming so much time (how much?) in this situation and thus causing the delay.
Is there something built in to track down these bad guys?
I think of something that keeps capturing the PIDs and exec time of all dispatched threads while a thread is waiting.
If it turns out that it runs into worst case latency, the collected data could give a hint where to dig.
On proc/pid/sched there is se.statistics.exec_max, some thread having a large number there can be a candidate, yes, but that is not necessarily causing my latency..  
But I think more of a bad situation where a lot of tasks need to work at the same time bloating the run queue and thus causing the latency.
I feel my solution is more "intelligent workload distribution" by identifying not so urgent work and shifting it to a later point in time.

Thanks in advance

jue
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Measuring scheduling latency for RT threads
  2014-11-19 12:43 Measuring scheduling latency for RT threads "Jürgen Lanner"
@ 2014-11-19 13:11 ` Bernhard Schiffner
  2014-11-19 13:19 ` Stanislav Meduna
  1 sibling, 0 replies; 4+ messages in thread
From: Bernhard Schiffner @ 2014-11-19 13:11 UTC (permalink / raw)
  To: linux-rt-users

Am Mittwoch, 19. November 2014, 13:43:36 schrieben Sie:
> I have a complex automation project with makes heavy use of real time prio
> threads working with external hardware connected by e.g. TCP/IP. IMHO most
> of my latencies simply come from the fact that there are multiple threads
> running on the same RT prio being scheduled FIFO. (poor design, yes, but
> that's what I want to fix) Expected response time is in low millsecond
> area, so no miroseconds juggling here. The process is running (alone, no
> other processes there) on one CPU of an i5 quad core (using 3.13.0-35
> ubuntu 14.04)

The expected behavior is "in general" only achiveable with patched kernels. 
The RT-patches are available here:
https://www.kernel.org/pub/linux/kernel/projects/rt/
(Notice availability only for even kernel numbers.)

So please get first of all a kernel working suited for "real realtime". This 
step should be quite simple :-)

Good luck!


Bernhard


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Measuring scheduling latency for RT threads
  2014-11-19 12:43 Measuring scheduling latency for RT threads "Jürgen Lanner"
  2014-11-19 13:11 ` Bernhard Schiffner
@ 2014-11-19 13:19 ` Stanislav Meduna
  2014-11-19 17:59   ` Clark Williams
  1 sibling, 1 reply; 4+ messages in thread
From: Stanislav Meduna @ 2014-11-19 13:19 UTC (permalink / raw)
  To: Jürgen Lanner, linux-rt-users

On 19.11.2014 13:43, "Jürgen Lanner" wrote:

> My first goal is to find out about the worst case latency:
> Is there a way I can find out how long (worst case) a RT thread
> being ready to run is just waiting to be dispatched?  

ftrace, trace-cmd, kernelshark

The latency for the highest prio runnable task is available
right away with the wakeup tracer. For other tasks you can
trace the scheduling functions and interpret the results
using a script (or look at them graphically using kernelshark).

Note that the function tracing is not free and will skew
the results a bit, but should be good enough to identify
the offenders.

-- 
                                 Stano

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Measuring scheduling latency for RT threads
  2014-11-19 13:19 ` Stanislav Meduna
@ 2014-11-19 17:59   ` Clark Williams
  0 siblings, 0 replies; 4+ messages in thread
From: Clark Williams @ 2014-11-19 17:59 UTC (permalink / raw)
  To: Jürgen Lanner; +Cc: Stanislav Meduna, linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 1581 bytes --]

On Wed, 19 Nov 2014 14:19:49 +0100
Stanislav Meduna <stano@meduna.org> wrote:

> On 19.11.2014 13:43, "Jürgen Lanner" wrote:
> 
> > My first goal is to find out about the worst case latency:
> > Is there a way I can find out how long (worst case) a RT thread
> > being ready to run is just waiting to be dispatched?  
> 
> ftrace, trace-cmd, kernelshark
> 
> The latency for the highest prio runnable task is available
> right away with the wakeup tracer. For other tasks you can
> trace the scheduling functions and interpret the results
> using a script (or look at them graphically using kernelshark).
> 
> Note that the function tracing is not free and will skew
> the results a bit, but should be good enough to identify
> the offenders.
> 

You might want to establish a baseline of what the expected latency on
your hardware will be. Try the 'cyclictest' application from the
rt-tests package. Run that for some number of hours to see if you have
any hardware issues that may cause delays. 

When run this way:

	$ sudo cyclictest --numa -p95 -mu

The program will kick off a measurement thread at fifo:95 on each
online core. It will the loop until you kill it with ^c, sleeping for
an interval and then measuring the delta between ideal wakeup time and
actual wakeup time of the measurement thread, something like this:

	t1 = gettime()
	loop:
		sleep(interval)
		t2=gettime()
		delta = t2 - (t1+interval)
		print delta
		t1=gettime()
		goto loop

That will show you the actual scheduling latency seen on each core. 

Clark

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-11-19 17:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-19 12:43 Measuring scheduling latency for RT threads "Jürgen Lanner"
2014-11-19 13:11 ` Bernhard Schiffner
2014-11-19 13:19 ` Stanislav Meduna
2014-11-19 17:59   ` Clark Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).