public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Darren Hart <dvhltc@us.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Lee Revell <rlrevell@joe-job.com>,
	lkml <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Mike Galbraith <efault@gmx.de>,
	Steven Rostedt <rostedt@goodmis.org>,
	Florian Schmidt <mista.tapas@gmx.net>
Subject: Re: rt20 scheduling latency testcase and failure data
Date: Mon, 15 May 2006 18:30:22 -0700	[thread overview]
Message-ID: <200605151830.23544.dvhltc@us.ibm.com> (raw)
In-Reply-To: <20060515081341.GB24523@elte.hu>

[-- Attachment #1: Type: text/plain, Size: 3437 bytes --]

On Monday 15 May 2006 01:13, Ingo Molnar wrote:
>
> have you tried to use the latency tracer to capture this latency? It is
> programmable to a high degree. (I've attached trace-it.c that shows how
> to use it)
> ...
> looking at the highlevel traces should give you a quick idea of what's
> going on, and then you can zoom into the time period that triggers the
> long latency. (but feel free to also send these traces to us, preferably
> in bzip2 -9 format.)

Following Ingo's example I have included the modified test case (please see 
the original mail for librt.h) that starts the trace before each sleep and 
disables it after we wake up.  If we have missed a period, we print the 
trace.

My initial trace (lt_01.log.bz2) contained entries like the following:

  <idle>-0     2...1   24us!: default_idle (cpu_idle)
  <idle>-0     2D..1 1609us : smp_apic_timer_interrupt (c0100c80 0 0)

  <idle>-0     2...1 1635us!: default_idle (cpu_idle)
  <idle>-0     2D..1 4756us : smp_apic_timer_interrupt (c0100c80 0 0)

I noticed the that there were long latencies (as high as 3ms) all related to 
irq handling in the idle loop.  I thought it might have something to do with 
the CPU going into some kind of a sleep state and taking a long time to wake 
up.  John Stultz suggested I boot with idle=poll.

The idle=poll trace (lt_02.log.bz2) had much smaller maximum latencies, like 
the following:

  <idle>-0     0D..1 1341us+: write_watchdog_counter (nmi_watchdog_tick)
  <idle>-0     0D..1 1441us : do_nmi (poll_idle)

  <idle>-0     0D..1 1445us+: write_watchdog_counter (nmi_watchdog_tick)
  <idle>-0     0D..1 1545us : do_nmi (poll_idle)

All the longest latencies were related to the nmi watchdogs.  Booting with 
idle=poll nmi_watchdog=0 unfortunately seemed to return to the original 
behavior - so perhaps the second test was just a fluke.  This trace 
(lt_03.log.bz2) had entries like the following:

  <idle>-0     1...1   19us!: poll_idle (cpu_idle)
  <idle>-0     1D..1 1089us : smp_apic_timer_interrupt (c0100ccf 0 0)

  <idle>-0     1...1 1113us!: poll_idle (cpu_idle)
  <idle>-0     1D..1 4765us : smp_apic_timer_interrupt (c0100ccf 0 0)

sched_la-2978  1D... 4798us!: init_fpu (math_state_restore)
sched_la-2978  1D... 5088us : smp_apic_timer_interrupt (4b3b98e8 0 0)

sched_la-2978  1D.h. 5103us!: irq_exit (smp_apic_timer_interrupt)
sched_la-2978  1.... 5719us > sys_clock_gettime (00000001 b7f65378 0000007b)

Finally I booted with idle=poll and nmi_watchdog=0.  The very first run 
failed, (I've noticed that the first iteration seems to always hit PERIOD 
MISSED! while the second usually passes fine).  It's still running a 1M 
iteration run with no more failures so far (100K so far).

The latency tracer is a very interesting tool.  I have a few 
questions/assumptions I'd like to run by you to make sure I understand the 
output of the latency trace:

o ! in the delay column means there is a long latency here?
o + in the delay column means there is a > 1us latency here?
o > means entering the kernel from a sys_call?
o < means returning from the sys_call?
o : is not < or >
o Why do I only see ~5ms of trace when it appears that I am seeing more like
  ~25ms between the time when I start and stop the trace?
o What are the maximum latencies that get printed to /var/log/messages?  What
  exacly do they measure? From what point to what point?
-- 
Darren Hart
IBM Linux Technology Center
Realtime Linux Team


[-- Attachment #2: lt_01.log.bz2 --]
[-- Type: application/x-bzip2, Size: 4351 bytes --]

[-- Attachment #3: lt_02.log.bz2 --]
[-- Type: application/x-bzip2, Size: 5710 bytes --]

[-- Attachment #4: lt_03.log.bz2 --]
[-- Type: application/x-bzip2, Size: 3828 bytes --]

[-- Attachment #5: sched_latency_lkml.c --]
[-- Type: text/x-csrc, Size: 5306 bytes --]

/*    Filename: sched_latency_lkml.c
 *      Author: Darren Hart <dvhltc@us.ibm.com>
 * Description: Measure the latency involved with periodic scheduling.
 * Compilation: gcc -O2 -g -D_GNU_SOURCE -I/usr/include/nptl -I -L/usr/lib/nptl -lpthread -lrt -lm sched_latency_lkml.c -o sched_latency_lkml
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 *
 * Copyright (C) IBM Corporation, 2006
 *
 * 2006-May-10:	Initial version by Darren Hart <dvhltc@us.ibm.com>
 */

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include "librt.h"

#define PRIO 98
//#define PERIOD 17*NS_PER_MS
//#define ITERATIONS 100
#define ITERATIONS 10000
#define PERIOD 5*NS_PER_MS
#define PASS_US 100

#define LATENCY_TRACE 1
#ifdef LATENCY_TRACE
#define latency_trace_enable() \
system("echo 0 > /proc/sys/kernel/trace_all_cpus"); \
system("echo 1 > /proc/sys/kernel/trace_enabled"); \
system("echo 0 > /proc/sys/kernel/trace_freerunning"); \
system("echo 0 > /proc/sys/kernel/trace_print_at_crash"); \
system("echo 1 > /proc/sys/kernel/trace_user_triggered"); \
system("echo 0 > /proc/sys/kernel/trace_verbose"); \
system("echo 0 > /proc/sys/kernel/preempt_max_latency"); \
system("echo 0 > /proc/sys/kernel/preempt_thresh"); \
system("[ -e /proc/sys/kernel/wakeup_timing ] && echo 1 > /proc/sys/kernel/wakeup_timing"); \
system("echo 1 > /proc/sys/kernel/mcount_enabled"); 
#define latency_trace_start() gettimeofday(0,1)
#define latency_trace_stop() gettimeofday(0,0)
#define latency_trace_print() system("cat /proc/latency_trace");
#else
#define latency_trace_enable() do { } while (0)
#define latency_trace_start() do { } while (0)
#define latency_trace_stop() do { } while (0)
#define latency_trace_print() do { } while (0)
#endif

nsec_t start;
int retval;

void *periodic_thread(void *arg)
{
	struct thread *t = (struct thread *)arg;
	int i;
	int delay, avg_delay = 0, start_delay, min_delay = 0x7FFFFFF, max_delay = 0;
	int failures = 0;
	nsec_t next=0, now=0, sched_delta=0, delta=0, prev=0;

	retval = 0;

	prev = start;
	next = start;
	now = rt_gettime();
	start_delay = (now - start)/NS_PER_US;

	printf("ITERATION DELAY(US) MAX_DELAY(US) FAILURES\n");
	printf("--------- --------- ------------- --------\n");
	for (i = 1; i <= ITERATIONS; i++) {
		/* wait for the period to start */
		next += PERIOD;
		prev = now;
		now = rt_gettime();

		if (i > 1) latency_trace_stop();

		if (next < now) {
			printf("\n\nPERIOD MISSED!\n");
			printf("     scheduled delta: %8llu us\n", sched_delta/1000);
			printf("        actual delta: %8llu us\n", delta/1000);
			printf("             latency: %8llu us\n", (delta-sched_delta)/1000);
			printf("---------------------------------------\n");
			printf("      previous start: %8llu us\n", (prev-start)/1000);
			printf("                 now: %8llu us\n", (now-start)/1000);
			printf("     scheduled start: %8llu us\n", (next-start)/1000);
			printf("next scheduled start is in the past!\n");
			retval = 1;
			latency_trace_print();
			break;
		}

		latency_trace_start();
		sched_delta = next - now; /* how long we should sleep */
		delta = 0;
		do {
			rt_nanosleep(next - now);
			delta += rt_gettime() - now; /* how long we did sleep */
			now = rt_gettime();
		} while (now < next);

		/* start of period */
		now = rt_gettime();
		delay = (now - start - (nsec_t)i*PERIOD)/NS_PER_US;
		if (delay < min_delay)
			min_delay = delay;
		if (delay > max_delay)
			max_delay = delay;
		if (delay > PASS_US)
			failures++;
		avg_delay += delay;
		

		/* continuous status ticker */
		//printf("%9i %9i %13i %8i\r", i, delay, max_delay, failures);
		//fflush(stdout);
		
		busy_work_ms(1);
	}

	avg_delay /= ITERATIONS;
	printf("\n\n");
	printf("Start Latency: %4d us: %s\n", start_delay, 
		start_delay < PASS_US ? "PASS" : "FAIL");
	printf("Min Latency:   %4d us: %s\n", min_delay,
		min_delay < PASS_US ? "PASS" : "FAIL");
	printf("Avg Latency:   %4d us: %s\n", avg_delay,
		avg_delay < PASS_US ? "PASS" : "FAIL");
	printf("Max Latency:   %4d us: %s\n", max_delay,
		max_delay < PASS_US ? "PASS" : "FAIL");
	printf("Failed Iterations: %d\n\n", failures);

	return NULL;
}

int main(int argc, char *argv[])
{
	int per_id, mas_id;

	printf("-------------------------------\n");
	printf("Scheduling Latency\n");
	printf("-------------------------------\n\n");
	printf("Running %d iterations with a period of %d ms\n", ITERATIONS, PERIOD/NS_PER_MS);
	printf("Expected running time: %d s\n\n", (nsec_t)ITERATIONS*PERIOD/NS_PER_SEC);

	latency_trace_enable();

	start = rt_gettime();
	per_id = create_fifo_thread(periodic_thread, (void*)0, PRIO);

	join_thread(per_id);
	join_threads();

	return retval;
}

  reply	other threads:[~2006-05-16  1:30 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-13  2:24 rt20 scheduling latency testcase and failure data Darren Hart
2006-05-13  9:20 ` Florian Paul Schmidt
2006-05-13 11:55   ` Mike Galbraith
2006-05-13 15:39     ` Steven Rostedt
2006-05-13 16:36       ` Mike Galbraith
2006-05-15  8:04       ` Ingo Molnar
2006-05-13 18:06   ` Darren Hart
2006-05-13 18:21     ` Lee Revell
2006-05-13 23:01       ` Darren Hart
2006-05-14  3:46         ` Mike Galbraith
2006-05-14  5:48           ` Mike Galbraith
2006-05-14  7:04             ` Darren Hart
2006-05-14  7:38               ` Mike Galbraith
2006-05-15  8:13         ` Ingo Molnar
2006-05-16  1:30           ` Darren Hart [this message]
2006-05-16  7:22             ` Sébastien Dugué
2006-05-18  9:14               ` Darren Hart
2006-05-18 11:24                 ` Ingo Molnar
2006-05-18  8:44             ` Sébastien Dugué
2006-05-18  8:47               ` Ingo Molnar
2006-05-18  8:58                 ` Sébastien Dugué
2006-05-18  8:56                   ` Ingo Molnar
2006-05-18  9:18                     ` Sébastien Dugué
2006-05-18  9:38                   ` Darren Hart
2006-05-18  9:58                     ` Sébastien Dugué
2006-05-19  5:48                     ` Mike Galbraith
2006-05-19  5:58                       ` Mike Galbraith
2006-05-15  5:43     ` Mike Galbraith
2006-05-15 16:52       ` Lee Revell
2006-05-15 11:20     ` Sébastien Dugué
2006-05-15 21:49       ` Darren Hart
2006-05-15 11:15 ` Sébastien Dugué
2006-05-15 14:34   ` Darren Hart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200605151830.23544.dvhltc@us.ibm.com \
    --to=dvhltc@us.ibm.com \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mista.tapas@gmx.net \
    --cc=rlrevell@joe-job.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox