public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Clark Williams <williams@redhat.com>
Cc: "Jon Masters" <jcm@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Arnaldo Carvalho de Melo" <acme@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Thomas Gleixner" <tglx@linutronix.de>
Subject: Re: [PATCH 3/3] perf latency builtin command
Date: Wed, 4 Nov 2009 13:41:18 +0100	[thread overview]
Message-ID: <20091104124118.GC11968@elte.hu> (raw)
In-Reply-To: <20091103160053.0f0dd357@torg>


* Clark Williams <williams@redhat.com> wrote:

> > Basically hwlat_detector is using stop_machine_run() plus a tight 
> > rdtsc based loop to sample what is happening in the system. Much of 
> > hwlat_detector.c deals with getting that information (and 
> > parameters) back and forth between user space and kernel space.
> > 
> > Couldnt we move that functionality a bit closer to perf by creating 
> > special events in a tight loop that generate a stream of perf 
> > events, and let the rest of perf events take over the details, and 
> > do the analysis in the user-space builtin-latency.c code?
> >  
> > Also, do we need stop_machine_run() - couldnt we do the measurement 
> > on a specific CPU with irqs (and NMIs) disabled [but other CPUs 
> > still running]?
> 
> So what would the source of the event's be and how confident would we 
> be that they're accurate? Jon used stop_machine() so that *nothing* 
> under the control of Linux is going to happen during the test; no 
> C-state changes, no interrupts, nada. The intent is that if there's a 
> gap seen in the TSC values, it's because something happened that's out 
> of our control.

Yes - that goal is sensible. It could be achieved by running a long 
sampling period on all online CPUs, using SCHED_FIFO:99 priority, right?

What we need to enable this is a way to start a measurement period, 
which outputs its result(s) to a perf event channel.

A first hack could be an ioctl extension in 
kernel/perf_event.c:perf_ioctl(), PERF_EVENT_IOC_INJECT or so.

PERF_EVENT_IOC_INJECT would inject an artificial trace event into the 
kernel, with its parameters defined by user-space.

Initially (for a prototype) you could hardcode it to be purely hwlat 
specific - i.e. the parameters would directly turn into 'disable irqs, 
run a tight TSC loop, report results'.

See commit 8968f9d how to define a new event via TRACE_EVENT(). Once a 
hwlat tracepoint is defined, it could be triggered via:

  trace_hwlat_result(loops, max_delay, min_delay, sum_delay)

I.e. it would look like this (in the prototype), in 
kernel/perf_event.c:perf_ioctl():

	case PERF_EVENT_IOC_INJECT: {

		unigned long hwlat_loops = arg;

		local_irq_disable();
		t0 = get_cycles();
		for (i = 0; i < hwlat_loops; i++) {
			t1 = get_cycles();
			hwlat_delay = t1 - t0;
			hwlat_delay_max = max(hwlat_delay_max, hwlat_delay);
			...
		}

		local_irq_enable();

		trace_hwlat_report(hwlat_loops, hwlat_delay_max);
	}

Note how simple the structure is - one event, one callback that does the 
hwlat measurement loop - and a line to report the resuls. All the rest 
(buffering, enumeration, post-processing, etc.) can be done via using 
'perf latency' and existing perf facilities.

Note at the advantages: we'd have hwlat _and_ the tooling in the kernel 
as one package in essence. Makes for an easy pull request: it fits 
nicely into the existing scheme of things and anyone can run 'perf 
latency' to see what it is good for.

Can you see any fundamental problems with such an approach?

	Ingo

      reply	other threads:[~2009-11-04 12:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-01 21:55 [PATCH 0/3] perf latency command Clark Williams
2009-11-01 21:56 ` [PATCH 1/3] debugfs utility routines for perf Clark Williams
2009-11-08 17:06   ` [tip:perf/core] perf tools: Add " tip-bot for Clark Williams
2009-11-01 21:57 ` [PATCH 2/3] modify perf routines to use new debugfs routines Clark Williams
2009-11-08 17:06   ` [tip:perf/core] perf tools: Modify " tip-bot for Clark Williams
2009-11-01 21:58 ` [PATCH 3/3] perf latency builtin command Clark Williams
2009-11-03 19:28   ` Ingo Molnar
2009-11-03 22:00     ` Clark Williams
2009-11-04 12:41       ` Ingo Molnar [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091104124118.GC11968@elte.hu \
    --to=mingo@elte.hu \
    --cc=acme@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=fweisbec@gmail.com \
    --cc=jcm@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox