Re: [PATCH RFC] sched: add notifier for process migration

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Jason Baron <jbaron@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>,
	Jeremy Fitzhardinge <jeremy@goop.org>,
	Avi Kivity <avi@redhat.com>, Ingo Molnar <mingo@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andi Kleen <ak@linux.intel.com>, "H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH RFC] sched: add notifier for process migration
Date: Wed, 14 Oct 2009 10:41:15 -0400	[thread overview]
Message-ID: <20091014144115.GA2657@redhat.com> (raw)
In-Reply-To: <1255512370.8392.373.camel@twins>

On Wed, Oct 14, 2009 at 11:26:10AM +0200, Peter Zijlstra wrote:
> On Wed, 2009-10-14 at 09:05 +0200, Ingo Molnar wrote:
> > * Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> > 
> > > @@ -1981,6 +1989,12 @@ void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
> > >  #endif
> > >  		perf_swcounter_event(PERF_COUNT_SW_CPU_MIGRATIONS,
> > >  				     1, 1, NULL, 0);
> > > +
> > > +		tmn.task = p;
> > > +		tmn.from_cpu = old_cpu;
> > > +		tmn.to_cpu = new_cpu;
> > > +
> > > +		atomic_notifier_call_chain(&task_migration_notifier, 0, &tmn);
> > 
> > We already have one event notifier there - look at the 
> > perf_swcounter_event() callback. Why add a second one for essentially 
> > the same thing?
> > 
> > We should only put a single callback there - a tracepoint defined via 
> > TRACE_EVENT() - and any secondary users can register a callback to the 
> > tracepoint itself.
> > 
> > There's many similar places in the kernel - with notifier chains and 
> > also with a need to get tracepoints there. The fastest (and most 
> > consistent) solution is to add just a single event callback facility.
> 
> But that would basically mandate tracepoints to be always enabled, do we
> want to go there?
> 
> I don't think the overhead of tracepoints is understood well enough,
> Jason you poked at that, do you have anything solid on that?
> 

Currently, the cost of the tracepoint is the global memory read, and
compare, and then a jump. On x86 systems that I've tested this can average
anywhere b/w 40 - 100 cycles per tracepoints. Plus, there is the
icache overhead of the extra instructions that we skip over. I'm not
sure how to measure that beyond looking at their size.

I've proposed a 'jump label' set of patches, which essentially hard
codes a jump around the disabled code (avoiding the memory reference).
However, this introduces a high 'write' cost in that we code patch the
jmp to a 'jmp 0' to enable the code.

Along with this optimization I'm also looking into a method for moving
the disabled text to a 'cold' text section, to reduce the icache
overhead. Using these techniques we can reduce the disabled case to
essentially a couple of cycles per tracepoint.

In this case, where the tracepoint is always on, we wouldn't want to
move the tracepoint text to a cold section. Thus, I could introduce a
default enabled/disabled bias to the tracepoint.

However, in introducing such a feature, we are essentially forcing an
always on, or always off usage pattern, since the switch cost is high.
So I want to be careful not limit usefullness of tracepoints with such
an optimization.

thanks,

-Jason

next prev parent reply	other threads:[~2009-10-14 14:42 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-09 21:01 [PATCH RFC] sched: add notifier for process migration Jeremy Fitzhardinge
2009-10-09 22:02 ` Peter Zijlstra
2009-10-09 22:43   ` Jeremy Fitzhardinge
2009-10-10  7:14     ` Peter Zijlstra
2009-10-10  9:05       ` Avi Kivity
2009-10-10  9:24         ` Peter Zijlstra
2009-10-10  9:36           ` Jeremy Fitzhardinge
2009-10-10 10:12             ` Peter Zijlstra
2009-10-13 21:25               ` Jeremy Fitzhardinge
2009-10-14  7:05                 ` Ingo Molnar
2009-10-14  9:26                   ` Peter Zijlstra
2009-10-14 10:37                     ` Avi Kivity
2009-10-14 14:41                     ` Jason Baron [this message]
2009-10-14 16:15                   ` Jeremy Fitzhardinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091014144115.GA2657@redhat.com \
    --to=jbaron@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=avi@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox