All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	linux-kernel@vger.kernel.org,
	Corey Ashford <cjashfor@linux.vnet.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 2/2] perf_counter: optimize context switch between identical inherited contexts
Date: Fri, 22 May 2009 12:11:28 +0200	[thread overview]
Message-ID: <20090522101128.GC13482@elte.hu> (raw)
In-Reply-To: <18966.10666.517218.332164@cargo.ozlabs.ibm.com>


* Paul Mackerras <paulus@samba.org> wrote:

> When monitoring a process and its descendants with a set of 
> inherited counters, we can often get the situation in a context 
> switch where both the old (outgoing) and new (incoming) process 
> have the same set of counters, and their values are ultimately 
> going to be added together. In that situation it doesn't matter 
> which set of counters are used to count the activity for the new 
> process, so there is really no need to go through the process of 
> reading the hardware counters and updating the old task's counters 
> and then setting up the PMU for the new task.
> 
> This optimizes the context switch in this situation.  Instead of 
> scheduling out the perf_counter_context for the old task and 
> scheduling in the new context, we simply transfer the old context 
> to the new task and keep using it without interruption.  The new 
> context gets transferred to the old task.  This means that both 
> tasks still have a valid perf_counter_context, so no special case 
> is introduced when the old task gets scheduled in again, either on 
> this CPU or another CPU.
> 
> The equivalence of contexts is detected by keeping a pointer in 
> each cloned context pointing to the context it was cloned from. To 
> cope with the situation where a context is changed by adding or 
> removing counters after it has been cloned, we also keep a 
> generation number on each context which is incremented every time 
> a context is changed.  When a context is cloned we take a copy of 
> the parent's generation number, and two cloned contexts are 
> equivalent only if they have the same parent and the same 
> generation number.  In order that the parent context pointer 
> remains valid (and is not reused), we increment the parent 
> context's reference count for each context cloned from it.
> 
> Since we don't have individual fds for the counters in a cloned
> context, the only thing that can make two clones of a given parent
> different after they have been cloned is enabling or disabling all
> counters with prctl.  To account for this, we keep a count of the
> number of enabled counters in each context.  Two contexts must have
> the same number of enabled counters to be considered equivalent.
> 
> Here are some measurements of the context switch time as measured with
> the lat_ctx benchmark from lmbench, comparing the times obtained with
> and without this patch series:
> 
> 		        -----Unmodified-----	With this patch series
> Counters:	        none	2 HW	4H+4S	none	2 HW	4H+4S
>
> 2 processes:
> Average		3.44	6.45	11.24	3.12	3.39	3.60
> St dev		0.04	0.04	0.13	0.05	0.17	0.19
>
> 8 processes:
> Average		6.45	8.79	14.00	5.57	6.23	7.57
> St dev		1.27	1.04	0.88	1.42	1.46	1.42
> 
> 32 processes:
> Average		5.56	8.43	13.78	5.28	5.55	7.15
> St dev		0.41	0.47	0.53	0.54	0.57	0.81
> 
> The numbers are the mean and standard deviation of 20 runs of 
> lat_ctx.  The "none" columns are lat_ctx run directly without any 
> counters.  The "2 HW" columns are with lat_ctx run under perfstat, 
> counting cycles and instructions.  The "4H+4S" columns are lat_ctx 
> run under perfstat with 4 hardware counters and 4 software 
> counters (cycles, instructions, cache references, cache misses, 
> task clock, context switch, cpu migrations, and page faults).
> 
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
>  include/linux/perf_counter.h |   12 ++++-
>  kernel/perf_counter.c        |  109 ++++++++++++++++++++++++++++++++++++-----
>  kernel/sched.c               |    2 +-
>  3 files changed, 107 insertions(+), 16 deletions(-)

Impressive!

I'm wondering where the sensitivity of lat_ctx on the number of 
counters comes from. I'd expect there to be constant (and very low) 
overhead. It could be measurement noise - lat_ctx is very sensitive 
on L2 layout and memory allocation patterns - those are very hard to 
eliminate and are not measured via the stddev numbers. (many of 
those effects are per bootup specific, and bootups dont randomize 
them - so there's no easy way to measure their statistical impact.)

	Ingo

  parent reply	other threads:[~2009-05-22 10:11 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-22  4:17 [PATCH 1/2] perf_counter: dynamically allocate tasks' perf_counter_context struct [v2] Paul Mackerras
2009-05-22  4:27 ` [PATCH 2/2] perf_counter: optimize context switch between identical inherited contexts Paul Mackerras
2009-05-22  8:16   ` Peter Zijlstra
2009-05-22  9:56     ` Paul Mackerras
2009-05-22 10:08       ` Peter Zijlstra
2009-05-23 12:38         ` Ingo Molnar
2009-05-23 13:06           ` Peter Zijlstra
2009-05-24 23:55           ` Paul Mackerras
2009-05-22  8:32   ` Peter Zijlstra
2009-05-22  8:57     ` Ingo Molnar
2009-05-22  9:02       ` Peter Zijlstra
2009-05-22 10:14         ` Ingo Molnar
2009-05-22  9:29     ` Paul Mackerras
2009-05-22  9:22   ` Peter Zijlstra
2009-05-22  9:42     ` Peter Zijlstra
2009-05-22 10:07       ` Paul Mackerras
2009-05-22 10:05     ` Paul Mackerras
2009-05-22 10:11   ` Ingo Molnar [this message]
2009-05-22 10:27   ` [tip:perfcounters/core] perf_counter: Optimize " tip-bot for Paul Mackerras
2009-05-24 11:33     ` Ingo Molnar
2009-05-25  6:18       ` Paul Mackerras
2009-05-25  6:54         ` Ingo Molnar
2009-05-22 10:36   ` [tip:perfcounters/core] perf_counter: fix !PERF_COUNTERS build failure tip-bot for Ingo Molnar
2009-05-22 13:46   ` [PATCH 2/2] perf_counter: optimize context switch between identical inherited contexts Peter Zijlstra
2009-05-25  0:15     ` Paul Mackerras
2009-05-25 10:38       ` Peter Zijlstra
2009-05-25 10:50         ` Peter Zijlstra
2009-05-25 11:06         ` Paul Mackerras
2009-05-25 11:27           ` Peter Zijlstra
2009-05-22  8:06 ` [PATCH 1/2] perf_counter: dynamically allocate tasks' perf_counter_context struct [v2] Peter Zijlstra
2009-05-22  9:30   ` Paul Mackerras
2009-05-22 10:27 ` [tip:perfcounters/core] perf_counter: Dynamically allocate tasks' perf_counter_context struct tip-bot for Paul Mackerras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090522101128.GC13482@elte.hu \
    --to=mingo@elte.hu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=cjashfor@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulus@samba.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.