All of lore.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Jiri Olsa <jolsa@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	Corey Ashford <cjashfor@linux.vnet.ibm.com>,
	David Ahern <dsahern@gmail.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@kernel.org>, Paul Mackerras <paulus@samba.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
	Jean Pihet <jean.pihet@linaro.org>
Subject: Re: [PATCH 1/3] perf tools: Cache register accesses for unwind processing
Date: Sun, 27 Apr 2014 23:29:21 +0900	[thread overview]
Message-ID: <1398608961.1689.9.camel@leonhard> (raw)
In-Reply-To: <1397756352-26694-2-git-send-email-jolsa@redhat.com>

Hi Jiri,

2014-04-17 (목), 19:39 +0200, Jiri Olsa:
> Caching registers value into an array. Got about 4% speed up
> of perf_reg_value function for report command processing
> dwarf unwind stacks.

I'm not familiar with the code base, so probably silly questions:  Where
does the speed up come from?  IOW I don't know what's the difference
between the regs->regs and regs->cached_regs.  And does the cached_regs
contain correct values of registers for each frame?

Thanks,
Namhyung

> 
> Output from report over 1.5 GB data with DWARF unwind stacks:
> (TODO fix perf diff)
> 
>   current code:
>    6.81%  perf.old  perf.old                   [.] perf_reg_value
> 
>   change:
>    2.24%  perf      perf                       [.] perf_reg_value
> 
> And little bit of speed up:
> 
>  Performance counter stats for './perf.old report -i perf-test.data --stdio':
> 
>    134,664,011,577      cycles:u                  #    2.472 GHz
>    189,677,227,475      instructions:u            #    1.41  insns per cycle
>       54465.096050      task-clock (msec)         #    0.998 CPUs utilized
> 
>       54.598339009 seconds time elapsed
> 
>  Performance counter stats for './perf report -i perf-test.data --stdio':
> 
>    124,478,681,672      cycles:u                  #    2.466 GHz
>    168,998,379,866      instructions:u            #    1.36  insns per cycle
>       50487.110482      task-clock (msec)         #    0.997 CPUs utilized
> 
>       50.635824229 seconds time elapsed
> 
> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> Cc: Jean Pihet <jean.pihet@linaro.org>
> Signed-off-by: Jiri Olsa <jolsa@redhat.com>
> ---
>  tools/perf/util/event.h     |  5 +++++
>  tools/perf/util/perf_regs.c | 10 +++++++++-
>  tools/perf/util/perf_regs.h |  4 +++-
>  3 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index 38457d4..970d4eb 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -7,6 +7,7 @@
>  #include "../perf.h"
>  #include "map.h"
>  #include "build-id.h"
> +#include "perf_regs.h"
>  
>  struct mmap_event {
>  	struct perf_event_header header;
> @@ -87,6 +88,10 @@ struct regs_dump {
>  	u64 abi;
>  	u64 mask;
>  	u64 *regs;
> +
> +	/* Cached values/mask filled by first register access. */
> +	u64 cache_regs[PERF_REGS_MAX];
> +	u64 cache_mask;
>  };
>  
>  struct stack_dump {
> diff --git a/tools/perf/util/perf_regs.c b/tools/perf/util/perf_regs.c
> index a3539ef..43168fb 100644
> --- a/tools/perf/util/perf_regs.c
> +++ b/tools/perf/util/perf_regs.c
> @@ -1,11 +1,15 @@
>  #include <errno.h>
>  #include "perf_regs.h"
> +#include "event.h"
>  
>  int perf_reg_value(u64 *valp, struct regs_dump *regs, int id)
>  {
>  	int i, idx = 0;
>  	u64 mask = regs->mask;
>  
> +	if (regs->cache_mask & (1 << id))
> +		goto out;
> +
>  	if (!(mask & (1 << id)))
>  		return -EINVAL;
>  
> @@ -14,6 +18,10 @@ int perf_reg_value(u64 *valp, struct regs_dump *regs, int id)
>  			idx++;
>  	}
>  
> -	*valp = regs->regs[idx];
> +	regs->cache_mask |= (1 << id);
> +	regs->cache_regs[id] = regs->regs[idx];
> +
> +out:
> +	*valp = regs->cache_regs[id];
>  	return 0;
>  }
> diff --git a/tools/perf/util/perf_regs.h b/tools/perf/util/perf_regs.h
> index d6e8b6a..80d8ab1 100644
> --- a/tools/perf/util/perf_regs.h
> +++ b/tools/perf/util/perf_regs.h
> @@ -2,15 +2,17 @@
>  #define __PERF_REGS_H
>  
>  #include "types.h"
> -#include "event.h"
>  
>  #ifdef HAVE_PERF_REGS_SUPPORT
>  #include <perf_regs.h>
>  
> +struct regs_dump;
> +
>  int perf_reg_value(u64 *valp, struct regs_dump *regs, int id);
>  
>  #else
>  #define PERF_REGS_MASK	0
> +#define PERF_REGS_MAX	0
>  
>  static inline const char *perf_reg_name(int id __maybe_unused)
>  {




  reply	other threads:[~2014-04-27 14:29 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-17 17:39 [PATCH 0/3] perf tools: Speedup DWARF unwind Jiri Olsa
2014-04-17 17:39 ` [PATCH 1/3] perf tools: Cache register accesses for unwind processing Jiri Olsa
2014-04-27 14:29   ` Namhyung Kim [this message]
2014-04-28  9:48     ` Jiri Olsa
2014-04-28 13:02       ` Namhyung Kim
2014-04-28 13:24         ` Jiri Olsa
2014-04-29  0:36           ` Namhyung Kim
2014-04-30 12:12             ` Jiri Olsa
2014-04-28 10:39   ` Christian Borntraeger
2014-04-28 11:00     ` Jiri Olsa
2014-04-17 17:39 ` [PATCH 2/3] perf tools: Cache dso data file descriptor Jiri Olsa
2014-04-27 14:36   ` Namhyung Kim
2014-04-28 10:01     ` Jiri Olsa
2014-04-28 13:16       ` Namhyung Kim
2014-04-28 13:34         ` Jiri Olsa
2014-04-28 14:57         ` David Ahern
2014-04-29  0:41           ` Namhyung Kim
2014-05-07 19:01       ` Ingo Molnar
2014-04-17 17:39 ` [PATCH 3/3] perf tools: Replace dso data cache with mapped data Jiri Olsa
2014-04-18  7:51 ` [PATCH 0/3] perf tools: Speedup DWARF unwind Ingo Molnar
2014-04-18  7:55   ` Ingo Molnar
2014-04-18  9:35     ` Jiri Olsa
2014-04-23 20:16 ` Jiri Olsa
2014-04-25 13:08 ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1398608961.1689.9.camel@leonhard \
    --to=namhyung@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@ghostprotocols.net \
    --cc=cjashfor@linux.vnet.ibm.com \
    --cc=dsahern@gmail.com \
    --cc=fweisbec@gmail.com \
    --cc=jean.pihet@linaro.org \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.