Re: [PATCH 7/7] perf: Add load latency monitoring on Intel Nehalem/Westmere v2

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Lin Ming <ming.m.lin@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>, Andi Kleen <andi@firstfloor.org>,
	Stephane Eranian <eranian@google.com>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 7/7] perf: Add load latency monitoring on Intel Nehalem/Westmere v2
Date: Tue, 04 Jan 2011 12:59:21 +0100	[thread overview]
Message-ID: <1294142361.2016.131.camel@laptop> (raw)
In-Reply-To: <1293464397.2695.109.camel@localhost>

On Mon, 2010-12-27 at 23:39 +0800, Lin Ming wrote:

> +/*
> + * Load latency data source encoding
> + */
> +
> +/* Bits(0-1) {L1, L2, L3, RAM} or {unknown, IO, uncached} */
> +#define LD_LAT_L1			0x00
> +#define LD_LAT_L2			0x01
> +#define LD_LAT_L3			0x02
> +#define LD_LAT_RAM			0x03
> +#define LD_LAT_UNKNOWN			0x00
> +#define LD_LAT_IO			0x01
> +#define LD_LAT_UNCACHED			0x02
> +
> +/* Bits(2-3) {not-used, snoop, local, remote} */
> +#define LD_LAT_NOT_USED			(0x00 << 2)
> +#define LD_LAT_SNOOP			(0x01 << 2)
> +#define LD_LAT_LOCAL			(0x02 << 2)
> +#define LD_LAT_REMOTE			(0x03 << 2)
> +
> +/* Bits(4-5) {modified, exclusive, shared, invalid} */
> +#define LD_LAT_MODIFIED			(0x00 << 4)
> +#define LD_LAT_EXCLUSIVE		(0x01 << 4)
> +#define LD_LAT_SHARED			(0x02 << 4)
> +#define LD_LAT_INVALID			(0x03 << 4)
> +
> +#define LD_LAT_RESERVED			0x3F

Hmm, why is that RESREVED? From what I can see that ends up being:
  RAM-remote-invalid

Which brings us to the NOT_USED thing, from what I can see its the value
that toggles between {L1, L2, L3, RAM} and {unknown, IO, uncached},
that's not NOT_USED.

There's still room for a {reserved} value in that alternative set,
making it: {unknown, IO, uncached, reserved}

Which would make

#define EXTRA_SRC_RESERVED 0x03


> @@ -384,6 +387,10 @@ static int x86_pmu_extra_regs(u64 config, struct perf_event *event)
>  		if (extra & ~er->valid_mask)
>  			return -EINVAL;
>  		event->hw.extra_config = extra;
> +
> +		/* The minimum value that may be programmed into MSR_PEBS_LD_LAT is 3 */
> +		if (er->msr == MSR_PEBS_LD_LAT_THRESHOLD && extra < 3)
> +			event->hw.extra_config = 3;
>  		break;
>  	}
>  	return 0;

I'm not really sure about this, should we silently fix up or bail on
invalid values?

> +/* Indexed by Intel load latency data source encoding value */
> +
> +static u64 intel_load_latency_data[] = {
> +	LD_LAT_UNKNOWN,					/* 0x00: Unknown L3 */
> +	LD_LAT_L1 | LD_LAT_LOCAL,			/* 0x01: L1-local */
> +	LD_LAT_L2 | LD_LAT_SNOOP,			/* 0x02: L2-snoop */

I bet Stephane wants to argue more on this ;-)

> +	LD_LAT_L2 | LD_LAT_LOCAL,			/* 0x03: L2-local */
> +	LD_LAT_L3 | LD_LAT_SNOOP | LD_LAT_INVALID,	/* 0x04: L3-snoop, no coherency actions */
> +	LD_LAT_L3 | LD_LAT_SNOOP | LD_LAT_SHARED,	/* 0x05: L3-snoop, found no M */
> +	LD_LAT_L3 | LD_LAT_SNOOP | LD_LAT_MODIFIED,	/* 0x06: L3-snoop, found M */
> +	LD_LAT_RESERVED,				/* 0x07: reserved */
> +	LD_LAT_RAM | LD_LAT_SNOOP | LD_LAT_SHARED,	/* 0x08: L3-miss, snoop, shared */
> +	LD_LAT_RESERVED,				/* 0x09: reserved */
> +	LD_LAT_RAM | LD_LAT_LOCAL | LD_LAT_SHARED,	/* 0x0A: L3-miss, local, shared */
> +	LD_LAT_RAM | LD_LAT_REMOTE | LD_LAT_SHARED,	/* 0x0B: L3-miss, remote, shared */
> +	LD_LAT_RAM | LD_LAT_LOCAL | LD_LAT_EXCLUSIVE,	/* 0x0C: L3-miss, local, exclusive */
> +	LD_LAT_RAM | LD_LAT_REMOTE | LD_LAT_EXCLUSIVE,	/* 0x0D: L3-miss, remote, exclusive */
> +	LD_LAT_IO,					/* 0x0E: IO */
> +	LD_LAT_UNCACHED,				/* 0x0F: Uncached */
> +};


> @@ -541,14 +567,10 @@ static int intel_pmu_save_and_restart(struct perf_event *event);
>  static void __intel_pmu_pebs_event(struct perf_event *event,
>  				   struct pt_regs *iregs, void *__pebs)
>  {
> -	/*
> -	 * We cast to pebs_record_core since that is a subset of
> -	 * both formats and we don't use the other fields in this
> -	 * routine.
> -	 */
> -	struct pebs_record_core *pebs = __pebs;
> +	struct pebs_record_nhm *pebs = __pebs;


So you cast to the biggest and make sure you don't peek past the end of
the actual data.. that might want to have a comment.


> @@ -556,6 +578,19 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
>  	perf_sample_data_init(&data, 0);
>  	data.period = event->hw.last_period;
>  
> +	if (event->hw.extra_reg == MSR_PEBS_LD_LAT_THRESHOLD) {

This is what stops us from peeking past the end of the pebs record,
since core would never end up having the LD_LAT extra_reg.

> +		sample_type = event->attr.sample_type;
> +
> +		if (sample_type & PERF_SAMPLE_ADDR)
> +			data.addr = pebs->dla;
> +
> +		if (sample_type & PERF_SAMPLE_LATENCY)
> +			data.latency = pebs->lat;
> +
> +		if (sample_type & PERF_SAMPLE_LATENCY_DATA)
> +			data.latency_data = intel_load_latency_data[pebs->dse];


> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index d24d9ab..9428a1b 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -124,9 +124,11 @@ enum perf_event_sample_format {
>  	PERF_SAMPLE_CPU				= 1U << 7,
>  	PERF_SAMPLE_PERIOD			= 1U << 8,
>  	PERF_SAMPLE_STREAM_ID			= 1U << 9,
> -	PERF_SAMPLE_RAW				= 1U << 10,
> +	PERF_SAMPLE_LATENCY			= 1U << 10,
> +	PERF_SAMPLE_LATENCY_DATA		= 1U << 11,
> +	PERF_SAMPLE_RAW				= 1U << 12,
>  
> -	PERF_SAMPLE_MAX = 1U << 11,		/* non-ABI */
> +	PERF_SAMPLE_MAX = 1U << 13,		/* non-ABI */
>  };
>  
>  /*
> @@ -958,6 +960,8 @@ struct perf_sample_data {
>  	}				tid_entry;
>  	u64				time;
>  	u64				addr;
> +	u64				latency;
> +	u64				latency_data;
>  	u64				id;
>  	u64				stream_id;
>  	struct {

Right, so I think I want this in 3 patches, one adding the load-latency
extra reg thing and PERF_SAMPLE_ADDR usage, one adding
PERF_SAMPLE_LATENCY and one adding PERF_SAMPLE_EXTRA, I really dislike
the LATENCY_DATA name since that ties the extra data to the latency
thing, which isn't at all true for other platforms.

next prev parent reply	other threads:[~2011-01-04 11:59 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-27 15:39 [PATCH 7/7] perf: Add load latency monitoring on Intel Nehalem/Westmere v2 Lin Ming
2011-01-04 11:59 ` Peter Zijlstra [this message]
2011-01-04 14:03   ` Peter Zijlstra
2011-01-06  7:49     ` Lin Ming
2011-01-05  7:12   ` Lin Ming
2011-01-05  9:50 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1294142361.2016.131.camel@laptop \
    --to=a.p.zijlstra@chello.nl \
    --cc=andi@firstfloor.org \
    --cc=eranian@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.m.lin@intel.com \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox