* [PATCH 1/6] perf, core: Add generic transaction flags v5
2013-09-20 14:40 perf, x86: Add last TSX PMU code for Haswell v2 Andi Kleen
@ 2013-09-20 14:40 ` Andi Kleen
2013-10-04 17:32 ` [tip:perf/core] perf: Add generic transaction flags tip-bot for Andi Kleen
2013-09-20 14:40 ` [PATCH 2/6] perf, x86: Add Haswell specific transaction flag reporting v5 Andi Kleen
` (6 subsequent siblings)
7 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2013-09-20 14:40 UTC (permalink / raw)
To: linux-kernel; +Cc: acme, mingo, peterz, eranian, jolsa, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Add a generic qualifier for transaction events, as a new sample
type that returns a flag word. This is particularly useful
for qualifying aborts: to distinguish aborts which happen
due to asynchronous events (like conflicts caused by another
CPU) versus instructions that lead to an abort.
The tuning strategies are very different for those cases,
so it's important to distinguish them easily and early.
Since it's inconvenient and inflexible to filter for this
in the kernel we report all the events out and allow
some post processing in user space.
The flags are based on the Intel TSX events, but should be fairly
generic and mostly applicable to other HTM architectures too. In addition
to various flag words there's also reserved space to report an
program supplied abort code. For TSX this is used to distinguish specific
classes of aborts, like a lock busy abort when doing lock elision.
Flags:
Elision and generic transactions (ELISION vs TRANSACTION)
(HLE vs RTM on TSX; IBM etc. would likely only use TRANSACTION)
Aborts caused by current thread vs aborts caused by others (SYNC vs ASYNC)
Retryable transaction (RETRY)
Conflicts with other threads (CONFLICT)
Transaction write capacity overflow (CAPACITY WRITE)
Transaction read capacity overflow (CAPACITY READ)
Transactions implicitely aborted can also return an abort code.
This can be used to signal specific events to the profiler. A common
case is abort on lock busy in a RTM eliding library (code 0xff)
To handle this case we include the TSX abort code
Common example aborts in TSX would be:
- Data conflict with another thread on memory read.
Flags: TRANSACTION|ASYNC|CONFLICT
- executing a WRMSR in a transaction. Flags: TRANSACTION|SYNC
- HLE transaction in user space is too large
Flags: ELISION|SYNC|CAPACITY-WRITE
The only flag that is somewhat TSX specific is ELISION.
This adds the perf core glue needed for reporting the new flag word out.
v2: Add MEM/MISC
v3: Move transaction to the end
v4: Separate capacity-read/write and remove misc
v5: Remove _SAMPLE. Move abort flags to 32bit. Rename
transaction to txn
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
include/linux/perf_event.h | 5 +++++
include/uapi/linux/perf_event.h | 25 ++++++++++++++++++++++++-
kernel/events/core.c | 6 ++++++
3 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 866e85c..ee96093 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -562,6 +562,10 @@ struct perf_sample_data {
struct perf_regs_user regs_user;
u64 stack_user_size;
u64 weight;
+ /*
+ * Transaction flags for abort events:
+ */
+ u64 txn;
};
static inline void perf_sample_data_init(struct perf_sample_data *data,
@@ -577,6 +581,7 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
data->stack_user_size = 0;
data->weight = 0;
data->data_src.val = 0;
+ data->txn = 0;
}
extern void perf_output_sample(struct perf_output_handle *handle,
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index ca1d90b..fee1264 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -136,8 +136,9 @@ enum perf_event_sample_format {
PERF_SAMPLE_WEIGHT = 1U << 14,
PERF_SAMPLE_DATA_SRC = 1U << 15,
PERF_SAMPLE_IDENTIFIER = 1U << 16,
+ PERF_SAMPLE_TRANSACTION = 1U << 17,
- PERF_SAMPLE_MAX = 1U << 17, /* non-ABI */
+ PERF_SAMPLE_MAX = 1U << 18, /* non-ABI */
};
/*
@@ -181,6 +182,28 @@ enum perf_sample_regs_abi {
};
/*
+ * Values for the memory transaction event qualifier, mostly for
+ * abort events. Multiple bits can be set.
+ */
+enum {
+ PERF_TXN_ELISION = (1 << 0), /* From elision */
+ PERF_TXN_TRANSACTION = (1 << 1), /* From transaction */
+ PERF_TXN_SYNC = (1 << 2), /* Instruction is related */
+ PERF_TXN_ASYNC = (1 << 3), /* Instruction not related */
+ PERF_TXN_RETRY = (1 << 4), /* Retry possible */
+ PERF_TXN_CONFLICT = (1 << 5), /* Conflict abort */
+ PERF_TXN_CAPACITY_WRITE = (1 << 6), /* Capacity write abort */
+ PERF_TXN_CAPACITY_READ = (1 << 7), /* Capacity read abort */
+
+ PERF_TXN_MAX = (1 << 8), /* non-ABI */
+
+ /* bits 32..63 are reserved for the abort code */
+
+ PERF_TXN_ABORT_MASK = (0xffffffffULL << 32),
+ PERF_TXN_ABORT_SHIFT = 32,
+};
+
+/*
* The format of the data returned by read() on a perf event fd,
* as specified by attr.read_format:
*
diff --git a/kernel/events/core.c b/kernel/events/core.c
index dd236b6..fe2d7c8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1201,6 +1201,9 @@ static void perf_event__header_size(struct perf_event *event)
if (sample_type & PERF_SAMPLE_DATA_SRC)
size += sizeof(data->data_src.val);
+ if (sample_type & PERF_SAMPLE_TRANSACTION)
+ size += sizeof(data->txn);
+
event->header_size = size;
}
@@ -4551,6 +4554,9 @@ void perf_output_sample(struct perf_output_handle *handle,
if (sample_type & PERF_SAMPLE_DATA_SRC)
perf_output_put(handle, data->data_src.val);
+ if (sample_type & PERF_SAMPLE_TRANSACTION)
+ perf_output_put(handle, data->txn);
+
if (!event->attr.watermark) {
int wakeup_events = event->attr.wakeup_events;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [tip:perf/core] perf: Add generic transaction flags
2013-09-20 14:40 ` [PATCH 1/6] perf, core: Add generic transaction flags v5 Andi Kleen
@ 2013-10-04 17:32 ` tip-bot for Andi Kleen
2013-12-13 20:31 ` Vince Weaver
0 siblings, 1 reply; 22+ messages in thread
From: tip-bot for Andi Kleen @ 2013-10-04 17:32 UTC (permalink / raw)
To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, peterz, ak, tglx
Commit-ID: fdfbbd07e91f8fe387140776f3fd94605f0c89e5
Gitweb: http://git.kernel.org/tip/fdfbbd07e91f8fe387140776f3fd94605f0c89e5
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Fri, 20 Sep 2013 07:40:39 -0700
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 4 Oct 2013 10:06:08 +0200
perf: Add generic transaction flags
Add a generic qualifier for transaction events, as a new sample
type that returns a flag word. This is particularly useful
for qualifying aborts: to distinguish aborts which happen
due to asynchronous events (like conflicts caused by another
CPU) versus instructions that lead to an abort.
The tuning strategies are very different for those cases,
so it's important to distinguish them easily and early.
Since it's inconvenient and inflexible to filter for this
in the kernel we report all the events out and allow
some post processing in user space.
The flags are based on the Intel TSX events, but should be fairly
generic and mostly applicable to other HTM architectures too. In addition
to various flag words there's also reserved space to report an
program supplied abort code. For TSX this is used to distinguish specific
classes of aborts, like a lock busy abort when doing lock elision.
Flags:
Elision and generic transactions (ELISION vs TRANSACTION)
(HLE vs RTM on TSX; IBM etc. would likely only use TRANSACTION)
Aborts caused by current thread vs aborts caused by others (SYNC vs ASYNC)
Retryable transaction (RETRY)
Conflicts with other threads (CONFLICT)
Transaction write capacity overflow (CAPACITY WRITE)
Transaction read capacity overflow (CAPACITY READ)
Transactions implicitely aborted can also return an abort code.
This can be used to signal specific events to the profiler. A common
case is abort on lock busy in a RTM eliding library (code 0xff)
To handle this case we include the TSX abort code
Common example aborts in TSX would be:
- Data conflict with another thread on memory read.
Flags: TRANSACTION|ASYNC|CONFLICT
- executing a WRMSR in a transaction. Flags: TRANSACTION|SYNC
- HLE transaction in user space is too large
Flags: ELISION|SYNC|CAPACITY-WRITE
The only flag that is somewhat TSX specific is ELISION.
This adds the perf core glue needed for reporting the new flag word out.
v2: Add MEM/MISC
v3: Move transaction to the end
v4: Separate capacity-read/write and remove misc
v5: Remove _SAMPLE. Move abort flags to 32bit. Rename
transaction to txn
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
include/linux/perf_event.h | 5 +++++
include/uapi/linux/perf_event.h | 25 ++++++++++++++++++++++++-
kernel/events/core.c | 6 ++++++
3 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index c8ba627..2e069d1 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -584,6 +584,10 @@ struct perf_sample_data {
struct perf_regs_user regs_user;
u64 stack_user_size;
u64 weight;
+ /*
+ * Transaction flags for abort events:
+ */
+ u64 txn;
};
static inline void perf_sample_data_init(struct perf_sample_data *data,
@@ -599,6 +603,7 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
data->stack_user_size = 0;
data->weight = 0;
data->data_src.val = 0;
+ data->txn = 0;
}
extern void perf_output_sample(struct perf_output_handle *handle,
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 009a655..da48837 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -136,8 +136,9 @@ enum perf_event_sample_format {
PERF_SAMPLE_WEIGHT = 1U << 14,
PERF_SAMPLE_DATA_SRC = 1U << 15,
PERF_SAMPLE_IDENTIFIER = 1U << 16,
+ PERF_SAMPLE_TRANSACTION = 1U << 17,
- PERF_SAMPLE_MAX = 1U << 17, /* non-ABI */
+ PERF_SAMPLE_MAX = 1U << 18, /* non-ABI */
};
/*
@@ -181,6 +182,28 @@ enum perf_sample_regs_abi {
};
/*
+ * Values for the memory transaction event qualifier, mostly for
+ * abort events. Multiple bits can be set.
+ */
+enum {
+ PERF_TXN_ELISION = (1 << 0), /* From elision */
+ PERF_TXN_TRANSACTION = (1 << 1), /* From transaction */
+ PERF_TXN_SYNC = (1 << 2), /* Instruction is related */
+ PERF_TXN_ASYNC = (1 << 3), /* Instruction not related */
+ PERF_TXN_RETRY = (1 << 4), /* Retry possible */
+ PERF_TXN_CONFLICT = (1 << 5), /* Conflict abort */
+ PERF_TXN_CAPACITY_WRITE = (1 << 6), /* Capacity write abort */
+ PERF_TXN_CAPACITY_READ = (1 << 7), /* Capacity read abort */
+
+ PERF_TXN_MAX = (1 << 8), /* non-ABI */
+
+ /* bits 32..63 are reserved for the abort code */
+
+ PERF_TXN_ABORT_MASK = (0xffffffffULL << 32),
+ PERF_TXN_ABORT_SHIFT = 32,
+};
+
+/*
* The format of the data returned by read() on a perf event fd,
* as specified by attr.read_format:
*
diff --git a/kernel/events/core.c b/kernel/events/core.c
index b25d65c..c716385 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1201,6 +1201,9 @@ static void perf_event__header_size(struct perf_event *event)
if (sample_type & PERF_SAMPLE_DATA_SRC)
size += sizeof(data->data_src.val);
+ if (sample_type & PERF_SAMPLE_TRANSACTION)
+ size += sizeof(data->txn);
+
event->header_size = size;
}
@@ -4572,6 +4575,9 @@ void perf_output_sample(struct perf_output_handle *handle,
if (sample_type & PERF_SAMPLE_DATA_SRC)
perf_output_put(handle, data->data_src.val);
+ if (sample_type & PERF_SAMPLE_TRANSACTION)
+ perf_output_put(handle, data->txn);
+
if (!event->attr.watermark) {
int wakeup_events = event->attr.wakeup_events;
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [tip:perf/core] perf: Add generic transaction flags
2013-10-04 17:32 ` [tip:perf/core] perf: Add generic transaction flags tip-bot for Andi Kleen
@ 2013-12-13 20:31 ` Vince Weaver
2013-12-13 20:38 ` Andi Kleen
0 siblings, 1 reply; 22+ messages in thread
From: Vince Weaver @ 2013-12-13 20:31 UTC (permalink / raw)
To: mingo, hpa, linux-kernel, peterz, ak, tglx
On Fri, 4 Oct 2013, tip-bot for Andi Kleen wrote:
> Commit-ID: fdfbbd07e91f8fe387140776f3fd94605f0c89e5
> Gitweb: http://git.kernel.org/tip/fdfbbd07e91f8fe387140776f3fd94605f0c89e5
> Author: Andi Kleen <ak@linux.intel.com>
> AuthorDate: Fri, 20 Sep 2013 07:40:39 -0700
> Committer: Ingo Molnar <mingo@kernel.org>
> CommitDate: Fri, 4 Oct 2013 10:06:08 +0200
>
> perf: Add generic transaction flags
...
> extern void perf_output_sample(struct perf_output_handle *handle,
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 009a655..da48837 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -136,8 +136,9 @@ enum perf_event_sample_format {
> PERF_SAMPLE_WEIGHT = 1U << 14,
> PERF_SAMPLE_DATA_SRC = 1U << 15,
> PERF_SAMPLE_IDENTIFIER = 1U << 16,
> + PERF_SAMPLE_TRANSACTION = 1U << 17,
>
> - PERF_SAMPLE_MAX = 1U << 17, /* non-ABI */
> + PERF_SAMPLE_MAX = 1U << 18, /* non-ABI */
> };
I know this is a bit late, but isn't this patch missing something like
(not a real patch):
* { u64 weight; } && PERF_SAMPLE_WEIGHT
* { u64 data_src; } && PERF_SAMPLE_DATA_SRC
+ * { u64 transaction; } && PERF_SAMPLE_TRANSACTION
* };
*/
> /*
> + * Values for the memory transaction event qualifier, mostly for
> + * abort events. Multiple bits can be set.
> + */
> +enum {
> + PERF_TXN_ELISION = (1 << 0), /* From elision */
> + PERF_TXN_TRANSACTION = (1 << 1), /* From transaction */
> + PERF_TXN_SYNC = (1 << 2), /* Instruction is related */
> + PERF_TXN_ASYNC = (1 << 3), /* Instruction not related */
> + PERF_TXN_RETRY = (1 << 4), /* Retry possible */
> + PERF_TXN_CONFLICT = (1 << 5), /* Conflict abort */
> + PERF_TXN_CAPACITY_WRITE = (1 << 6), /* Capacity write abort */
> + PERF_TXN_CAPACITY_READ = (1 << 7), /* Capacity read abort */
> +
> + PERF_TXN_MAX = (1 << 8), /* non-ABI */
> +
> + /* bits 32..63 are reserved for the abort code */
> +
> + PERF_TXN_ABORT_MASK = (0xffffffffULL << 32),
> + PERF_TXN_ABORT_SHIFT = 32,
> +};
> +
> +/*
> * The format of the data returned by read() on a perf event fd,
> * as specified by attr.read_format:
> *
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index b25d65c..c716385 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -1201,6 +1201,9 @@ static void perf_event__header_size(struct perf_event *event)
> if (sample_type & PERF_SAMPLE_DATA_SRC)
> size += sizeof(data->data_src.val);
>
> + if (sample_type & PERF_SAMPLE_TRANSACTION)
> + size += sizeof(data->txn);
> +
> event->header_size = size;
> }
>
> @@ -4572,6 +4575,9 @@ void perf_output_sample(struct perf_output_handle *handle,
> if (sample_type & PERF_SAMPLE_DATA_SRC)
> perf_output_put(handle, data->data_src.val);
>
> + if (sample_type & PERF_SAMPLE_TRANSACTION)
> + perf_output_put(handle, data->txn);
> +
> if (!event->attr.watermark) {
> int wakeup_events = event->attr.wakeup_events;
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [tip:perf/core] perf: Add generic transaction flags
2013-12-13 20:31 ` Vince Weaver
@ 2013-12-13 20:38 ` Andi Kleen
2013-12-13 20:52 ` [patch] perf properly document the new transaction sample type Vince Weaver
0 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2013-12-13 20:38 UTC (permalink / raw)
To: Vince Weaver; +Cc: mingo, hpa, linux-kernel, peterz, tglx
> I know this is a bit late, but isn't this patch missing something like
> (not a real patch):
>
>
> * { u64 weight; } && PERF_SAMPLE_WEIGHT
> * { u64 data_src; } && PERF_SAMPLE_DATA_SRC
> + * { u64 transaction; } && PERF_SAMPLE_TRANSACTION
> * };
> */
Yes it should have that. Thanks for catching. Please send a real patch.
-Andi
--
ak@linux.intel.com -- Speaking for myself only
^ permalink raw reply [flat|nested] 22+ messages in thread* [patch] perf properly document the new transaction sample type
2013-12-13 20:38 ` Andi Kleen
@ 2013-12-13 20:52 ` Vince Weaver
2013-12-13 21:04 ` Peter Zijlstra
2013-12-18 10:31 ` [tip:perf/urgent] perf: Document " tip-bot for Vince Weaver
0 siblings, 2 replies; 22+ messages in thread
From: Vince Weaver @ 2013-12-13 20:52 UTC (permalink / raw)
To: Andi Kleen; +Cc: mingo, hpa, linux-kernel, peterz, tglx
Commit fdfbbd07e91f8fe3871 "perf: Add generic transaction flags"
added support for PERF_SAMPLE_TRANSACTION but forgot to add documentation
for the sample type to include/uapi/linux/perf_event.h
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index e1802d6..959d454 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -679,6 +679,7 @@ enum perf_event_type {
*
* { u64 weight; } && PERF_SAMPLE_WEIGHT
* { u64 data_src; } && PERF_SAMPLE_DATA_SRC
+ * { u64 transaction; } && PERF_SAMPLE_TRANSACTION
* };
*/
PERF_RECORD_SAMPLE = 9,
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [patch] perf properly document the new transaction sample type
2013-12-13 20:52 ` [patch] perf properly document the new transaction sample type Vince Weaver
@ 2013-12-13 21:04 ` Peter Zijlstra
2013-12-18 10:31 ` [tip:perf/urgent] perf: Document " tip-bot for Vince Weaver
1 sibling, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2013-12-13 21:04 UTC (permalink / raw)
To: Vince Weaver; +Cc: Andi Kleen, mingo, hpa, linux-kernel, tglx
On Fri, Dec 13, 2013 at 03:52:25PM -0500, Vince Weaver wrote:
>
> Commit fdfbbd07e91f8fe3871 "perf: Add generic transaction flags"
> added support for PERF_SAMPLE_TRANSACTION but forgot to add documentation
> for the sample type to include/uapi/linux/perf_event.h
>
> Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Thanks!
^ permalink raw reply [flat|nested] 22+ messages in thread
* [tip:perf/urgent] perf: Document the new transaction sample type
2013-12-13 20:52 ` [patch] perf properly document the new transaction sample type Vince Weaver
2013-12-13 21:04 ` Peter Zijlstra
@ 2013-12-18 10:31 ` tip-bot for Vince Weaver
1 sibling, 0 replies; 22+ messages in thread
From: tip-bot for Vince Weaver @ 2013-12-18 10:31 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, hpa, mingo, peterz, vince, vincent.weaver, ak, tglx
Commit-ID: 189b84fb54490ae24111124346a8e63f8e019385
Gitweb: http://git.kernel.org/tip/189b84fb54490ae24111124346a8e63f8e019385
Author: Vince Weaver <vince@deater.net>
AuthorDate: Fri, 13 Dec 2013 15:52:25 -0500
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 17 Dec 2013 15:04:01 +0100
perf: Document the new transaction sample type
Commit fdfbbd07e91f8fe3871 ("perf: Add generic transaction flags")
added support for PERF_SAMPLE_TRANSACTION but forgot to add documentation
for the sample type to include/uapi/linux/perf_event.h
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1312131548450.10372@pianoman.cluster.toy
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
include/uapi/linux/perf_event.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index e1802d6..959d454 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -679,6 +679,7 @@ enum perf_event_type {
*
* { u64 weight; } && PERF_SAMPLE_WEIGHT
* { u64 data_src; } && PERF_SAMPLE_DATA_SRC
+ * { u64 transaction; } && PERF_SAMPLE_TRANSACTION
* };
*/
PERF_RECORD_SAMPLE = 9,
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 2/6] perf, x86: Add Haswell specific transaction flag reporting v5
2013-09-20 14:40 perf, x86: Add last TSX PMU code for Haswell v2 Andi Kleen
2013-09-20 14:40 ` [PATCH 1/6] perf, core: Add generic transaction flags v5 Andi Kleen
@ 2013-09-20 14:40 ` Andi Kleen
2013-10-04 17:32 ` [tip:perf/core] perf/x86: Add Haswell specific transaction flag reporting tip-bot for Andi Kleen
2013-09-20 14:40 ` [PATCH 3/6] perf, tools: Support sorting by in_tx, abort branch flags v3 Andi Kleen
` (5 subsequent siblings)
7 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2013-09-20 14:40 UTC (permalink / raw)
To: linux-kernel; +Cc: acme, mingo, peterz, eranian, jolsa, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
In the PEBS handler report the transaction flags using the new
generic transaction flags facility. Most of them come from
the "tsx_tuning" field in PEBSv2, but the abort code is derived
from the RAX register reported in the PEBS record.
v2: Fix interaction with precise-loads
v3: Mask out reserved bits. More comments.
v4: Adjust white space
v5: Refactor PEBS handler code to collapse one if
and move parts into new inline.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/kernel/cpu/perf_event_intel_ds.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 104cbba..fd2dbac 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -207,6 +207,8 @@ union hsw_tsx_tuning {
u64 value;
};
+#define PEBS_HSW_TSX_FLAGS 0xff00000000ULL
+
void init_debug_store_on_cpu(int cpu)
{
struct debug_store *ds = per_cpu(cpu_hw_events, cpu).ds;
@@ -807,6 +809,16 @@ static inline u64 intel_hsw_weight(struct pebs_record_hsw *pebs)
return 0;
}
+static inline u64 intel_hsw_transaction(struct pebs_record_hsw *pebs)
+{
+ u64 txn = (pebs->tsx_tuning & PEBS_HSW_TSX_FLAGS) >> 32;
+
+ /* For RTM XABORTs also log the abort code from AX */
+ if ((txn & PERF_TXN_TRANSACTION) && (pebs->ax & 1))
+ txn |= ((pebs->ax >> 24) & 0xff) << PERF_TXN_ABORT_SHIFT;
+ return txn;
+}
+
static void __intel_pmu_pebs_event(struct perf_event *event,
struct pt_regs *iregs, void *__pebs)
{
@@ -887,11 +899,14 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
x86_pmu.intel_cap.pebs_format >= 1)
data.addr = pebs->dla;
- /* Only set the TSX weight when no memory weight was requested. */
- if ((event->attr.sample_type & PERF_SAMPLE_WEIGHT) &&
- !fll &&
- (x86_pmu.intel_cap.pebs_format >= 2))
- data.weight = intel_hsw_weight(pebs);
+ if (x86_pmu.intel_cap.pebs_format >= 2) {
+ /* Only set the TSX weight when no memory weight. */
+ if ((event->attr.sample_type & PERF_SAMPLE_WEIGHT) && !fll)
+ data.weight = intel_hsw_weight(pebs);
+
+ if (event->attr.sample_type & PERF_SAMPLE_TRANSACTION)
+ data.txn = intel_hsw_transaction(pebs);
+ }
if (has_branch_stack(event))
data.br_stack = &cpuc->lbr_stack;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [tip:perf/core] perf/x86: Add Haswell specific transaction flag reporting
2013-09-20 14:40 ` [PATCH 2/6] perf, x86: Add Haswell specific transaction flag reporting v5 Andi Kleen
@ 2013-10-04 17:32 ` tip-bot for Andi Kleen
0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Andi Kleen @ 2013-10-04 17:32 UTC (permalink / raw)
To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, peterz, ak, tglx
Commit-ID: a405bad5ad2086766ce320b16a56952e013327f8
Gitweb: http://git.kernel.org/tip/a405bad5ad2086766ce320b16a56952e013327f8
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Fri, 20 Sep 2013 07:40:40 -0700
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 4 Oct 2013 10:06:09 +0200
perf/x86: Add Haswell specific transaction flag reporting
In the PEBS handler report the transaction flags using the new
generic transaction flags facility. Most of them come from
the "tsx_tuning" field in PEBSv2, but the abort code is derived
from the RAX register reported in the PEBS record.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-3-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/cpu/perf_event_intel_ds.c | 24 ++++++++++++++++++++----
1 file changed, 20 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 07d9a05..32e9ed8 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -206,6 +206,8 @@ union hsw_tsx_tuning {
u64 value;
};
+#define PEBS_HSW_TSX_FLAGS 0xff00000000ULL
+
void init_debug_store_on_cpu(int cpu)
{
struct debug_store *ds = per_cpu(cpu_hw_events, cpu).ds;
@@ -807,6 +809,16 @@ static inline u64 intel_hsw_weight(struct pebs_record_hsw *pebs)
return 0;
}
+static inline u64 intel_hsw_transaction(struct pebs_record_hsw *pebs)
+{
+ u64 txn = (pebs->tsx_tuning & PEBS_HSW_TSX_FLAGS) >> 32;
+
+ /* For RTM XABORTs also log the abort code from AX */
+ if ((txn & PERF_TXN_TRANSACTION) && (pebs->ax & 1))
+ txn |= ((pebs->ax >> 24) & 0xff) << PERF_TXN_ABORT_SHIFT;
+ return txn;
+}
+
static void __intel_pmu_pebs_event(struct perf_event *event,
struct pt_regs *iregs, void *__pebs)
{
@@ -885,10 +897,14 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
x86_pmu.intel_cap.pebs_format >= 1)
data.addr = pebs->dla;
- /* Only set the TSX weight when no memory weight was requested. */
- if ((event->attr.sample_type & PERF_SAMPLE_WEIGHT) && !fll &&
- (x86_pmu.intel_cap.pebs_format >= 2))
- data.weight = intel_hsw_weight(pebs);
+ if (x86_pmu.intel_cap.pebs_format >= 2) {
+ /* Only set the TSX weight when no memory weight. */
+ if ((event->attr.sample_type & PERF_SAMPLE_WEIGHT) && !fll)
+ data.weight = intel_hsw_weight(pebs);
+
+ if (event->attr.sample_type & PERF_SAMPLE_TRANSACTION)
+ data.txn = intel_hsw_transaction(pebs);
+ }
if (has_branch_stack(event))
data.br_stack = &cpuc->lbr_stack;
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 3/6] perf, tools: Support sorting by in_tx, abort branch flags v3
2013-09-20 14:40 perf, x86: Add last TSX PMU code for Haswell v2 Andi Kleen
2013-09-20 14:40 ` [PATCH 1/6] perf, core: Add generic transaction flags v5 Andi Kleen
2013-09-20 14:40 ` [PATCH 2/6] perf, x86: Add Haswell specific transaction flag reporting v5 Andi Kleen
@ 2013-09-20 14:40 ` Andi Kleen
2013-10-04 17:32 ` [tip:perf/core] tools/perf: Support sorting by in_tx or abort branch flags tip-bot for Andi Kleen
2013-09-20 14:40 ` [PATCH 4/6] perf, tools: Add abort_tx,no_tx,in_tx branch filter options to perf record -j v3 Andi Kleen
` (4 subsequent siblings)
7 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2013-09-20 14:40 UTC (permalink / raw)
To: linux-kernel; +Cc: acme, mingo, peterz, eranian, jolsa, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Extend the perf branch sorting code to support sorting by in_tx
or abort_tx qualifiers. Also print out those qualifiers.
This also fixes up some of the existing sort key documentation.
We do not support no_tx here, because it's simply not showing
the in_tx flag.
v2: Readd flags to man pages
v3: Rename intx
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/Documentation/perf-report.txt | 4 ++-
tools/perf/Documentation/perf-top.txt | 3 +-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-top.c | 3 +-
tools/perf/perf.h | 4 ++-
tools/perf/util/hist.h | 2 ++
tools/perf/util/sort.c | 51 ++++++++++++++++++++++++++++++++
tools/perf/util/sort.h | 2 ++
8 files changed, 66 insertions(+), 5 deletions(-)
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 2b8097e..ae337e3 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -71,7 +71,7 @@ OPTIONS
entries are displayed as "[other]".
- cpu: cpu number the task ran at the time of sample
- srcline: filename and line number executed at the time of sample. The
- DWARF debuggin info must be provided.
+ DWARF debugging info must be provided.
By default, comm, dso and symbol keys are used.
(i.e. --sort comm,dso,symbol)
@@ -85,6 +85,8 @@ OPTIONS
- symbol_from: name of function branched from
- symbol_to: name of function branched to
- mispredict: "N" for predicted branch, "Y" for mispredicted branch
+ - in_tx: branch in TSX transaction
+ - abort: TSX transaction abort.
And default sort keys are changed to comm, dso_from, symbol_from, dso_to
and symbol_to, see '--branch-stack'.
diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index 58d6598..f852eb5 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -112,7 +112,8 @@ Default is to monitor all CPUS.
-s::
--sort::
- Sort by key(s): pid, comm, dso, symbol, parent, srcline, weight, local_weight.
+ Sort by key(s): pid, comm, dso, symbol, parent, srcline, weight,
+ local_weight, abort, in_tx
-n::
--show-nr-samples::
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 8e50d8d..1e84103 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -786,7 +786,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
"sort by key(s): pid, comm, dso, symbol, parent, cpu, srcline,"
" dso_to, dso_from, symbol_to, symbol_from, mispredict,"
" weight, local_weight, mem, symbol_daddr, dso_daddr, tlb, "
- "snoop, locked"),
+ "snoop, locked, abort, in_tx"),
OPT_BOOLEAN(0, "showcpuutilization", &symbol_conf.show_cpu_utilization,
"Show sample percentage for different cpu modes"),
OPT_STRING('p', "parent", &parent_pattern, "regex",
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 2122141..6534a37 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1103,7 +1103,8 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_INCR('v', "verbose", &verbose,
"be more verbose (show counter open errors, etc)"),
OPT_STRING('s', "sort", &sort_order, "key[,key2...]",
- "sort by key(s): pid, comm, dso, symbol, parent, weight, local_weight"),
+ "sort by key(s): pid, comm, dso, symbol, parent, weight, local_weight,"
+ " abort, in_tx"),
OPT_BOOLEAN('n', "show-nr-samples", &symbol_conf.show_nr_samples,
"Show a column with the number of samples"),
OPT_CALLBACK_DEFAULT('G', "call-graph", &top.record_opts,
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index cf20187..acf3d66 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -182,7 +182,9 @@ struct ip_callchain {
struct branch_flags {
u64 mispred:1;
u64 predicted:1;
- u64 reserved:62;
+ u64 in_tx:1;
+ u64 abort:1;
+ u64 reserved:60;
};
struct branch_entry {
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 1329b6b..f743e96 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -45,6 +45,8 @@ enum hist_column {
HISTC_CPU,
HISTC_SRCLINE,
HISTC_MISPREDICT,
+ HISTC_IN_TX,
+ HISTC_ABORT,
HISTC_SYMBOL_FROM,
HISTC_SYMBOL_TO,
HISTC_DSO_FROM,
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 5f118a0..1771566 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -858,6 +858,55 @@ struct sort_entry sort_mem_snoop = {
.se_width_idx = HISTC_MEM_SNOOP,
};
+static int64_t
+sort__abort_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ return left->branch_info->flags.abort !=
+ right->branch_info->flags.abort;
+}
+
+static int hist_entry__abort_snprintf(struct hist_entry *self, char *bf,
+ size_t size, unsigned int width)
+{
+ static const char *out = ".";
+
+ if (self->branch_info->flags.abort)
+ out = "A";
+ return repsep_snprintf(bf, size, "%-*s", width, out);
+}
+
+struct sort_entry sort_abort = {
+ .se_header = "Transaction abort",
+ .se_cmp = sort__abort_cmp,
+ .se_snprintf = hist_entry__abort_snprintf,
+ .se_width_idx = HISTC_ABORT,
+};
+
+static int64_t
+sort__in_tx_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ return left->branch_info->flags.in_tx !=
+ right->branch_info->flags.in_tx;
+}
+
+static int hist_entry__in_tx_snprintf(struct hist_entry *self, char *bf,
+ size_t size, unsigned int width)
+{
+ static const char *out = ".";
+
+ if (self->branch_info->flags.in_tx)
+ out = "T";
+
+ return repsep_snprintf(bf, size, "%-*s", width, out);
+}
+
+struct sort_entry sort_in_tx = {
+ .se_header = "Branch in transaction",
+ .se_cmp = sort__in_tx_cmp,
+ .se_snprintf = hist_entry__in_tx_snprintf,
+ .se_width_idx = HISTC_IN_TX,
+};
+
struct sort_dimension {
const char *name;
struct sort_entry *entry;
@@ -888,6 +937,8 @@ static struct sort_dimension bstack_sort_dimensions[] = {
DIM(SORT_SYM_FROM, "symbol_from", sort_sym_from),
DIM(SORT_SYM_TO, "symbol_to", sort_sym_to),
DIM(SORT_MISPREDICT, "mispredict", sort_mispredict),
+ DIM(SORT_IN_TX, "in_tx", sort_in_tx),
+ DIM(SORT_ABORT, "abort", sort_abort),
};
#undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 4e80dbd..9dad3a0 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -153,6 +153,8 @@ enum sort_type {
SORT_SYM_FROM,
SORT_SYM_TO,
SORT_MISPREDICT,
+ SORT_ABORT,
+ SORT_IN_TX,
/* memory mode specific sort keys */
__SORT_MEMORY_MODE,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [tip:perf/core] tools/perf: Support sorting by in_tx or abort branch flags
2013-09-20 14:40 ` [PATCH 3/6] perf, tools: Support sorting by in_tx, abort branch flags v3 Andi Kleen
@ 2013-10-04 17:32 ` tip-bot for Andi Kleen
0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Andi Kleen @ 2013-10-04 17:32 UTC (permalink / raw)
To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, peterz, ak, tglx, jolsa
Commit-ID: f5d05bcec409aec2c41727077ad818f7c4db005b
Gitweb: http://git.kernel.org/tip/f5d05bcec409aec2c41727077ad818f7c4db005b
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Fri, 20 Sep 2013 07:40:41 -0700
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 4 Oct 2013 10:06:09 +0200
tools/perf: Support sorting by in_tx or abort branch flags
Extend the perf branch sorting code to support sorting by in_tx
or abort_tx qualifiers. Also print out those qualifiers.
This also fixes up some of the existing sort key documentation.
We do not support no_tx here, because it's simply not showing
the in_tx flag.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-4-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
tools/perf/Documentation/perf-report.txt | 4 ++-
tools/perf/Documentation/perf-top.txt | 3 +-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-top.c | 3 +-
tools/perf/perf.h | 4 ++-
tools/perf/util/hist.h | 2 ++
tools/perf/util/sort.c | 51 ++++++++++++++++++++++++++++++++
tools/perf/util/sort.h | 2 ++
8 files changed, 66 insertions(+), 5 deletions(-)
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 2b8097e..ae337e3 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -71,7 +71,7 @@ OPTIONS
entries are displayed as "[other]".
- cpu: cpu number the task ran at the time of sample
- srcline: filename and line number executed at the time of sample. The
- DWARF debuggin info must be provided.
+ DWARF debugging info must be provided.
By default, comm, dso and symbol keys are used.
(i.e. --sort comm,dso,symbol)
@@ -85,6 +85,8 @@ OPTIONS
- symbol_from: name of function branched from
- symbol_to: name of function branched to
- mispredict: "N" for predicted branch, "Y" for mispredicted branch
+ - in_tx: branch in TSX transaction
+ - abort: TSX transaction abort.
And default sort keys are changed to comm, dso_from, symbol_from, dso_to
and symbol_to, see '--branch-stack'.
diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index 58d6598..f852eb5 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -112,7 +112,8 @@ Default is to monitor all CPUS.
-s::
--sort::
- Sort by key(s): pid, comm, dso, symbol, parent, srcline, weight, local_weight.
+ Sort by key(s): pid, comm, dso, symbol, parent, srcline, weight,
+ local_weight, abort, in_tx
-n::
--show-nr-samples::
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 72eae74..89b188d 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -787,7 +787,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
"sort by key(s): pid, comm, dso, symbol, parent, cpu, srcline,"
" dso_to, dso_from, symbol_to, symbol_from, mispredict,"
" weight, local_weight, mem, symbol_daddr, dso_daddr, tlb, "
- "snoop, locked"),
+ "snoop, locked, abort, in_tx"),
OPT_BOOLEAN(0, "showcpuutilization", &symbol_conf.show_cpu_utilization,
"Show sample percentage for different cpu modes"),
OPT_STRING('p', "parent", &parent_pattern, "regex",
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 2122141..6534a37 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1103,7 +1103,8 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_INCR('v', "verbose", &verbose,
"be more verbose (show counter open errors, etc)"),
OPT_STRING('s', "sort", &sort_order, "key[,key2...]",
- "sort by key(s): pid, comm, dso, symbol, parent, weight, local_weight"),
+ "sort by key(s): pid, comm, dso, symbol, parent, weight, local_weight,"
+ " abort, in_tx"),
OPT_BOOLEAN('n', "show-nr-samples", &symbol_conf.show_nr_samples,
"Show a column with the number of samples"),
OPT_CALLBACK_DEFAULT('G', "call-graph", &top.record_opts,
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index cf20187..acf3d66 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -182,7 +182,9 @@ struct ip_callchain {
struct branch_flags {
u64 mispred:1;
u64 predicted:1;
- u64 reserved:62;
+ u64 in_tx:1;
+ u64 abort:1;
+ u64 reserved:60;
};
struct branch_entry {
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 1329b6b..f743e96 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -45,6 +45,8 @@ enum hist_column {
HISTC_CPU,
HISTC_SRCLINE,
HISTC_MISPREDICT,
+ HISTC_IN_TX,
+ HISTC_ABORT,
HISTC_SYMBOL_FROM,
HISTC_SYMBOL_TO,
HISTC_DSO_FROM,
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 5f118a0..1771566 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -858,6 +858,55 @@ struct sort_entry sort_mem_snoop = {
.se_width_idx = HISTC_MEM_SNOOP,
};
+static int64_t
+sort__abort_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ return left->branch_info->flags.abort !=
+ right->branch_info->flags.abort;
+}
+
+static int hist_entry__abort_snprintf(struct hist_entry *self, char *bf,
+ size_t size, unsigned int width)
+{
+ static const char *out = ".";
+
+ if (self->branch_info->flags.abort)
+ out = "A";
+ return repsep_snprintf(bf, size, "%-*s", width, out);
+}
+
+struct sort_entry sort_abort = {
+ .se_header = "Transaction abort",
+ .se_cmp = sort__abort_cmp,
+ .se_snprintf = hist_entry__abort_snprintf,
+ .se_width_idx = HISTC_ABORT,
+};
+
+static int64_t
+sort__in_tx_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ return left->branch_info->flags.in_tx !=
+ right->branch_info->flags.in_tx;
+}
+
+static int hist_entry__in_tx_snprintf(struct hist_entry *self, char *bf,
+ size_t size, unsigned int width)
+{
+ static const char *out = ".";
+
+ if (self->branch_info->flags.in_tx)
+ out = "T";
+
+ return repsep_snprintf(bf, size, "%-*s", width, out);
+}
+
+struct sort_entry sort_in_tx = {
+ .se_header = "Branch in transaction",
+ .se_cmp = sort__in_tx_cmp,
+ .se_snprintf = hist_entry__in_tx_snprintf,
+ .se_width_idx = HISTC_IN_TX,
+};
+
struct sort_dimension {
const char *name;
struct sort_entry *entry;
@@ -888,6 +937,8 @@ static struct sort_dimension bstack_sort_dimensions[] = {
DIM(SORT_SYM_FROM, "symbol_from", sort_sym_from),
DIM(SORT_SYM_TO, "symbol_to", sort_sym_to),
DIM(SORT_MISPREDICT, "mispredict", sort_mispredict),
+ DIM(SORT_IN_TX, "in_tx", sort_in_tx),
+ DIM(SORT_ABORT, "abort", sort_abort),
};
#undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 4e80dbd..9dad3a0 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -153,6 +153,8 @@ enum sort_type {
SORT_SYM_FROM,
SORT_SYM_TO,
SORT_MISPREDICT,
+ SORT_ABORT,
+ SORT_IN_TX,
/* memory mode specific sort keys */
__SORT_MEMORY_MODE,
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 4/6] perf, tools: Add abort_tx,no_tx,in_tx branch filter options to perf record -j v3
2013-09-20 14:40 perf, x86: Add last TSX PMU code for Haswell v2 Andi Kleen
` (2 preceding siblings ...)
2013-09-20 14:40 ` [PATCH 3/6] perf, tools: Support sorting by in_tx, abort branch flags v3 Andi Kleen
@ 2013-09-20 14:40 ` Andi Kleen
2013-10-04 17:32 ` [tip:perf/core] tools/perf/record: Add abort_tx,no_tx, in_tx branch filter options to perf record -j tip-bot for Andi Kleen
2013-09-20 14:40 ` [PATCH 5/6] perf, tools: Add support for record transaction flags v5 Andi Kleen
` (3 subsequent siblings)
7 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2013-09-20 14:40 UTC (permalink / raw)
To: linux-kernel; +Cc: acme, mingo, peterz, eranian, jolsa, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Make perf record -j aware of the new in_tx,no_tx,abort_tx branch qualifiers.
v2: ABORT -> ABORTTX
v3: Add more _
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/Documentation/perf-record.txt | 3 +++
tools/perf/builtin-record.c | 3 +++
2 files changed, 6 insertions(+)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index e297b74..6bec1c9 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -166,6 +166,9 @@ following filters are defined:
- u: only when the branch target is at the user level
- k: only when the branch target is in the kernel
- hv: only when the target is at the hypervisor level
+ - in_tx: only when the target is in a hardware transaction
+ - no_tx: only when the target is not in a hardware transaction
+ - abort_tx: only when the target is a hardware transaction abort
+
The option requires at least one branch type among any, any_call, any_ret, ind_call.
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a41ac415..8384b54 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -618,6 +618,9 @@ static const struct branch_mode branch_modes[] = {
BRANCH_OPT("any_call", PERF_SAMPLE_BRANCH_ANY_CALL),
BRANCH_OPT("any_ret", PERF_SAMPLE_BRANCH_ANY_RETURN),
BRANCH_OPT("ind_call", PERF_SAMPLE_BRANCH_IND_CALL),
+ BRANCH_OPT("abort_tx", PERF_SAMPLE_BRANCH_ABORT_TX),
+ BRANCH_OPT("in_tx", PERF_SAMPLE_BRANCH_IN_TX),
+ BRANCH_OPT("no_tx", PERF_SAMPLE_BRANCH_NO_TX),
BRANCH_END
};
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [tip:perf/core] tools/perf/record: Add abort_tx,no_tx, in_tx branch filter options to perf record -j
2013-09-20 14:40 ` [PATCH 4/6] perf, tools: Add abort_tx,no_tx,in_tx branch filter options to perf record -j v3 Andi Kleen
@ 2013-10-04 17:32 ` tip-bot for Andi Kleen
0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Andi Kleen @ 2013-10-04 17:32 UTC (permalink / raw)
To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, peterz, ak, tglx, jolsa
Commit-ID: 0126d493b62e1306db09e1019c05e0bfe84ae8e7
Gitweb: http://git.kernel.org/tip/0126d493b62e1306db09e1019c05e0bfe84ae8e7
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Fri, 20 Sep 2013 07:40:42 -0700
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 4 Oct 2013 10:06:10 +0200
tools/perf/record: Add abort_tx,no_tx,in_tx branch filter options to perf record -j
Make perf record -j aware of the new in_tx,no_tx,abort_tx branch qualifiers.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-5-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
tools/perf/Documentation/perf-record.txt | 3 +++
tools/perf/builtin-record.c | 3 +++
2 files changed, 6 insertions(+)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index e297b74..6bec1c9 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -166,6 +166,9 @@ following filters are defined:
- u: only when the branch target is at the user level
- k: only when the branch target is in the kernel
- hv: only when the target is at the hypervisor level
+ - in_tx: only when the target is in a hardware transaction
+ - no_tx: only when the target is not in a hardware transaction
+ - abort_tx: only when the target is a hardware transaction abort
+
The option requires at least one branch type among any, any_call, any_ret, ind_call.
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a41ac415..8384b54 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -618,6 +618,9 @@ static const struct branch_mode branch_modes[] = {
BRANCH_OPT("any_call", PERF_SAMPLE_BRANCH_ANY_CALL),
BRANCH_OPT("any_ret", PERF_SAMPLE_BRANCH_ANY_RETURN),
BRANCH_OPT("ind_call", PERF_SAMPLE_BRANCH_IND_CALL),
+ BRANCH_OPT("abort_tx", PERF_SAMPLE_BRANCH_ABORT_TX),
+ BRANCH_OPT("in_tx", PERF_SAMPLE_BRANCH_IN_TX),
+ BRANCH_OPT("no_tx", PERF_SAMPLE_BRANCH_NO_TX),
BRANCH_END
};
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 5/6] perf, tools: Add support for record transaction flags v5
2013-09-20 14:40 perf, x86: Add last TSX PMU code for Haswell v2 Andi Kleen
` (3 preceding siblings ...)
2013-09-20 14:40 ` [PATCH 4/6] perf, tools: Add abort_tx,no_tx,in_tx branch filter options to perf record -j v3 Andi Kleen
@ 2013-09-20 14:40 ` Andi Kleen
2013-10-04 17:33 ` [tip:perf/core] tools/perf: Add support for record transaction flags tip-bot for Andi Kleen
2013-09-20 14:40 ` [PATCH 6/6] perf, x86: Suppress duplicated abort LBR records v2 Andi Kleen
` (2 subsequent siblings)
7 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2013-09-20 14:40 UTC (permalink / raw)
To: linux-kernel; +Cc: acme, mingo, peterz, eranian, jolsa, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Add support for recording and displaying the transaction flags.
They are essentially a new sort key. Also display them
in a nice way to the user.
v2: Fix manpage
v3: Move transaction to the end
v4: Handle capacity-read/write
v5: Adjust for new names
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/Documentation/perf-record.txt | 4 +-
tools/perf/Documentation/perf-report.txt | 4 ++
tools/perf/Documentation/perf-top.txt | 2 +-
tools/perf/builtin-annotate.c | 2 +-
tools/perf/builtin-diff.c | 8 ++--
tools/perf/builtin-record.c | 2 +
tools/perf/builtin-report.c | 4 +-
tools/perf/builtin-top.c | 5 +--
tools/perf/perf.h | 1 +
tools/perf/tests/hists_link.c | 6 ++-
tools/perf/util/event.h | 1 +
tools/perf/util/evsel.c | 9 ++++
tools/perf/util/hist.c | 7 ++-
tools/perf/util/hist.h | 4 +-
tools/perf/util/session.c | 3 ++
tools/perf/util/sort.c | 73 ++++++++++++++++++++++++++++++++
tools/perf/util/sort.h | 2 +
17 files changed, 122 insertions(+), 15 deletions(-)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 6bec1c9..f732eaa 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -179,12 +179,14 @@ is enabled for all the sampling events. The sampled branch type is the same for
The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
Note that this feature may not be available on all processors.
--W::
--weight::
Enable weightened sampling. An additional weight is recorded per sample and can be
displayed with the weight and local_weight sort keys. This currently works for TSX
abort events and some memory events in precise mode on modern Intel CPUs.
+--transaction::
+Record transaction flags for transaction related events.
+
SEE ALSO
--------
linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index ae337e3..be5ad87 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -72,6 +72,10 @@ OPTIONS
- cpu: cpu number the task ran at the time of sample
- srcline: filename and line number executed at the time of sample. The
DWARF debugging info must be provided.
+ - weight: Event specific weight, e.g. memory latency or transaction
+ abort cost. This is the global weight.
+ - local_weight: Local weight version of the weight above.
+ - transaction: Transaction abort flags.
By default, comm, dso and symbol keys are used.
(i.e. --sort comm,dso,symbol)
diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index f852eb5..6d70fbf 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -113,7 +113,7 @@ Default is to monitor all CPUS.
-s::
--sort::
Sort by key(s): pid, comm, dso, symbol, parent, srcline, weight,
- local_weight, abort, in_tx
+ local_weight, abort, in_tx, transaction
-n::
--show-nr-samples::
diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 5ebd0c3..0393d98 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -63,7 +63,7 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
return 0;
}
- he = __hists__add_entry(&evsel->hists, al, NULL, 1, 1);
+ he = __hists__add_entry(&evsel->hists, al, NULL, 1, 1, 0);
if (he == NULL)
return -ENOMEM;
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index f28799e..2a78dc8 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -304,9 +304,10 @@ static int formula_fprintf(struct hist_entry *he, struct hist_entry *pair,
static int hists__add_entry(struct hists *self,
struct addr_location *al, u64 period,
- u64 weight)
+ u64 weight, u64 transaction)
{
- if (__hists__add_entry(self, al, NULL, period, weight) != NULL)
+ if (__hists__add_entry(self, al, NULL, period, weight, transaction)
+ != NULL)
return 0;
return -ENOMEM;
}
@@ -328,7 +329,8 @@ static int diff__process_sample_event(struct perf_tool *tool __maybe_unused,
if (al.filtered)
return 0;
- if (hists__add_entry(&evsel->hists, &al, sample->period, sample->weight)) {
+ if (hists__add_entry(&evsel->hists, &al, sample->period,
+ sample->weight, sample->transaction)) {
pr_warning("problem incrementing symbol period, skipping event\n");
return -1;
}
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 8384b54..a78db3f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -894,6 +894,8 @@ const struct option record_options[] = {
parse_branch_stack),
OPT_BOOLEAN('W', "weight", &record.opts.sample_weight,
"sample by weight (on special events only)"),
+ OPT_BOOLEAN(0, "transaction", &record.opts.sample_transaction,
+ "sample transaction flags (special events only)"),
OPT_END()
};
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 1e84103..8657a3d 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -259,7 +259,7 @@ static int perf_evsel__add_hist_entry(struct perf_evsel *evsel,
}
he = __hists__add_entry(&evsel->hists, al, parent, sample->period,
- sample->weight);
+ sample->weight, sample->transaction);
if (he == NULL)
return -ENOMEM;
@@ -786,7 +786,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
"sort by key(s): pid, comm, dso, symbol, parent, cpu, srcline,"
" dso_to, dso_from, symbol_to, symbol_from, mispredict,"
" weight, local_weight, mem, symbol_daddr, dso_daddr, tlb, "
- "snoop, locked, abort, in_tx"),
+ "snoop, locked, abort, in_tx, transaction"),
OPT_BOOLEAN(0, "showcpuutilization", &symbol_conf.show_cpu_utilization,
"Show sample percentage for different cpu modes"),
OPT_STRING('p', "parent", &parent_pattern, "regex",
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 6534a37..b3e0229 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -247,9 +247,8 @@ static struct hist_entry *perf_evsel__add_hist_entry(struct perf_evsel *evsel,
pthread_mutex_lock(&evsel->hists.lock);
he = __hists__add_entry(&evsel->hists, al, NULL, sample->period,
- sample->weight);
+ sample->weight, sample->transaction);
pthread_mutex_unlock(&evsel->hists.lock);
-
if (he == NULL)
return NULL;
@@ -1104,7 +1103,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
"be more verbose (show counter open errors, etc)"),
OPT_STRING('s', "sort", &sort_order, "key[,key2...]",
"sort by key(s): pid, comm, dso, symbol, parent, weight, local_weight,"
- " abort, in_tx"),
+ " abort, in_tx, transaction"),
OPT_BOOLEAN('n', "show-nr-samples", &symbol_conf.show_nr_samples,
"Show a column with the number of samples"),
OPT_CALLBACK_DEFAULT('G', "call-graph", &top.record_opts,
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index acf3d66..84502e8 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -233,6 +233,7 @@ struct perf_record_opts {
u64 default_interval;
u64 user_interval;
u16 stack_dump_size;
+ bool sample_transaction;
};
#endif
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 4228ffc..025503a 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -222,7 +222,8 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
&sample) < 0)
goto out;
- he = __hists__add_entry(&evsel->hists, &al, NULL, 1, 1);
+ he = __hists__add_entry(&evsel->hists, &al, NULL,
+ 1, 1, 0);
if (he == NULL)
goto out;
@@ -244,7 +245,8 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
&sample) < 0)
goto out;
- he = __hists__add_entry(&evsel->hists, &al, NULL, 1, 1);
+ he = __hists__add_entry(&evsel->hists, &al, NULL, 1, 1,
+ 0);
if (he == NULL)
goto out;
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index c67ecc4..17d9e16 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -111,6 +111,7 @@ struct perf_sample {
u64 stream_id;
u64 period;
u64 weight;
+ u64 transaction;
u32 cpu;
u32 raw_size;
u64 data_src;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 0ce9feb..abe69af 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -681,6 +681,9 @@ void perf_evsel__config(struct perf_evsel *evsel,
attr->mmap2 = track && !perf_missing_features.mmap2;
attr->comm = track;
+ if (opts->sample_transaction)
+ attr->sample_type |= PERF_SAMPLE_TRANSACTION;
+
/*
* XXX see the function comment above
*
@@ -1470,6 +1473,12 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
array++;
}
+ data->transaction = 0;
+ if (type & PERF_SAMPLE_TRANSACTION) {
+ data->transaction = *array;
+ array++;
+ }
+
return 0;
}
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 46a0d35..4714a72 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -160,6 +160,10 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
hists__new_col_len(hists, HISTC_MEM_LVL, 21 + 3);
hists__new_col_len(hists, HISTC_LOCAL_WEIGHT, 12);
hists__new_col_len(hists, HISTC_GLOBAL_WEIGHT, 12);
+
+ if (h->transaction)
+ hists__new_col_len(hists, HISTC_TRANSACTION,
+ hist_entry__transaction_len());
}
void hists__output_recalc_col_len(struct hists *hists, int max_rows)
@@ -466,7 +470,7 @@ struct hist_entry *__hists__add_branch_entry(struct hists *self,
struct hist_entry *__hists__add_entry(struct hists *self,
struct addr_location *al,
struct symbol *sym_parent, u64 period,
- u64 weight)
+ u64 weight, u64 transaction)
{
struct hist_entry entry = {
.thread = al->thread,
@@ -487,6 +491,7 @@ struct hist_entry *__hists__add_entry(struct hists *self,
.hists = self,
.branch_info = NULL,
.mem_info = NULL,
+ .transaction = transaction,
};
return add_hist_entry(self, &entry, al, period, weight);
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index f743e96..6a048c0 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -59,6 +59,7 @@ enum hist_column {
HISTC_MEM_TLB,
HISTC_MEM_LVL,
HISTC_MEM_SNOOP,
+ HISTC_TRANSACTION,
HISTC_NR_COLS, /* Last entry */
};
@@ -84,9 +85,10 @@ struct hists {
struct hist_entry *__hists__add_entry(struct hists *self,
struct addr_location *al,
struct symbol *parent, u64 period,
- u64 weight);
+ u64 weight, u64 transaction);
int64_t hist_entry__cmp(struct hist_entry *left, struct hist_entry *right);
int64_t hist_entry__collapse(struct hist_entry *left, struct hist_entry *right);
+int hist_entry__transaction_len(void);
int hist_entry__sort_snprintf(struct hist_entry *self, char *bf, size_t size,
struct hists *hists);
void hist_entry__free(struct hist_entry *);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 51f5edf..ef5af04 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -855,6 +855,9 @@ static void dump_sample(struct perf_evsel *evsel, union perf_event *event,
if (sample_type & PERF_SAMPLE_DATA_SRC)
printf(" . data_src: 0x%"PRIx64"\n", sample->data_src);
+ if (sample_type & PERF_SAMPLE_TRANSACTION)
+ printf("... transaction: %" PRIx64 "\n", sample->transaction);
+
if (sample_type & PERF_SAMPLE_READ)
sample_read__printf(sample, evsel->attr.read_format);
}
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 1771566..b4ecc0e 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -907,6 +907,78 @@ struct sort_entry sort_in_tx = {
.se_width_idx = HISTC_IN_TX,
};
+static int64_t
+sort__transaction_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ return left->transaction - right->transaction;
+}
+
+static inline char *add_str(char *p, const char *str)
+{
+ strcpy(p, str);
+ return p + strlen(str);
+}
+
+static struct txbit {
+ unsigned flag;
+ const char *name;
+ int skip_for_len;
+} txbits[] = {
+ { PERF_TXN_ELISION, "EL ", 0 },
+ { PERF_TXN_TRANSACTION, "TX ", 1 },
+ { PERF_TXN_SYNC, "SYNC ", 1 },
+ { PERF_TXN_ASYNC, "ASYNC ", 0 },
+ { PERF_TXN_RETRY, "RETRY ", 0 },
+ { PERF_TXN_CONFLICT, "CON ", 0 },
+ { PERF_TXN_CAPACITY_WRITE, "CAP-WRITE ", 1 },
+ { PERF_TXN_CAPACITY_READ, "CAP-READ ", 0 },
+ { 0, NULL, 0 }
+};
+
+int hist_entry__transaction_len(void)
+{
+ int i;
+ int len = 0;
+
+ for (i = 0; txbits[i].name; i++) {
+ if (!txbits[i].skip_for_len)
+ len += strlen(txbits[i].name);
+ }
+ len += 4; /* :XX<space> */
+ return len;
+}
+
+static int hist_entry__transaction_snprintf(struct hist_entry *self, char *bf,
+ size_t size, unsigned int width)
+{
+ u64 t = self->transaction;
+ char buf[128];
+ char *p = buf;
+ int i;
+
+ buf[0] = 0;
+ for (i = 0; txbits[i].name; i++)
+ if (txbits[i].flag & t)
+ p = add_str(p, txbits[i].name);
+ if (t && !(t & (PERF_TXN_SYNC|PERF_TXN_ASYNC)))
+ p = add_str(p, "NEITHER ");
+ if (t & PERF_TXN_ABORT_MASK) {
+ sprintf(p, ":%" PRIx64,
+ (t & PERF_TXN_ABORT_MASK) >>
+ PERF_TXN_ABORT_SHIFT);
+ p += strlen(p);
+ }
+
+ return repsep_snprintf(bf, size, "%-*s", width, buf);
+}
+
+struct sort_entry sort_transaction = {
+ .se_header = "Transaction ",
+ .se_cmp = sort__transaction_cmp,
+ .se_snprintf = hist_entry__transaction_snprintf,
+ .se_width_idx = HISTC_TRANSACTION,
+};
+
struct sort_dimension {
const char *name;
struct sort_entry *entry;
@@ -925,6 +997,7 @@ static struct sort_dimension common_sort_dimensions[] = {
DIM(SORT_SRCLINE, "srcline", sort_srcline),
DIM(SORT_LOCAL_WEIGHT, "local_weight", sort_local_weight),
DIM(SORT_GLOBAL_WEIGHT, "weight", sort_global_weight),
+ DIM(SORT_TRANSACTION, "transaction", sort_transaction),
};
#undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 9dad3a0..bf43336 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -85,6 +85,7 @@ struct hist_entry {
struct map_symbol ms;
struct thread *thread;
u64 ip;
+ u64 transaction;
s32 cpu;
struct hist_entry_diff diff;
@@ -145,6 +146,7 @@ enum sort_type {
SORT_SRCLINE,
SORT_LOCAL_WEIGHT,
SORT_GLOBAL_WEIGHT,
+ SORT_TRANSACTION,
/* branch stack specific sort keys */
__SORT_BRANCH_STACK,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [tip:perf/core] tools/perf: Add support for record transaction flags
2013-09-20 14:40 ` [PATCH 5/6] perf, tools: Add support for record transaction flags v5 Andi Kleen
@ 2013-10-04 17:33 ` tip-bot for Andi Kleen
0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Andi Kleen @ 2013-10-04 17:33 UTC (permalink / raw)
To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, peterz, ak, tglx, jolsa
Commit-ID: 475eeab9f3c1579c8da89667496084db4867bf7c
Gitweb: http://git.kernel.org/tip/475eeab9f3c1579c8da89667496084db4867bf7c
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Fri, 20 Sep 2013 07:40:43 -0700
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 4 Oct 2013 10:06:12 +0200
tools/perf: Add support for record transaction flags
Add support for recording and displaying the transaction flags.
They are essentially a new sort key. Also display them
in a nice way to the user.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-6-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
tools/perf/Documentation/perf-record.txt | 4 +-
tools/perf/Documentation/perf-report.txt | 4 ++
tools/perf/Documentation/perf-top.txt | 2 +-
tools/perf/builtin-annotate.c | 2 +-
tools/perf/builtin-diff.c | 8 ++--
tools/perf/builtin-record.c | 2 +
tools/perf/builtin-report.c | 4 +-
tools/perf/builtin-top.c | 5 +--
tools/perf/perf.h | 1 +
tools/perf/tests/hists_link.c | 6 ++-
tools/perf/util/event.h | 1 +
tools/perf/util/evsel.c | 9 ++++
tools/perf/util/hist.c | 7 ++-
tools/perf/util/hist.h | 4 +-
tools/perf/util/session.c | 3 ++
tools/perf/util/sort.c | 73 ++++++++++++++++++++++++++++++++
tools/perf/util/sort.h | 2 +
17 files changed, 122 insertions(+), 15 deletions(-)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 6bec1c9..f732eaa 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -179,12 +179,14 @@ is enabled for all the sampling events. The sampled branch type is the same for
The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
Note that this feature may not be available on all processors.
--W::
--weight::
Enable weightened sampling. An additional weight is recorded per sample and can be
displayed with the weight and local_weight sort keys. This currently works for TSX
abort events and some memory events in precise mode on modern Intel CPUs.
+--transaction::
+Record transaction flags for transaction related events.
+
SEE ALSO
--------
linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index ae337e3..be5ad87 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -72,6 +72,10 @@ OPTIONS
- cpu: cpu number the task ran at the time of sample
- srcline: filename and line number executed at the time of sample. The
DWARF debugging info must be provided.
+ - weight: Event specific weight, e.g. memory latency or transaction
+ abort cost. This is the global weight.
+ - local_weight: Local weight version of the weight above.
+ - transaction: Transaction abort flags.
By default, comm, dso and symbol keys are used.
(i.e. --sort comm,dso,symbol)
diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index f852eb5..6d70fbf 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -113,7 +113,7 @@ Default is to monitor all CPUS.
-s::
--sort::
Sort by key(s): pid, comm, dso, symbol, parent, srcline, weight,
- local_weight, abort, in_tx
+ local_weight, abort, in_tx, transaction
-n::
--show-nr-samples::
diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 5ebd0c3..0393d98 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -63,7 +63,7 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
return 0;
}
- he = __hists__add_entry(&evsel->hists, al, NULL, 1, 1);
+ he = __hists__add_entry(&evsel->hists, al, NULL, 1, 1, 0);
if (he == NULL)
return -ENOMEM;
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index f28799e..2a78dc8 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -304,9 +304,10 @@ static int formula_fprintf(struct hist_entry *he, struct hist_entry *pair,
static int hists__add_entry(struct hists *self,
struct addr_location *al, u64 period,
- u64 weight)
+ u64 weight, u64 transaction)
{
- if (__hists__add_entry(self, al, NULL, period, weight) != NULL)
+ if (__hists__add_entry(self, al, NULL, period, weight, transaction)
+ != NULL)
return 0;
return -ENOMEM;
}
@@ -328,7 +329,8 @@ static int diff__process_sample_event(struct perf_tool *tool __maybe_unused,
if (al.filtered)
return 0;
- if (hists__add_entry(&evsel->hists, &al, sample->period, sample->weight)) {
+ if (hists__add_entry(&evsel->hists, &al, sample->period,
+ sample->weight, sample->transaction)) {
pr_warning("problem incrementing symbol period, skipping event\n");
return -1;
}
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 8384b54..a78db3f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -894,6 +894,8 @@ const struct option record_options[] = {
parse_branch_stack),
OPT_BOOLEAN('W', "weight", &record.opts.sample_weight,
"sample by weight (on special events only)"),
+ OPT_BOOLEAN(0, "transaction", &record.opts.sample_transaction,
+ "sample transaction flags (special events only)"),
OPT_END()
};
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 89b188d..06e1abe 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -259,7 +259,7 @@ static int perf_evsel__add_hist_entry(struct perf_evsel *evsel,
}
he = __hists__add_entry(&evsel->hists, al, parent, sample->period,
- sample->weight);
+ sample->weight, sample->transaction);
if (he == NULL)
return -ENOMEM;
@@ -787,7 +787,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
"sort by key(s): pid, comm, dso, symbol, parent, cpu, srcline,"
" dso_to, dso_from, symbol_to, symbol_from, mispredict,"
" weight, local_weight, mem, symbol_daddr, dso_daddr, tlb, "
- "snoop, locked, abort, in_tx"),
+ "snoop, locked, abort, in_tx, transaction"),
OPT_BOOLEAN(0, "showcpuutilization", &symbol_conf.show_cpu_utilization,
"Show sample percentage for different cpu modes"),
OPT_STRING('p', "parent", &parent_pattern, "regex",
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 6534a37..b3e0229 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -247,9 +247,8 @@ static struct hist_entry *perf_evsel__add_hist_entry(struct perf_evsel *evsel,
pthread_mutex_lock(&evsel->hists.lock);
he = __hists__add_entry(&evsel->hists, al, NULL, sample->period,
- sample->weight);
+ sample->weight, sample->transaction);
pthread_mutex_unlock(&evsel->hists.lock);
-
if (he == NULL)
return NULL;
@@ -1104,7 +1103,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
"be more verbose (show counter open errors, etc)"),
OPT_STRING('s', "sort", &sort_order, "key[,key2...]",
"sort by key(s): pid, comm, dso, symbol, parent, weight, local_weight,"
- " abort, in_tx"),
+ " abort, in_tx, transaction"),
OPT_BOOLEAN('n', "show-nr-samples", &symbol_conf.show_nr_samples,
"Show a column with the number of samples"),
OPT_CALLBACK_DEFAULT('G', "call-graph", &top.record_opts,
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index acf3d66..84502e8 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -233,6 +233,7 @@ struct perf_record_opts {
u64 default_interval;
u64 user_interval;
u16 stack_dump_size;
+ bool sample_transaction;
};
#endif
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 4228ffc..025503a 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -222,7 +222,8 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
&sample) < 0)
goto out;
- he = __hists__add_entry(&evsel->hists, &al, NULL, 1, 1);
+ he = __hists__add_entry(&evsel->hists, &al, NULL,
+ 1, 1, 0);
if (he == NULL)
goto out;
@@ -244,7 +245,8 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
&sample) < 0)
goto out;
- he = __hists__add_entry(&evsel->hists, &al, NULL, 1, 1);
+ he = __hists__add_entry(&evsel->hists, &al, NULL, 1, 1,
+ 0);
if (he == NULL)
goto out;
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index c67ecc4..17d9e16 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -111,6 +111,7 @@ struct perf_sample {
u64 stream_id;
u64 period;
u64 weight;
+ u64 transaction;
u32 cpu;
u32 raw_size;
u64 data_src;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 0ce9feb..abe69af 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -681,6 +681,9 @@ void perf_evsel__config(struct perf_evsel *evsel,
attr->mmap2 = track && !perf_missing_features.mmap2;
attr->comm = track;
+ if (opts->sample_transaction)
+ attr->sample_type |= PERF_SAMPLE_TRANSACTION;
+
/*
* XXX see the function comment above
*
@@ -1470,6 +1473,12 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
array++;
}
+ data->transaction = 0;
+ if (type & PERF_SAMPLE_TRANSACTION) {
+ data->transaction = *array;
+ array++;
+ }
+
return 0;
}
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 97dc280..f3278a3 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -160,6 +160,10 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
hists__new_col_len(hists, HISTC_MEM_LVL, 21 + 3);
hists__new_col_len(hists, HISTC_LOCAL_WEIGHT, 12);
hists__new_col_len(hists, HISTC_GLOBAL_WEIGHT, 12);
+
+ if (h->transaction)
+ hists__new_col_len(hists, HISTC_TRANSACTION,
+ hist_entry__transaction_len());
}
void hists__output_recalc_col_len(struct hists *hists, int max_rows)
@@ -466,7 +470,7 @@ struct hist_entry *__hists__add_branch_entry(struct hists *self,
struct hist_entry *__hists__add_entry(struct hists *self,
struct addr_location *al,
struct symbol *sym_parent, u64 period,
- u64 weight)
+ u64 weight, u64 transaction)
{
struct hist_entry entry = {
.thread = al->thread,
@@ -487,6 +491,7 @@ struct hist_entry *__hists__add_entry(struct hists *self,
.hists = self,
.branch_info = NULL,
.mem_info = NULL,
+ .transaction = transaction,
};
return add_hist_entry(self, &entry, al, period, weight);
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index f743e96..6a048c0 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -59,6 +59,7 @@ enum hist_column {
HISTC_MEM_TLB,
HISTC_MEM_LVL,
HISTC_MEM_SNOOP,
+ HISTC_TRANSACTION,
HISTC_NR_COLS, /* Last entry */
};
@@ -84,9 +85,10 @@ struct hists {
struct hist_entry *__hists__add_entry(struct hists *self,
struct addr_location *al,
struct symbol *parent, u64 period,
- u64 weight);
+ u64 weight, u64 transaction);
int64_t hist_entry__cmp(struct hist_entry *left, struct hist_entry *right);
int64_t hist_entry__collapse(struct hist_entry *left, struct hist_entry *right);
+int hist_entry__transaction_len(void);
int hist_entry__sort_snprintf(struct hist_entry *self, char *bf, size_t size,
struct hists *hists);
void hist_entry__free(struct hist_entry *);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 70ffa41..211b325 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -858,6 +858,9 @@ static void dump_sample(struct perf_evsel *evsel, union perf_event *event,
if (sample_type & PERF_SAMPLE_DATA_SRC)
printf(" . data_src: 0x%"PRIx64"\n", sample->data_src);
+ if (sample_type & PERF_SAMPLE_TRANSACTION)
+ printf("... transaction: %" PRIx64 "\n", sample->transaction);
+
if (sample_type & PERF_SAMPLE_READ)
sample_read__printf(sample, evsel->attr.read_format);
}
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 1771566..b4ecc0e 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -907,6 +907,78 @@ struct sort_entry sort_in_tx = {
.se_width_idx = HISTC_IN_TX,
};
+static int64_t
+sort__transaction_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ return left->transaction - right->transaction;
+}
+
+static inline char *add_str(char *p, const char *str)
+{
+ strcpy(p, str);
+ return p + strlen(str);
+}
+
+static struct txbit {
+ unsigned flag;
+ const char *name;
+ int skip_for_len;
+} txbits[] = {
+ { PERF_TXN_ELISION, "EL ", 0 },
+ { PERF_TXN_TRANSACTION, "TX ", 1 },
+ { PERF_TXN_SYNC, "SYNC ", 1 },
+ { PERF_TXN_ASYNC, "ASYNC ", 0 },
+ { PERF_TXN_RETRY, "RETRY ", 0 },
+ { PERF_TXN_CONFLICT, "CON ", 0 },
+ { PERF_TXN_CAPACITY_WRITE, "CAP-WRITE ", 1 },
+ { PERF_TXN_CAPACITY_READ, "CAP-READ ", 0 },
+ { 0, NULL, 0 }
+};
+
+int hist_entry__transaction_len(void)
+{
+ int i;
+ int len = 0;
+
+ for (i = 0; txbits[i].name; i++) {
+ if (!txbits[i].skip_for_len)
+ len += strlen(txbits[i].name);
+ }
+ len += 4; /* :XX<space> */
+ return len;
+}
+
+static int hist_entry__transaction_snprintf(struct hist_entry *self, char *bf,
+ size_t size, unsigned int width)
+{
+ u64 t = self->transaction;
+ char buf[128];
+ char *p = buf;
+ int i;
+
+ buf[0] = 0;
+ for (i = 0; txbits[i].name; i++)
+ if (txbits[i].flag & t)
+ p = add_str(p, txbits[i].name);
+ if (t && !(t & (PERF_TXN_SYNC|PERF_TXN_ASYNC)))
+ p = add_str(p, "NEITHER ");
+ if (t & PERF_TXN_ABORT_MASK) {
+ sprintf(p, ":%" PRIx64,
+ (t & PERF_TXN_ABORT_MASK) >>
+ PERF_TXN_ABORT_SHIFT);
+ p += strlen(p);
+ }
+
+ return repsep_snprintf(bf, size, "%-*s", width, buf);
+}
+
+struct sort_entry sort_transaction = {
+ .se_header = "Transaction ",
+ .se_cmp = sort__transaction_cmp,
+ .se_snprintf = hist_entry__transaction_snprintf,
+ .se_width_idx = HISTC_TRANSACTION,
+};
+
struct sort_dimension {
const char *name;
struct sort_entry *entry;
@@ -925,6 +997,7 @@ static struct sort_dimension common_sort_dimensions[] = {
DIM(SORT_SRCLINE, "srcline", sort_srcline),
DIM(SORT_LOCAL_WEIGHT, "local_weight", sort_local_weight),
DIM(SORT_GLOBAL_WEIGHT, "weight", sort_global_weight),
+ DIM(SORT_TRANSACTION, "transaction", sort_transaction),
};
#undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 9dad3a0..bf43336 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -85,6 +85,7 @@ struct hist_entry {
struct map_symbol ms;
struct thread *thread;
u64 ip;
+ u64 transaction;
s32 cpu;
struct hist_entry_diff diff;
@@ -145,6 +146,7 @@ enum sort_type {
SORT_SRCLINE,
SORT_LOCAL_WEIGHT,
SORT_GLOBAL_WEIGHT,
+ SORT_TRANSACTION,
/* branch stack specific sort keys */
__SORT_BRANCH_STACK,
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 6/6] perf, x86: Suppress duplicated abort LBR records v2
2013-09-20 14:40 perf, x86: Add last TSX PMU code for Haswell v2 Andi Kleen
` (4 preceding siblings ...)
2013-09-20 14:40 ` [PATCH 5/6] perf, tools: Add support for record transaction flags v5 Andi Kleen
@ 2013-09-20 14:40 ` Andi Kleen
2013-10-04 17:33 ` [tip:perf/core] perf/x86: Suppress duplicated abort LBR records tip-bot for Andi Kleen
2013-09-26 16:34 ` perf, x86: Add last TSX PMU code for Haswell v2 Jiri Olsa
2013-09-30 20:17 ` Jiri Olsa
7 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2013-09-20 14:40 UTC (permalink / raw)
To: linux-kernel; +Cc: acme, mingo, peterz, eranian, jolsa, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Haswell always give an extra LBR record after every TSX abort.
Suppress the extra record.
This only works when the abort is visible in the LBR
If the original abort has already left the 16 LBR entries
the extra entry will will stay.
v2: Adjust white space.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/kernel/cpu/perf_event.h | 1 +
arch/x86/kernel/cpu/perf_event_intel.c | 1 +
arch/x86/kernel/cpu/perf_event_intel_lbr.c | 29 +++++++++++++++++++++--------
3 files changed, 23 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index cc16faa..3b303c6 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -440,6 +440,7 @@ struct x86_pmu {
int lbr_nr; /* hardware stack size */
u64 lbr_sel_mask; /* LBR_SELECT valid bits */
const int *lbr_sel_map; /* lbr_select mappings */
+ bool lbr_double_abort; /* duplicated lbr aborts */
/*
* Extra registers for events
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 7c53676..9262551 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2515,6 +2515,7 @@ __init int intel_pmu_init(void)
x86_pmu.hw_config = hsw_hw_config;
x86_pmu.get_event_constraints = hsw_get_event_constraints;
x86_pmu.cpu_events = hsw_events_attrs;
+ x86_pmu.lbr_double_abort = true;
pr_cont("Haswell events, ");
break;
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index d5be06a..90ee6c1 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -284,6 +284,7 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
int lbr_format = x86_pmu.intel_cap.lbr_format;
u64 tos = intel_pmu_lbr_tos();
int i;
+ int out = 0;
for (i = 0; i < x86_pmu.lbr_nr; i++) {
unsigned long lbr_idx = (tos - i) & mask;
@@ -306,15 +307,27 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
}
from = (u64)((((s64)from) << skip) >> skip);
- cpuc->lbr_entries[i].from = from;
- cpuc->lbr_entries[i].to = to;
- cpuc->lbr_entries[i].mispred = mis;
- cpuc->lbr_entries[i].predicted = pred;
- cpuc->lbr_entries[i].in_tx = in_tx;
- cpuc->lbr_entries[i].abort = abort;
- cpuc->lbr_entries[i].reserved = 0;
+ /*
+ * Some CPUs report duplicated abort records,
+ * with the second entry not having an abort bit set.
+ * Skip them here. This loop runs backwards,
+ * so we need to undo the previous record.
+ * If the abort just happened outside the window
+ * the extra entry cannot be removed.
+ */
+ if (abort && x86_pmu.lbr_double_abort && out > 0)
+ out--;
+
+ cpuc->lbr_entries[out].from = from;
+ cpuc->lbr_entries[out].to = to;
+ cpuc->lbr_entries[out].mispred = mis;
+ cpuc->lbr_entries[out].predicted = pred;
+ cpuc->lbr_entries[out].in_tx = in_tx;
+ cpuc->lbr_entries[out].abort = abort;
+ cpuc->lbr_entries[out].reserved = 0;
+ out++;
}
- cpuc->lbr_stack.nr = i;
+ cpuc->lbr_stack.nr = out;
}
void intel_pmu_lbr_read(void)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [tip:perf/core] perf/x86: Suppress duplicated abort LBR records
2013-09-20 14:40 ` [PATCH 6/6] perf, x86: Suppress duplicated abort LBR records v2 Andi Kleen
@ 2013-10-04 17:33 ` tip-bot for Andi Kleen
0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Andi Kleen @ 2013-10-04 17:33 UTC (permalink / raw)
To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, peterz, ak, tglx
Commit-ID: b7af41a1bc255c0098c37a4bcf5c7e5e168ce875
Gitweb: http://git.kernel.org/tip/b7af41a1bc255c0098c37a4bcf5c7e5e168ce875
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Fri, 20 Sep 2013 07:40:44 -0700
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 4 Oct 2013 10:06:16 +0200
perf/x86: Suppress duplicated abort LBR records
Haswell always give an extra LBR record after every TSX abort.
Suppress the extra record.
This only works when the abort is visible in the LBR
If the original abort has already left the 16 LBR entries
the extra entry will will stay.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-7-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/cpu/perf_event.h | 1 +
arch/x86/kernel/cpu/perf_event_intel.c | 1 +
arch/x86/kernel/cpu/perf_event_intel_lbr.c | 29 +++++++++++++++++++++--------
3 files changed, 23 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index ce84ede..fd00bb2 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -445,6 +445,7 @@ struct x86_pmu {
int lbr_nr; /* hardware stack size */
u64 lbr_sel_mask; /* LBR_SELECT valid bits */
const int *lbr_sel_map; /* lbr_select mappings */
+ bool lbr_double_abort; /* duplicated lbr aborts */
/*
* Extra registers for events
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 36b5ab8..0fa4f24 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2519,6 +2519,7 @@ __init int intel_pmu_init(void)
x86_pmu.hw_config = hsw_hw_config;
x86_pmu.get_event_constraints = hsw_get_event_constraints;
x86_pmu.cpu_events = hsw_events_attrs;
+ x86_pmu.lbr_double_abort = true;
pr_cont("Haswell events, ");
break;
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index d5be06a..90ee6c1 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -284,6 +284,7 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
int lbr_format = x86_pmu.intel_cap.lbr_format;
u64 tos = intel_pmu_lbr_tos();
int i;
+ int out = 0;
for (i = 0; i < x86_pmu.lbr_nr; i++) {
unsigned long lbr_idx = (tos - i) & mask;
@@ -306,15 +307,27 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
}
from = (u64)((((s64)from) << skip) >> skip);
- cpuc->lbr_entries[i].from = from;
- cpuc->lbr_entries[i].to = to;
- cpuc->lbr_entries[i].mispred = mis;
- cpuc->lbr_entries[i].predicted = pred;
- cpuc->lbr_entries[i].in_tx = in_tx;
- cpuc->lbr_entries[i].abort = abort;
- cpuc->lbr_entries[i].reserved = 0;
+ /*
+ * Some CPUs report duplicated abort records,
+ * with the second entry not having an abort bit set.
+ * Skip them here. This loop runs backwards,
+ * so we need to undo the previous record.
+ * If the abort just happened outside the window
+ * the extra entry cannot be removed.
+ */
+ if (abort && x86_pmu.lbr_double_abort && out > 0)
+ out--;
+
+ cpuc->lbr_entries[out].from = from;
+ cpuc->lbr_entries[out].to = to;
+ cpuc->lbr_entries[out].mispred = mis;
+ cpuc->lbr_entries[out].predicted = pred;
+ cpuc->lbr_entries[out].in_tx = in_tx;
+ cpuc->lbr_entries[out].abort = abort;
+ cpuc->lbr_entries[out].reserved = 0;
+ out++;
}
- cpuc->lbr_stack.nr = i;
+ cpuc->lbr_stack.nr = out;
}
void intel_pmu_lbr_read(void)
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: perf, x86: Add last TSX PMU code for Haswell v2
2013-09-20 14:40 perf, x86: Add last TSX PMU code for Haswell v2 Andi Kleen
` (5 preceding siblings ...)
2013-09-20 14:40 ` [PATCH 6/6] perf, x86: Suppress duplicated abort LBR records v2 Andi Kleen
@ 2013-09-26 16:34 ` Jiri Olsa
2013-09-30 20:17 ` Jiri Olsa
7 siblings, 0 replies; 22+ messages in thread
From: Jiri Olsa @ 2013-09-26 16:34 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, acme, mingo, peterz, eranian
On Fri, Sep 20, 2013 at 07:40:38AM -0700, Andi Kleen wrote:
> [This has kernel and user parts, so will need
> review/ack/merges from both perf kernel and user land maintainers]
> [v2: Address Peter's feedback for the kernel parts]
>
> This is currently the last part of the TSX PMU code,
> just adding the left over bits:
>
> This adds some changes to the user interfaces.
> I'll send patches for the manpage separately.
>
> - Report the transaction abort flags to user space
> using a new field, and add the code to display them.
> This is used to classify abort types, also fairly
> important for tuning as it guides the tuning process,
> together with the abort weight that was added earleir.
>
> [3 patches, generic, x86, user tools]
>
> - Add support for reporting the two new TSX LBR flags: in_tx
> and abort_tx. The code to handle the LBRs was already
> added earlier, this just adds the code to report,
> filter and display them.
>
> - Add a workaround for a Haswell issue that it reports
> an extra LBR record for every abort. We just filter
> those out in the kernel.
>
> Open perf TSX issues left:
> - Revisit automatic enabling of precise for tx/el-abort
> - Need to fix the sort handling in the user tools
> to actually sort on other fields
> - The aggregated LBR display in the user tools is not
> very useful for transactions, need a way to report them
> in a histogram like backtraces.
> - May want some shortcut options for
> record --transaction --weight / report --sort symbol,transaction,weight
>
> -Andi
I checked and tried (with no actuall data.. have no Haswell server)
and it seems ok
for the perf tool part:
Acked-by: Jiri Olsa <jolsa@redhat.com>
thanks,
jirka
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: perf, x86: Add last TSX PMU code for Haswell v2
2013-09-20 14:40 perf, x86: Add last TSX PMU code for Haswell v2 Andi Kleen
` (6 preceding siblings ...)
2013-09-26 16:34 ` perf, x86: Add last TSX PMU code for Haswell v2 Jiri Olsa
@ 2013-09-30 20:17 ` Jiri Olsa
2013-09-30 20:26 ` Jiri Olsa
7 siblings, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2013-09-30 20:17 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, acme, mingo, peterz, eranian
On Fri, Sep 20, 2013 at 07:40:38AM -0700, Andi Kleen wrote:
> [This has kernel and user parts, so will need
> review/ack/merges from both perf kernel and user land maintainers]
> [v2: Address Peter's feedback for the kernel parts]
>
> This is currently the last part of the TSX PMU code,
> just adding the left over bits:
>
> This adds some changes to the user interfaces.
> I'll send patches for the manpage separately.
>
> - Report the transaction abort flags to user space
> using a new field, and add the code to display them.
> This is used to classify abort types, also fairly
> important for tuning as it guides the tuning process,
> together with the abort weight that was added earleir.
>
> [3 patches, generic, x86, user tools]
>
> - Add support for reporting the two new TSX LBR flags: in_tx
> and abort_tx. The code to handle the LBRs was already
> added earlier, this just adds the code to report,
> filter and display them.
perf tool changes look ok
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
btw, kernel changes didn't apply cleanly on tip tree for me
last week.. I couldn't test on Haswel, but I at least tried
your changes with no data and all seem ok ;-)
jirka
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: perf, x86: Add last TSX PMU code for Haswell v2
2013-09-30 20:17 ` Jiri Olsa
@ 2013-09-30 20:26 ` Jiri Olsa
2013-09-30 20:30 ` Andi Kleen
0 siblings, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2013-09-30 20:26 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, acme, mingo, peterz, eranian
On Mon, Sep 30, 2013 at 10:17:21PM +0200, Jiri Olsa wrote:
> On Fri, Sep 20, 2013 at 07:40:38AM -0700, Andi Kleen wrote:
> > [This has kernel and user parts, so will need
> > review/ack/merges from both perf kernel and user land maintainers]
> > [v2: Address Peter's feedback for the kernel parts]
> >
> > This is currently the last part of the TSX PMU code,
> > just adding the left over bits:
> >
> > This adds some changes to the user interfaces.
> > I'll send patches for the manpage separately.
> >
> > - Report the transaction abort flags to user space
> > using a new field, and add the code to display them.
> > This is used to classify abort types, also fairly
> > important for tuning as it guides the tuning process,
> > together with the abort weight that was added earleir.
> >
> > [3 patches, generic, x86, user tools]
> >
> > - Add support for reporting the two new TSX LBR flags: in_tx
> > and abort_tx. The code to handle the LBRs was already
> > added earlier, this just adds the code to report,
> > filter and display them.
>
> perf tool changes look ok
>
> Reviewed-by: Jiri Olsa <jolsa@redhat.com>
>
> btw, kernel changes didn't apply cleanly on tip tree for me
> last week.. I couldn't test on Haswel, but I at least tried
> your changes with no data and all seem ok ;-)
I knew I've already sent this out! heh, even acked ;-)
http://marc.info/?l=linux-kernel&m=138021331913851&w=2
got confused by your new PING^2
jirka
^ permalink raw reply [flat|nested] 22+ messages in thread