From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>, Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
Clark Williams <williams@redhat.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Leo Yan <leo.yan@linaro.org>, Mike Leach <mike.leach@linaro.org>,
Mathieu Poirier <mathieu.poirier@linaro.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@redhat.com>, Mark Rutland <mark.rutland@arm.com>,
Peter Zijlstra <peterz@infradead.org>,
Robert Walker <robert.walker@arm.com>,
Suzuki Poulouse <suzuki.poulose@arm.com>,
coresight ml <coresight@lists.linaro.org>,
linux-arm-kernel@lists.infradead.org,
Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 11/23] perf cs-etm: Correct synthesizing instruction samples
Date: Tue, 17 Mar 2020 18:32:47 -0300 [thread overview]
Message-ID: <20200317213259.15494-12-acme@kernel.org> (raw)
In-Reply-To: <20200317213259.15494-1-acme@kernel.org>
From: Leo Yan <leo.yan@linaro.org>
When 'etm->instructions_sample_period' is less than
'tidq->period_instructions', the function cs_etm__sample() cannot handle
this case properly with its logic.
Let's see below flow as an example:
- If we set itrace option '--itrace=i4', then function cs_etm__sample()
has variables with initialized values:
tidq->period_instructions = 0
etm->instructions_sample_period = 4
- When the first packet is coming:
packet->instr_count = 10; the number of instructions executed in this
packet is 10, thus update period_instructions as below:
tidq->period_instructions = 0 + 10 = 10
instrs_over = 10 - 4 = 6
offset = 10 - 6 - 1 = 3
tidq->period_instructions = instrs_over = 6
- When the second packet is coming:
packet->instr_count = 10; in the second pass, assume 10 instructions
in the trace sample again:
tidq->period_instructions = 6 + 10 = 16
instrs_over = 16 - 4 = 12
offset = 10 - 12 - 1 = -3 -> the negative value
tidq->period_instructions = instrs_over = 12
So after handle these two packets, there have below issues:
The first issue is that cs_etm__instr_addr() returns the address within
the current trace sample of the instruction related to offset, so the
offset is supposed to be always unsigned value. But in fact, function
cs_etm__sample() might calculate a negative offset value (in handling
the second packet, the offset is -3) and pass to cs_etm__instr_addr()
with u64 type with a big positive integer.
The second issue is it only synthesizes 2 samples for sample period = 4.
In theory, every packet has 10 instructions so the two packets have
total 20 instructions, 20 instructions should generate 5 samples
(4 x 5 = 20). This is because cs_etm__sample() only calls once
cs_etm__synth_instruction_sample() to generate instruction sample per
range packet.
This patch fixes the logic in function cs_etm__sample(); the basic
idea for handling coming packet is:
- To synthesize the first instruction sample, it combines the left
instructions from the previous packet and the head of the new
packet; then generate continuous samples with sample period;
- At the tail of the new packet, if it has the rest instructions,
these instructions will be left for the sequential sample.
Suggested-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200219021811.20067-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/cs-etm.c | 87 ++++++++++++++++++++++++++++++++--------
1 file changed, 70 insertions(+), 17 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 2c4156c5ed09..1ddcc67e13dd 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -1358,9 +1358,12 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
struct cs_etm_auxtrace *etm = etmq->etm;
int ret;
u8 trace_chan_id = tidq->trace_chan_id;
- u64 instrs_executed = tidq->packet->instr_count;
+ u64 instrs_prev;
- tidq->period_instructions += instrs_executed;
+ /* Get instructions remainder from previous packet */
+ instrs_prev = tidq->period_instructions;
+
+ tidq->period_instructions += tidq->packet->instr_count;
/*
* Record a branch when the last instruction in
@@ -1378,26 +1381,76 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
* TODO: allow period to be defined in cycles and clock time
*/
- /* Get number of instructions executed after the sample point */
- u64 instrs_over = tidq->period_instructions -
- etm->instructions_sample_period;
+ /*
+ * Below diagram demonstrates the instruction samples
+ * generation flows:
+ *
+ * Instrs Instrs Instrs Instrs
+ * Sample(n) Sample(n+1) Sample(n+2) Sample(n+3)
+ * | | | |
+ * V V V V
+ * --------------------------------------------------
+ * ^ ^
+ * | |
+ * Period Period
+ * instructions(Pi) instructions(Pi')
+ *
+ * | |
+ * \---------------- -----------------/
+ * V
+ * tidq->packet->instr_count
+ *
+ * Instrs Sample(n...) are the synthesised samples occurring
+ * every etm->instructions_sample_period instructions - as
+ * defined on the perf command line. Sample(n) is being the
+ * last sample before the current etm packet, n+1 to n+3
+ * samples are generated from the current etm packet.
+ *
+ * tidq->packet->instr_count represents the number of
+ * instructions in the current etm packet.
+ *
+ * Period instructions (Pi) contains the the number of
+ * instructions executed after the sample point(n) from the
+ * previous etm packet. This will always be less than
+ * etm->instructions_sample_period.
+ *
+ * When generate new samples, it combines with two parts
+ * instructions, one is the tail of the old packet and another
+ * is the head of the new coming packet, to generate
+ * sample(n+1); sample(n+2) and sample(n+3) consume the
+ * instructions with sample period. After sample(n+3), the rest
+ * instructions will be used by later packet and it is assigned
+ * to tidq->period_instructions for next round calculation.
+ */
/*
- * Calculate the address of the sampled instruction (-1 as
- * sample is reported as though instruction has just been
- * executed, but PC has not advanced to next instruction)
+ * Get the initial offset into the current packet instructions;
+ * entry conditions ensure that instrs_prev is less than
+ * etm->instructions_sample_period.
*/
- u64 offset = (instrs_executed - instrs_over - 1);
- u64 addr = cs_etm__instr_addr(etmq, trace_chan_id,
- tidq->packet, offset);
+ u64 offset = etm->instructions_sample_period - instrs_prev;
+ u64 addr;
- ret = cs_etm__synth_instruction_sample(
- etmq, tidq, addr, etm->instructions_sample_period);
- if (ret)
- return ret;
+ while (tidq->period_instructions >=
+ etm->instructions_sample_period) {
+ /*
+ * Calculate the address of the sampled instruction (-1
+ * as sample is reported as though instruction has just
+ * been executed, but PC has not advanced to next
+ * instruction)
+ */
+ addr = cs_etm__instr_addr(etmq, trace_chan_id,
+ tidq->packet, offset - 1);
+ ret = cs_etm__synth_instruction_sample(
+ etmq, tidq, addr,
+ etm->instructions_sample_period);
+ if (ret)
+ return ret;
- /* Carry remaining instructions into next sample period */
- tidq->period_instructions = instrs_over;
+ offset += etm->instructions_sample_period;
+ tidq->period_instructions -=
+ etm->instructions_sample_period;
+ }
}
if (etm->sample_branches) {
--
2.21.1
WARNING: multiple messages have this Message-ID (diff)
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>, Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Mathieu Poirier <mathieu.poirier@linaro.org>,
Suzuki Poulouse <suzuki.poulose@arm.com>,
Clark Williams <williams@redhat.com>,
coresight ml <coresight@lists.linaro.org>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Jiri Olsa <jolsa@kernel.org>, Leo Yan <leo.yan@linaro.org>,
Namhyung Kim <namhyung@kernel.org>,
Robert Walker <robert.walker@arm.com>,
Jiri Olsa <jolsa@redhat.com>,
linux-arm-kernel@lists.infradead.org,
Mike Leach <mike.leach@linaro.org>
Subject: [PATCH 11/23] perf cs-etm: Correct synthesizing instruction samples
Date: Tue, 17 Mar 2020 18:32:47 -0300 [thread overview]
Message-ID: <20200317213259.15494-12-acme@kernel.org> (raw)
In-Reply-To: <20200317213259.15494-1-acme@kernel.org>
From: Leo Yan <leo.yan@linaro.org>
When 'etm->instructions_sample_period' is less than
'tidq->period_instructions', the function cs_etm__sample() cannot handle
this case properly with its logic.
Let's see below flow as an example:
- If we set itrace option '--itrace=i4', then function cs_etm__sample()
has variables with initialized values:
tidq->period_instructions = 0
etm->instructions_sample_period = 4
- When the first packet is coming:
packet->instr_count = 10; the number of instructions executed in this
packet is 10, thus update period_instructions as below:
tidq->period_instructions = 0 + 10 = 10
instrs_over = 10 - 4 = 6
offset = 10 - 6 - 1 = 3
tidq->period_instructions = instrs_over = 6
- When the second packet is coming:
packet->instr_count = 10; in the second pass, assume 10 instructions
in the trace sample again:
tidq->period_instructions = 6 + 10 = 16
instrs_over = 16 - 4 = 12
offset = 10 - 12 - 1 = -3 -> the negative value
tidq->period_instructions = instrs_over = 12
So after handle these two packets, there have below issues:
The first issue is that cs_etm__instr_addr() returns the address within
the current trace sample of the instruction related to offset, so the
offset is supposed to be always unsigned value. But in fact, function
cs_etm__sample() might calculate a negative offset value (in handling
the second packet, the offset is -3) and pass to cs_etm__instr_addr()
with u64 type with a big positive integer.
The second issue is it only synthesizes 2 samples for sample period = 4.
In theory, every packet has 10 instructions so the two packets have
total 20 instructions, 20 instructions should generate 5 samples
(4 x 5 = 20). This is because cs_etm__sample() only calls once
cs_etm__synth_instruction_sample() to generate instruction sample per
range packet.
This patch fixes the logic in function cs_etm__sample(); the basic
idea for handling coming packet is:
- To synthesize the first instruction sample, it combines the left
instructions from the previous packet and the head of the new
packet; then generate continuous samples with sample period;
- At the tail of the new packet, if it has the rest instructions,
these instructions will be left for the sequential sample.
Suggested-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200219021811.20067-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/cs-etm.c | 87 ++++++++++++++++++++++++++++++++--------
1 file changed, 70 insertions(+), 17 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 2c4156c5ed09..1ddcc67e13dd 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -1358,9 +1358,12 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
struct cs_etm_auxtrace *etm = etmq->etm;
int ret;
u8 trace_chan_id = tidq->trace_chan_id;
- u64 instrs_executed = tidq->packet->instr_count;
+ u64 instrs_prev;
- tidq->period_instructions += instrs_executed;
+ /* Get instructions remainder from previous packet */
+ instrs_prev = tidq->period_instructions;
+
+ tidq->period_instructions += tidq->packet->instr_count;
/*
* Record a branch when the last instruction in
@@ -1378,26 +1381,76 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
* TODO: allow period to be defined in cycles and clock time
*/
- /* Get number of instructions executed after the sample point */
- u64 instrs_over = tidq->period_instructions -
- etm->instructions_sample_period;
+ /*
+ * Below diagram demonstrates the instruction samples
+ * generation flows:
+ *
+ * Instrs Instrs Instrs Instrs
+ * Sample(n) Sample(n+1) Sample(n+2) Sample(n+3)
+ * | | | |
+ * V V V V
+ * --------------------------------------------------
+ * ^ ^
+ * | |
+ * Period Period
+ * instructions(Pi) instructions(Pi')
+ *
+ * | |
+ * \---------------- -----------------/
+ * V
+ * tidq->packet->instr_count
+ *
+ * Instrs Sample(n...) are the synthesised samples occurring
+ * every etm->instructions_sample_period instructions - as
+ * defined on the perf command line. Sample(n) is being the
+ * last sample before the current etm packet, n+1 to n+3
+ * samples are generated from the current etm packet.
+ *
+ * tidq->packet->instr_count represents the number of
+ * instructions in the current etm packet.
+ *
+ * Period instructions (Pi) contains the the number of
+ * instructions executed after the sample point(n) from the
+ * previous etm packet. This will always be less than
+ * etm->instructions_sample_period.
+ *
+ * When generate new samples, it combines with two parts
+ * instructions, one is the tail of the old packet and another
+ * is the head of the new coming packet, to generate
+ * sample(n+1); sample(n+2) and sample(n+3) consume the
+ * instructions with sample period. After sample(n+3), the rest
+ * instructions will be used by later packet and it is assigned
+ * to tidq->period_instructions for next round calculation.
+ */
/*
- * Calculate the address of the sampled instruction (-1 as
- * sample is reported as though instruction has just been
- * executed, but PC has not advanced to next instruction)
+ * Get the initial offset into the current packet instructions;
+ * entry conditions ensure that instrs_prev is less than
+ * etm->instructions_sample_period.
*/
- u64 offset = (instrs_executed - instrs_over - 1);
- u64 addr = cs_etm__instr_addr(etmq, trace_chan_id,
- tidq->packet, offset);
+ u64 offset = etm->instructions_sample_period - instrs_prev;
+ u64 addr;
- ret = cs_etm__synth_instruction_sample(
- etmq, tidq, addr, etm->instructions_sample_period);
- if (ret)
- return ret;
+ while (tidq->period_instructions >=
+ etm->instructions_sample_period) {
+ /*
+ * Calculate the address of the sampled instruction (-1
+ * as sample is reported as though instruction has just
+ * been executed, but PC has not advanced to next
+ * instruction)
+ */
+ addr = cs_etm__instr_addr(etmq, trace_chan_id,
+ tidq->packet, offset - 1);
+ ret = cs_etm__synth_instruction_sample(
+ etmq, tidq, addr,
+ etm->instructions_sample_period);
+ if (ret)
+ return ret;
- /* Carry remaining instructions into next sample period */
- tidq->period_instructions = instrs_over;
+ offset += etm->instructions_sample_period;
+ tidq->period_instructions -=
+ etm->instructions_sample_period;
+ }
}
if (etm->sample_branches) {
--
2.21.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2020-03-17 21:32 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-17 21:32 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 01/23] perf vendor events s390: Add new deflate counters for IBM z15 Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 02/23] perf jevents: Support metric constraint Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 03/23] perf metricgroup: Factor out metricgroup__add_metric_weak_group() Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 04/23] perf util: Factor out sysctl__nmi_watchdog_enabled() Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 05/23] perf metricgroup: Support metric constraint Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 06/23] perf vendor events intel: Add NO_NMI_WATCHDOG " Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 07/23] perf map: Fix off by one in strncpy() size argument Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 08/23] perf map: Use strstarts() to look for Android libraries Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 09/23] perf cs-etm: Swap packets for instruction samples Arnaldo Carvalho de Melo
2020-03-17 21:32 ` Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 10/23] perf cs-etm: Continuously record last branch Arnaldo Carvalho de Melo
2020-03-17 21:32 ` Arnaldo Carvalho de Melo
2020-03-17 21:32 ` Arnaldo Carvalho de Melo [this message]
2020-03-17 21:32 ` [PATCH 11/23] perf cs-etm: Correct synthesizing instruction samples Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 12/23] perf cs-etm: Optimize copying last branches Arnaldo Carvalho de Melo
2020-03-17 21:32 ` Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 13/23] perf cs-etm: Fix unsigned variable comparison to zero Arnaldo Carvalho de Melo
2020-03-17 21:32 ` Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 14/23] perf doc: Set man page date to last git commit Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 15/23] perf intel-pt: Rename intel-pt.txt and put it in man page format Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 16/23] perf intel-pt: Add Intel PT man page references Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 17/23] perf intel-pt: Update intel-pt.txt file with new location of the documentation Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 18/23] perf scripting perl: Add common_callchain to fix argument order Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 19/23] perf record: Fix binding of AIO user space buffers to nodes Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 20/23] perf test: Print if shell directory isn't present Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 21/23] perf tools: Give synthetic mmap events an inode generation Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 22/23] perf report: Fix no branch type statistics report issue Arnaldo Carvalho de Melo
2020-03-17 21:32 ` [PATCH 23/23] perf expr: Fix copy/paste mistake Arnaldo Carvalho de Melo
2020-03-19 14:03 ` [GIT PULL] perf/core improvements and fixes Ingo Molnar
2020-03-19 14:07 ` Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200317213259.15494-12-acme@kernel.org \
--to=acme@kernel.org \
--cc=acme@redhat.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=coresight@lists.linaro.org \
--cc=jolsa@kernel.org \
--cc=jolsa@redhat.com \
--cc=leo.yan@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mathieu.poirier@linaro.org \
--cc=mike.leach@linaro.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=robert.walker@arm.com \
--cc=suzuki.poulose@arm.com \
--cc=tglx@linutronix.de \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.