* [PATCH v2 1/7] perf cs-etm: Don't flush when packet_queue fills up
2024-09-12 15:11 [PATCH v2 0/7] perf: cs-etm: Coresight decode and disassembly improvements James Clark
@ 2024-09-12 15:11 ` James Clark
2024-09-13 11:17 ` Leo Yan
2024-09-12 15:11 ` [PATCH v2 2/7] perf cs-etm: Use new OpenCSD consistency checks James Clark
` (6 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: James Clark @ 2024-09-12 15:11 UTC (permalink / raw)
To: linux-perf-users, gankulkarni, coresight, leo.yan, scclevenger
Cc: James Clark, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang, Kan, Suzuki K Poulose, Mike Leach, John Garry, Will Deacon,
Leo Yan, Ben Gainey, Ruidong Tian, Benjamin Gray, Mathieu Poirier,
linux-kernel, linux-arm-kernel
cs_etm__flush(), like cs_etm__sample() is an operation that generates a
sample and then swaps the current with the previous packet. Calling
flush after processing the queues results in two swaps which corrupts
the next sample. Therefore it wasn't appropriate to call flush here so
remove it.
Flushing is still done on a discontinuity to explicitly clear the last
branch buffer, but when the packet_queue fills up before reaching a
timestamp, that's not a discontinuity and the call to
cs_etm__process_traceid_queue() already generated samples and drained
the buffers correctly.
This is visible by looking for a branch that has the same target as the
previous branch and the following source is before the address of the
last target, which is impossible as execution would have had to have
gone backwards:
ffff800080849d40 _find_next_and_bit+0x78 => ffff80008011cadc update_sg_lb_stats+0x94
(packet_queue fills here before a timestamp, resulting in a flush and
branch target ffff80008011cadc is duplicated.)
ffff80008011cb1c update_sg_lb_stats+0xd4 => ffff80008011cadc update_sg_lb_stats+0x94
ffff8000801117c4 cpu_util+0x24 => ffff8000801117d4 cpu_util+0x34
After removing the flush the correct branch target is used for the
second sample, and ffff8000801117c4 is no longer before the previous
address:
ffff800080849d40 _find_next_and_bit+0x78 => ffff80008011cadc update_sg_lb_stats+0x94
ffff80008011cb1c update_sg_lb_stats+0xd4 => ffff8000801117a0 cpu_util+0x0
ffff8000801117c4 cpu_util+0x24 => ffff8000801117d4 cpu_util+0x34
Make sure that a final branch stack is output at the end of the trace
by calling cs_etm__end_block(). This is already done for both the
timeless decode paths.
Fixes: 21fe8dc1191a ("perf cs-etm: Add support for CPU-wide trace scenarios")
Reported-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Closes: https://lore.kernel.org/all/20240719092619.274730-1-gankulkarni@os.amperecomputing.com/
Signed-off-by: James Clark <james.clark@linaro.org>
---
tools/perf/util/cs-etm.c | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 90f32f327b9b..242788ac9625 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -2490,12 +2490,6 @@ static void cs_etm__clear_all_traceid_queues(struct cs_etm_queue *etmq)
/* Ignore return value */
cs_etm__process_traceid_queue(etmq, tidq);
-
- /*
- * Generate an instruction sample with the remaining
- * branchstack entries.
- */
- cs_etm__flush(etmq, tidq);
}
}
@@ -2638,7 +2632,7 @@ static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm)
while (1) {
if (!etm->heap.heap_cnt)
- goto out;
+ break;
/* Take the entry at the top of the min heap */
cs_queue_nr = etm->heap.heap_array[0].queue_nr;
@@ -2721,6 +2715,23 @@ static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm)
ret = auxtrace_heap__add(&etm->heap, cs_queue_nr, cs_timestamp);
}
+ for (i = 0; i < etm->queues.nr_queues; i++) {
+ struct int_node *inode;
+
+ etmq = etm->queues.queue_array[i].priv;
+ if (!etmq)
+ continue;
+
+ intlist__for_each_entry(inode, etmq->traceid_queues_list) {
+ int idx = (int)(intptr_t)inode->priv;
+
+ /* Flush any remaining branch stack entries */
+ tidq = etmq->traceid_queues[idx];
+ ret = cs_etm__end_block(etmq, tidq);
+ if (ret)
+ return ret;
+ }
+ }
out:
return ret;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v2 1/7] perf cs-etm: Don't flush when packet_queue fills up
2024-09-12 15:11 ` [PATCH v2 1/7] perf cs-etm: Don't flush when packet_queue fills up James Clark
@ 2024-09-13 11:17 ` Leo Yan
0 siblings, 0 replies; 21+ messages in thread
From: Leo Yan @ 2024-09-13 11:17 UTC (permalink / raw)
To: James Clark, linux-perf-users, gankulkarni, coresight,
scclevenger
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan, Suzuki K Poulose,
Mike Leach, John Garry, Will Deacon, Leo Yan, Ben Gainey,
Ruidong Tian, Benjamin Gray, Mathieu Poirier, linux-kernel,
linux-arm-kernel
On 9/12/24 16:11, James Clark wrote:>
> cs_etm__flush(), like cs_etm__sample() is an operation that generates a
> sample and then swaps the current with the previous packet. Calling
> flush after processing the queues results in two swaps which corrupts
> the next sample. Therefore it wasn't appropriate to call flush here so
> remove it.
>
> Flushing is still done on a discontinuity to explicitly clear the last
> branch buffer, but when the packet_queue fills up before reaching a
> timestamp, that's not a discontinuity and the call to
> cs_etm__process_traceid_queue() already generated samples and drained
> the buffers correctly.
>
> This is visible by looking for a branch that has the same target as the
> previous branch and the following source is before the address of the
> last target, which is impossible as execution would have had to have
> gone backwards:
>
> ffff800080849d40 _find_next_and_bit+0x78 => ffff80008011cadc update_sg_lb_stats+0x94
> (packet_queue fills here before a timestamp, resulting in a flush and
> branch target ffff80008011cadc is duplicated.)
> ffff80008011cb1c update_sg_lb_stats+0xd4 => ffff80008011cadc update_sg_lb_stats+0x94
> ffff8000801117c4 cpu_util+0x24 => ffff8000801117d4 cpu_util+0x34
>
> After removing the flush the correct branch target is used for the
> second sample, and ffff8000801117c4 is no longer before the previous
> address:
>
> ffff800080849d40 _find_next_and_bit+0x78 => ffff80008011cadc update_sg_lb_stats+0x94
> ffff80008011cb1c update_sg_lb_stats+0xd4 => ffff8000801117a0 cpu_util+0x0
> ffff8000801117c4 cpu_util+0x24 => ffff8000801117d4 cpu_util+0x34
>
> Make sure that a final branch stack is output at the end of the trace
> by calling cs_etm__end_block(). This is already done for both the
> timeless decode paths.
It is right to call cs_etm__flush() for only discontinuity packet and use
cs_etm__end_block() for flushing the end of data block. Thanks for
distinguishing these two different things.
> Fixes: 21fe8dc1191a ("perf cs-etm: Add support for CPU-wide trace scenarios")
> Reported-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
> Closes: https://lore.kernel.org/all/20240719092619.274730-1-gankulkarni@os.amperecomputing.com/
> Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
> ---
> tools/perf/util/cs-etm.c | 25 ++++++++++++++++++-------
> 1 file changed, 18 insertions(+), 7 deletions(-)
>
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 90f32f327b9b..242788ac9625 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -2490,12 +2490,6 @@ static void cs_etm__clear_all_traceid_queues(struct cs_etm_queue *etmq)
>
> /* Ignore return value */
> cs_etm__process_traceid_queue(etmq, tidq);
> -
> - /*
> - * Generate an instruction sample with the remaining
> - * branchstack entries.
> - */
> - cs_etm__flush(etmq, tidq);
> }
> }
>
> @@ -2638,7 +2632,7 @@ static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm)
>
> while (1) {
> if (!etm->heap.heap_cnt)
> - goto out;
> + break;
>
> /* Take the entry at the top of the min heap */
> cs_queue_nr = etm->heap.heap_array[0].queue_nr;
> @@ -2721,6 +2715,23 @@ static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm)
> ret = auxtrace_heap__add(&etm->heap, cs_queue_nr, cs_timestamp);
> }
>
> + for (i = 0; i < etm->queues.nr_queues; i++) {
> + struct int_node *inode;
> +
> + etmq = etm->queues.queue_array[i].priv;
> + if (!etmq)
> + continue;
> +
> + intlist__for_each_entry(inode, etmq->traceid_queues_list) {
> + int idx = (int)(intptr_t)inode->priv;
> +
> + /* Flush any remaining branch stack entries */
> + tidq = etmq->traceid_queues[idx];
> + ret = cs_etm__end_block(etmq, tidq);
> + if (ret)
> + return ret;
> + }
> + }
> out:
> return ret;
> }
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 2/7] perf cs-etm: Use new OpenCSD consistency checks
2024-09-12 15:11 [PATCH v2 0/7] perf: cs-etm: Coresight decode and disassembly improvements James Clark
2024-09-12 15:11 ` [PATCH v2 1/7] perf cs-etm: Don't flush when packet_queue fills up James Clark
@ 2024-09-12 15:11 ` James Clark
2024-09-13 11:54 ` Leo Yan
2024-09-12 15:11 ` [PATCH v2 3/7] perf scripting python: Add function to get a config value James Clark
` (5 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: James Clark @ 2024-09-12 15:11 UTC (permalink / raw)
To: linux-perf-users, gankulkarni, coresight, leo.yan, scclevenger
Cc: James Clark, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang, Kan, Suzuki K Poulose, Mike Leach, John Garry, Will Deacon,
Leo Yan, Ben Gainey, Ruidong Tian, Benjamin Gray, Mathieu Poirier,
linux-kernel, linux-arm-kernel
Previously when the incorrect binary was used for decode, Perf would
silently continue to generate incorrect samples. With OpenCSD 1.5.4 we
can enable consistency checks that do a best effort to detect a mismatch
in the image. When one is detected a warning is printed and sample
generation stops until the trace resynchronizes with a good part of the
image.
Reported-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Closes: https://lore.kernel.org/all/20240719092619.274730-1-gankulkarni@os.amperecomputing.com/
Signed-off-by: James Clark <james.clark@linaro.org>
---
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
index b78ef0262135..b85a8837bddc 100644
--- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
@@ -685,9 +685,14 @@ cs_etm_decoder__create_etm_decoder(struct cs_etm_decoder_params *d_params,
}
if (d_params->operation == CS_ETM_OPERATION_DECODE) {
+ int decode_flags = OCSD_CREATE_FLG_FULL_DECODER;
+#ifdef OCSD_OPFLG_N_UNCOND_DIR_BR_CHK
+ decode_flags |= OCSD_OPFLG_N_UNCOND_DIR_BR_CHK | OCSD_OPFLG_CHK_RANGE_CONTINUE |
+ ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK;
+#endif
if (ocsd_dt_create_decoder(decoder->dcd_tree,
decoder->decoder_name,
- OCSD_CREATE_FLG_FULL_DECODER,
+ decode_flags,
trace_config, &csid))
return -1;
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v2 2/7] perf cs-etm: Use new OpenCSD consistency checks
2024-09-12 15:11 ` [PATCH v2 2/7] perf cs-etm: Use new OpenCSD consistency checks James Clark
@ 2024-09-13 11:54 ` Leo Yan
2024-09-13 12:09 ` James Clark
0 siblings, 1 reply; 21+ messages in thread
From: Leo Yan @ 2024-09-13 11:54 UTC (permalink / raw)
To: James Clark, linux-perf-users, gankulkarni, coresight,
scclevenger
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan, Suzuki K Poulose,
Mike Leach, John Garry, Will Deacon, Leo Yan, Ben Gainey,
Ruidong Tian, Benjamin Gray, Mathieu Poirier, linux-kernel,
linux-arm-kernel
On 9/12/24 16:11, James Clark wrote:>
>
> Previously when the incorrect binary was used for decode, Perf would
> silently continue to generate incorrect samples. With OpenCSD 1.5.4 we
> can enable consistency checks that do a best effort to detect a mismatch
> in the image. When one is detected a warning is printed and sample
> generation stops until the trace resynchronizes with a good part of the
> image.
>
> Reported-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
> Closes: https://lore.kernel.org/all/20240719092619.274730-1-gankulkarni@os.amperecomputing.com/
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
> tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> index b78ef0262135..b85a8837bddc 100644
> --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
> @@ -685,9 +685,14 @@ cs_etm_decoder__create_etm_decoder(struct cs_etm_decoder_params *d_params,
> }
>
> if (d_params->operation == CS_ETM_OPERATION_DECODE) {
> + int decode_flags = OCSD_CREATE_FLG_FULL_DECODER;
> +#ifdef OCSD_OPFLG_N_UNCOND_DIR_BR_CHK
> + decode_flags |= OCSD_OPFLG_N_UNCOND_DIR_BR_CHK | OCSD_OPFLG_CHK_RANGE_CONTINUE |
> + ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK;
> +#endif
Looks good to me.
Just one question: should the flag ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK be set
according to ETM version? E.g. it should be only set for ETMv4 or this is
fine for ETE as well.
Thanks,
Leo
> if (ocsd_dt_create_decoder(decoder->dcd_tree,
> decoder->decoder_name,
> - OCSD_CREATE_FLG_FULL_DECODER,
> + decode_flags,
> trace_config, &csid))
> return -1;
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v2 2/7] perf cs-etm: Use new OpenCSD consistency checks
2024-09-13 11:54 ` Leo Yan
@ 2024-09-13 12:09 ` James Clark
2024-09-13 13:03 ` Leo Yan
0 siblings, 1 reply; 21+ messages in thread
From: James Clark @ 2024-09-13 12:09 UTC (permalink / raw)
To: Leo Yan
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan, Suzuki K Poulose,
Mike Leach, John Garry, Will Deacon, Leo Yan, Ben Gainey,
Ruidong Tian, Benjamin Gray, Mathieu Poirier, linux-kernel,
linux-arm-kernel, linux-perf-users, gankulkarni, coresight,
scclevenger
On 13/09/2024 12:54, Leo Yan wrote:
> On 9/12/24 16:11, James Clark wrote:>
>>
>> Previously when the incorrect binary was used for decode, Perf would
>> silently continue to generate incorrect samples. With OpenCSD 1.5.4 we
>> can enable consistency checks that do a best effort to detect a mismatch
>> in the image. When one is detected a warning is printed and sample
>> generation stops until the trace resynchronizes with a good part of the
>> image.
>>
>> Reported-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
>> Closes:
>> https://lore.kernel.org/all/20240719092619.274730-1-gankulkarni@os.amperecomputing.com/
>> Signed-off-by: James Clark <james.clark@linaro.org>
>> ---
>> tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> index b78ef0262135..b85a8837bddc 100644
>> --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> @@ -685,9 +685,14 @@ cs_etm_decoder__create_etm_decoder(struct
>> cs_etm_decoder_params *d_params,
>> }
>>
>> if (d_params->operation == CS_ETM_OPERATION_DECODE) {
>> + int decode_flags = OCSD_CREATE_FLG_FULL_DECODER;
>> +#ifdef OCSD_OPFLG_N_UNCOND_DIR_BR_CHK
>> + decode_flags |= OCSD_OPFLG_N_UNCOND_DIR_BR_CHK |
>> OCSD_OPFLG_CHK_RANGE_CONTINUE |
>> + ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK;
>> +#endif
>
> Looks good to me.
>
> Just one question: should the flag ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK be set
> according to ETM version? E.g. it should be only set for ETMv4 or this is
> fine for ETE as well.
>
> Thanks,
> Leo
>
I asked Mike the same question about ETMv3 and he said none of the flags
overlap and it was ok to always pass them. So I assume the same applies
to ETE as well.
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v2 2/7] perf cs-etm: Use new OpenCSD consistency checks
2024-09-13 12:09 ` James Clark
@ 2024-09-13 13:03 ` Leo Yan
0 siblings, 0 replies; 21+ messages in thread
From: Leo Yan @ 2024-09-13 13:03 UTC (permalink / raw)
To: James Clark
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan, Suzuki K Poulose,
Mike Leach, John Garry, Will Deacon, Leo Yan, Ben Gainey,
Ruidong Tian, Benjamin Gray, Mathieu Poirier, linux-kernel,
linux-arm-kernel, linux-perf-users, gankulkarni, coresight,
scclevenger
On 9/13/24 13:09, James Clark wrote:
[...]
>> Just one question: should the flag ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK be set
>> according to ETM version? E.g. it should be only set for ETMv4 or this is
>> fine for ETE as well.
>
> I asked Mike the same question about ETMv3 and he said none of the flags
> overlap and it was ok to always pass them. So I assume the same applies
> to ETE as well.
Thanks for confirmation. This is my review tag:
Reviewed-by: Leo Yan <leo.yan@arm.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 3/7] perf scripting python: Add function to get a config value
2024-09-12 15:11 [PATCH v2 0/7] perf: cs-etm: Coresight decode and disassembly improvements James Clark
2024-09-12 15:11 ` [PATCH v2 1/7] perf cs-etm: Don't flush when packet_queue fills up James Clark
2024-09-12 15:11 ` [PATCH v2 2/7] perf cs-etm: Use new OpenCSD consistency checks James Clark
@ 2024-09-12 15:11 ` James Clark
2024-09-13 13:40 ` Leo Yan
2024-09-12 15:11 ` [PATCH v2 4/7] perf scripts python cs-etm: Update to use argparse James Clark
` (4 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: James Clark @ 2024-09-12 15:11 UTC (permalink / raw)
To: linux-perf-users, gankulkarni, coresight, leo.yan, scclevenger
Cc: James Clark, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang, Kan, Suzuki K Poulose, Mike Leach, John Garry, Will Deacon,
Leo Yan, Ben Gainey, Ruidong Tian, Benjamin Gray, Mathieu Poirier,
linux-kernel, linux-arm-kernel
This can be used to get config values like which objdump Perf uses for
disassembly.
Signed-off-by: James Clark <james.clark@linaro.org>
---
.../perf/Documentation/perf-script-python.txt | 2 +-
.../scripts/python/Perf-Trace-Util/Context.c | 11 ++++++++++
tools/perf/util/config.c | 22 +++++++++++++++++++
tools/perf/util/config.h | 1 +
4 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/tools/perf/Documentation/perf-script-python.txt b/tools/perf/Documentation/perf-script-python.txt
index 13e37e9385ee..27a1cac6fe76 100644
--- a/tools/perf/Documentation/perf-script-python.txt
+++ b/tools/perf/Documentation/perf-script-python.txt
@@ -624,7 +624,7 @@ as perf_trace_context.perf_script_context .
perf_set_itrace_options(context, itrace_options) - set --itrace options if they have not been set already
perf_sample_srcline(context) - returns source_file_name, line_number
perf_sample_srccode(context) - returns source_file_name, line_number, source_line
-
+ perf_config_get(config_name) - returns the value of the named config item, or None if unset
Util.py Module
~~~~~~~~~~~~~~
diff --git a/tools/perf/scripts/python/Perf-Trace-Util/Context.c b/tools/perf/scripts/python/Perf-Trace-Util/Context.c
index 3954bd1587ce..01f54d6724a5 100644
--- a/tools/perf/scripts/python/Perf-Trace-Util/Context.c
+++ b/tools/perf/scripts/python/Perf-Trace-Util/Context.c
@@ -12,6 +12,7 @@
#define PY_SSIZE_T_CLEAN
#include <Python.h>
+#include "../../../util/config.h"
#include "../../../util/trace-event.h"
#include "../../../util/event.h"
#include "../../../util/symbol.h"
@@ -182,6 +183,15 @@ static PyObject *perf_sample_srccode(PyObject *obj, PyObject *args)
return perf_sample_src(obj, args, true);
}
+static PyObject *__perf_config_get(PyObject *obj, PyObject *args)
+{
+ const char *config_name;
+
+ if (!PyArg_ParseTuple(args, "s", &config_name))
+ return NULL;
+ return Py_BuildValue("s", perf_config_get(config_name));
+}
+
static PyMethodDef ContextMethods[] = {
#ifdef HAVE_LIBTRACEEVENT
{ "common_pc", perf_trace_context_common_pc, METH_VARARGS,
@@ -199,6 +209,7 @@ static PyMethodDef ContextMethods[] = {
METH_VARARGS, "Get source file name and line number."},
{ "perf_sample_srccode", perf_sample_srccode,
METH_VARARGS, "Get source file name, line number and line."},
+ { "perf_config_get", __perf_config_get, METH_VARARGS, "Get perf config entry"},
{ NULL, NULL, 0, NULL}
};
diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index 7a650de0db83..68f9407ca74b 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -912,6 +912,7 @@ void set_buildid_dir(const char *dir)
struct perf_config_scan_data {
const char *name;
const char *fmt;
+ const char *value;
va_list args;
int ret;
};
@@ -939,3 +940,24 @@ int perf_config_scan(const char *name, const char *fmt, ...)
return d.ret;
}
+
+static int perf_config_get_cb(const char *var, const char *value, void *data)
+{
+ struct perf_config_scan_data *d = data;
+
+ if (!strcmp(var, d->name))
+ d->value = value;
+
+ return 0;
+}
+
+const char *perf_config_get(const char *name)
+{
+ struct perf_config_scan_data d = {
+ .name = name,
+ .value = NULL,
+ };
+
+ perf_config(perf_config_get_cb, &d);
+ return d.value;
+}
diff --git a/tools/perf/util/config.h b/tools/perf/util/config.h
index 2e5e808928a5..9971313d61c1 100644
--- a/tools/perf/util/config.h
+++ b/tools/perf/util/config.h
@@ -30,6 +30,7 @@ typedef int (*config_fn_t)(const char *, const char *, void *);
int perf_default_config(const char *, const char *, void *);
int perf_config(config_fn_t fn, void *);
int perf_config_scan(const char *name, const char *fmt, ...) __scanf(2, 3);
+const char *perf_config_get(const char *name);
int perf_config_set(struct perf_config_set *set,
config_fn_t fn, void *data);
int perf_config_int(int *dest, const char *, const char *);
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v2 3/7] perf scripting python: Add function to get a config value
2024-09-12 15:11 ` [PATCH v2 3/7] perf scripting python: Add function to get a config value James Clark
@ 2024-09-13 13:40 ` Leo Yan
0 siblings, 0 replies; 21+ messages in thread
From: Leo Yan @ 2024-09-13 13:40 UTC (permalink / raw)
To: James Clark, linux-perf-users, gankulkarni, coresight,
scclevenger
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan, Suzuki K Poulose,
Mike Leach, John Garry, Will Deacon, Leo Yan, Ben Gainey,
Ruidong Tian, Benjamin Gray, Mathieu Poirier, linux-kernel,
linux-arm-kernel
On 9/12/24 16:11, James Clark wrote:
> Warning: EXTERNAL SENDER, use caution when opening links or attachments.
>
>
> This can be used to get config values like which objdump Perf uses for
> disassembly.
>
> Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
> ---
> .../perf/Documentation/perf-script-python.txt | 2 +-
> .../scripts/python/Perf-Trace-Util/Context.c | 11 ++++++++++
> tools/perf/util/config.c | 22 +++++++++++++++++++
> tools/perf/util/config.h | 1 +
> 4 files changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/Documentation/perf-script-python.txt b/tools/perf/Documentation/perf-script-python.txt
> index 13e37e9385ee..27a1cac6fe76 100644
> --- a/tools/perf/Documentation/perf-script-python.txt
> +++ b/tools/perf/Documentation/perf-script-python.txt
> @@ -624,7 +624,7 @@ as perf_trace_context.perf_script_context .
> perf_set_itrace_options(context, itrace_options) - set --itrace options if they have not been set already
> perf_sample_srcline(context) - returns source_file_name, line_number
> perf_sample_srccode(context) - returns source_file_name, line_number, source_line
> -
> + perf_config_get(config_name) - returns the value of the named config item, or None if unset
>
> Util.py Module
> ~~~~~~~~~~~~~~
> diff --git a/tools/perf/scripts/python/Perf-Trace-Util/Context.c b/tools/perf/scripts/python/Perf-Trace-Util/Context.c
> index 3954bd1587ce..01f54d6724a5 100644
> --- a/tools/perf/scripts/python/Perf-Trace-Util/Context.c
> +++ b/tools/perf/scripts/python/Perf-Trace-Util/Context.c
> @@ -12,6 +12,7 @@
> #define PY_SSIZE_T_CLEAN
>
> #include <Python.h>
> +#include "../../../util/config.h"
> #include "../../../util/trace-event.h"
> #include "../../../util/event.h"
> #include "../../../util/symbol.h"
> @@ -182,6 +183,15 @@ static PyObject *perf_sample_srccode(PyObject *obj, PyObject *args)
> return perf_sample_src(obj, args, true);
> }
>
> +static PyObject *__perf_config_get(PyObject *obj, PyObject *args)
> +{
> + const char *config_name;
> +
> + if (!PyArg_ParseTuple(args, "s", &config_name))
> + return NULL;
> + return Py_BuildValue("s", perf_config_get(config_name));
> +}
> +
> static PyMethodDef ContextMethods[] = {
> #ifdef HAVE_LIBTRACEEVENT
> { "common_pc", perf_trace_context_common_pc, METH_VARARGS,
> @@ -199,6 +209,7 @@ static PyMethodDef ContextMethods[] = {
> METH_VARARGS, "Get source file name and line number."},
> { "perf_sample_srccode", perf_sample_srccode,
> METH_VARARGS, "Get source file name, line number and line."},
> + { "perf_config_get", __perf_config_get, METH_VARARGS, "Get perf config entry"},
> { NULL, NULL, 0, NULL}
> };
>
> diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
> index 7a650de0db83..68f9407ca74b 100644
> --- a/tools/perf/util/config.c
> +++ b/tools/perf/util/config.c
> @@ -912,6 +912,7 @@ void set_buildid_dir(const char *dir)
> struct perf_config_scan_data {
> const char *name;
> const char *fmt;
> + const char *value;
> va_list args;
> int ret;
> };
> @@ -939,3 +940,24 @@ int perf_config_scan(const char *name, const char *fmt, ...)
>
> return d.ret;
> }
> +
> +static int perf_config_get_cb(const char *var, const char *value, void *data)
> +{
> + struct perf_config_scan_data *d = data;
> +
> + if (!strcmp(var, d->name))
> + d->value = value;
> +
> + return 0;
> +}
> +
> +const char *perf_config_get(const char *name)
> +{
> + struct perf_config_scan_data d = {
> + .name = name,
> + .value = NULL,
> + };
> +
> + perf_config(perf_config_get_cb, &d);
> + return d.value;
> +}
> diff --git a/tools/perf/util/config.h b/tools/perf/util/config.h
> index 2e5e808928a5..9971313d61c1 100644
> --- a/tools/perf/util/config.h
> +++ b/tools/perf/util/config.h
> @@ -30,6 +30,7 @@ typedef int (*config_fn_t)(const char *, const char *, void *);
> int perf_default_config(const char *, const char *, void *);
> int perf_config(config_fn_t fn, void *);
> int perf_config_scan(const char *name, const char *fmt, ...) __scanf(2, 3);
> +const char *perf_config_get(const char *name);
> int perf_config_set(struct perf_config_set *set,
> config_fn_t fn, void *data);
> int perf_config_int(int *dest, const char *, const char *);
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 4/7] perf scripts python cs-etm: Update to use argparse
2024-09-12 15:11 [PATCH v2 0/7] perf: cs-etm: Coresight decode and disassembly improvements James Clark
` (2 preceding siblings ...)
2024-09-12 15:11 ` [PATCH v2 3/7] perf scripting python: Add function to get a config value James Clark
@ 2024-09-12 15:11 ` James Clark
2024-09-13 12:44 ` Leo Yan
2024-09-12 15:11 ` [PATCH v2 5/7] perf scripts python cs-etm: Improve arguments James Clark
` (3 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: James Clark @ 2024-09-12 15:11 UTC (permalink / raw)
To: linux-perf-users, gankulkarni, coresight, leo.yan, scclevenger
Cc: James Clark, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang, Kan, Suzuki K Poulose, Mike Leach, John Garry, Will Deacon,
Leo Yan, Ben Gainey, Ruidong Tian, Benjamin Gray, Mathieu Poirier,
linux-kernel, linux-arm-kernel
optparse is deprecated and less flexible than argparse so update it.
Signed-off-by: James Clark <james.clark@linaro.org>
---
.../scripts/python/arm-cs-trace-disasm.py | 28 +++++++------------
1 file changed, 10 insertions(+), 18 deletions(-)
diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py b/tools/perf/scripts/python/arm-cs-trace-disasm.py
index 7aff02d84ffb..45f682a8b34d 100755
--- a/tools/perf/scripts/python/arm-cs-trace-disasm.py
+++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py
@@ -11,7 +11,7 @@ import os
from os import path
import re
from subprocess import *
-from optparse import OptionParser, make_option
+import argparse
from perf_trace_context import perf_set_itrace_options, \
perf_sample_insn, perf_sample_srccode
@@ -28,19 +28,11 @@ from perf_trace_context import perf_set_itrace_options, \
# perf script -s scripts/python/arm-cs-trace-disasm.py
# Command line parsing.
-option_list = [
- # formatting options for the bottom entry of the stack
- make_option("-k", "--vmlinux", dest="vmlinux_name",
- help="Set path to vmlinux file"),
- make_option("-d", "--objdump", dest="objdump_name",
- help="Set path to objdump executable file"),
- make_option("-v", "--verbose", dest="verbose",
- action="store_true", default=False,
- help="Enable debugging log")
-]
-
-parser = OptionParser(option_list=option_list)
-(options, args) = parser.parse_args()
+args = argparse.ArgumentParser()
+args.add_argument("-k", "--vmlinux", help="Set path to vmlinux file")
+args.add_argument("-d", "--objdump", help="Set path to objdump executable file"),
+args.add_argument("-v", "--verbose", action="store_true", help="Enable debugging log")
+options = args.parse_args()
# Initialize global dicts and regular expression
disasm_cache = dict()
@@ -65,8 +57,8 @@ def get_offset(perf_dict, field):
def get_dso_file_path(dso_name, dso_build_id):
if (dso_name == "[kernel.kallsyms]" or dso_name == "vmlinux"):
- if (options.vmlinux_name):
- return options.vmlinux_name;
+ if (options.vmlinux):
+ return options.vmlinux;
else:
return dso_name
@@ -92,7 +84,7 @@ def read_disam(dso_fname, dso_start, start_addr, stop_addr):
else:
start_addr = start_addr - dso_start;
stop_addr = stop_addr - dso_start;
- disasm = [ options.objdump_name, "-d", "-z",
+ disasm = [ options.objdump, "-d", "-z",
"--start-address="+format(start_addr,"#x"),
"--stop-address="+format(stop_addr,"#x") ]
disasm += [ dso_fname ]
@@ -256,7 +248,7 @@ def process_event(param_dict):
print("Stop address 0x%x is out of range [ 0x%x .. 0x%x ] for dso %s" % (stop_addr, int(dso_start), int(dso_end), dso))
return
- if (options.objdump_name != None):
+ if (options.objdump != None):
# It doesn't need to decrease virtual memory offset for disassembly
# for kernel dso and executable file dso, so in this case we set
# vm_start to zero.
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v2 4/7] perf scripts python cs-etm: Update to use argparse
2024-09-12 15:11 ` [PATCH v2 4/7] perf scripts python cs-etm: Update to use argparse James Clark
@ 2024-09-13 12:44 ` Leo Yan
0 siblings, 0 replies; 21+ messages in thread
From: Leo Yan @ 2024-09-13 12:44 UTC (permalink / raw)
To: James Clark, linux-perf-users, gankulkarni, coresight,
scclevenger
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan, Suzuki K Poulose,
Mike Leach, John Garry, Will Deacon, Leo Yan, Ben Gainey,
Ruidong Tian, Benjamin Gray, Mathieu Poirier, linux-kernel,
linux-arm-kernel
On 9/12/24 16:11, James Clark wrote:>
> optparse is deprecated and less flexible than argparse so update it.
>
> Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
> ---
> .../scripts/python/arm-cs-trace-disasm.py | 28 +++++++------------
> 1 file changed, 10 insertions(+), 18 deletions(-)
>
> diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py b/tools/perf/scripts/python/arm-cs-trace-disasm.py
> index 7aff02d84ffb..45f682a8b34d 100755
> --- a/tools/perf/scripts/python/arm-cs-trace-disasm.py
> +++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py
> @@ -11,7 +11,7 @@ import os
> from os import path
> import re
> from subprocess import *
> -from optparse import OptionParser, make_option
> +import argparse
>
> from perf_trace_context import perf_set_itrace_options, \
> perf_sample_insn, perf_sample_srccode
> @@ -28,19 +28,11 @@ from perf_trace_context import perf_set_itrace_options, \
> # perf script -s scripts/python/arm-cs-trace-disasm.py
>
> # Command line parsing.
> -option_list = [
> - # formatting options for the bottom entry of the stack
> - make_option("-k", "--vmlinux", dest="vmlinux_name",
> - help="Set path to vmlinux file"),
> - make_option("-d", "--objdump", dest="objdump_name",
> - help="Set path to objdump executable file"),
> - make_option("-v", "--verbose", dest="verbose",
> - action="store_true", default=False,
> - help="Enable debugging log")
> -]
> -
> -parser = OptionParser(option_list=option_list)
> -(options, args) = parser.parse_args()
> +args = argparse.ArgumentParser()
> +args.add_argument("-k", "--vmlinux", help="Set path to vmlinux file")
> +args.add_argument("-d", "--objdump", help="Set path to objdump executable file"),
> +args.add_argument("-v", "--verbose", action="store_true", help="Enable debugging log")
> +options = args.parse_args()
>
> # Initialize global dicts and regular expression
> disasm_cache = dict()
> @@ -65,8 +57,8 @@ def get_offset(perf_dict, field):
>
> def get_dso_file_path(dso_name, dso_build_id):
> if (dso_name == "[kernel.kallsyms]" or dso_name == "vmlinux"):
> - if (options.vmlinux_name):
> - return options.vmlinux_name;
> + if (options.vmlinux):
> + return options.vmlinux;
> else:
> return dso_name
>
> @@ -92,7 +84,7 @@ def read_disam(dso_fname, dso_start, start_addr, stop_addr):
> else:
> start_addr = start_addr - dso_start;
> stop_addr = stop_addr - dso_start;
> - disasm = [ options.objdump_name, "-d", "-z",
> + disasm = [ options.objdump, "-d", "-z",
> "--start-address="+format(start_addr,"#x"),
> "--stop-address="+format(stop_addr,"#x") ]
> disasm += [ dso_fname ]
> @@ -256,7 +248,7 @@ def process_event(param_dict):
> print("Stop address 0x%x is out of range [ 0x%x .. 0x%x ] for dso %s" % (stop_addr, int(dso_start), int(dso_end), dso))
> return
>
> - if (options.objdump_name != None):
> + if (options.objdump != None):
> # It doesn't need to decrease virtual memory offset for disassembly
> # for kernel dso and executable file dso, so in this case we set
> # vm_start to zero.
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 5/7] perf scripts python cs-etm: Improve arguments
2024-09-12 15:11 [PATCH v2 0/7] perf: cs-etm: Coresight decode and disassembly improvements James Clark
` (3 preceding siblings ...)
2024-09-12 15:11 ` [PATCH v2 4/7] perf scripts python cs-etm: Update to use argparse James Clark
@ 2024-09-12 15:11 ` James Clark
2024-09-13 13:01 ` Leo Yan
2024-09-12 15:11 ` [PATCH v2 6/7] perf scripts python cs-etm: Add start and stop arguments James Clark
` (2 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: James Clark @ 2024-09-12 15:11 UTC (permalink / raw)
To: linux-perf-users, gankulkarni, coresight, leo.yan, scclevenger
Cc: James Clark, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang, Kan, Suzuki K Poulose, Mike Leach, John Garry, Will Deacon,
Leo Yan, Ben Gainey, Ruidong Tian, Benjamin Gray, Mathieu Poirier,
linux-kernel, linux-arm-kernel
Make vmlinux detection automatic and use Perf's default objdump
when -d is specified. This will make it easier for a test to use the
script without having to provide arguments. And similarly for users.
Signed-off-by: James Clark <james.clark@linaro.org>
---
.../scripts/python/arm-cs-trace-disasm.py | 63 ++++++++++++++++---
1 file changed, 55 insertions(+), 8 deletions(-)
diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py b/tools/perf/scripts/python/arm-cs-trace-disasm.py
index 45f682a8b34d..02e957d037ea 100755
--- a/tools/perf/scripts/python/arm-cs-trace-disasm.py
+++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py
@@ -12,25 +12,48 @@ from os import path
import re
from subprocess import *
import argparse
+import platform
-from perf_trace_context import perf_set_itrace_options, \
- perf_sample_insn, perf_sample_srccode
+from perf_trace_context import perf_sample_srccode, perf_config_get
# Below are some example commands for using this script.
+# Note a --kcore recording is required for accurate decode
+# due to the alternatives patching mechanism. However this
+# script only supports reading vmlinux for disassembly dump,
+# meaning that any patched instructions will appear
+# as unpatched, but the instruction ranges themselves will
+# be correct. In addition to this, source line info comes
+# from Perf, and when using kcore there is no debug info. The
+# following lists the supported features in each mode:
+#
+# +-----------+-----------------+------------------+------------------+
+# | Recording | Accurate decode | Source line dump | Disassembly dump |
+# +-----------+-----------------+------------------+------------------+
+# | --kcore | yes | no | yes |
+# | normal | no | yes | yes |
+# +-----------+-----------------+------------------+------------------+
+#
+# Output disassembly with objdump and auto detect vmlinux
+# (when running on same machine.)
+# perf script -s scripts/python/arm-cs-trace-disasm.py -d
#
-# Output disassembly with objdump:
-# perf script -s scripts/python/arm-cs-trace-disasm.py \
-# -- -d objdump -k path/to/vmlinux
# Output disassembly with llvm-objdump:
# perf script -s scripts/python/arm-cs-trace-disasm.py \
# -- -d llvm-objdump-11 -k path/to/vmlinux
+#
# Output only source line and symbols:
# perf script -s scripts/python/arm-cs-trace-disasm.py
+def default_objdump():
+ config = perf_config_get("annotate.objdump")
+ return config if config else "objdump"
+
# Command line parsing.
args = argparse.ArgumentParser()
-args.add_argument("-k", "--vmlinux", help="Set path to vmlinux file")
-args.add_argument("-d", "--objdump", help="Set path to objdump executable file"),
+args.add_argument("-k", "--vmlinux",
+ help="Set path to vmlinux file. Omit to autodetect if running on same machine")
+args.add_argument("-d", "--objdump", nargs="?", const=default_objdump(),
+ help="Show disassembly. Can also be used to change the objdump path"),
args.add_argument("-v", "--verbose", action="store_true", help="Enable debugging log")
options = args.parse_args()
@@ -45,6 +68,17 @@ glb_source_file_name = None
glb_line_number = None
glb_dso = None
+kver = platform.release()
+vmlinux_paths = [
+ f"/usr/lib/debug/boot/vmlinux-{kver}.debug",
+ f"/usr/lib/debug/lib/modules/{kver}/vmlinux",
+ f"/lib/modules/{kver}/build/vmlinux",
+ f"/usr/lib/debug/boot/vmlinux-{kver}",
+ f"/boot/vmlinux-{kver}",
+ f"/boot/vmlinux",
+ f"vmlinux"
+]
+
def get_optional(perf_dict, field):
if field in perf_dict:
return perf_dict[field]
@@ -55,12 +89,25 @@ def get_offset(perf_dict, field):
return "+%#x" % perf_dict[field]
return ""
+def find_vmlinux():
+ if hasattr(find_vmlinux, "path"):
+ return find_vmlinux.path
+
+ for v in vmlinux_paths:
+ if os.access(v, os.R_OK):
+ find_vmlinux.path = v
+ break
+ else:
+ find_vmlinux.path = None
+
+ return find_vmlinux.path
+
def get_dso_file_path(dso_name, dso_build_id):
if (dso_name == "[kernel.kallsyms]" or dso_name == "vmlinux"):
if (options.vmlinux):
return options.vmlinux;
else:
- return dso_name
+ return find_vmlinux() if find_vmlinux() else dso_name
if (dso_name == "[vdso]") :
append = "/vdso"
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v2 5/7] perf scripts python cs-etm: Improve arguments
2024-09-12 15:11 ` [PATCH v2 5/7] perf scripts python cs-etm: Improve arguments James Clark
@ 2024-09-13 13:01 ` Leo Yan
0 siblings, 0 replies; 21+ messages in thread
From: Leo Yan @ 2024-09-13 13:01 UTC (permalink / raw)
To: James Clark, linux-perf-users, gankulkarni, coresight,
scclevenger
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan, Suzuki K Poulose,
Mike Leach, John Garry, Will Deacon, Leo Yan, Ben Gainey,
Ruidong Tian, Benjamin Gray, Mathieu Poirier, linux-kernel,
linux-arm-kernel
On 9/12/24 16:11, James Clark wrote:
>
> Make vmlinux detection automatic and use Perf's default objdump
> when -d is specified. This will make it easier for a test to use the
> script without having to provide arguments. And similarly for users.
>
> Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
> ---
> .../scripts/python/arm-cs-trace-disasm.py | 63 ++++++++++++++++---
> 1 file changed, 55 insertions(+), 8 deletions(-)
>
> diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py b/tools/perf/scripts/python/arm-cs-trace-disasm.py
> index 45f682a8b34d..02e957d037ea 100755
> --- a/tools/perf/scripts/python/arm-cs-trace-disasm.py
> +++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py
> @@ -12,25 +12,48 @@ from os import path
> import re
> from subprocess import *
> import argparse
> +import platform
>
> -from perf_trace_context import perf_set_itrace_options, \
> - perf_sample_insn, perf_sample_srccode
> +from perf_trace_context import perf_sample_srccode, perf_config_get
>
> # Below are some example commands for using this script.
> +# Note a --kcore recording is required for accurate decode
> +# due to the alternatives patching mechanism. However this
> +# script only supports reading vmlinux for disassembly dump,
> +# meaning that any patched instructions will appear
> +# as unpatched, but the instruction ranges themselves will
> +# be correct. In addition to this, source line info comes
> +# from Perf, and when using kcore there is no debug info. The
> +# following lists the supported features in each mode:
> +#
> +# +-----------+-----------------+------------------+------------------+
> +# | Recording | Accurate decode | Source line dump | Disassembly dump |
> +# +-----------+-----------------+------------------+------------------+
> +# | --kcore | yes | no | yes |
> +# | normal | no | yes | yes |
> +# +-----------+-----------------+------------------+------------------+
> +#
> +# Output disassembly with objdump and auto detect vmlinux
> +# (when running on same machine.)
> +# perf script -s scripts/python/arm-cs-trace-disasm.py -d
> #
> -# Output disassembly with objdump:
> -# perf script -s scripts/python/arm-cs-trace-disasm.py \
> -# -- -d objdump -k path/to/vmlinux
> # Output disassembly with llvm-objdump:
> # perf script -s scripts/python/arm-cs-trace-disasm.py \
> # -- -d llvm-objdump-11 -k path/to/vmlinux
> +#
> # Output only source line and symbols:
> # perf script -s scripts/python/arm-cs-trace-disasm.py
>
> +def default_objdump():
> + config = perf_config_get("annotate.objdump")
> + return config if config else "objdump"
> +
> # Command line parsing.
> args = argparse.ArgumentParser()
> -args.add_argument("-k", "--vmlinux", help="Set path to vmlinux file")
> -args.add_argument("-d", "--objdump", help="Set path to objdump executable file"),
> +args.add_argument("-k", "--vmlinux",
> + help="Set path to vmlinux file. Omit to autodetect if running on same machine")
> +args.add_argument("-d", "--objdump", nargs="?", const=default_objdump(),
> + help="Show disassembly. Can also be used to change the objdump path"),
> args.add_argument("-v", "--verbose", action="store_true", help="Enable debugging log")
> options = args.parse_args()
>
> @@ -45,6 +68,17 @@ glb_source_file_name = None
> glb_line_number = None
> glb_dso = None
>
> +kver = platform.release()
> +vmlinux_paths = [
> + f"/usr/lib/debug/boot/vmlinux-{kver}.debug",
> + f"/usr/lib/debug/lib/modules/{kver}/vmlinux",
> + f"/lib/modules/{kver}/build/vmlinux",
> + f"/usr/lib/debug/boot/vmlinux-{kver}",
> + f"/boot/vmlinux-{kver}",
> + f"/boot/vmlinux",
> + f"vmlinux"
> +]
> +
> def get_optional(perf_dict, field):
> if field in perf_dict:
> return perf_dict[field]
> @@ -55,12 +89,25 @@ def get_offset(perf_dict, field):
> return "+%#x" % perf_dict[field]
> return ""
>
> +def find_vmlinux():
> + if hasattr(find_vmlinux, "path"):
> + return find_vmlinux.path
> +
> + for v in vmlinux_paths:
> + if os.access(v, os.R_OK):
> + find_vmlinux.path = v
> + break
> + else:
> + find_vmlinux.path = None
> +
> + return find_vmlinux.path
> +
> def get_dso_file_path(dso_name, dso_build_id):
> if (dso_name == "[kernel.kallsyms]" or dso_name == "vmlinux"):
> if (options.vmlinux):
> return options.vmlinux;
> else:
> - return dso_name
> + return find_vmlinux() if find_vmlinux() else dso_name
>
> if (dso_name == "[vdso]") :
> append = "/vdso"
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 6/7] perf scripts python cs-etm: Add start and stop arguments
2024-09-12 15:11 [PATCH v2 0/7] perf: cs-etm: Coresight decode and disassembly improvements James Clark
` (4 preceding siblings ...)
2024-09-12 15:11 ` [PATCH v2 5/7] perf scripts python cs-etm: Improve arguments James Clark
@ 2024-09-12 15:11 ` James Clark
2024-09-13 13:20 ` Leo Yan
2024-09-12 15:11 ` [PATCH v2 7/7] perf test: cs-etm: Test Coresight disassembly script James Clark
2024-09-12 19:23 ` [PATCH v2 0/7] perf: cs-etm: Coresight decode and disassembly improvements Arnaldo Carvalho de Melo
7 siblings, 1 reply; 21+ messages in thread
From: James Clark @ 2024-09-12 15:11 UTC (permalink / raw)
To: linux-perf-users, gankulkarni, coresight, leo.yan, scclevenger
Cc: James Clark, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang, Kan, Suzuki K Poulose, Mike Leach, John Garry, Will Deacon,
Leo Yan, Ben Gainey, Ruidong Tian, Benjamin Gray, Mathieu Poirier,
linux-kernel, linux-arm-kernel
Make it possible to only disassemble a range of timestamps or sample
indexes. This will be used by the test to limit the runtime, but it's
also useful for users.
Signed-off-by: James Clark <james.clark@linaro.org>
---
.../scripts/python/arm-cs-trace-disasm.py | 22 +++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py b/tools/perf/scripts/python/arm-cs-trace-disasm.py
index 02e957d037ea..a097995d8e7b 100755
--- a/tools/perf/scripts/python/arm-cs-trace-disasm.py
+++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py
@@ -55,6 +55,11 @@ args.add_argument("-k", "--vmlinux",
args.add_argument("-d", "--objdump", nargs="?", const=default_objdump(),
help="Show disassembly. Can also be used to change the objdump path"),
args.add_argument("-v", "--verbose", action="store_true", help="Enable debugging log")
+args.add_argument("--start-time", type=int, help="Time of sample to start from")
+args.add_argument("--stop-time", type=int, help="Time of sample to stop at")
+args.add_argument("--start-sample", type=int, help="Index of sample to start from")
+args.add_argument("--stop-sample", type=int, help="Index of sample to stop at")
+
options = args.parse_args()
# Initialize global dicts and regular expression
@@ -63,6 +68,7 @@ cpu_data = dict()
disasm_re = re.compile(r"^\s*([0-9a-fA-F]+):")
disasm_func_re = re.compile(r"^\s*([0-9a-fA-F]+)\s.*:")
cache_size = 64*1024
+sample_idx = -1
glb_source_file_name = None
glb_line_number = None
@@ -151,10 +157,10 @@ def print_disam(dso_fname, dso_start, start_addr, stop_addr):
def print_sample(sample):
print("Sample = { cpu: %04d addr: 0x%016x phys_addr: 0x%016x ip: 0x%016x " \
- "pid: %d tid: %d period: %d time: %d }" % \
+ "pid: %d tid: %d period: %d time: %d index: %d}" % \
(sample['cpu'], sample['addr'], sample['phys_addr'], \
sample['ip'], sample['pid'], sample['tid'], \
- sample['period'], sample['time']))
+ sample['period'], sample['time'], sample_idx))
def trace_begin():
print('ARM CoreSight Trace Data Assembler Dump')
@@ -216,6 +222,7 @@ def print_srccode(comm, param_dict, sample, symbol, dso):
def process_event(param_dict):
global cache_size
global options
+ global sample_idx
sample = param_dict["sample"]
comm = param_dict["comm"]
@@ -231,6 +238,17 @@ def process_event(param_dict):
ip = sample["ip"]
addr = sample["addr"]
+ sample_idx += 1
+
+ if (options.start_time and sample["time"] < options.start_time):
+ return
+ if (options.stop_time and sample["time"] > options.stop_time):
+ exit(0)
+ if (options.start_sample and sample_idx < options.start_sample):
+ return
+ if (options.stop_sample and sample_idx > options.stop_sample):
+ exit(0)
+
if (options.verbose == True):
print("Event type: %s" % name)
print_sample(sample)
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v2 6/7] perf scripts python cs-etm: Add start and stop arguments
2024-09-12 15:11 ` [PATCH v2 6/7] perf scripts python cs-etm: Add start and stop arguments James Clark
@ 2024-09-13 13:20 ` Leo Yan
2024-09-16 10:41 ` James Clark
0 siblings, 1 reply; 21+ messages in thread
From: Leo Yan @ 2024-09-13 13:20 UTC (permalink / raw)
To: James Clark, linux-perf-users, gankulkarni, coresight,
scclevenger
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan, Suzuki K Poulose,
Mike Leach, John Garry, Will Deacon, Leo Yan, Ben Gainey,
Ruidong Tian, Benjamin Gray, Mathieu Poirier, linux-kernel,
linux-arm-kernel
On 9/12/24 16:11, James Clark wrote:>
> Make it possible to only disassemble a range of timestamps or sample
> indexes. This will be used by the test to limit the runtime, but it's
> also useful for users.
>
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
> .../scripts/python/arm-cs-trace-disasm.py | 22 +++++++++++++++++--
> 1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py b/tools/perf/scripts/python/arm-cs-trace-disasm.py
> index 02e957d037ea..a097995d8e7b 100755
> --- a/tools/perf/scripts/python/arm-cs-trace-disasm.py
> +++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py
> @@ -55,6 +55,11 @@ args.add_argument("-k", "--vmlinux",
> args.add_argument("-d", "--objdump", nargs="?", const=default_objdump(),
> help="Show disassembly. Can also be used to change the objdump path"),
> args.add_argument("-v", "--verbose", action="store_true", help="Enable debugging log")
> +args.add_argument("--start-time", type=int, help="Time of sample to start from")
> +args.add_argument("--stop-time", type=int, help="Time of sample to stop at")
> +args.add_argument("--start-sample", type=int, help="Index of sample to start from")
> +args.add_argument("--stop-sample", type=int, help="Index of sample to stop at")
> +
It is good to validate the ranges for time and sample indexes, in case
the user passes unexpected values and can directly report the error.
BTW, I think it is good to clarify the time is based on monotonic clock
but not wall-clock time.
With above changes, LGTM:
Reviewed-by: Leo Yan <leo.yan@arm.com>
> options = args.parse_args()
>
> # Initialize global dicts and regular expression
> @@ -63,6 +68,7 @@ cpu_data = dict()
> disasm_re = re.compile(r"^\s*([0-9a-fA-F]+):")
> disasm_func_re = re.compile(r"^\s*([0-9a-fA-F]+)\s.*:")
> cache_size = 64*1024
> +sample_idx = -1
>
> glb_source_file_name = None
> glb_line_number = None
> @@ -151,10 +157,10 @@ def print_disam(dso_fname, dso_start, start_addr, stop_addr):
>
> def print_sample(sample):
> print("Sample = { cpu: %04d addr: 0x%016x phys_addr: 0x%016x ip: 0x%016x " \
> - "pid: %d tid: %d period: %d time: %d }" % \
> + "pid: %d tid: %d period: %d time: %d index: %d}" % \
> (sample['cpu'], sample['addr'], sample['phys_addr'], \
> sample['ip'], sample['pid'], sample['tid'], \
> - sample['period'], sample['time']))
> + sample['period'], sample['time'], sample_idx))
>
> def trace_begin():
> print('ARM CoreSight Trace Data Assembler Dump')
> @@ -216,6 +222,7 @@ def print_srccode(comm, param_dict, sample, symbol, dso):
> def process_event(param_dict):
> global cache_size
> global options
> + global sample_idx
>
> sample = param_dict["sample"]
> comm = param_dict["comm"]
> @@ -231,6 +238,17 @@ def process_event(param_dict):
> ip = sample["ip"]
> addr = sample["addr"]
>
> + sample_idx += 1
> +
> + if (options.start_time and sample["time"] < options.start_time):
> + return
> + if (options.stop_time and sample["time"] > options.stop_time):
> + exit(0)
> + if (options.start_sample and sample_idx < options.start_sample):
> + return
> + if (options.stop_sample and sample_idx > options.stop_sample):
> + exit(0)
> +
> if (options.verbose == True):
> print("Event type: %s" % name)
> print_sample(sample)
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v2 6/7] perf scripts python cs-etm: Add start and stop arguments
2024-09-13 13:20 ` Leo Yan
@ 2024-09-16 10:41 ` James Clark
0 siblings, 0 replies; 21+ messages in thread
From: James Clark @ 2024-09-16 10:41 UTC (permalink / raw)
To: Leo Yan
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan, Suzuki K Poulose,
Mike Leach, John Garry, Will Deacon, Leo Yan, Ben Gainey,
Ruidong Tian, Benjamin Gray, Mathieu Poirier, linux-kernel,
linux-arm-kernel, linux-perf-users, gankulkarni, coresight,
scclevenger
On 13/09/2024 14:20, Leo Yan wrote:
> On 9/12/24 16:11, James Clark wrote:>
>> Make it possible to only disassemble a range of timestamps or sample
>> indexes. This will be used by the test to limit the runtime, but it's
>> also useful for users.
>>
>> Signed-off-by: James Clark <james.clark@linaro.org>
>> ---
>> .../scripts/python/arm-cs-trace-disasm.py | 22 +++++++++++++++++--
>> 1 file changed, 20 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py
>> b/tools/perf/scripts/python/arm-cs-trace-disasm.py
>> index 02e957d037ea..a097995d8e7b 100755
>> --- a/tools/perf/scripts/python/arm-cs-trace-disasm.py
>> +++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py
>> @@ -55,6 +55,11 @@ args.add_argument("-k", "--vmlinux",
>> args.add_argument("-d", "--objdump", nargs="?",
>> const=default_objdump(),
>> help="Show disassembly. Can also be used to change
>> the objdump path"),
>> args.add_argument("-v", "--verbose", action="store_true",
>> help="Enable debugging log")
>> +args.add_argument("--start-time", type=int, help="Time of sample to
>> start from")
>> +args.add_argument("--stop-time", type=int, help="Time of sample to
>> stop at")
>> +args.add_argument("--start-sample", type=int, help="Index of sample
>> to start from")
>> +args.add_argument("--stop-sample", type=int, help="Index of sample to
>> stop at")
>> +
>
> It is good to validate the ranges for time and sample indexes, in case
> the user passes unexpected values and can directly report the error.
>
> BTW, I think it is good to clarify the time is based on monotonic clock
> but not wall-clock time.
>
> With above changes, LGTM:
>
> Reviewed-by: Leo Yan <leo.yan@arm.com>
>
Yep makes sense, I can add that
>> options = args.parse_args()
>>
>> # Initialize global dicts and regular expression
>> @@ -63,6 +68,7 @@ cpu_data = dict()
>> disasm_re = re.compile(r"^\s*([0-9a-fA-F]+):")
>> disasm_func_re = re.compile(r"^\s*([0-9a-fA-F]+)\s.*:")
>> cache_size = 64*1024
>> +sample_idx = -1
>>
>> glb_source_file_name = None
>> glb_line_number = None
>> @@ -151,10 +157,10 @@ def print_disam(dso_fname, dso_start,
>> start_addr, stop_addr):
>>
>> def print_sample(sample):
>> print("Sample = { cpu: %04d addr: 0x%016x phys_addr: 0x%016x
>> ip: 0x%016x " \
>> - "pid: %d tid: %d period: %d time: %d }" % \
>> + "pid: %d tid: %d period: %d time: %d index: %d}" % \
>> (sample['cpu'], sample['addr'], sample['phys_addr'], \
>> sample['ip'], sample['pid'], sample['tid'], \
>> - sample['period'], sample['time']))
>> + sample['period'], sample['time'], sample_idx))
>>
>> def trace_begin():
>> print('ARM CoreSight Trace Data Assembler Dump')
>> @@ -216,6 +222,7 @@ def print_srccode(comm, param_dict, sample,
>> symbol, dso):
>> def process_event(param_dict):
>> global cache_size
>> global options
>> + global sample_idx
>>
>> sample = param_dict["sample"]
>> comm = param_dict["comm"]
>> @@ -231,6 +238,17 @@ def process_event(param_dict):
>> ip = sample["ip"]
>> addr = sample["addr"]
>>
>> + sample_idx += 1
>> +
>> + if (options.start_time and sample["time"] < options.start_time):
>> + return
>> + if (options.stop_time and sample["time"] > options.stop_time):
>> + exit(0)
>> + if (options.start_sample and sample_idx < options.start_sample):
>> + return
>> + if (options.stop_sample and sample_idx > options.stop_sample):
>> + exit(0)
>> +
>> if (options.verbose == True):
>> print("Event type: %s" % name)
>> print_sample(sample)
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 7/7] perf test: cs-etm: Test Coresight disassembly script
2024-09-12 15:11 [PATCH v2 0/7] perf: cs-etm: Coresight decode and disassembly improvements James Clark
` (5 preceding siblings ...)
2024-09-12 15:11 ` [PATCH v2 6/7] perf scripts python cs-etm: Add start and stop arguments James Clark
@ 2024-09-12 15:11 ` James Clark
2024-09-13 13:35 ` Leo Yan
2024-09-12 19:23 ` [PATCH v2 0/7] perf: cs-etm: Coresight decode and disassembly improvements Arnaldo Carvalho de Melo
7 siblings, 1 reply; 21+ messages in thread
From: James Clark @ 2024-09-12 15:11 UTC (permalink / raw)
To: linux-perf-users, gankulkarni, coresight, leo.yan, scclevenger
Cc: James Clark, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang, Kan, Suzuki K Poulose, Mike Leach, John Garry, Will Deacon,
Leo Yan, Ben Gainey, Ruidong Tian, Benjamin Gray, Mathieu Poirier,
linux-kernel, linux-arm-kernel
Run a few samples through the disassembly script and check to see that
at least one branch instruction is printed.
Signed-off-by: James Clark <james.clark@linaro.org>
---
.../tests/shell/test_arm_coresight_disasm.sh | 63 +++++++++++++++++++
1 file changed, 63 insertions(+)
create mode 100755 tools/perf/tests/shell/test_arm_coresight_disasm.sh
diff --git a/tools/perf/tests/shell/test_arm_coresight_disasm.sh b/tools/perf/tests/shell/test_arm_coresight_disasm.sh
new file mode 100755
index 000000000000..6d004bf29f80
--- /dev/null
+++ b/tools/perf/tests/shell/test_arm_coresight_disasm.sh
@@ -0,0 +1,63 @@
+#!/bin/sh
+# Check Arm CoreSight disassembly script completes without errors
+# SPDX-License-Identifier: GPL-2.0
+
+# The disassembly script reconstructs ranges of instructions and gives these to objdump to
+# decode. objdump doesn't like ranges that go backwards, but these are a good indication
+# that decoding has gone wrong either in OpenCSD, Perf or in the range reconstruction in
+# the script. Test all 3 parts are working correctly by running the script.
+
+skip_if_no_cs_etm_event() {
+ perf list | grep -q 'cs_etm//' && return 0
+
+ # cs_etm event doesn't exist
+ return 2
+}
+
+skip_if_no_cs_etm_event || exit 2
+
+# Assume an error unless we reach the very end
+set -e
+glb_err=1
+
+perfdata_dir=$(mktemp -d /tmp/__perf_test.perf.data.XXXXX)
+perfdata=${perfdata_dir}/perf.data
+file=$(mktemp /tmp/temporary_file.XXXXX)
+
+cleanup_files()
+{
+ set +e
+ rm -rf ${perfdata_dir}
+ rm -f ${file}
+ trap - EXIT TERM INT
+ exit $glb_err
+}
+
+trap cleanup_files EXIT TERM INT
+
+# Ranges start and end on branches, so check for some likely branch instructions
+sep="\s\|\s"
+branch_search="\sbl${sep}b${sep}b.ne${sep}b.eq${sep}cbz\s"
+
+## Test kernel ##
+if [ -e /proc/kcore ]; then
+ echo "Testing kernel disassembly"
+ perf record -o ${perfdata} -e cs_etm//k --kcore -- touch $file > /dev/null 2>&1
+ perf script -i ${perfdata} -s python:tools/perf/scripts/python/arm-cs-trace-disasm.py -- \
+ -d --stop-sample=30 2> /dev/null > ${file}
+ grep -q -e ${branch_search} ${file}
+ echo "Found kernel branches"
+else
+ # kcore is required for correct kernel decode due to runtime code patching
+ echo "No kcore, skipping kernel test"
+fi
+
+## Test user ##
+echo "Testing userspace disassembly"
+perf record -o ${perfdata} -e cs_etm//u -- touch $file > /dev/null 2>&1
+perf script -i ${perfdata} -s python:tools/perf/scripts/python/arm-cs-trace-disasm.py -- \
+ -d --stop-sample=30 2> /dev/null > ${file}
+grep -q -e ${branch_search} ${file}
+echo "Found userspace branches"
+
+glb_err=0
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v2 7/7] perf test: cs-etm: Test Coresight disassembly script
2024-09-12 15:11 ` [PATCH v2 7/7] perf test: cs-etm: Test Coresight disassembly script James Clark
@ 2024-09-13 13:35 ` Leo Yan
2024-09-16 13:25 ` James Clark
0 siblings, 1 reply; 21+ messages in thread
From: Leo Yan @ 2024-09-13 13:35 UTC (permalink / raw)
To: James Clark, linux-perf-users, gankulkarni, coresight,
scclevenger
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan, Suzuki K Poulose,
Mike Leach, John Garry, Will Deacon, Leo Yan, Ben Gainey,
Ruidong Tian, Benjamin Gray, Mathieu Poirier, linux-kernel,
linux-arm-kernel
On 9/12/24 16:11, James Clark wrote:
>
> Run a few samples through the disassembly script and check to see that
> at least one branch instruction is printed.
>
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
> .../tests/shell/test_arm_coresight_disasm.sh | 63 +++++++++++++++++++
> 1 file changed, 63 insertions(+)
> create mode 100755 tools/perf/tests/shell/test_arm_coresight_disasm.sh
>
> diff --git a/tools/perf/tests/shell/test_arm_coresight_disasm.sh b/tools/perf/tests/shell/test_arm_coresight_disasm.sh
> new file mode 100755
> index 000000000000..6d004bf29f80
> --- /dev/null
> +++ b/tools/perf/tests/shell/test_arm_coresight_disasm.sh
> @@ -0,0 +1,63 @@
> +#!/bin/sh
> +# Check Arm CoreSight disassembly script completes without errors
> +# SPDX-License-Identifier: GPL-2.0
> +
> +# The disassembly script reconstructs ranges of instructions and gives these to objdump to
> +# decode. objdump doesn't like ranges that go backwards, but these are a good indication
> +# that decoding has gone wrong either in OpenCSD, Perf or in the range reconstruction in
> +# the script. Test all 3 parts are working correctly by running the script.
> +
> +skip_if_no_cs_etm_event() {
> + perf list | grep -q 'cs_etm//' && return 0
> +
> + # cs_etm event doesn't exist
> + return 2
> +}
> +
> +skip_if_no_cs_etm_event || exit 2
> +
> +# Assume an error unless we reach the very end
> +set -e
> +glb_err=1
> +
> +perfdata_dir=$(mktemp -d /tmp/__perf_test.perf.data.XXXXX)
> +perfdata=${perfdata_dir}/perf.data
> +file=$(mktemp /tmp/temporary_file.XXXXX)
> +
> +cleanup_files()
> +{
> + set +e
> + rm -rf ${perfdata_dir}
> + rm -f ${file}
> + trap - EXIT TERM INT
> + exit $glb_err
> +}
> +
> +trap cleanup_files EXIT TERM INT
> +
> +# Ranges start and end on branches, so check for some likely branch instructions
> +sep="\s\|\s"
> +branch_search="\sbl${sep}b${sep}b.ne${sep}b.eq${sep}cbz\s"
> +
> +## Test kernel ##
> +if [ -e /proc/kcore ]; then
> + echo "Testing kernel disassembly"
> + perf record -o ${perfdata} -e cs_etm//k --kcore -- touch $file > /dev/null 2>&1
> + perf script -i ${perfdata} -s python:tools/perf/scripts/python/arm-cs-trace-disasm.py -- \
> + -d --stop-sample=30 2> /dev/null > ${file}
This is fine for self test. But for a CI test in a distro, will it fail to
find script with prefix 'tools/perf/...'?
Thanks,
Leo
> + grep -q -e ${branch_search} ${file}
> + echo "Found kernel branches"
> +else
> + # kcore is required for correct kernel decode due to runtime code patching
> + echo "No kcore, skipping kernel test"
> +fi
> +
> +## Test user ##
> +echo "Testing userspace disassembly"
> +perf record -o ${perfdata} -e cs_etm//u -- touch $file > /dev/null 2>&1
> +perf script -i ${perfdata} -s python:tools/perf/scripts/python/arm-cs-trace-disasm.py -- \
> + -d --stop-sample=30 2> /dev/null > ${file}
> +grep -q -e ${branch_search} ${file}
> +echo "Found userspace branches"
> +
> +glb_err=0
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v2 7/7] perf test: cs-etm: Test Coresight disassembly script
2024-09-13 13:35 ` Leo Yan
@ 2024-09-16 13:25 ` James Clark
0 siblings, 0 replies; 21+ messages in thread
From: James Clark @ 2024-09-16 13:25 UTC (permalink / raw)
To: Leo Yan
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan, Suzuki K Poulose,
Mike Leach, John Garry, Will Deacon, Leo Yan, Ben Gainey,
Ruidong Tian, Benjamin Gray, Mathieu Poirier, linux-kernel,
linux-arm-kernel, linux-perf-users, gankulkarni, coresight,
scclevenger
On 13/09/2024 14:35, Leo Yan wrote:
>
>
> On 9/12/24 16:11, James Clark wrote:
>>
>> Run a few samples through the disassembly script and check to see that
>> at least one branch instruction is printed.
>>
>> Signed-off-by: James Clark <james.clark@linaro.org>
>> ---
>> .../tests/shell/test_arm_coresight_disasm.sh | 63 +++++++++++++++++++
>> 1 file changed, 63 insertions(+)
>> create mode 100755 tools/perf/tests/shell/test_arm_coresight_disasm.sh
>>
>> diff --git a/tools/perf/tests/shell/test_arm_coresight_disasm.sh
>> b/tools/perf/tests/shell/test_arm_coresight_disasm.sh
>> new file mode 100755
>> index 000000000000..6d004bf29f80
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/test_arm_coresight_disasm.sh
>> @@ -0,0 +1,63 @@
>> +#!/bin/sh
>> +# Check Arm CoreSight disassembly script completes without errors
>> +# SPDX-License-Identifier: GPL-2.0
>> +
>> +# The disassembly script reconstructs ranges of instructions and
>> gives these to objdump to
>> +# decode. objdump doesn't like ranges that go backwards, but these
>> are a good indication
>> +# that decoding has gone wrong either in OpenCSD, Perf or in the
>> range reconstruction in
>> +# the script. Test all 3 parts are working correctly by running the
>> script.
>> +
>> +skip_if_no_cs_etm_event() {
>> + perf list | grep -q 'cs_etm//' && return 0
>> +
>> + # cs_etm event doesn't exist
>> + return 2
>> +}
>> +
>> +skip_if_no_cs_etm_event || exit 2
>> +
>> +# Assume an error unless we reach the very end
>> +set -e
>> +glb_err=1
>> +
>> +perfdata_dir=$(mktemp -d /tmp/__perf_test.perf.data.XXXXX)
>> +perfdata=${perfdata_dir}/perf.data
>> +file=$(mktemp /tmp/temporary_file.XXXXX)
>> +
>> +cleanup_files()
>> +{
>> + set +e
>> + rm -rf ${perfdata_dir}
>> + rm -f ${file}
>> + trap - EXIT TERM INT
>> + exit $glb_err
>> +}
>> +
>> +trap cleanup_files EXIT TERM INT
>> +
>> +# Ranges start and end on branches, so check for some likely branch
>> instructions
>> +sep="\s\|\s"
>> +branch_search="\sbl${sep}b${sep}b.ne${sep}b.eq${sep}cbz\s"
>> +
>> +## Test kernel ##
>> +if [ -e /proc/kcore ]; then
>> + echo "Testing kernel disassembly"
>> + perf record -o ${perfdata} -e cs_etm//k --kcore -- touch $file
>> > /dev/null 2>&1
>> + perf script -i ${perfdata} -s
>> python:tools/perf/scripts/python/arm-cs-trace-disasm.py -- \
>> + -d --stop-sample=30 2> /dev/null > ${file}
>
> This is fine for self test. But for a CI test in a distro, will it fail to
> find script with prefix 'tools/perf/...'?
>
> Thanks,
> Leo
>
Nice catch, it should be this:
# Relative path works whether it's installed or running from repo
script_path=$(dirname "$0")/../../scripts/python/arm-cs-trace\
-disasm.py
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/7] perf: cs-etm: Coresight decode and disassembly improvements
2024-09-12 15:11 [PATCH v2 0/7] perf: cs-etm: Coresight decode and disassembly improvements James Clark
` (6 preceding siblings ...)
2024-09-12 15:11 ` [PATCH v2 7/7] perf test: cs-etm: Test Coresight disassembly script James Clark
@ 2024-09-12 19:23 ` Arnaldo Carvalho de Melo
2024-09-17 8:15 ` James Clark
7 siblings, 1 reply; 21+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-09-12 19:23 UTC (permalink / raw)
To: James Clark
Cc: linux-perf-users, gankulkarni, coresight, leo.yan, scclevenger,
Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang, Kan, Suzuki K Poulose, Mike Leach, John Garry, Will Deacon,
Leo Yan, Ben Gainey, Ruidong Tian, Benjamin Gray, Mathieu Poirier,
linux-kernel, linux-arm-kernel
On Thu, Sep 12, 2024 at 04:11:31PM +0100, James Clark wrote:
> A set of changes that came out of the issues reported here [1].
>
> * First 2 patches fix a decode bug in Perf and add support for new
> consistency checks in OpenCSD
> * The remaining ones make the disassembly script easier to test
> and use. This also involves adding a new Python binding to
> Perf to get a config value (perf_config_get())
>
> [1]: https://lore.kernel.org/linux-arm-kernel/20240719092619.274730-1-gankulkarni@os.amperecomputing.com/
Looks ok from a quick look, but I can't test it, so since there are
reporters for problems that are being fixed, it would be great to have a
Tested-by: from the reporters and probably from someone with access to
the hardware where this can be tested.
- Arnaldo
> Changes since V1:
> * Keep the flush function for discontinuities
> * Still remove the flush when the buffer fills, but now add
> cs_etm__end_block() for the end trace. That way we won't drop
> the last branch stack if the instruction sample period wasn't
> hit at the very end.
>
> James Clark (7):
> perf cs-etm: Don't flush when packet_queue fills up
> perf cs-etm: Use new OpenCSD consistency checks
> perf scripting python: Add function to get a config value
> perf scripts python cs-etm: Update to use argparse
> perf scripts python cs-etm: Improve arguments
> perf scripts python cs-etm: Add start and stop arguments
> perf test: cs-etm: Test Coresight disassembly script
>
> .../perf/Documentation/perf-script-python.txt | 2 +-
> .../scripts/python/Perf-Trace-Util/Context.c | 11 ++
> .../scripts/python/arm-cs-trace-disasm.py | 109 +++++++++++++-----
> .../tests/shell/test_arm_coresight_disasm.sh | 63 ++++++++++
> tools/perf/util/config.c | 22 ++++
> tools/perf/util/config.h | 1 +
> .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 +-
> tools/perf/util/cs-etm.c | 25 ++--
> 8 files changed, 205 insertions(+), 35 deletions(-)
> create mode 100755 tools/perf/tests/shell/test_arm_coresight_disasm.sh
>
> --
> 2.34.1
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v2 0/7] perf: cs-etm: Coresight decode and disassembly improvements
2024-09-12 19:23 ` [PATCH v2 0/7] perf: cs-etm: Coresight decode and disassembly improvements Arnaldo Carvalho de Melo
@ 2024-09-17 8:15 ` James Clark
0 siblings, 0 replies; 21+ messages in thread
From: James Clark @ 2024-09-17 8:15 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: linux-perf-users, gankulkarni, coresight, leo.yan, scclevenger,
Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang, Kan, Suzuki K Poulose, Mike Leach, John Garry, Will Deacon,
Leo Yan, Ben Gainey, Ruidong Tian, Benjamin Gray, Mathieu Poirier,
linux-kernel, linux-arm-kernel
On 12/09/2024 20:23, Arnaldo Carvalho de Melo wrote:
> On Thu, Sep 12, 2024 at 04:11:31PM +0100, James Clark wrote:
>> A set of changes that came out of the issues reported here [1].
>>
>> * First 2 patches fix a decode bug in Perf and add support for new
>> consistency checks in OpenCSD
>> * The remaining ones make the disassembly script easier to test
>> and use. This also involves adding a new Python binding to
>> Perf to get a config value (perf_config_get())
>>
>> [1]: https://lore.kernel.org/linux-arm-kernel/20240719092619.274730-1-gankulkarni@os.amperecomputing.com/
>
> Looks ok from a quick look, but I can't test it, so since there are
> reporters for problems that are being fixed, it would be great to have a
> Tested-by: from the reporters and probably from someone with access to
> the hardware where this can be tested.
>
> - Arnaldo
>
I pinged Ganapat, hopefully he has some time to test it. But we can wait
a bit and maybe find someone else if not.
^ permalink raw reply [flat|nested] 21+ messages in thread