* [PATCH 6.11 481/817] perf cs-etm: Dont flush when packet_queue fills up
[not found] <20241203143955.605130076@linuxfoundation.org>
@ 2024-12-03 14:40 ` Greg Kroah-Hartman
2024-12-03 14:40 ` [PATCH 6.11 486/817] perf stat: Uniquify event name improvements Greg Kroah-Hartman
1 sibling, 0 replies; 2+ messages in thread
From: Greg Kroah-Hartman @ 2024-12-03 14:40 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Ganapatrao Kulkarni, Leo Yan,
James Clark, Ben Gainey, Suzuki K Poulose, Will Deacon,
Mathieu Poirier, Mike Leach, Ruidong Tian, Benjamin Gray,
linux-arm-kernel, coresight, John Garry, scclevenger,
Namhyung Kim, Sasha Levin
6.11-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Clark <james.clark@linaro.org>
[ Upstream commit 5afd032961e8465808c4bc385c06e7676fbe1951 ]
cs_etm__flush(), like cs_etm__sample() is an operation that generates a
sample and then swaps the current with the previous packet. Calling
flush after processing the queues results in two swaps which corrupts
the next sample. Therefore it wasn't appropriate to call flush here so
remove it.
Flushing is still done on a discontinuity to explicitly clear the last
branch buffer, but when the packet_queue fills up before reaching a
timestamp, that's not a discontinuity and the call to
cs_etm__process_traceid_queue() already generated samples and drained
the buffers correctly.
This is visible by looking for a branch that has the same target as the
previous branch and the following source is before the address of the
last target, which is impossible as execution would have had to have
gone backwards:
ffff800080849d40 _find_next_and_bit+0x78 => ffff80008011cadc update_sg_lb_stats+0x94
(packet_queue fills here before a timestamp, resulting in a flush and
branch target ffff80008011cadc is duplicated.)
ffff80008011cb1c update_sg_lb_stats+0xd4 => ffff80008011cadc update_sg_lb_stats+0x94
ffff8000801117c4 cpu_util+0x24 => ffff8000801117d4 cpu_util+0x34
After removing the flush the correct branch target is used for the
second sample, and ffff8000801117c4 is no longer before the previous
address:
ffff800080849d40 _find_next_and_bit+0x78 => ffff80008011cadc update_sg_lb_stats+0x94
ffff80008011cb1c update_sg_lb_stats+0xd4 => ffff8000801117a0 cpu_util+0x0
ffff8000801117c4 cpu_util+0x24 => ffff8000801117d4 cpu_util+0x34
Make sure that a final branch stack is output at the end of the trace
by calling cs_etm__end_block(). This is already done for both the
timeless decode paths.
Fixes: 21fe8dc1191a ("perf cs-etm: Add support for CPU-wide trace scenarios")
Reported-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Closes: https://lore.kernel.org/all/20240719092619.274730-1-gankulkarni@os.amperecomputing.com/
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ruidong Tian <tianruidong@linux.alibaba.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Cc: John Garry <john.g.garry@oracle.com>
Cc: scclevenger@os.amperecomputing.com
Link: https://lore.kernel.org/r/20240916135743.1490403-2-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
tools/perf/util/cs-etm.c | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 5e9fbcfad7d44..f4615fa4280d8 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -2439,12 +2439,6 @@ static void cs_etm__clear_all_traceid_queues(struct cs_etm_queue *etmq)
/* Ignore return value */
cs_etm__process_traceid_queue(etmq, tidq);
-
- /*
- * Generate an instruction sample with the remaining
- * branchstack entries.
- */
- cs_etm__flush(etmq, tidq);
}
}
@@ -2587,7 +2581,7 @@ static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm)
while (1) {
if (!etm->heap.heap_cnt)
- goto out;
+ break;
/* Take the entry at the top of the min heap */
cs_queue_nr = etm->heap.heap_array[0].queue_nr;
@@ -2670,6 +2664,23 @@ static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm)
ret = auxtrace_heap__add(&etm->heap, cs_queue_nr, cs_timestamp);
}
+ for (i = 0; i < etm->queues.nr_queues; i++) {
+ struct int_node *inode;
+
+ etmq = etm->queues.queue_array[i].priv;
+ if (!etmq)
+ continue;
+
+ intlist__for_each_entry(inode, etmq->traceid_queues_list) {
+ int idx = (int)(intptr_t)inode->priv;
+
+ /* Flush any remaining branch stack entries */
+ tidq = etmq->traceid_queues[idx];
+ ret = cs_etm__end_block(etmq, tidq);
+ if (ret)
+ return ret;
+ }
+ }
out:
return ret;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* [PATCH 6.11 486/817] perf stat: Uniquify event name improvements
[not found] <20241203143955.605130076@linuxfoundation.org>
2024-12-03 14:40 ` [PATCH 6.11 481/817] perf cs-etm: Dont flush when packet_queue fills up Greg Kroah-Hartman
@ 2024-12-03 14:40 ` Greg Kroah-Hartman
1 sibling, 0 replies; 2+ messages in thread
From: Greg Kroah-Hartman @ 2024-12-03 14:40 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Namhyung Kim, Ian Rogers, Kan Liang,
James Clark, Yang Jihong, Dominique Martinet, Colin Ian King,
Howard Chu, Ze Gao, Yicong Yang, Weilin Wang, Will Deacon,
Mike Leach, Jing Zhang, Yang Li, Leo Yan, ak, Athira Rajeev,
linux-arm-kernel, Sun Haiyong, John Garry, Sasha Levin
6.11-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Rogers <irogers@google.com>
[ Upstream commit 057f8bfc6f7070577523d1e3081081bbf4229c1c ]
Without aggregation on Intel:
```
$ perf stat -e instructions,cycles ...
```
Will use "cycles" for the name of the legacy cycles event but as
"instructions" has a sysfs name it will and a "[cpu]" PMU suffix. This
often breaks things as the space between the event and the PMU name
look like an extra column. The existing uniquify logic was also
uniquifying in cases when all events are core and not with uncore
events, it was not correctly handling modifiers, etc.
Change the logic so that an initial pass that can disable
uniquification is run. For individual counters, disable uniquification
in more cases such as for consistency with legacy events or for
libpfm4 events. Don't use the "[pmu]" style suffix in uniquification,
always use "pmu/.../". Change how modifiers/terms are handled in the
uniquification so that they look like parse-able events.
This fixes "102: perf stat metrics (shadow stat) test:" that has been
failing due to "instructions [cpu]" breaking its column/awk logic when
values aren't aggregated. This started happening when instructions
could match a sysfs rather than a legacy event, so the fixes tag
reflects this.
Fixes: 617824a7f0f7 ("perf parse-events: Prefer sysfs/JSON hardware events over legacy")
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
[ Fix Intel TPEBS counting mode test ]
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: ak@linux.intel.com
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20240926144851.245903-3-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
.../perf/tests/shell/test_stat_intel_tpebs.sh | 11 +-
tools/perf/util/stat-display.c | 101 ++++++++++++++----
2 files changed, 85 insertions(+), 27 deletions(-)
diff --git a/tools/perf/tests/shell/test_stat_intel_tpebs.sh b/tools/perf/tests/shell/test_stat_intel_tpebs.sh
index c60b29add9801..9a11f42d153ca 100755
--- a/tools/perf/tests/shell/test_stat_intel_tpebs.sh
+++ b/tools/perf/tests/shell/test_stat_intel_tpebs.sh
@@ -8,12 +8,15 @@ grep -q GenuineIntel /proc/cpuinfo || { echo Skipping non-Intel; exit 2; }
# Use this event for testing because it should exist in all platforms
event=cache-misses:R
+# Hybrid platforms output like "cpu_atom/cache-misses/R", rather than as above
+alt_name=/cache-misses/R
+
# Without this cmd option, default value or zero is returned
-echo "Testing without --record-tpebs"
-result=$(perf stat -e "$event" true 2>&1)
-[[ "$result" =~ $event ]] || exit 1
+#echo "Testing without --record-tpebs"
+#result=$(perf stat -e "$event" true 2>&1)
+#[[ "$result" =~ $event || "$result" =~ $alt_name ]] || exit 1
# In platforms that do not support TPEBS, it should execute without error.
echo "Testing with --record-tpebs"
result=$(perf stat -e "$event" --record-tpebs -a sleep 0.01 2>&1)
-[[ "$result" =~ "perf record" && "$result" =~ $event ]] || exit 1
+[[ "$result" =~ "perf record" && "$result" =~ $event || "$result" =~ $alt_name ]] || exit 1
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index ea96e4ebad8c8..cbff43ff8d0fb 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -871,38 +871,66 @@ static void printout(struct perf_stat_config *config, struct outstate *os,
static void uniquify_event_name(struct evsel *counter)
{
- char *new_name;
- char *config;
- int ret = 0;
+ const char *name, *pmu_name;
+ char *new_name, *config;
+ int ret;
- if (counter->uniquified_name || counter->use_config_name ||
- !counter->pmu_name || !strncmp(evsel__name(counter), counter->pmu_name,
- strlen(counter->pmu_name)))
+ /* The evsel was already uniquified. */
+ if (counter->uniquified_name)
return;
- config = strchr(counter->name, '/');
+ /* Avoid checking to uniquify twice. */
+ counter->uniquified_name = true;
+
+ /* The evsel has a "name=" config term or is from libpfm. */
+ if (counter->use_config_name || counter->is_libpfm_event)
+ return;
+
+ /* Legacy no PMU event, don't uniquify. */
+ if (!counter->pmu ||
+ (counter->pmu->type < PERF_TYPE_MAX && counter->pmu->type != PERF_TYPE_RAW))
+ return;
+
+ /* A sysfs or json event replacing a legacy event, don't uniquify. */
+ if (counter->pmu->is_core && counter->alternate_hw_config != PERF_COUNT_HW_MAX)
+ return;
+
+ name = evsel__name(counter);
+ pmu_name = counter->pmu->name;
+ /* Already prefixed by the PMU name. */
+ if (!strncmp(name, pmu_name, strlen(pmu_name)))
+ return;
+
+ config = strchr(name, '/');
if (config) {
- if (asprintf(&new_name,
- "%s%s", counter->pmu_name, config) > 0) {
- free(counter->name);
- counter->name = new_name;
- }
- } else {
- if (evsel__is_hybrid(counter)) {
- ret = asprintf(&new_name, "%s/%s/",
- counter->pmu_name, counter->name);
+ int len = config - name;
+
+ if (config[1] == '/') {
+ /* case: event// */
+ ret = asprintf(&new_name, "%s/%.*s/%s", pmu_name, len, name, config + 2);
} else {
- ret = asprintf(&new_name, "%s [%s]",
- counter->name, counter->pmu_name);
+ /* case: event/.../ */
+ ret = asprintf(&new_name, "%s/%.*s,%s", pmu_name, len, name, config + 1);
}
+ } else {
+ config = strchr(name, ':');
+ if (config) {
+ /* case: event:.. */
+ int len = config - name;
- if (ret) {
- free(counter->name);
- counter->name = new_name;
+ ret = asprintf(&new_name, "%s/%.*s/%s", pmu_name, len, name, config + 1);
+ } else {
+ /* case: event */
+ ret = asprintf(&new_name, "%s/%s/", pmu_name, name);
}
}
-
- counter->uniquified_name = true;
+ if (ret > 0) {
+ free(counter->name);
+ counter->name = new_name;
+ } else {
+ /* ENOMEM from asprintf. */
+ counter->uniquified_name = false;
+ }
}
static bool hybrid_uniquify(struct evsel *evsel, struct perf_stat_config *config)
@@ -1559,6 +1587,31 @@ static void print_cgroup_counter(struct perf_stat_config *config, struct evlist
print_metric_end(config, os);
}
+static void disable_uniquify(struct evlist *evlist)
+{
+ struct evsel *counter;
+ struct perf_pmu *last_pmu = NULL;
+ bool first = true;
+
+ evlist__for_each_entry(evlist, counter) {
+ /* If PMUs vary then uniquify can be useful. */
+ if (!first && counter->pmu != last_pmu)
+ return;
+ first = false;
+ if (counter->pmu) {
+ /* Allow uniquify for uncore PMUs. */
+ if (!counter->pmu->is_core)
+ return;
+ /* Keep hybrid event names uniquified for clarity. */
+ if (perf_pmus__num_core_pmus() > 1)
+ return;
+ }
+ }
+ evlist__for_each_entry_continue(evlist, counter) {
+ counter->uniquified_name = true;
+ }
+}
+
void evlist__print_counters(struct evlist *evlist, struct perf_stat_config *config,
struct target *_target, struct timespec *ts,
int argc, const char **argv)
@@ -1572,6 +1625,8 @@ void evlist__print_counters(struct evlist *evlist, struct perf_stat_config *conf
.first = true,
};
+ disable_uniquify(evlist);
+
if (config->iostat_run)
evlist->selected = evlist__first(evlist);
--
2.43.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-12-03 15:20 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20241203143955.605130076@linuxfoundation.org>
2024-12-03 14:40 ` [PATCH 6.11 481/817] perf cs-etm: Dont flush when packet_queue fills up Greg Kroah-Hartman
2024-12-03 14:40 ` [PATCH 6.11 486/817] perf stat: Uniquify event name improvements Greg Kroah-Hartman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).