* [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge()
@ 2024-09-25 19:53 Leo Yan
2024-09-25 19:53 ` [PATCH v1 1/5] libperf cpumap: Correct reference count " Leo Yan
` (5 more replies)
0 siblings, 6 replies; 13+ messages in thread
From: Leo Yan @ 2024-09-25 19:53 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Adrian Hunter, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Liang, Kan, James Clark, linux-perf-users, linux-kernel
Cc: Leo Yan
perf_cpu_map__merge() has two arguments, 'orig' and 'other', as
described in its original comment:
"orig either gets freed and replaced with a new map, or reused
with no reference count change (similar to "realloc")
other has its reference count increased."
This causes confusion due to the different reference counting on the CPU
map objects, which complicates its usage and makes maintenance
difficult. We also discussed this in the email [1].
This patch series makes that a new CPU map is allocated for the
merging result, or the reference count is increased if an existing CPU
map is reused. This means that once perf_cpu_map__merge() is invoked,
the caller gains ownership of the resulting map and must release it
with perf_cpu_map__put().
The perf test has been updated accordingly. Tested result is:
# ./perf test 41
41: CPU map :
41.1: Synthesize cpu map : Ok
41.2: Print cpu map : Ok
41.3: Merge cpu map : Ok
41.4: Intersect cpu map : Ok
41.5: Equal cpu map : Ok
[1] https://lore.kernel.org/linux-perf-users/3f03541e-6dab-472f-bad9-4cdc0c0dc061@intel.com/
Leo Yan (5):
libperf cpumap: Correct reference count for perf_cpu_map__merge()
perf: Release old CPU maps after merging
perf cpumap: Update CPU map merging test
perf cpumap: Add more tests for CPU map merging
perf cpumap: Add checking for reference counter
tools/lib/perf/cpumap.c | 11 ++------
tools/lib/perf/evlist.c | 4 +++
tools/perf/tests/cpumap.c | 55 ++++++++++++++++++++++++++++++++----
tools/perf/util/mem-events.c | 4 ++-
4 files changed, 60 insertions(+), 14 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v1 1/5] libperf cpumap: Correct reference count for perf_cpu_map__merge()
2024-09-25 19:53 [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge() Leo Yan
@ 2024-09-25 19:53 ` Leo Yan
2024-10-10 17:41 ` Adrian Hunter
2024-09-25 19:53 ` [PATCH v1 2/5] perf: Release old CPU maps after merging Leo Yan
` (4 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Leo Yan @ 2024-09-25 19:53 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Adrian Hunter, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Liang, Kan, James Clark, linux-perf-users, linux-kernel
Cc: Leo Yan
The perf_cpu_map__merge() function has two arguments, 'orig' and 'other',
and it returns results for three different cases:
Case 1: 'other' is a subset of 'orig'.
Case 2: 'orig' is a subset of 'other'.
Case 3: 'orig' and 'other' are not subsets of each other.
The result combinations are:
+--------+-------------+-----------+-----------+
| Cases | Result | orig | other |
+--------+-------------+-----------+-----------+
| Case1 | orig | No change | No change |
+--------+-------------+-----------+-----------+
| Case2 | other | No change | refcnt++ |
+--------+-------------+-----------+-----------+
| Case3 | New CPU map | refcnt-- | No change |
+--------+-------------+-----------+-----------+
Both Case 1 and Case 3 have a risk of releasing maps unexpectedly. This
is because the reference counter operations are not consistent crossing
different cases and leads to difficulty for callers handling them.
For Case 1, because 'other' is a subset of 'orig', 'orig' is returned as
the merging result, but its refcnt is not incremented. This can lead to
the map being released repeatedly:
struct perf_cpu_map *a = perf_cpu_map__new("1,2");
struct perf_cpu_map *b = perf_cpu_map__new("2");
/* 'm' and 'a' point to the same CPU map */
struct perf_cpu_map *m = perf_cpu_map__merge(a, b);
...
perf_cpu_map__put(m); -> Release the map
perf_cpu_map__put(b);
perf_cpu_map__put(a); -> Release the same merged again
For Case 3, it is possible that the CPU map pointed to by 'orig' can be
released twice: within the function and outside of it.
struct perf_cpu_map *a = perf_cpu_map__new("1,2");
struct perf_cpu_map *b = perf_cpu_map__new("3");
struct perf_cpu_map *m = perf_cpu_map__merge(a, b);
`> 'm' is new allocated map. But 'a' has
been released in the function.
...
perf_cpu_map__put(m);
perf_cpu_map__put(b);
perf_cpu_map__put(a); -> Release the 'a' map again
This commit increases the reference counter if a passed map is returned.
For the case of a newly allocated map, it does not change the reference
counter for passed maps.
Signed-off-by: Leo Yan <leo.yan@arm.com>
---
tools/lib/perf/cpumap.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
index cae799ad44e1..3f80eade8b1c 100644
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -438,9 +438,7 @@ bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu
/*
* Merge two cpumaps
*
- * orig either gets freed and replaced with a new map, or reused
- * with no reference count change (similar to "realloc")
- * other has its reference count increased.
+ * Return a new map, or reused with its reference count increased.
*/
struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
@@ -452,11 +450,9 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
struct perf_cpu_map *merged;
if (perf_cpu_map__is_subset(orig, other))
- return orig;
- if (perf_cpu_map__is_subset(other, orig)) {
- perf_cpu_map__put(orig);
+ return perf_cpu_map__get(orig);
+ if (perf_cpu_map__is_subset(other, orig))
return perf_cpu_map__get(other);
- }
tmp_len = __perf_cpu_map__nr(orig) + __perf_cpu_map__nr(other);
tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
@@ -483,7 +479,6 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
merged = cpu_map__trim_new(k, tmp_cpus);
free(tmp_cpus);
- perf_cpu_map__put(orig);
return merged;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v1 2/5] perf: Release old CPU maps after merging
2024-09-25 19:53 [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge() Leo Yan
2024-09-25 19:53 ` [PATCH v1 1/5] libperf cpumap: Correct reference count " Leo Yan
@ 2024-09-25 19:53 ` Leo Yan
2024-09-25 19:53 ` [PATCH v1 3/5] perf cpumap: Update CPU map merging test Leo Yan
` (3 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: Leo Yan @ 2024-09-25 19:53 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Adrian Hunter, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Liang, Kan, James Clark, linux-perf-users, linux-kernel
Cc: Leo Yan
After CPU maps are merged, the old CPU map will not automatically
released.
This commit adds a new variable to record old CPU map, after merging the
new allocated map is returned, and release the old CPU map.
Signed-off-by: Leo Yan <leo.yan@arm.com>
---
tools/lib/perf/evlist.c | 4 ++++
tools/perf/util/mem-events.c | 4 +++-
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index c6d67fc9e57e..37920e0b0cd6 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -36,6 +36,8 @@ void perf_evlist__init(struct perf_evlist *evlist)
static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
struct perf_evsel *evsel)
{
+ struct perf_cpu_map *old_all_cpus;
+
if (evsel->system_wide) {
/* System wide: set the cpu map of the evsel to all online CPUs. */
perf_cpu_map__put(evsel->cpus);
@@ -75,7 +77,9 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
evsel->threads = perf_thread_map__get(evlist->threads);
}
+ old_all_cpus = evlist->all_cpus;
evlist->all_cpus = perf_cpu_map__merge(evlist->all_cpus, evsel->cpus);
+ perf_cpu_map__put(old_all_cpus);
}
static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 051feb93ed8d..016a1f4adb5d 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -257,7 +257,7 @@ int perf_mem_events__record_args(const char **rec_argv, int *argv_nr)
int i = *argv_nr;
const char *s;
char *copy;
- struct perf_cpu_map *cpu_map = NULL;
+ struct perf_cpu_map *cpu_map = NULL, *old_cpu_map;
while ((pmu = perf_pmus__scan_mem(pmu)) != NULL) {
for (int j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
@@ -283,7 +283,9 @@ int perf_mem_events__record_args(const char **rec_argv, int *argv_nr)
rec_argv[i++] = "-e";
rec_argv[i++] = copy;
+ old_cpu_map = cpu_map;
cpu_map = perf_cpu_map__merge(cpu_map, pmu->cpus);
+ perf_cpu_map__put(old_cpu_map);
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v1 3/5] perf cpumap: Update CPU map merging test
2024-09-25 19:53 [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge() Leo Yan
2024-09-25 19:53 ` [PATCH v1 1/5] libperf cpumap: Correct reference count " Leo Yan
2024-09-25 19:53 ` [PATCH v1 2/5] perf: Release old CPU maps after merging Leo Yan
@ 2024-09-25 19:53 ` Leo Yan
2024-09-25 19:53 ` [PATCH v1 4/5] perf cpumap: Add more tests for CPU map merging Leo Yan
` (2 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: Leo Yan @ 2024-09-25 19:53 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Adrian Hunter, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Liang, Kan, James Clark, linux-perf-users, linux-kernel
Cc: Leo Yan
Since the semantics of perf_cpu_map__merge() have changed, the two
arguments will not been released automatically in the function.
Update the test to release all CPU maps.
Signed-off-by: Leo Yan <leo.yan@arm.com>
---
tools/perf/tests/cpumap.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/perf/tests/cpumap.c b/tools/perf/tests/cpumap.c
index 2f0168b2a5a9..ed07ef6d7e33 100644
--- a/tools/perf/tests/cpumap.c
+++ b/tools/perf/tests/cpumap.c
@@ -166,6 +166,7 @@ static int test__cpu_map_merge(struct test_suite *test __maybe_unused, int subte
TEST_ASSERT_VAL("failed to merge map: bad nr", perf_cpu_map__nr(c) == 5);
cpu_map__snprint(c, buf, sizeof(buf));
TEST_ASSERT_VAL("failed to merge map: bad result", !strcmp(buf, "1-2,4-5,7"));
+ perf_cpu_map__put(a);
perf_cpu_map__put(b);
perf_cpu_map__put(c);
return 0;
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v1 4/5] perf cpumap: Add more tests for CPU map merging
2024-09-25 19:53 [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge() Leo Yan
` (2 preceding siblings ...)
2024-09-25 19:53 ` [PATCH v1 3/5] perf cpumap: Update CPU map merging test Leo Yan
@ 2024-09-25 19:53 ` Leo Yan
2024-09-25 19:53 ` [PATCH v1 5/5] perf cpumap: Add checking for reference counter Leo Yan
2024-10-10 15:01 ` [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge() Leo Yan
5 siblings, 0 replies; 13+ messages in thread
From: Leo Yan @ 2024-09-25 19:53 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Adrian Hunter, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Liang, Kan, James Clark, linux-perf-users, linux-kernel
Cc: Leo Yan
Add additional tests for CPU map merging to cover more cases.
These tests include different types of arguments, such as when one CPU
map is a subset of another, as well as cases with or without overlap
between the two maps.
Signed-off-by: Leo Yan <leo.yan@arm.com>
---
tools/perf/tests/cpumap.c | 34 +++++++++++++++++++++++++++++-----
1 file changed, 29 insertions(+), 5 deletions(-)
diff --git a/tools/perf/tests/cpumap.c b/tools/perf/tests/cpumap.c
index ed07ef6d7e33..d5b6c450f5c9 100644
--- a/tools/perf/tests/cpumap.c
+++ b/tools/perf/tests/cpumap.c
@@ -156,22 +156,46 @@ static int test__cpu_map_print(struct test_suite *test __maybe_unused, int subte
return 0;
}
-static int test__cpu_map_merge(struct test_suite *test __maybe_unused, int subtest __maybe_unused)
+static int __test__cpu_map_merge(const char *lhs, const char *rhs, int nr, const char *expected)
{
- struct perf_cpu_map *a = perf_cpu_map__new("4,2,1");
- struct perf_cpu_map *b = perf_cpu_map__new("4,5,7");
+ struct perf_cpu_map *a = perf_cpu_map__new(lhs);
+ struct perf_cpu_map *b = perf_cpu_map__new(rhs);
struct perf_cpu_map *c = perf_cpu_map__merge(a, b);
char buf[100];
- TEST_ASSERT_VAL("failed to merge map: bad nr", perf_cpu_map__nr(c) == 5);
+ TEST_ASSERT_VAL("failed to merge map: bad nr", perf_cpu_map__nr(c) == nr);
cpu_map__snprint(c, buf, sizeof(buf));
- TEST_ASSERT_VAL("failed to merge map: bad result", !strcmp(buf, "1-2,4-5,7"));
+ TEST_ASSERT_VAL("failed to merge map: bad result", !strcmp(buf, expected));
perf_cpu_map__put(a);
perf_cpu_map__put(b);
perf_cpu_map__put(c);
return 0;
}
+static int test__cpu_map_merge(struct test_suite *test __maybe_unused,
+ int subtest __maybe_unused)
+{
+ int ret;
+
+ ret = __test__cpu_map_merge("4,2,1", "4,5,7", 5, "1-2,4-5,7");
+ if (ret)
+ return ret;
+ ret = __test__cpu_map_merge("1-8", "6-9", 9, "1-9");
+ if (ret)
+ return ret;
+ ret = __test__cpu_map_merge("1-8,12-20", "6-9,15", 18, "1-9,12-20");
+ if (ret)
+ return ret;
+ ret = __test__cpu_map_merge("4,2,1", "1", 3, "1-2,4");
+ if (ret)
+ return ret;
+ ret = __test__cpu_map_merge("1", "4,2,1", 3, "1-2,4");
+ if (ret)
+ return ret;
+ ret = __test__cpu_map_merge("1", "1", 1, "1");
+ return ret;
+}
+
static int __test__cpu_map_intersect(const char *lhs, const char *rhs, int nr, const char *expected)
{
struct perf_cpu_map *a = perf_cpu_map__new(lhs);
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v1 5/5] perf cpumap: Add checking for reference counter
2024-09-25 19:53 [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge() Leo Yan
` (3 preceding siblings ...)
2024-09-25 19:53 ` [PATCH v1 4/5] perf cpumap: Add more tests for CPU map merging Leo Yan
@ 2024-09-25 19:53 ` Leo Yan
2024-10-10 15:01 ` [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge() Leo Yan
5 siblings, 0 replies; 13+ messages in thread
From: Leo Yan @ 2024-09-25 19:53 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Adrian Hunter, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Liang, Kan, James Clark, linux-perf-users, linux-kernel
Cc: Leo Yan
For the CPU map merging and intersection cases, add an extra check for
the reference counter before releasing the last CPU map.
Signed-off-by: Leo Yan <leo.yan@arm.com>
---
tools/perf/tests/cpumap.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/tools/perf/tests/cpumap.c b/tools/perf/tests/cpumap.c
index d5b6c450f5c9..0bcf603a0ccf 100644
--- a/tools/perf/tests/cpumap.c
+++ b/tools/perf/tests/cpumap.c
@@ -168,6 +168,16 @@ static int __test__cpu_map_merge(const char *lhs, const char *rhs, int nr, const
TEST_ASSERT_VAL("failed to merge map: bad result", !strcmp(buf, expected));
perf_cpu_map__put(a);
perf_cpu_map__put(b);
+
+ /*
+ * If 'b' is a superset of 'a', 'c' points to the same map with the
+ * map 'b'. In this case, the two owners 'b' and 'c' both point to the
+ * same map. The owner 'b' has released the resource above but 'c'
+ * still keeps the ownership, so the reference counter should be 1.
+ */
+ TEST_ASSERT_VAL("unexpected refcnt: bad result",
+ refcount_read(perf_cpu_map__refcnt(c)) == 1);
+
perf_cpu_map__put(c);
return 0;
}
@@ -208,6 +218,16 @@ static int __test__cpu_map_intersect(const char *lhs, const char *rhs, int nr, c
TEST_ASSERT_VAL("failed to intersect map: bad result", !strcmp(buf, expected));
perf_cpu_map__put(a);
perf_cpu_map__put(b);
+
+ /*
+ * If 'b' is a subset of 'a', 'c' points to the same map with the
+ * map 'b'. In this case, the two owners 'b' and 'c' both point to the
+ * same map. The owner 'b' has released the resource above but 'c'
+ * still keeps the ownership, so the reference counter should be 1.
+ */
+ TEST_ASSERT_VAL("unexpected refcnt: bad result",
+ refcount_read(perf_cpu_map__refcnt(c)) == 1);
+
perf_cpu_map__put(c);
return 0;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge()
2024-09-25 19:53 [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge() Leo Yan
` (4 preceding siblings ...)
2024-09-25 19:53 ` [PATCH v1 5/5] perf cpumap: Add checking for reference counter Leo Yan
@ 2024-10-10 15:01 ` Leo Yan
2024-10-10 16:10 ` Namhyung Kim
5 siblings, 1 reply; 13+ messages in thread
From: Leo Yan @ 2024-10-10 15:01 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Adrian Hunter, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Liang, Kan, James Clark, linux-perf-users, linux-kernel
On 9/25/24 20:53, Leo Yan wrote:
> perf_cpu_map__merge() has two arguments, 'orig' and 'other', as
> described in its original comment:
>
> "orig either gets freed and replaced with a new map, or reused
> with no reference count change (similar to "realloc")
> other has its reference count increased."
>
> This causes confusion due to the different reference counting on the CPU
> map objects, which complicates its usage and makes maintenance
> difficult. We also discussed this in the email [1].
>
> This patch series makes that a new CPU map is allocated for the
> merging result, or the reference count is increased if an existing CPU
> map is reused. This means that once perf_cpu_map__merge() is invoked,
> the caller gains ownership of the resulting map and must release it
> with perf_cpu_map__put().
Gentle ping ...
Thanks,
Leo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge()
2024-10-10 15:01 ` [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge() Leo Yan
@ 2024-10-10 16:10 ` Namhyung Kim
0 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2024-10-10 16:10 UTC (permalink / raw)
To: Leo Yan, Ian Rogers, Adrian Hunter
Cc: Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Liang, Kan, James Clark, linux-perf-users,
linux-kernel
On Thu, Oct 10, 2024 at 04:01:21PM +0100, Leo Yan wrote:
> On 9/25/24 20:53, Leo Yan wrote:
> > perf_cpu_map__merge() has two arguments, 'orig' and 'other', as
> > described in its original comment:
> >
> > "orig either gets freed and replaced with a new map, or reused
> > with no reference count change (similar to "realloc")
> > other has its reference count increased."
> >
> > This causes confusion due to the different reference counting on the CPU
> > map objects, which complicates its usage and makes maintenance
> > difficult. We also discussed this in the email [1].
> >
> > This patch series makes that a new CPU map is allocated for the
> > merging result, or the reference count is increased if an existing CPU
> > map is reused. This means that once perf_cpu_map__merge() is invoked,
> > the caller gains ownership of the resulting map and must release it
> > with perf_cpu_map__put().
>
> Gentle ping ...
Ian and Adrian, can you please review this patchset?
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v1 1/5] libperf cpumap: Correct reference count for perf_cpu_map__merge()
2024-09-25 19:53 ` [PATCH v1 1/5] libperf cpumap: Correct reference count " Leo Yan
@ 2024-10-10 17:41 ` Adrian Hunter
2024-10-11 9:34 ` Leo Yan
0 siblings, 1 reply; 13+ messages in thread
From: Adrian Hunter @ 2024-10-10 17:41 UTC (permalink / raw)
To: Leo Yan, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Liang, Kan,
James Clark, linux-perf-users, linux-kernel
On 25/09/24 22:53, Leo Yan wrote:
> The perf_cpu_map__merge() function has two arguments, 'orig' and 'other',
> and it returns results for three different cases:
>
> Case 1: 'other' is a subset of 'orig'.
> Case 2: 'orig' is a subset of 'other'.
> Case 3: 'orig' and 'other' are not subsets of each other.
>
> The result combinations are:
>
> +--------+-------------+-----------+-----------+
> | Cases | Result | orig | other |
> +--------+-------------+-----------+-----------+
> | Case1 | orig | No change | No change |
> +--------+-------------+-----------+-----------+
> | Case2 | other | No change | refcnt++ |
> +--------+-------------+-----------+-----------+
> | Case3 | New CPU map | refcnt-- | No change |
> +--------+-------------+-----------+-----------+
>
> Both Case 1 and Case 3 have a risk of releasing maps unexpectedly. This
> is because the reference counter operations are not consistent crossing
> different cases and leads to difficulty for callers handling them.
>
> For Case 1, because 'other' is a subset of 'orig', 'orig' is returned as
> the merging result, but its refcnt is not incremented. This can lead to
> the map being released repeatedly:
>
> struct perf_cpu_map *a = perf_cpu_map__new("1,2");
> struct perf_cpu_map *b = perf_cpu_map__new("2");
>
> /* 'm' and 'a' point to the same CPU map */
> struct perf_cpu_map *m = perf_cpu_map__merge(a, b);
>
> ...
>
> perf_cpu_map__put(m); -> Release the map
> perf_cpu_map__put(b);
> perf_cpu_map__put(a); -> Release the same merged again
>
> For Case 3, it is possible that the CPU map pointed to by 'orig' can be
> released twice: within the function and outside of it.
>
> struct perf_cpu_map *a = perf_cpu_map__new("1,2");
> struct perf_cpu_map *b = perf_cpu_map__new("3");
>
> struct perf_cpu_map *m = perf_cpu_map__merge(a, b);
> `> 'm' is new allocated map. But 'a' has
> been released in the function.
> ...
>
> perf_cpu_map__put(m);
> perf_cpu_map__put(b);
> perf_cpu_map__put(a); -> Release the 'a' map again
>
> This commit increases the reference counter if a passed map is returned.
> For the case of a newly allocated map, it does not change the reference
> counter for passed maps.
The 2 non-test uses of perf_cpu_map__merge both do:
a = perf_cpu_map__merge(a, b);
so another way to make the API less misleading would be
to introduce:
err = perf_cpu_map__merge_in(&a, b);
where:
int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map *other)
{
struct perf_cpu_map *result = perf_cpu_map__merge(*orig, other);
if (!result)
return -ENOMEM;
*orig = result;
return 0;
}
without any changes to perf_cpu_map__merge().
>
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
> tools/lib/perf/cpumap.c | 11 +++--------
> 1 file changed, 3 insertions(+), 8 deletions(-)
>
> diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
> index cae799ad44e1..3f80eade8b1c 100644
> --- a/tools/lib/perf/cpumap.c
> +++ b/tools/lib/perf/cpumap.c
> @@ -438,9 +438,7 @@ bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu
> /*
> * Merge two cpumaps
> *
> - * orig either gets freed and replaced with a new map, or reused
> - * with no reference count change (similar to "realloc")
> - * other has its reference count increased.
> + * Return a new map, or reused with its reference count increased.
> */
>
> struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
> @@ -452,11 +450,9 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
> struct perf_cpu_map *merged;
>
> if (perf_cpu_map__is_subset(orig, other))
> - return orig;
> - if (perf_cpu_map__is_subset(other, orig)) {
> - perf_cpu_map__put(orig);
> + return perf_cpu_map__get(orig);
> + if (perf_cpu_map__is_subset(other, orig))
> return perf_cpu_map__get(other);
> - }
>
> tmp_len = __perf_cpu_map__nr(orig) + __perf_cpu_map__nr(other);
> tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
> @@ -483,7 +479,6 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
>
> merged = cpu_map__trim_new(k, tmp_cpus);
> free(tmp_cpus);
> - perf_cpu_map__put(orig);
> return merged;
> }
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v1 1/5] libperf cpumap: Correct reference count for perf_cpu_map__merge()
2024-10-10 17:41 ` Adrian Hunter
@ 2024-10-11 9:34 ` Leo Yan
2024-10-11 9:40 ` Leo Yan
0 siblings, 1 reply; 13+ messages in thread
From: Leo Yan @ 2024-10-11 9:34 UTC (permalink / raw)
To: Adrian Hunter, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Liang, Kan, James Clark, linux-perf-users, linux-kernel
Hi Adrian,
On 10/10/24 18:41, Adrian Hunter wrote:>
> On 25/09/24 22:53, Leo Yan wrote:
>> The perf_cpu_map__merge() function has two arguments, 'orig' and 'other',
>> and it returns results for three different cases:
>>
>> Case 1: 'other' is a subset of 'orig'.
>> Case 2: 'orig' is a subset of 'other'.
>> Case 3: 'orig' and 'other' are not subsets of each other.
>>
>> The result combinations are:
>>
>> +--------+-------------+-----------+-----------+
>> | Cases | Result | orig | other |
>> +--------+-------------+-----------+-----------+
>> | Case1 | orig | No change | No change |
>> +--------+-------------+-----------+-----------+
>> | Case2 | other | No change | refcnt++ |
>> +--------+-------------+-----------+-----------+
>> | Case3 | New CPU map | refcnt-- | No change |
>> +--------+-------------+-----------+-----------+
>>
>> Both Case 1 and Case 3 have a risk of releasing maps unexpectedly. This
>> is because the reference counter operations are not consistent crossing
>> different cases and leads to difficulty for callers handling them.
>>
>> For Case 1, because 'other' is a subset of 'orig', 'orig' is returned as
>> the merging result, but its refcnt is not incremented. This can lead to
>> the map being released repeatedly:
>>
>> struct perf_cpu_map *a = perf_cpu_map__new("1,2");
>> struct perf_cpu_map *b = perf_cpu_map__new("2");
>>
>> /* 'm' and 'a' point to the same CPU map */
>> struct perf_cpu_map *m = perf_cpu_map__merge(a, b);
>>
>> ...
>>
>> perf_cpu_map__put(m); -> Release the map
>> perf_cpu_map__put(b);
>> perf_cpu_map__put(a); -> Release the same merged again
>>
>> For Case 3, it is possible that the CPU map pointed to by 'orig' can be
>> released twice: within the function and outside of it.
>>
>> struct perf_cpu_map *a = perf_cpu_map__new("1,2");
>> struct perf_cpu_map *b = perf_cpu_map__new("3");
>>
>> struct perf_cpu_map *m = perf_cpu_map__merge(a, b);
>> `> 'm' is new allocated map. But 'a' has
>> been released in the function.
>> ...
>>
>> perf_cpu_map__put(m);
>> perf_cpu_map__put(b);
>> perf_cpu_map__put(a); -> Release the 'a' map again
>>
>> This commit increases the reference counter if a passed map is returned.
>> For the case of a newly allocated map, it does not change the reference
>> counter for passed maps.
>
> The 2 non-test uses of perf_cpu_map__merge both do:
>
> a = perf_cpu_map__merge(a, b);
>
> so another way to make the API less misleading would be
> to introduce:
>
> err = perf_cpu_map__merge_in(&a, b);
>
> where:
>
> int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map *other)
> {
> struct perf_cpu_map *result = perf_cpu_map__merge(*orig, other);
>
> if (!result)
> return -ENOMEM;
>
> *orig = result;
> return 0;
> }
>
> without any changes to perf_cpu_map__merge().
Just wandering why we cannot do the same thing for the perf_cpu_map__merge()
function?
int perf_cpu_map__merge_in(struct perf_cpu_map **orig,
struct perf_cpu_map *other)
This can allow us to avoid any confusion in the first place. And we don't need
to maintain two functions for the same thing.
Thanks,
Leo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v1 1/5] libperf cpumap: Correct reference count for perf_cpu_map__merge()
2024-10-11 9:34 ` Leo Yan
@ 2024-10-11 9:40 ` Leo Yan
2024-10-11 9:46 ` Adrian Hunter
0 siblings, 1 reply; 13+ messages in thread
From: Leo Yan @ 2024-10-11 9:40 UTC (permalink / raw)
To: Adrian Hunter, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Liang, Kan, James Clark, linux-perf-users, linux-kernel
On 10/11/24 10:34, Leo Yan wrote:
>> The 2 non-test uses of perf_cpu_map__merge both do:
>>
>> a = perf_cpu_map__merge(a, b);
>>
>> so another way to make the API less misleading would be
>> to introduce:
>>
>> err = perf_cpu_map__merge_in(&a, b);
>>
>> where:
>>
>> int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map
>> *other)
>> {
>> struct perf_cpu_map *result = perf_cpu_map__merge(*orig, other);
>>
>> if (!result)
>> return -ENOMEM;
>>
>> *orig = result;
>> return 0;
>> }
>>
>> without any changes to perf_cpu_map__merge().
>
> Just wandering why we cannot do the same thing for the perf_cpu_map__merge()
> function?
>
> int perf_cpu_map__merge_in(struct perf_cpu_map **orig,
> struct perf_cpu_map *other)
Sorry for typo and spamming. The above suggested definition is for
perf_cpu_map__merge().
> This can allow us to avoid any confusion in the first place. And we don't need
> to maintain two functions for the same thing.
>
> Thanks,
> Leo
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v1 1/5] libperf cpumap: Correct reference count for perf_cpu_map__merge()
2024-10-11 9:40 ` Leo Yan
@ 2024-10-11 9:46 ` Adrian Hunter
2024-10-11 9:51 ` Leo Yan
0 siblings, 1 reply; 13+ messages in thread
From: Adrian Hunter @ 2024-10-11 9:46 UTC (permalink / raw)
To: Leo Yan, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Liang, Kan,
James Clark, linux-perf-users, linux-kernel
On 11/10/24 12:40, Leo Yan wrote:
>
>
> On 10/11/24 10:34, Leo Yan wrote:
>
>>> The 2 non-test uses of perf_cpu_map__merge both do:
>>>
>>> a = perf_cpu_map__merge(a, b);
>>>
>>> so another way to make the API less misleading would be
>>> to introduce:
>>>
>>> err = perf_cpu_map__merge_in(&a, b);
>>>
>>> where:
>>>
>>> int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map *other)
>>> {
>>> struct perf_cpu_map *result = perf_cpu_map__merge(*orig, other);
>>>
>>> if (!result)
>>> return -ENOMEM;
>>>
>>> *orig = result;
>>> return 0;
>>> }
>>>
>>> without any changes to perf_cpu_map__merge().
>>
>> Just wandering why we cannot do the same thing for the perf_cpu_map__merge()
>> function?
>>
>> int perf_cpu_map__merge_in(struct perf_cpu_map **orig,
>> struct perf_cpu_map *other)
>
> Sorry for typo and spamming. The above suggested definition is for perf_cpu_map__merge().
Yes - there is not much reason to have perf_cpu_map__merge()
and perf_cpu_map__merge_in().
>
>
>> This can allow us to avoid any confusion in the first place. And we don't need
>> to maintain two functions for the same thing.
>>
>> Thanks,
>> Leo
>>
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v1 1/5] libperf cpumap: Correct reference count for perf_cpu_map__merge()
2024-10-11 9:46 ` Adrian Hunter
@ 2024-10-11 9:51 ` Leo Yan
0 siblings, 0 replies; 13+ messages in thread
From: Leo Yan @ 2024-10-11 9:51 UTC (permalink / raw)
To: Adrian Hunter, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Liang, Kan, James Clark, linux-perf-users, linux-kernel
On 10/11/24 10:46, Adrian Hunter wrote:
[...]
>>>> int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map *other)
>>>> {
>>>> struct perf_cpu_map *result = perf_cpu_map__merge(*orig, other);
>>>>
>>>> if (!result)
>>>> return -ENOMEM;
>>>>
>>>> *orig = result;
>>>> return 0;
>>>> }
>>>>
>>>> without any changes to perf_cpu_map__merge().
>>>
>>> Just wandering why we cannot do the same thing for the perf_cpu_map__merge()
>>> function?
>>>
>>> int perf_cpu_map__merge_in(struct perf_cpu_map **orig,
>>> struct perf_cpu_map *other)
>>
>> Sorry for typo and spamming. The above suggested definition is for perf_cpu_map__merge().
>
> Yes - there is not much reason to have perf_cpu_map__merge()
> and perf_cpu_map__merge_in().
Thanks for suggestion! Will move towards this.
Leo
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-10-11 9:51 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-25 19:53 [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge() Leo Yan
2024-09-25 19:53 ` [PATCH v1 1/5] libperf cpumap: Correct reference count " Leo Yan
2024-10-10 17:41 ` Adrian Hunter
2024-10-11 9:34 ` Leo Yan
2024-10-11 9:40 ` Leo Yan
2024-10-11 9:46 ` Adrian Hunter
2024-10-11 9:51 ` Leo Yan
2024-09-25 19:53 ` [PATCH v1 2/5] perf: Release old CPU maps after merging Leo Yan
2024-09-25 19:53 ` [PATCH v1 3/5] perf cpumap: Update CPU map merging test Leo Yan
2024-09-25 19:53 ` [PATCH v1 4/5] perf cpumap: Add more tests for CPU map merging Leo Yan
2024-09-25 19:53 ` [PATCH v1 5/5] perf cpumap: Add checking for reference counter Leo Yan
2024-10-10 15:01 ` [PATCH v1 0/5] perf cpumap: Correct for perf_cpu_map__merge() Leo Yan
2024-10-10 16:10 ` Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).