* [PATCH v1 1/3] perf maps: Fix use after free in __maps__fixup_overlap_and_insert
2024-05-21 16:51 [PATCH v1 0/3] Fix and improve __maps__fixup_overlap_and_insert Ian Rogers
@ 2024-05-21 16:51 ` Ian Rogers
2024-05-21 16:51 ` [PATCH v1 2/3] perf maps: Reduce sorting for overlapping mappings Ian Rogers
` (4 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: Ian Rogers @ 2024-05-21 16:51 UTC (permalink / raw)
To: Steinar H . Gunderson, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, linux-perf-users, linux-kernel
In the case 'before' and 'after' are broken out from pos,
maps_by_address may be changed by __maps__insert, as such it needs
re-reading.
Don't ignore the return value from __maps_insert.
Fixes: 659ad3492b91 ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/maps.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
index 16b39db594f4..eaada3e0f5b4 100644
--- a/tools/perf/util/maps.c
+++ b/tools/perf/util/maps.c
@@ -741,7 +741,6 @@ static unsigned int first_ending_after(struct maps *maps, const struct map *map)
*/
static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
{
- struct map **maps_by_address;
int err = 0;
FILE *fp = debug_file();
@@ -749,12 +748,12 @@ static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
if (!maps__maps_by_address_sorted(maps))
__maps__sort_by_address(maps);
- maps_by_address = maps__maps_by_address(maps);
/*
* Iterate through entries where the end of the existing entry is
* greater-than the new map's start.
*/
for (unsigned int i = first_ending_after(maps, new); i < maps__nr_maps(maps); ) {
+ struct map **maps_by_address = maps__maps_by_address(maps);
struct map *pos = maps_by_address[i];
struct map *before = NULL, *after = NULL;
@@ -821,8 +820,10 @@ static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
/* Maps are still ordered, go to next one. */
i++;
if (after) {
- __maps__insert(maps, after);
+ err = __maps__insert(maps, after);
map__put(after);
+ if (err)
+ goto out_err;
if (!maps__maps_by_address_sorted(maps)) {
/*
* Sorting broken so invariants don't
@@ -851,7 +852,7 @@ static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
check_invariants(maps);
}
/* Add the map. */
- __maps__insert(maps, new);
+ err = __maps__insert(maps, new);
out_err:
return err;
}
--
2.45.0.rc1.225.g2a3ae87e7f-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH v1 2/3] perf maps: Reduce sorting for overlapping mappings
2024-05-21 16:51 [PATCH v1 0/3] Fix and improve __maps__fixup_overlap_and_insert Ian Rogers
2024-05-21 16:51 ` [PATCH v1 1/3] perf maps: Fix use after free in __maps__fixup_overlap_and_insert Ian Rogers
@ 2024-05-21 16:51 ` Ian Rogers
2024-05-21 16:51 ` [PATCH v1 3/3] perf maps: Add/use a sorted insert for fixup overlap and insert Ian Rogers
` (3 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: Ian Rogers @ 2024-05-21 16:51 UTC (permalink / raw)
To: Steinar H . Gunderson, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, linux-perf-users, linux-kernel
When an 'after' map is generated the 'new' map must be before it so
terminate iterating and don't resort. If the entry 'pos' is entirely
overlapped by the 'new' mapping then don't remove and insert the
mapping, just replace - again to remove sorting.
For a perf report on a perf.data file containing overlapping mappings
the time numbers are:
Before:
real 0m9.856s
user 0m9.637s
sys 0m0.204s
After:
real 0m5.894s
user 0m5.650s
sys 0m0.231s
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/maps.c | 55 +++++++++++++++++++++++++++---------------
1 file changed, 36 insertions(+), 19 deletions(-)
diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
index eaada3e0f5b4..f6b6df82f4cf 100644
--- a/tools/perf/util/maps.c
+++ b/tools/perf/util/maps.c
@@ -744,7 +744,6 @@ static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
int err = 0;
FILE *fp = debug_file();
-sort_again:
if (!maps__maps_by_address_sorted(maps))
__maps__sort_by_address(maps);
@@ -820,36 +819,54 @@ static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
/* Maps are still ordered, go to next one. */
i++;
if (after) {
- err = __maps__insert(maps, after);
- map__put(after);
- if (err)
- goto out_err;
- if (!maps__maps_by_address_sorted(maps)) {
- /*
- * Sorting broken so invariants don't
- * hold, sort and go again.
- */
- goto sort_again;
- }
/*
- * Maps are still ordered, skip after and go to
- * next one (terminate loop).
+ * 'before' and 'after' mean 'new' split the
+ * 'pos' mapping and therefore there are no
+ * later mappings.
*/
- i++;
+ err = __maps__insert(maps, new);
+ if (!err)
+ err = __maps__insert(maps, after);
+ map__put(after);
+ check_invariants(maps);
+ return err;
}
+ check_invariants(maps);
} else if (after) {
+ /*
+ * 'after' means 'new' split 'pos' and there are no
+ * later mappings.
+ */
map__put(maps_by_address[i]);
- maps_by_address[i] = after;
- /* Maps are ordered, go to next one. */
- i++;
+ maps_by_address[i] = map__get(new);
+ err = __maps__insert(maps, after);
+ map__put(after);
+ check_invariants(maps);
+ return err;
} else {
+ struct map *next = NULL;
+
+ if (i + 1 < maps__nr_maps(maps))
+ next = maps_by_address[i + 1];
+
+ if (!next || map__start(next) >= map__end(new)) {
+ /*
+ * Replace existing mapping and end knowing
+ * there aren't later overlapping or any
+ * mappings.
+ */
+ map__put(maps_by_address[i]);
+ maps_by_address[i] = map__get(new);
+ check_invariants(maps);
+ return err;
+ }
__maps__remove(maps, pos);
+ check_invariants(maps);
/*
* Maps are ordered but no need to increase `i` as the
* later maps were moved down.
*/
}
- check_invariants(maps);
}
/* Add the map. */
err = __maps__insert(maps, new);
--
2.45.0.rc1.225.g2a3ae87e7f-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH v1 3/3] perf maps: Add/use a sorted insert for fixup overlap and insert
2024-05-21 16:51 [PATCH v1 0/3] Fix and improve __maps__fixup_overlap_and_insert Ian Rogers
2024-05-21 16:51 ` [PATCH v1 1/3] perf maps: Fix use after free in __maps__fixup_overlap_and_insert Ian Rogers
2024-05-21 16:51 ` [PATCH v1 2/3] perf maps: Reduce sorting for overlapping mappings Ian Rogers
@ 2024-05-21 16:51 ` Ian Rogers
2024-06-05 23:54 ` [PATCH v1 0/3] Fix and improve __maps__fixup_overlap_and_insert Ian Rogers
` (2 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: Ian Rogers @ 2024-05-21 16:51 UTC (permalink / raw)
To: Steinar H . Gunderson, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, linux-perf-users, linux-kernel
Data may have lots of overlapping mmaps. The regular insert adds at
the end and relies on a later sort. For data with overlapping mappings
the sort will happen during a subsequent maps__find or
__maps__fixup_overlap_and_insert, there's never a period where the
inserted maps buffer up and a single sort happens. To avoid back to
back sorts, maintain the sort order when fixing up and
inserting. Previously the first_ending_after search was O(log n) where
n is the size of maps, and the insert was O(1) but because of the
continuous sorting was becoming O(n*log(n)). With maintaining sort
order, the insert now becomes O(n) for a memmove.
For a perf report on a perf.data file containing overlapping mappings
the time numbers are:
Before:
real 0m5.894s
user 0m5.650s
sys 0m0.231s
After:
real 0m0.675s
user 0m0.454s
sys 0m0.196s
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/maps.c | 65 ++++++++++++++++++++++++++++++++++++++----
1 file changed, 59 insertions(+), 6 deletions(-)
diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
index f6b6df82f4cf..432399cbe5dd 100644
--- a/tools/perf/util/maps.c
+++ b/tools/perf/util/maps.c
@@ -735,6 +735,60 @@ static unsigned int first_ending_after(struct maps *maps, const struct map *map)
return first;
}
+static int __maps__insert_sorted(struct maps *maps, unsigned int first_after_index,
+ struct map *new1, struct map *new2)
+{
+ struct map **maps_by_address = maps__maps_by_address(maps);
+ struct map **maps_by_name = maps__maps_by_name(maps);
+ unsigned int nr_maps = maps__nr_maps(maps);
+ unsigned int nr_allocate = RC_CHK_ACCESS(maps)->nr_maps_allocated;
+ unsigned int to_add = new2 ? 2 : 1;
+
+ assert(maps__maps_by_address_sorted(maps));
+ assert(first_after_index == nr_maps ||
+ map__end(new1) <= map__start(maps_by_address[first_after_index]));
+ assert(!new2 || map__end(new1) <= map__start(new2));
+ assert(first_after_index == nr_maps || !new2 ||
+ map__end(new2) <= map__start(maps_by_address[first_after_index]));
+
+ if (nr_maps + to_add > nr_allocate) {
+ nr_allocate = !nr_allocate ? 32 : nr_allocate * 2;
+
+ maps_by_address = realloc(maps_by_address, nr_allocate * sizeof(new1));
+ if (!maps_by_address)
+ return -ENOMEM;
+
+ maps__set_maps_by_address(maps, maps_by_address);
+ if (maps_by_name) {
+ maps_by_name = realloc(maps_by_name, nr_allocate * sizeof(new1));
+ if (!maps_by_name) {
+ /*
+ * If by name fails, just disable by name and it will
+ * recompute next time it is required.
+ */
+ __maps__free_maps_by_name(maps);
+ }
+ maps__set_maps_by_name(maps, maps_by_name);
+ }
+ RC_CHK_ACCESS(maps)->nr_maps_allocated = nr_allocate;
+ }
+ memmove(&maps_by_address[first_after_index+to_add],
+ &maps_by_address[first_after_index],
+ (nr_maps - first_after_index) * sizeof(new1));
+ maps_by_address[first_after_index] = map__get(new1);
+ if (maps_by_name)
+ maps_by_name[nr_maps] = map__get(new1);
+ if (new2) {
+ maps_by_address[first_after_index + 1] = map__get(new2);
+ if (maps_by_name)
+ maps_by_name[nr_maps + 1] = map__get(new2);
+ }
+ RC_CHK_ACCESS(maps)->nr_maps = nr_maps + to_add;
+ maps__set_maps_by_name_sorted(maps, false);
+ check_invariants(maps);
+ return 0;
+}
+
/*
* Adds new to maps, if new overlaps existing entries then the existing maps are
* adjusted or removed so that new fits without overlapping any entries.
@@ -743,6 +797,7 @@ static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
{
int err = 0;
FILE *fp = debug_file();
+ unsigned int i;
if (!maps__maps_by_address_sorted(maps))
__maps__sort_by_address(maps);
@@ -751,7 +806,7 @@ static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
* Iterate through entries where the end of the existing entry is
* greater-than the new map's start.
*/
- for (unsigned int i = first_ending_after(maps, new); i < maps__nr_maps(maps); ) {
+ for (i = first_ending_after(maps, new); i < maps__nr_maps(maps); ) {
struct map **maps_by_address = maps__maps_by_address(maps);
struct map *pos = maps_by_address[i];
struct map *before = NULL, *after = NULL;
@@ -824,9 +879,7 @@ static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
* 'pos' mapping and therefore there are no
* later mappings.
*/
- err = __maps__insert(maps, new);
- if (!err)
- err = __maps__insert(maps, after);
+ err = __maps__insert_sorted(maps, i, new, after);
map__put(after);
check_invariants(maps);
return err;
@@ -839,7 +892,7 @@ static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
*/
map__put(maps_by_address[i]);
maps_by_address[i] = map__get(new);
- err = __maps__insert(maps, after);
+ err = __maps__insert_sorted(maps, i + 1, after, NULL);
map__put(after);
check_invariants(maps);
return err;
@@ -869,7 +922,7 @@ static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
}
}
/* Add the map. */
- err = __maps__insert(maps, new);
+ err = __maps__insert_sorted(maps, i, new, NULL);
out_err:
return err;
}
--
2.45.0.rc1.225.g2a3ae87e7f-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH v1 0/3] Fix and improve __maps__fixup_overlap_and_insert
2024-05-21 16:51 [PATCH v1 0/3] Fix and improve __maps__fixup_overlap_and_insert Ian Rogers
` (2 preceding siblings ...)
2024-05-21 16:51 ` [PATCH v1 3/3] perf maps: Add/use a sorted insert for fixup overlap and insert Ian Rogers
@ 2024-06-05 23:54 ` Ian Rogers
2024-06-06 10:56 ` James Clark
2024-06-07 17:51 ` Namhyung Kim
5 siblings, 0 replies; 8+ messages in thread
From: Ian Rogers @ 2024-06-05 23:54 UTC (permalink / raw)
To: Steinar H . Gunderson, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, linux-perf-users, linux-kernel
On Tue, May 21, 2024 at 9:51 AM Ian Rogers <irogers@google.com> wrote:
>
> Fix latent unlikely bugs in __maps__fixup_overlap_and_insert.
>
> Improve __maps__fixup_overlap_and_insert's performance 21x in the case
> of overlapping mmaps. sesse@google.com reported slowness opening
> perf.data files from chromium where the files contained a large number
> of overlapping mappings. Improve this case primarily by avoiding
> unnecessary sorting.
>
> Unscientific timing data processing a perf.data file with overlapping
> mmap events from chromium:
>
> Before:
> real 0m9.856s
> user 0m9.637s
> sys 0m0.204s
>
> After:
> real 0m0.675s
> user 0m0.454s
> sys 0m0.196s
>
> Tested with address/leak sanitizer, invariant checks and validating
> the before and after output are identical.
>
> Ian Rogers (3):
> perf maps: Fix use after free in __maps__fixup_overlap_and_insert
> perf maps: Reduce sorting for overlapping mappings
> perf maps: Add/use a sorted insert for fixup overlap and insert
Ping. Thanks,
Ian
> tools/perf/util/maps.c | 113 +++++++++++++++++++++++++++++++++--------
> 1 file changed, 92 insertions(+), 21 deletions(-)
>
> --
> 2.45.0.rc1.225.g2a3ae87e7f-goog
>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH v1 0/3] Fix and improve __maps__fixup_overlap_and_insert
2024-05-21 16:51 [PATCH v1 0/3] Fix and improve __maps__fixup_overlap_and_insert Ian Rogers
` (3 preceding siblings ...)
2024-06-05 23:54 ` [PATCH v1 0/3] Fix and improve __maps__fixup_overlap_and_insert Ian Rogers
@ 2024-06-06 10:56 ` James Clark
2024-06-07 4:31 ` Ian Rogers
2024-06-07 17:51 ` Namhyung Kim
5 siblings, 1 reply; 8+ messages in thread
From: James Clark @ 2024-06-06 10:56 UTC (permalink / raw)
To: Ian Rogers
Cc: Steinar H . Gunderson, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
linux-perf-users, linux-kernel
On 21/05/2024 17:51, Ian Rogers wrote:
> Fix latent unlikely bugs in __maps__fixup_overlap_and_insert.
>
> Improve __maps__fixup_overlap_and_insert's performance 21x in the case
> of overlapping mmaps. sesse@google.com reported slowness opening
> perf.data files from chromium where the files contained a large number
> of overlapping mappings. Improve this case primarily by avoiding
> unnecessary sorting.
>
> Unscientific timing data processing a perf.data file with overlapping
> mmap events from chromium:
>
> Before:
> real 0m9.856s
> user 0m9.637s
> sys 0m0.204s
>
> After:
> real 0m0.675s
> user 0m0.454s
> sys 0m0.196s
>
> Tested with address/leak sanitizer, invariant checks and validating
> the before and after output are identical.
>
> Ian Rogers (3):
> perf maps: Fix use after free in __maps__fixup_overlap_and_insert
> perf maps: Reduce sorting for overlapping mappings
> perf maps: Add/use a sorted insert for fixup overlap and insert
>
> tools/perf/util/maps.c | 113 +++++++++++++++++++++++++++++++++--------
> 1 file changed, 92 insertions(+), 21 deletions(-)
>
Reviewed-by: James Clark <james.clark@arm.com>
I'm wondering if there is any point in the non sorted insert any more?
Maps could be always either sorted by name or sorted by address and
insert() is always a sorted/fixup-overlaps insert depending on the sort
style of the list.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH v1 0/3] Fix and improve __maps__fixup_overlap_and_insert
2024-06-06 10:56 ` James Clark
@ 2024-06-07 4:31 ` Ian Rogers
0 siblings, 0 replies; 8+ messages in thread
From: Ian Rogers @ 2024-06-07 4:31 UTC (permalink / raw)
To: James Clark
Cc: Steinar H . Gunderson, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
linux-perf-users, linux-kernel
On Thu, Jun 6, 2024 at 3:56 AM James Clark <james.clark@arm.com> wrote:
>
>
>
> On 21/05/2024 17:51, Ian Rogers wrote:
> > Fix latent unlikely bugs in __maps__fixup_overlap_and_insert.
> >
> > Improve __maps__fixup_overlap_and_insert's performance 21x in the case
> > of overlapping mmaps. sesse@google.com reported slowness opening
> > perf.data files from chromium where the files contained a large number
> > of overlapping mappings. Improve this case primarily by avoiding
> > unnecessary sorting.
> >
> > Unscientific timing data processing a perf.data file with overlapping
> > mmap events from chromium:
> >
> > Before:
> > real 0m9.856s
> > user 0m9.637s
> > sys 0m0.204s
> >
> > After:
> > real 0m0.675s
> > user 0m0.454s
> > sys 0m0.196s
> >
> > Tested with address/leak sanitizer, invariant checks and validating
> > the before and after output are identical.
> >
> > Ian Rogers (3):
> > perf maps: Fix use after free in __maps__fixup_overlap_and_insert
> > perf maps: Reduce sorting for overlapping mappings
> > perf maps: Add/use a sorted insert for fixup overlap and insert
> >
> > tools/perf/util/maps.c | 113 +++++++++++++++++++++++++++++++++--------
> > 1 file changed, 92 insertions(+), 21 deletions(-)
> >
>
> Reviewed-by: James Clark <james.clark@arm.com>
>
> I'm wondering if there is any point in the non sorted insert any more?
>
> Maps could be always either sorted by name or sorted by address and
> insert() is always a sorted/fixup-overlaps insert depending on the sort
> style of the list.
One of the motivations for the sorted array, instead of an rbtree, was
to simplify reference count checking. We're in much better shape with
that these days, I think the worst "memory leak" is the dsos holding
onto a dso and its symbols indefinitely, instead of removing older
unused dsos. It'd be hard to go back to an rb tree with reference
counting.
For the rbtree insert it was O(log N), the sorted insert these changes
add is O(N) and the regular insert without sorting is O(1). A common
case from scanning /proc/pid/maps is that when map entries are
inserted they are inserted in order. The O(1) insert checks that and
maintains that the maps are sorted.
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/maps.c?h=perf-tools-next#n469
For the synthesized maps we should be able to insert N map entries in
O(N). The sorted insert would be similar as the memmoves would always
copy 0 elements. So I can see an argument for keeping the array always
sorted. For the perf.data files we commonly process synthesized mmaps
dominate. In the case for this patch, JIT data dominated, with
frequent overlapping mapping changes. I guess the biggest hurdle to
just always keeping things sorted is the mental hurdle of ignoring the
worst insert complexity, which should never apply given how the maps
are commonly built.
Thanks,
Ian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v1 0/3] Fix and improve __maps__fixup_overlap_and_insert
2024-05-21 16:51 [PATCH v1 0/3] Fix and improve __maps__fixup_overlap_and_insert Ian Rogers
` (4 preceding siblings ...)
2024-06-06 10:56 ` James Clark
@ 2024-06-07 17:51 ` Namhyung Kim
5 siblings, 0 replies; 8+ messages in thread
From: Namhyung Kim @ 2024-06-07 17:51 UTC (permalink / raw)
To: Steinar H . Gunderson, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Kan Liang, linux-perf-users,
linux-kernel, Ian Rogers
On Tue, 21 May 2024 09:51:06 -0700, Ian Rogers wrote:
> Fix latent unlikely bugs in __maps__fixup_overlap_and_insert.
>
> Improve __maps__fixup_overlap_and_insert's performance 21x in the case
> of overlapping mmaps. sesse@google.com reported slowness opening
> perf.data files from chromium where the files contained a large number
> of overlapping mappings. Improve this case primarily by avoiding
> unnecessary sorting.
>
> [...]
Applied to perf-tools-next, thanks!
Best regards,
Namhyung
^ permalink raw reply [flat|nested] 8+ messages in thread