From: Matt Fleming <matt@codeblueprint.co.uk>
To: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] perf ordered_events: Optimise event object reuse
Date: Wed, 20 May 2020 14:00:49 +0100 [thread overview]
Message-ID: <20200520130049.GC19431@codeblueprint.co.uk> (raw)
In-Reply-To: <20200518120408.GD3726797@krava>
On Mon, 18 May, at 02:04:08PM, Jiri Olsa wrote:
> On Fri, May 15, 2020 at 10:01:51PM +0100, Matt Fleming wrote:
> > ordered_event objects can be placed on the free object cache list in any
> > order which means future allocations may not return objects at
> > sequential locations in memory. Getting non-contiguous objects from the
> > free cache has bad consequences when later iterating over those objects
> > in ordered_events__queue().
> >
> > For example, large perf.data files can contain trillions of events and
> > since objects that are next to each other in the free cache linked list
> > can point to pretty much anywhere in the object address space, lots of
> > cycles in ordered_events__queue() are spent servicing DTLB misses.
> >
> > Implement the free object cache using the in-kernel implementation of
> > interval trees so that objects can always be allocated from the free
> > object cache in sequential order, improving spatial locality and
> > reducing DTLB misses.
> >
> > Here are some numbers showing the speed up (reducing in execution time)
> > when running perf sched latency on sched events data and perf report on
> > HW_CPU_CYCLES.
>
> really nice, few questions below
>
> >
> > $ perf stat --null -r 10 -- bash -c \
> > "export PAGER=cat ; perf sched latency -i $file --stdio &>/dev/null"
> >
> > Nr events File Size Before After Speed up
> > -------------- --------- -------- ------- ----------
> > 123318457470 29MB 0.2149 0.2440 -13.5%
>
> should we be concerned about small data and the extra processing?
I didn't look into this slowdown originally because it's ~2.9 ms, but
FYI it looks like this is caused by:
- Longer code paths (more instructions)
- More branches
- More branch mispredicts
> maybe we could add some option that disables this, at leat to be
> able to compare times in the future
Sure. Do you mean a command-line option or build-time config?
> > diff --git a/tools/include/linux/interval_tree_generic.h b/tools/include/linux/interval_tree_generic.h
> > new file mode 100644
> > index 000000000000..aaa8a0767aa3
> > --- /dev/null
> > +++ b/tools/include/linux/interval_tree_generic.h
> > @@ -0,0 +1,187 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> > +/*
> > + Interval Trees
> > + (C) 2012 Michel Lespinasse <walken@google.com>
> > +
> > +
> > + include/linux/interval_tree_generic.h
> > +*/
> > +
> > +#include <linux/rbtree_augmented.h>
> > +
> > +/*
> > + * Template for implementing interval trees
> > + *
> > + * ITSTRUCT: struct type of the interval tree nodes
> > + * ITRB: name of struct rb_node field within ITSTRUCT
> > + * ITTYPE: type of the interval endpoints
> > + * ITSUBTREE: name of ITTYPE field within ITSTRUCT holding last-in-subtree
> > + * ITSTART(n): start endpoint of ITSTRUCT node n
> > + * ITLAST(n): last endpoint of ITSTRUCT node n
> > + * ITSTATIC: 'static' or empty
> > + * ITPREFIX: prefix to use for the inline tree definitions
> > + *
> > + * Note - before using this, please consider if generic version
> > + * (interval_tree.h) would work for you...
>
> the interval_tree.h looks like what you have added with the generic
> header.. is there some reason to use the _generic header?
Yes, the _generic version contains the actual implementation of
interval trees, so you need both.
> please add the header file also to check-headers.sh
Done!
> > diff --git a/tools/perf/tests/free-object-cache.c b/tools/perf/tests/free-object-cache.c
> > new file mode 100644
> > index 000000000000..e4395ece7d2b
> > --- /dev/null
> > +++ b/tools/perf/tests/free-object-cache.c
> > @@ -0,0 +1,200 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include "tests.h"
> > +#include <linux/kernel.h>
> > +
> > +#define ordered_events__flush_time __test_ordered_events__flush_time
> > +#define ordered_events__first_time __test_ordered_events__first_time
> > +#define ordered_events__delete __test_ordered_events__delete
> > +#define ordered_events__init __test_ordered_events__init
> > +#define ordered_events__free __test_ordered_events__free
> > +#define ordered_events__queue __test_ordered_events__queue
> > +#define ordered_events__reinit __test_ordered_events__reinit
> > +#define ordered_events__flush __test_ordered_events__flush
>
> I'm excited to see these tests, but why is above needed?
>
> can't you use ordered-events interface as it is? you used only
> exported functions right?
Nope, the tests in this file are unit tests so I'm testing
free_cache_{get,put} which are file-local functions by #include'ing
ordered-events.c.
The above define are required to avoid duplicate symbol errors at
link-time, e.g.
util/perf-in.o: In function `ordered_events__flush_time':
/home/matt/src/kernels/linux/tools/perf/util/ordered-events.c:461: multiple definition of `ordered_events__flush_time'
tests/perf-in.o:/home/matt/src/kernels/linux/tools/perf/tests/../util/ordered-events.c:461: first defined here
There are other ways to resolve this (linker flags to change the
symbols) but I couldn't find any precedent with that, so this seemed
like the easiest and most obvious solution. I'm happy to fix this up any
other way if you have suggestions though.
> > +static struct ordered_event *free_cache_get(struct ordered_events *oe)
> > +{
> > + struct interval_tree_node *it;
> > + struct ordered_event *new;
> > + size_t bytes = sizeof(*new);
> > +
> > + it = interval_tree_iter_first(&oe->cache, 0, ULONG_MAX);
> > + if (!it)
> > + return NULL;
> > +
> > + /* Has the cache memory been exhausted? */
> > + assert(cache_region_size(it) >= bytes);
> > +
> > + new = (void *)it->start;
> > + interval_tree_remove(it, &oe->cache);
> > +
> > + it->start += bytes;
> > + if (cache_region_size(it))
> > + interval_tree_insert(it, &oe->cache);
>
> in case we did not use the whole node, do we need to remove
> and insert back the node? it should be still the lowest node
> in the tree no? we could just fix 'start'
Hmm.. I originally did this remove/insert dance to be sure that the
rbtree semantics are maintained. But I think you're right that adjusting
->start cannot cause this rbtree node to move places in the tree and the
->start value isn't stored anywhere other than this interval_tree node
(a copy of ->last, on the other hand, can be stored in the parent node).
> > diff --git a/tools/perf/util/ordered-events.h b/tools/perf/util/ordered-events.h
> > index 0920fb0ec6cc..20d108baa572 100644
> > --- a/tools/perf/util/ordered-events.h
> > +++ b/tools/perf/util/ordered-events.h
> > @@ -3,9 +3,15 @@
> > #define __ORDERED_EVENTS_H
> >
> > #include <linux/types.h>
> > +#include <linux/interval_tree.h>
> >
> > struct perf_sample;
> >
> > +struct cache_region {
> > + struct interval_tree_node node;
> > + size_t len;
> > +};
>
> ^^^ this looks like some leftover
Oops :) Yes, you're right. This should not be here.
next prev parent reply other threads:[~2020-05-20 13:00 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-15 21:01 [PATCH] perf ordered_events: Optimise event object reuse Matt Fleming
2020-05-18 12:04 ` Jiri Olsa
2020-05-20 13:00 ` Matt Fleming [this message]
2020-05-20 21:52 ` Jiri Olsa
2020-05-21 14:41 ` Matt Fleming
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200520130049.GC19431@codeblueprint.co.uk \
--to=matt@codeblueprint.co.uk \
--cc=acme@redhat.com \
--cc=jolsa@kernel.org \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.