* [PATCH v3 0/2] perf tool: improves DSO long names search speed with RB tree
@ 2014-09-18 13:30 Waiman Long
2014-09-18 13:30 ` [PATCH v3 1/2] " Waiman Long
2014-09-18 13:30 ` [PATCH v3 2/2] perf tool: iterate DSOs with rbtree instead of list Waiman Long
0 siblings, 2 replies; 5+ messages in thread
From: Waiman Long @ 2014-09-18 13:30 UTC (permalink / raw)
To: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
Arnaldo Carvalho de Melo
Cc: linux-kernel, Scott J Norton, Douglas Hatch, Don Zickus,
Jiri Olsa, Adrian Hunter, Waiman Long
v2->v3:
- Move the rbtree linking operation from dso__set_long_name() to
dsos__add(), where the list_add() operation was done.
- Add a second patch to remove the linked list and iterates the
DSO structures by going through them in the rbtree. This requires
changes in quite a number of files, but it makes for neater code.
- Rebased to the 3.17-rc5 kernel.
v1->v2:
- Rename DSO longname RBtree find function to segregate its two
different uses of searching and linking DSO into RB tree.
This patch set replaces the list that is linking the DSO structures
of the perf tool by rbtree sorted by its long name. This can
significantly speed up DSO processing when a large number of DSOs
are beining profiled.
Waiman Long (2):
perf tool: improves DSO long names search speed with RB tree
perf tool: iterate DSOs with rbtree instead of list
tools/perf/util/dso.c | 118 +++++++++++++++++++++++++++++++++--------
tools/perf/util/dso.h | 25 ++++++---
tools/perf/util/header.c | 36 ++++++------
tools/perf/util/machine.c | 14 ++---
tools/perf/util/machine.h | 4 +-
tools/perf/util/probe-event.c | 4 +-
tools/perf/util/symbol-elf.c | 2 +-
7 files changed, 142 insertions(+), 61 deletions(-)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v3 1/2] perf tool: improves DSO long names search speed with RB tree
2014-09-18 13:30 [PATCH v3 0/2] perf tool: improves DSO long names search speed with RB tree Waiman Long
@ 2014-09-18 13:30 ` Waiman Long
2014-09-18 15:10 ` Arnaldo Carvalho de Melo
2014-09-18 13:30 ` [PATCH v3 2/2] perf tool: iterate DSOs with rbtree instead of list Waiman Long
1 sibling, 1 reply; 5+ messages in thread
From: Waiman Long @ 2014-09-18 13:30 UTC (permalink / raw)
To: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
Arnaldo Carvalho de Melo
Cc: linux-kernel, Scott J Norton, Douglas Hatch, Don Zickus,
Jiri Olsa, Adrian Hunter, Waiman Long
With workload that spawns and destroys many threads and processes,
it was found that perf-mem could took a long time to post-process
the perf data after the target workload had completed its operation.
The performance bottleneck was found to be searching and insertion
of the new DSO structures (thousands of them in this case).
In a dual-socket Ivy-Bridge E7-4890 v2 machine (30-core, 60-thread),
the perf profile below shows what perf was doing after the profiled
AIM7 shared workload completed:
- 83.94% perf libc-2.11.3.so [.] __strcmp_sse42
- __strcmp_sse42
- 99.82% map__new
machine__process_mmap_event
perf_session_deliver_event
perf_session__process_event
__perf_session__process_events
cmd_record
cmd_mem
run_builtin
main
__libc_start_main
- 13.17% perf perf [.] __dsos__findnew
__dsos__findnew
map__new
machine__process_mmap_event
perf_session_deliver_event
perf_session__process_event
__perf_session__process_events
cmd_record
cmd_mem
run_builtin
main
__libc_start_main
So about 97% of CPU times were spent in the map__new() function
trying to insert new DSO entry into the DSO linked list. The whole
post-processing step took about 9 minutes.
The DSO structures are currently searched linearly. So the total
processing time will be proportional to n^2.
To overcome this performance problem, the DSO code is modified to
also put the DSO structures in a RB tree sorted by its long name
in additional to being in a simple linked list. With this change,
the processing time will become proportional to n*log(n) which will
be much quicker for large n. However, the short name will still
be searched using the old linear searching method which is slow.
With that patch in place, the same perf-mem post-processing step took
less than 30 seconds to complete.
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
tools/perf/util/dso.c | 87 ++++++++++++++++++++++++++++++++++++++++++++++--
tools/perf/util/dso.h | 2 +
2 files changed, 85 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 90d02c6..6293f89 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -651,6 +651,80 @@ struct dso *dso__kernel_findnew(struct machine *machine, const char *name,
return dso;
}
+/*
+ * RB root of DSOs sorted by the long name
+ */
+struct rb_root dso__root = { NULL };
+
+/*
+ * Find a matching entry and/or link current entry to RB tree.
+ * Either one of the dso or name parameter must be non-NULL or the
+ * function will not work.
+ */
+static struct dso *dso__findlink_by_longname(struct rb_root *root,
+ struct dso *dso, const char *name)
+{
+ struct rb_node **p = &root->rb_node;
+ struct rb_node *parent = NULL;
+ int warned = false;
+
+ if (!name)
+ name = dso->long_name;
+ /*
+ * Find node with the matching name
+ */
+ while (*p) {
+ struct dso *this = rb_entry(*p, struct dso, rb_node);
+ long rc = (long)strcmp(name, this->long_name);
+
+ parent = *p;
+ if (rc == 0) {
+ /*
+ * In case the new DSO is a duplicate of an existing
+ * one, print an one-time warning & sort the entry
+ * by its DSO address.
+ */
+ if (!dso || (dso == this))
+ return this; /* Find matching dso */
+ /*
+ * The kernel DSOs may have duplicated long name,
+ * so don't print warning for them.
+ */
+ if (!warned && !strstr(name, "kernel.kallsyms")
+ && !strstr(name, "/vmlinux")) {
+ pr_warning("Duplicated dso long name: %s\n",
+ name);
+ warned = true;
+ }
+ rc = (long)dso - (long)this;
+ }
+ if (rc < 0)
+ p = &parent->rb_left;
+ else
+ p = &parent->rb_right;
+ }
+ if (dso) {
+ /* Add new node and rebalance tree */
+ rb_link_node(&dso->rb_node, parent, p);
+ rb_insert_color(&dso->rb_node, root);
+ }
+ return NULL;
+}
+
+static inline struct dso *
+dso__find_by_longname(struct rb_root *root, const char *name)
+{
+ return dso__findlink_by_longname(root, NULL, name);
+}
+
+/*
+ * Unlink the longname-sorted RB tree node
+ */
+static inline void dso__rb_unlink(struct rb_root *root, struct dso *dso)
+{
+ rb_erase(&dso->rb_node, root);
+}
+
void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated)
{
if (name == NULL)
@@ -753,6 +827,8 @@ struct dso *dso__new(const char *name)
dso->a2l_fails = 1;
dso->kernel = DSO_TYPE_USER;
dso->needs_swap = DSO_SWAP__UNSET;
+ dso->rb_root = NULL;
+ RB_CLEAR_NODE(&dso->rb_node);
INIT_LIST_HEAD(&dso->node);
INIT_LIST_HEAD(&dso->data.open_entry);
}
@@ -775,6 +851,10 @@ void dso__delete(struct dso *dso)
zfree((char **)&dso->long_name);
dso->long_name_allocated = false;
}
+ if (dso->rb_root) {
+ dso__rb_unlink(dso->rb_root, dso);
+ dso->rb_root = NULL;
+ }
dso__data_close(dso);
dso_cache__free(&dso->data.cache);
@@ -852,6 +932,8 @@ bool __dsos__read_build_ids(struct list_head *head, bool with_hits)
void dsos__add(struct list_head *head, struct dso *dso)
{
list_add_tail(&dso->node, head);
+ dso__findlink_by_longname(&dso__root, dso, NULL);
+ dso->rb_root = &dso__root;
}
struct dso *dsos__find(const struct list_head *head, const char *name, bool cmp_short)
@@ -864,10 +946,7 @@ struct dso *dsos__find(const struct list_head *head, const char *name, bool cmp_
return pos;
return NULL;
}
- list_for_each_entry(pos, head, node)
- if (strcmp(pos->long_name, name) == 0)
- return pos;
- return NULL;
+ return dso__find_by_longname(&dso__root, name);
}
struct dso *__dsos__findnew(struct list_head *head, const char *name)
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 5e463c0..75cda1d 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -92,6 +92,8 @@ struct dso_cache {
struct dso {
struct list_head node;
+ struct rb_node rb_node; /* rbtree sorted by long name */
+ struct rb_root *rb_root; /* pointer to rbtree root */
struct rb_root symbols[MAP__NR_TYPES];
struct rb_root symbol_names[MAP__NR_TYPES];
void *a2l;
--
1.7.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v3 2/2] perf tool: iterate DSOs with rbtree instead of list
2014-09-18 13:30 [PATCH v3 0/2] perf tool: improves DSO long names search speed with RB tree Waiman Long
2014-09-18 13:30 ` [PATCH v3 1/2] " Waiman Long
@ 2014-09-18 13:30 ` Waiman Long
1 sibling, 0 replies; 5+ messages in thread
From: Waiman Long @ 2014-09-18 13:30 UTC (permalink / raw)
To: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
Arnaldo Carvalho de Melo
Cc: linux-kernel, Scott J Norton, Douglas Hatch, Don Zickus,
Jiri Olsa, Adrian Hunter, Waiman Long
This patch changes the iteration process for DSOs from using the
list_head to rbtree. As a result, the node field in the DSO structure
can be eliminated. The user_dsos and kernel_dsos fields of the machine
structure are now rb_root.
The changes in this patch are mainly in one of the following 4
categories:
1) Change list_head to rb_root and list to root
2) Change list_for_each_entry*() to
rbtree_postorder_for_each_entry_safe() and add a temporary variable
needed by the rbtree macro
3) Initialize the modified fields accordingly
4) Remove the global dso__root variable and use root to replace
dso__root
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
tools/perf/util/dso.c | 47 +++++++++++++++++-----------------------
tools/perf/util/dso.h | 23 ++++++++++++++------
| 36 +++++++++++++++---------------
tools/perf/util/machine.c | 14 +++++-------
tools/perf/util/machine.h | 4 +-
tools/perf/util/probe-event.c | 4 +-
tools/perf/util/symbol-elf.c | 2 +-
7 files changed, 65 insertions(+), 65 deletions(-)
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 6293f89..7597d88 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -652,11 +652,6 @@ struct dso *dso__kernel_findnew(struct machine *machine, const char *name,
}
/*
- * RB root of DSOs sorted by the long name
- */
-struct rb_root dso__root = { NULL };
-
-/*
* Find a matching entry and/or link current entry to RB tree.
* Either one of the dso or name parameter must be non-NULL or the
* function will not work.
@@ -829,7 +824,6 @@ struct dso *dso__new(const char *name)
dso->needs_swap = DSO_SWAP__UNSET;
dso->rb_root = NULL;
RB_CLEAR_NODE(&dso->rb_node);
- INIT_LIST_HEAD(&dso->node);
INIT_LIST_HEAD(&dso->data.open_entry);
}
@@ -907,12 +901,12 @@ int dso__kernel_module_get_build_id(struct dso *dso,
return 0;
}
-bool __dsos__read_build_ids(struct list_head *head, bool with_hits)
+bool __dsos__read_build_ids(struct rb_root *root, bool with_hits)
{
bool have_build_id = false;
- struct dso *pos;
+ struct dso *pos, *n;
- list_for_each_entry(pos, head, node) {
+ dso__for_each_entry_safe(pos, n, root) {
if (with_hits && !pos->hit)
continue;
if (pos->has_build_id) {
@@ -929,34 +923,33 @@ bool __dsos__read_build_ids(struct list_head *head, bool with_hits)
return have_build_id;
}
-void dsos__add(struct list_head *head, struct dso *dso)
+void dsos__add(struct rb_root *root, struct dso *dso)
{
- list_add_tail(&dso->node, head);
- dso__findlink_by_longname(&dso__root, dso, NULL);
- dso->rb_root = &dso__root;
+ dso__findlink_by_longname(root, dso, NULL);
+ dso->rb_root = root;
}
-struct dso *dsos__find(const struct list_head *head, const char *name, bool cmp_short)
+struct dso *dsos__find(const struct rb_root *root, const char *name, bool cmp_short)
{
- struct dso *pos;
-
if (cmp_short) {
- list_for_each_entry(pos, head, node)
+ struct dso *pos, *n;
+
+ dso__for_each_entry_safe(pos, n, root)
if (strcmp(pos->short_name, name) == 0)
return pos;
return NULL;
}
- return dso__find_by_longname(&dso__root, name);
+ return dso__find_by_longname((struct rb_root *)root, name);
}
-struct dso *__dsos__findnew(struct list_head *head, const char *name)
+struct dso *__dsos__findnew(struct rb_root *root, const char *name)
{
- struct dso *dso = dsos__find(head, name, false);
+ struct dso *dso = dsos__find(root, name, false);
if (!dso) {
dso = dso__new(name);
if (dso != NULL) {
- dsos__add(head, dso);
+ dsos__add(root, dso);
dso__set_basename(dso);
}
}
@@ -964,13 +957,13 @@ struct dso *__dsos__findnew(struct list_head *head, const char *name)
return dso;
}
-size_t __dsos__fprintf_buildid(struct list_head *head, FILE *fp,
+size_t __dsos__fprintf_buildid(struct rb_root *root, FILE *fp,
bool (skip)(struct dso *dso, int parm), int parm)
{
- struct dso *pos;
+ struct dso *pos, *n;
size_t ret = 0;
- list_for_each_entry(pos, head, node) {
+ dso__for_each_entry_safe(pos, n, root) {
if (skip && skip(pos, parm))
continue;
ret += dso__fprintf_buildid(pos, fp);
@@ -979,12 +972,12 @@ size_t __dsos__fprintf_buildid(struct list_head *head, FILE *fp,
return ret;
}
-size_t __dsos__fprintf(struct list_head *head, FILE *fp)
+size_t __dsos__fprintf(struct rb_root *root, FILE *fp)
{
- struct dso *pos;
+ struct dso *pos, *n;
size_t ret = 0;
- list_for_each_entry(pos, head, node) {
+ dso__for_each_entry_safe(pos, n, root) {
int i;
for (i = 0; i < MAP__NR_TYPES; ++i)
ret += dso__fprintf(pos, i, fp);
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 75cda1d..23fff82 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -91,7 +91,6 @@ struct dso_cache {
};
struct dso {
- struct list_head node;
struct rb_node rb_node; /* rbtree sorted by long name */
struct rb_root *rb_root; /* pointer to rbtree root */
struct rb_root symbols[MAP__NR_TYPES];
@@ -143,6 +142,16 @@ struct dso {
#define dso__for_each_symbol(dso, pos, n, type) \
symbols__for_each_entry(&(dso)->symbols[(type)], pos, n)
+/*
+ * dso__for_each_entry_safe - iterate over all the dso entries in the rbtree
+ *
+ * @root: the rbtree root pointer
+ * @pos : the 'struct dso *' to use as a loop cursor
+ * @n : the 'struct dso *' to use as a temporary storage
+ */
+#define dso__for_each_entry_safe(root, pos, n) \
+ rbtree_postorder_for_each_entry_safe(root, pos, n, rb_node)
+
static inline void dso__set_loaded(struct dso *dso, enum map_type type)
{
dso->loaded |= (1 << type);
@@ -226,15 +235,15 @@ struct map *dso__new_map(const char *name);
struct dso *dso__kernel_findnew(struct machine *machine, const char *name,
const char *short_name, int dso_type);
-void dsos__add(struct list_head *head, struct dso *dso);
-struct dso *dsos__find(const struct list_head *head, const char *name,
+void dsos__add(struct rb_root *root, struct dso *dso);
+struct dso *dsos__find(const struct rb_root *root, const char *name,
bool cmp_short);
-struct dso *__dsos__findnew(struct list_head *head, const char *name);
-bool __dsos__read_build_ids(struct list_head *head, bool with_hits);
+struct dso *__dsos__findnew(struct rb_root *root, const char *name);
+bool __dsos__read_build_ids(struct rb_root *root, bool with_hits);
-size_t __dsos__fprintf_buildid(struct list_head *head, FILE *fp,
+size_t __dsos__fprintf_buildid(struct rb_root *root, FILE *fp,
bool (skip)(struct dso *dso, int parm), int parm);
-size_t __dsos__fprintf(struct list_head *head, FILE *fp);
+size_t __dsos__fprintf(struct rb_root *root, FILE *fp);
size_t dso__fprintf_buildid(struct dso *dso, FILE *fp);
size_t dso__fprintf_symbols_by_name(struct dso *dso,
--git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 158c787..636c183 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -171,10 +171,10 @@ perf_header__set_cmdline(int argc, const char **argv)
return 0;
}
-#define dsos__for_each_with_build_id(pos, head) \
- list_for_each_entry(pos, head, node) \
- if (!pos->has_build_id) \
- continue; \
+#define dsos__for_each_with_build_id(pos, n, root) \
+ dso__for_each_entry_safe(pos, n, root) \
+ if (!pos->has_build_id) \
+ continue; \
else
static int write_buildid(const char *name, size_t name_len, u8 *build_id,
@@ -200,11 +200,11 @@ static int write_buildid(const char *name, size_t name_len, u8 *build_id,
return write_padded(fd, name, name_len + 1, len);
}
-static int __dsos__hit_all(struct list_head *head)
+static int __dsos__hit_all(struct rb_root *root)
{
- struct dso *pos;
+ struct dso *pos, *n;
- list_for_each_entry(pos, head, node)
+ dso__for_each_entry_safe(pos, n, root)
pos->hit = true;
return 0;
@@ -241,14 +241,14 @@ int dsos__hit_all(struct perf_session *session)
return 0;
}
-static int __dsos__write_buildid_table(struct list_head *head,
+static int __dsos__write_buildid_table(struct rb_root *root,
struct machine *machine,
pid_t pid, u16 misc, int fd)
{
char nm[PATH_MAX];
- struct dso *pos;
+ struct dso *pos, *n;
- dsos__for_each_with_build_id(pos, head) {
+ dsos__for_each_with_build_id(pos, n, root) {
int err;
const char *name;
size_t name_len;
@@ -440,13 +440,13 @@ static int dso__cache_build_id(struct dso *dso, struct machine *machine,
debugdir, is_kallsyms, is_vdso);
}
-static int __dsos__cache_build_ids(struct list_head *head,
+static int __dsos__cache_build_ids(struct rb_root *root,
struct machine *machine, const char *debugdir)
{
- struct dso *pos;
+ struct dso *pos, *n;
int err = 0;
- dsos__for_each_with_build_id(pos, head)
+ dsos__for_each_with_build_id(pos, n, root)
if (dso__cache_build_id(pos, machine, debugdir))
err = -1;
@@ -1548,7 +1548,7 @@ static int __event_process_build_id(struct build_id_event *bev,
struct perf_session *session)
{
int err = -1;
- struct list_head *head;
+ struct rb_root *root;
struct machine *machine;
u16 misc;
struct dso *dso;
@@ -1563,22 +1563,22 @@ static int __event_process_build_id(struct build_id_event *bev,
switch (misc) {
case PERF_RECORD_MISC_KERNEL:
dso_type = DSO_TYPE_KERNEL;
- head = &machine->kernel_dsos;
+ root = &machine->kernel_dsos;
break;
case PERF_RECORD_MISC_GUEST_KERNEL:
dso_type = DSO_TYPE_GUEST_KERNEL;
- head = &machine->kernel_dsos;
+ root = &machine->kernel_dsos;
break;
case PERF_RECORD_MISC_USER:
case PERF_RECORD_MISC_GUEST_USER:
dso_type = DSO_TYPE_USER;
- head = &machine->user_dsos;
+ root = &machine->user_dsos;
break;
default:
goto out;
}
- dso = __dsos__findnew(head, filename);
+ dso = __dsos__findnew(root, filename);
if (dso != NULL) {
char sbuild_id[BUILD_ID_SIZE * 2 + 1];
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 16bba9f..317f50b 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -17,8 +17,8 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
{
map_groups__init(&machine->kmaps);
RB_CLEAR_NODE(&machine->rb_node);
- INIT_LIST_HEAD(&machine->user_dsos);
- INIT_LIST_HEAD(&machine->kernel_dsos);
+ machine->user_dsos = RB_ROOT;
+ machine->kernel_dsos = RB_ROOT;
machine->threads = RB_ROOT;
INIT_LIST_HEAD(&machine->dead_threads);
@@ -70,14 +70,12 @@ out_delete:
return NULL;
}
-static void dsos__delete(struct list_head *dsos)
+static void dsos__delete(struct rb_root *dsos)
{
struct dso *pos, *n;
- list_for_each_entry_safe(pos, n, dsos, node) {
- list_del(&pos->node);
+ dso__for_each_entry_safe(pos, n, dsos)
dso__delete(pos);
- }
}
void machine__delete_dead_threads(struct machine *machine)
@@ -963,9 +961,9 @@ static void machine__set_kernel_mmap_len(struct machine *machine,
static bool machine__uses_kcore(struct machine *machine)
{
- struct dso *dso;
+ struct dso *dso, *n;
- list_for_each_entry(dso, &machine->kernel_dsos, node) {
+ dso__for_each_entry_safe(dso, n, &machine->kernel_dsos) {
if (dso__is_kcore(dso))
return true;
}
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index b972824..c75ce9e 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -31,8 +31,8 @@ struct machine {
struct list_head dead_threads;
struct thread *last_match;
struct vdso_info *vdso_info;
- struct list_head user_dsos;
- struct list_head kernel_dsos;
+ struct rb_root user_dsos;
+ struct rb_root kernel_dsos;
struct map_groups kmaps;
struct map *vmlinux_maps[MAP__NR_TYPES];
symbol_filter_t symbol_filter;
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 9a0a183..b093d25 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -179,12 +179,12 @@ static struct map *kernel_get_module_map(const char *module)
static struct dso *kernel_get_module_dso(const char *module)
{
- struct dso *dso;
+ struct dso *dso, *n;
struct map *map;
const char *vmlinux_name;
if (module) {
- list_for_each_entry(dso, &host_machine->kernel_dsos, node) {
+ dso__for_each_entry_safe(dso, n, &host_machine->kernel_dsos) {
if (strncmp(dso->short_name + 1, module,
dso->short_name_len - 2) == 0)
goto found;
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index d753499..97b0d18 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -916,7 +916,7 @@ int dso__load_sym(struct dso *dso, struct map *map,
}
curr_dso->symtab_type = dso->symtab_type;
map_groups__insert(kmap->kmaps, curr_map);
- dsos__add(&dso->node, curr_dso);
+ dsos__add(dso->rb_root, curr_dso);
dso__set_loaded(curr_dso, map->type);
} else
curr_dso = curr_map->dso;
--
1.7.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v3 1/2] perf tool: improves DSO long names search speed with RB tree
2014-09-18 13:30 ` [PATCH v3 1/2] " Waiman Long
@ 2014-09-18 15:10 ` Arnaldo Carvalho de Melo
2014-09-24 15:29 ` Waiman Long
0 siblings, 1 reply; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-18 15:10 UTC (permalink / raw)
To: Waiman Long
Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, linux-kernel,
Scott J Norton, Douglas Hatch, Don Zickus, Jiri Olsa,
Adrian Hunter
Em Thu, Sep 18, 2014 at 09:30:20AM -0400, Waiman Long escreveu:
> +++ b/tools/perf/util/dso.c
> @@ -651,6 +651,80 @@ struct dso *dso__kernel_findnew(struct machine *machine, const char *name,
> return dso;
> +/*
> + * RB root of DSOs sorted by the long name
> + */
> +struct rb_root dso__root = { NULL };
Why do we still have a global variable for this? I thought that we would
be having this in struct machine?
Ok, I shouldn't have done this, but I went on and looked at the second
patch, and there, this goes away, why not avoid introducing the global
in the first place?
I.e. the existing code operates on a data structure that holds struct
dsos, you are switching to a new data structure, so it looked natural to
me to do this in one step, no?
Also at some point I thought about adding rb_tree helper functions to do
some rb__for_each() like operation, i.e. to sequentially access the
rb_tree instead of using it for searching using its key. PeterZ
rightfully nacked that because that would, IIRC, encourage people to use
a rb_tree to do linear searches for normal operation, i.e. not just for
rb_tree__printf() dump like routines:
https://lkml.org/lkml/2010/1/13/227
Also I saw at least one place where some foo__for_each_entry_safe() is
used but the loop doesn't look like it will remove/add anything to the
data structure that is being made "_safe", i.e. it should remain
foo__for_each_entry(), as it was before.
So, I would _keep_ the list_head, or else replace it with a another
rb_node to do lookups on it by shortname the same way we do for long
names.
The cheapest thing now would be for solving your problem, i.e. use a
rb_tree for searching for long names, keep the list_head for short names
linear searches.
I suggest having a
struct dsos {
struct list_head short_names;
struct rb_root long_names;
};
Then make struct machine use this type for:
struct dsos kernel_dsos, user_dsos;
Then all those dsos__find* routines stop receiving a list_head pointer
and start receiving a "struct dsos" instance.
That way it can add the dso to both containers, the one "sorted" by
short names (that linear search, just like before) and to the rb_tree
sorted by long names.
- Arnaldo
> +/*
> + * Find a matching entry and/or link current entry to RB tree.
> + * Either one of the dso or name parameter must be non-NULL or the
> + * function will not work.
> + */
> +static struct dso *dso__findlink_by_longname(struct rb_root *root,
> + struct dso *dso, const char *name)
> +{
> + struct rb_node **p = &root->rb_node;
> + struct rb_node *parent = NULL;
> + int warned = false;
> +
> + if (!name)
> + name = dso->long_name;
> + /*
> + * Find node with the matching name
> + */
> + while (*p) {
> + struct dso *this = rb_entry(*p, struct dso, rb_node);
> + long rc = (long)strcmp(name, this->long_name);
> +
> + parent = *p;
> + if (rc == 0) {
> + /*
> + * In case the new DSO is a duplicate of an existing
> + * one, print an one-time warning & sort the entry
> + * by its DSO address.
> + */
> + if (!dso || (dso == this))
> + return this; /* Find matching dso */
> + /*
> + * The kernel DSOs may have duplicated long name,
> + * so don't print warning for them.
> + */
> + if (!warned && !strstr(name, "kernel.kallsyms")
> + && !strstr(name, "/vmlinux")) {
> + pr_warning("Duplicated dso long name: %s\n",
> + name);
> + warned = true;
> + }
> + rc = (long)dso - (long)this;
> + }
> + if (rc < 0)
> + p = &parent->rb_left;
> + else
> + p = &parent->rb_right;
> + }
> + if (dso) {
> + /* Add new node and rebalance tree */
> + rb_link_node(&dso->rb_node, parent, p);
> + rb_insert_color(&dso->rb_node, root);
> + }
> + return NULL;
> +}
> +
> +static inline struct dso *
> +dso__find_by_longname(struct rb_root *root, const char *name)
> +{
> + return dso__findlink_by_longname(root, NULL, name);
> +}
> +
> +/*
> + * Unlink the longname-sorted RB tree node
> + */
> +static inline void dso__rb_unlink(struct rb_root *root, struct dso *dso)
> +{
> + rb_erase(&dso->rb_node, root);
> +}
> +
> void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated)
> {
> if (name == NULL)
> @@ -753,6 +827,8 @@ struct dso *dso__new(const char *name)
> dso->a2l_fails = 1;
> dso->kernel = DSO_TYPE_USER;
> dso->needs_swap = DSO_SWAP__UNSET;
> + dso->rb_root = NULL;
> + RB_CLEAR_NODE(&dso->rb_node);
> INIT_LIST_HEAD(&dso->node);
> INIT_LIST_HEAD(&dso->data.open_entry);
> }
> @@ -775,6 +851,10 @@ void dso__delete(struct dso *dso)
> zfree((char **)&dso->long_name);
> dso->long_name_allocated = false;
> }
> + if (dso->rb_root) {
> + dso__rb_unlink(dso->rb_root, dso);
> + dso->rb_root = NULL;
> + }
>
> dso__data_close(dso);
> dso_cache__free(&dso->data.cache);
> @@ -852,6 +932,8 @@ bool __dsos__read_build_ids(struct list_head *head, bool with_hits)
> void dsos__add(struct list_head *head, struct dso *dso)
> {
> list_add_tail(&dso->node, head);
> + dso__findlink_by_longname(&dso__root, dso, NULL);
> + dso->rb_root = &dso__root;
> }
>
> struct dso *dsos__find(const struct list_head *head, const char *name, bool cmp_short)
> @@ -864,10 +946,7 @@ struct dso *dsos__find(const struct list_head *head, const char *name, bool cmp_
> return pos;
> return NULL;
> }
> - list_for_each_entry(pos, head, node)
> - if (strcmp(pos->long_name, name) == 0)
> - return pos;
> - return NULL;
> + return dso__find_by_longname(&dso__root, name);
> }
>
> struct dso *__dsos__findnew(struct list_head *head, const char *name)
> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
> index 5e463c0..75cda1d 100644
> --- a/tools/perf/util/dso.h
> +++ b/tools/perf/util/dso.h
> @@ -92,6 +92,8 @@ struct dso_cache {
>
> struct dso {
> struct list_head node;
> + struct rb_node rb_node; /* rbtree sorted by long name */
> + struct rb_root *rb_root; /* pointer to rbtree root */
> struct rb_root symbols[MAP__NR_TYPES];
> struct rb_root symbol_names[MAP__NR_TYPES];
> void *a2l;
> --
> 1.7.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3 1/2] perf tool: improves DSO long names search speed with RB tree
2014-09-18 15:10 ` Arnaldo Carvalho de Melo
@ 2014-09-24 15:29 ` Waiman Long
0 siblings, 0 replies; 5+ messages in thread
From: Waiman Long @ 2014-09-24 15:29 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, linux-kernel,
Scott J Norton, Douglas Hatch, Don Zickus, Jiri Olsa,
Adrian Hunter
On 09/18/2014 11:10 AM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Sep 18, 2014 at 09:30:20AM -0400, Waiman Long escreveu:
>> +++ b/tools/perf/util/dso.c
>> @@ -651,6 +651,80 @@ struct dso *dso__kernel_findnew(struct machine *machine, const char *name,
>> return dso;
>
>> +/*
>> + * RB root of DSOs sorted by the long name
>> + */
>> +struct rb_root dso__root = { NULL };
> Why do we still have a global variable for this? I thought that we would
> be having this in struct machine?
>
> Ok, I shouldn't have done this, but I went on and looked at the second
> patch, and there, this goes away, why not avoid introducing the global
> in the first place?
The global variable was added to make the first patch compilable by
itself. I will take this out in the next version of the patch.
> I.e. the existing code operates on a data structure that holds struct
> dsos, you are switching to a new data structure, so it looked natural to
> me to do this in one step, no?
Yes, I think that makes sense.
> Also at some point I thought about adding rb_tree helper functions to do
> some rb__for_each() like operation, i.e. to sequentially access the
> rb_tree instead of using it for searching using its key. PeterZ
> rightfully nacked that because that would, IIRC, encourage people to use
> a rb_tree to do linear searches for normal operation, i.e. not just for
> rb_tree__printf() dump like routines:
>
> https://lkml.org/lkml/2010/1/13/227
>
> Also I saw at least one place where some foo__for_each_entry_safe() is
> used but the loop doesn't look like it will remove/add anything to the
> data structure that is being made "_safe", i.e. it should remain
> foo__for_each_entry(), as it was before.
>
> So, I would _keep_ the list_head, or else replace it with a another
> rb_node to do lookups on it by shortname the same way we do for long
> names.
>
> The cheapest thing now would be for solving your problem, i.e. use a
> rb_tree for searching for long names, keep the list_head for short names
> linear searches.
>
> I suggest having a
>
> struct dsos {
> struct list_head short_names;
> struct rb_root long_names;
> };
>
> Then make struct machine use this type for:
>
> struct dsos kernel_dsos, user_dsos;
>
> Then all those dsos__find* routines stop receiving a list_head pointer
> and start receiving a "struct dsos" instance.
>
> That way it can add the dso to both containers, the one "sorted" by
> short names (that linear search, just like before) and to the rb_tree
> sorted by long names.
>
> - Arnaldo
I think this is a good idea. I will incorporate that in my next patch.
BTW, the list isn't sorted in any way. So I won't use the same structure
field names as you have suggested.
-Longman
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-09-24 15:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-18 13:30 [PATCH v3 0/2] perf tool: improves DSO long names search speed with RB tree Waiman Long
2014-09-18 13:30 ` [PATCH v3 1/2] " Waiman Long
2014-09-18 15:10 ` Arnaldo Carvalho de Melo
2014-09-24 15:29 ` Waiman Long
2014-09-18 13:30 ` [PATCH v3 2/2] perf tool: iterate DSOs with rbtree instead of list Waiman Long
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox