public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] perf tool: improves DSO long names search speed with RB tree
@ 2014-09-18 13:30 Waiman Long
  2014-09-18 13:30 ` [PATCH v3 1/2] " Waiman Long
  2014-09-18 13:30 ` [PATCH v3 2/2] perf tool: iterate DSOs with rbtree instead of list Waiman Long
  0 siblings, 2 replies; 5+ messages in thread
From: Waiman Long @ 2014-09-18 13:30 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo
  Cc: linux-kernel, Scott J Norton, Douglas Hatch, Don Zickus,
	Jiri Olsa, Adrian Hunter, Waiman Long

v2->v3:
  - Move the rbtree linking operation from dso__set_long_name() to
    dsos__add(), where the list_add() operation was done.
  - Add a second patch to remove the linked list and iterates the
    DSO structures by going through them in the rbtree. This requires
    changes in quite a number of files, but it makes for neater code.
  - Rebased to the 3.17-rc5 kernel.

v1->v2:
 - Rename DSO longname RBtree find function to segregate its two
   different uses of searching and linking DSO into RB tree.

This patch set replaces the list that is linking the DSO structures
of the perf tool by rbtree sorted by its long name. This can
significantly speed up DSO processing when a large number of DSOs
are beining profiled.

Waiman Long (2):
  perf tool: improves DSO long names search speed with RB     tree
  perf tool: iterate DSOs with rbtree instead of list

 tools/perf/util/dso.c         |  118 +++++++++++++++++++++++++++++++++--------
 tools/perf/util/dso.h         |   25 ++++++---
 tools/perf/util/header.c      |   36 ++++++------
 tools/perf/util/machine.c     |   14 ++---
 tools/perf/util/machine.h     |    4 +-
 tools/perf/util/probe-event.c |    4 +-
 tools/perf/util/symbol-elf.c  |    2 +-
 7 files changed, 142 insertions(+), 61 deletions(-)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v3 1/2] perf tool: improves DSO long names search speed with RB tree
  2014-09-18 13:30 [PATCH v3 0/2] perf tool: improves DSO long names search speed with RB tree Waiman Long
@ 2014-09-18 13:30 ` Waiman Long
  2014-09-18 15:10   ` Arnaldo Carvalho de Melo
  2014-09-18 13:30 ` [PATCH v3 2/2] perf tool: iterate DSOs with rbtree instead of list Waiman Long
  1 sibling, 1 reply; 5+ messages in thread
From: Waiman Long @ 2014-09-18 13:30 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo
  Cc: linux-kernel, Scott J Norton, Douglas Hatch, Don Zickus,
	Jiri Olsa, Adrian Hunter, Waiman Long

With workload that spawns and destroys many threads and processes,
it was found that perf-mem could took a long time to post-process
the perf data after the target workload had completed its operation.
The performance bottleneck was found to be searching and insertion
of the new DSO structures (thousands of them in this case).

In a dual-socket Ivy-Bridge E7-4890 v2 machine (30-core, 60-thread),
the perf profile below shows what perf was doing after the profiled
AIM7 shared workload completed:

-     83.94%  perf  libc-2.11.3.so     [.] __strcmp_sse42
   - __strcmp_sse42
      - 99.82% map__new
           machine__process_mmap_event
           perf_session_deliver_event
           perf_session__process_event
           __perf_session__process_events
           cmd_record
           cmd_mem
           run_builtin
           main
           __libc_start_main
-     13.17%  perf  perf               [.] __dsos__findnew
     __dsos__findnew
     map__new
     machine__process_mmap_event
     perf_session_deliver_event
     perf_session__process_event
     __perf_session__process_events
     cmd_record
     cmd_mem
     run_builtin
     main
     __libc_start_main

So about 97% of CPU times were spent in the map__new() function
trying to insert new DSO entry into the DSO linked list. The whole
post-processing step took about 9 minutes.

The DSO structures are currently searched linearly. So the total
processing time will be proportional to n^2.

To overcome this performance problem, the DSO code is modified to
also put the DSO structures in a RB tree sorted by its long name
in additional to being in a simple linked list. With this change,
the processing time will become proportional to n*log(n) which will
be much quicker for large n. However, the short name will still
be searched using the old linear searching method which is slow.
With that patch in place, the same perf-mem post-processing step took
less than 30 seconds to complete.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 tools/perf/util/dso.c |   87 ++++++++++++++++++++++++++++++++++++++++++++++--
 tools/perf/util/dso.h |    2 +
 2 files changed, 85 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 90d02c6..6293f89 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -651,6 +651,80 @@ struct dso *dso__kernel_findnew(struct machine *machine, const char *name,
 	return dso;
 }
 
+/*
+ * RB root of DSOs sorted by the long name
+ */
+struct rb_root dso__root = { NULL };
+
+/*
+ * Find a matching entry and/or link current entry to RB tree.
+ * Either one of the dso or name parameter must be non-NULL or the
+ * function will not work.
+ */
+static struct dso *dso__findlink_by_longname(struct rb_root *root,
+					     struct dso *dso, const char *name)
+{
+	struct rb_node **p = &root->rb_node;
+	struct rb_node  *parent = NULL;
+	int warned = false;
+
+	if (!name)
+		name = dso->long_name;
+	/*
+	 * Find node with the matching name
+	 */
+	while (*p) {
+		struct dso *this = rb_entry(*p, struct dso, rb_node);
+		long rc = (long)strcmp(name, this->long_name);
+
+		parent = *p;
+		if (rc == 0) {
+			/*
+			 * In case the new DSO is a duplicate of an existing
+			 * one, print an one-time warning & sort the entry
+			 * by its DSO address.
+			 */
+			if (!dso || (dso == this))
+				return this;	/* Find matching dso */
+			/*
+			 * The kernel DSOs may have duplicated long name,
+			 * so don't print warning for them.
+			 */
+			if (!warned && !strstr(name, "kernel.kallsyms")
+				    && !strstr(name, "/vmlinux")) {
+				pr_warning("Duplicated dso long name: %s\n",
+					   name);
+				warned = true;
+			}
+			rc = (long)dso - (long)this;
+		}
+		if (rc < 0)
+			p = &parent->rb_left;
+		else
+			p = &parent->rb_right;
+	}
+	if (dso) {
+		/* Add new node and rebalance tree */
+		rb_link_node(&dso->rb_node, parent, p);
+		rb_insert_color(&dso->rb_node, root);
+	}
+	return NULL;
+}
+
+static inline struct dso *
+dso__find_by_longname(struct rb_root *root, const char *name)
+{
+	return dso__findlink_by_longname(root, NULL, name);
+}
+
+/*
+ * Unlink the longname-sorted RB tree node
+ */
+static inline void dso__rb_unlink(struct rb_root *root, struct dso *dso)
+{
+	rb_erase(&dso->rb_node, root);
+}
+
 void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated)
 {
 	if (name == NULL)
@@ -753,6 +827,8 @@ struct dso *dso__new(const char *name)
 		dso->a2l_fails = 1;
 		dso->kernel = DSO_TYPE_USER;
 		dso->needs_swap = DSO_SWAP__UNSET;
+		dso->rb_root = NULL;
+		RB_CLEAR_NODE(&dso->rb_node);
 		INIT_LIST_HEAD(&dso->node);
 		INIT_LIST_HEAD(&dso->data.open_entry);
 	}
@@ -775,6 +851,10 @@ void dso__delete(struct dso *dso)
 		zfree((char **)&dso->long_name);
 		dso->long_name_allocated = false;
 	}
+	if (dso->rb_root) {
+		dso__rb_unlink(dso->rb_root, dso);
+		dso->rb_root = NULL;
+	}
 
 	dso__data_close(dso);
 	dso_cache__free(&dso->data.cache);
@@ -852,6 +932,8 @@ bool __dsos__read_build_ids(struct list_head *head, bool with_hits)
 void dsos__add(struct list_head *head, struct dso *dso)
 {
 	list_add_tail(&dso->node, head);
+	dso__findlink_by_longname(&dso__root, dso, NULL);
+	dso->rb_root = &dso__root;
 }
 
 struct dso *dsos__find(const struct list_head *head, const char *name, bool cmp_short)
@@ -864,10 +946,7 @@ struct dso *dsos__find(const struct list_head *head, const char *name, bool cmp_
 				return pos;
 		return NULL;
 	}
-	list_for_each_entry(pos, head, node)
-		if (strcmp(pos->long_name, name) == 0)
-			return pos;
-	return NULL;
+	return dso__find_by_longname(&dso__root, name);
 }
 
 struct dso *__dsos__findnew(struct list_head *head, const char *name)
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 5e463c0..75cda1d 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -92,6 +92,8 @@ struct dso_cache {
 
 struct dso {
 	struct list_head node;
+	struct rb_node   rb_node;	/* rbtree sorted by long name */
+	struct rb_root   *rb_root;	/* pointer to rbtree root */
 	struct rb_root	 symbols[MAP__NR_TYPES];
 	struct rb_root	 symbol_names[MAP__NR_TYPES];
 	void		 *a2l;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v3 2/2] perf tool: iterate DSOs with rbtree instead of list
  2014-09-18 13:30 [PATCH v3 0/2] perf tool: improves DSO long names search speed with RB tree Waiman Long
  2014-09-18 13:30 ` [PATCH v3 1/2] " Waiman Long
@ 2014-09-18 13:30 ` Waiman Long
  1 sibling, 0 replies; 5+ messages in thread
From: Waiman Long @ 2014-09-18 13:30 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo
  Cc: linux-kernel, Scott J Norton, Douglas Hatch, Don Zickus,
	Jiri Olsa, Adrian Hunter, Waiman Long

This patch changes the iteration process for DSOs from using the
list_head to rbtree. As a result, the node field in the DSO structure
can be eliminated. The user_dsos and kernel_dsos fields of the machine
structure are now rb_root.

The changes in this patch are mainly in one of the following 4
categories:

 1) Change list_head to rb_root and list to root
 2) Change list_for_each_entry*() to
    rbtree_postorder_for_each_entry_safe() and add a temporary variable
    needed by the rbtree macro
 3) Initialize the modified fields accordingly
 4) Remove the global dso__root variable and use root to replace
    dso__root

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 tools/perf/util/dso.c         |   47 +++++++++++++++++-----------------------
 tools/perf/util/dso.h         |   23 ++++++++++++++------
 tools/perf/util/header.c      |   36 +++++++++++++++---------------
 tools/perf/util/machine.c     |   14 +++++-------
 tools/perf/util/machine.h     |    4 +-
 tools/perf/util/probe-event.c |    4 +-
 tools/perf/util/symbol-elf.c  |    2 +-
 7 files changed, 65 insertions(+), 65 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 6293f89..7597d88 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -652,11 +652,6 @@ struct dso *dso__kernel_findnew(struct machine *machine, const char *name,
 }
 
 /*
- * RB root of DSOs sorted by the long name
- */
-struct rb_root dso__root = { NULL };
-
-/*
  * Find a matching entry and/or link current entry to RB tree.
  * Either one of the dso or name parameter must be non-NULL or the
  * function will not work.
@@ -829,7 +824,6 @@ struct dso *dso__new(const char *name)
 		dso->needs_swap = DSO_SWAP__UNSET;
 		dso->rb_root = NULL;
 		RB_CLEAR_NODE(&dso->rb_node);
-		INIT_LIST_HEAD(&dso->node);
 		INIT_LIST_HEAD(&dso->data.open_entry);
 	}
 
@@ -907,12 +901,12 @@ int dso__kernel_module_get_build_id(struct dso *dso,
 	return 0;
 }
 
-bool __dsos__read_build_ids(struct list_head *head, bool with_hits)
+bool __dsos__read_build_ids(struct rb_root *root, bool with_hits)
 {
 	bool have_build_id = false;
-	struct dso *pos;
+	struct dso *pos, *n;
 
-	list_for_each_entry(pos, head, node) {
+	dso__for_each_entry_safe(pos, n, root) {
 		if (with_hits && !pos->hit)
 			continue;
 		if (pos->has_build_id) {
@@ -929,34 +923,33 @@ bool __dsos__read_build_ids(struct list_head *head, bool with_hits)
 	return have_build_id;
 }
 
-void dsos__add(struct list_head *head, struct dso *dso)
+void dsos__add(struct rb_root *root, struct dso *dso)
 {
-	list_add_tail(&dso->node, head);
-	dso__findlink_by_longname(&dso__root, dso, NULL);
-	dso->rb_root = &dso__root;
+	dso__findlink_by_longname(root, dso, NULL);
+	dso->rb_root = root;
 }
 
-struct dso *dsos__find(const struct list_head *head, const char *name, bool cmp_short)
+struct dso *dsos__find(const struct rb_root *root, const char *name, bool cmp_short)
 {
-	struct dso *pos;
-
 	if (cmp_short) {
-		list_for_each_entry(pos, head, node)
+		struct dso *pos, *n;
+
+		dso__for_each_entry_safe(pos, n, root)
 			if (strcmp(pos->short_name, name) == 0)
 				return pos;
 		return NULL;
 	}
-	return dso__find_by_longname(&dso__root, name);
+	return dso__find_by_longname((struct rb_root *)root, name);
 }
 
-struct dso *__dsos__findnew(struct list_head *head, const char *name)
+struct dso *__dsos__findnew(struct rb_root *root, const char *name)
 {
-	struct dso *dso = dsos__find(head, name, false);
+	struct dso *dso = dsos__find(root, name, false);
 
 	if (!dso) {
 		dso = dso__new(name);
 		if (dso != NULL) {
-			dsos__add(head, dso);
+			dsos__add(root, dso);
 			dso__set_basename(dso);
 		}
 	}
@@ -964,13 +957,13 @@ struct dso *__dsos__findnew(struct list_head *head, const char *name)
 	return dso;
 }
 
-size_t __dsos__fprintf_buildid(struct list_head *head, FILE *fp,
+size_t __dsos__fprintf_buildid(struct rb_root *root, FILE *fp,
 			       bool (skip)(struct dso *dso, int parm), int parm)
 {
-	struct dso *pos;
+	struct dso *pos, *n;
 	size_t ret = 0;
 
-	list_for_each_entry(pos, head, node) {
+	dso__for_each_entry_safe(pos, n, root) {
 		if (skip && skip(pos, parm))
 			continue;
 		ret += dso__fprintf_buildid(pos, fp);
@@ -979,12 +972,12 @@ size_t __dsos__fprintf_buildid(struct list_head *head, FILE *fp,
 	return ret;
 }
 
-size_t __dsos__fprintf(struct list_head *head, FILE *fp)
+size_t __dsos__fprintf(struct rb_root *root, FILE *fp)
 {
-	struct dso *pos;
+	struct dso *pos, *n;
 	size_t ret = 0;
 
-	list_for_each_entry(pos, head, node) {
+	dso__for_each_entry_safe(pos, n, root) {
 		int i;
 		for (i = 0; i < MAP__NR_TYPES; ++i)
 			ret += dso__fprintf(pos, i, fp);
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 75cda1d..23fff82 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -91,7 +91,6 @@ struct dso_cache {
 };
 
 struct dso {
-	struct list_head node;
 	struct rb_node   rb_node;	/* rbtree sorted by long name */
 	struct rb_root   *rb_root;	/* pointer to rbtree root */
 	struct rb_root	 symbols[MAP__NR_TYPES];
@@ -143,6 +142,16 @@ struct dso {
 #define dso__for_each_symbol(dso, pos, n, type)	\
 	symbols__for_each_entry(&(dso)->symbols[(type)], pos, n)
 
+/*
+ * dso__for_each_entry_safe - iterate over all the dso entries in the rbtree
+ *
+ * @root: the rbtree root pointer
+ * @pos : the 'struct dso *' to use as a loop cursor
+ * @n   : the 'struct dso *' to use as a temporary storage
+ */
+#define dso__for_each_entry_safe(root, pos, n) \
+	rbtree_postorder_for_each_entry_safe(root, pos, n, rb_node)
+
 static inline void dso__set_loaded(struct dso *dso, enum map_type type)
 {
 	dso->loaded |= (1 << type);
@@ -226,15 +235,15 @@ struct map *dso__new_map(const char *name);
 struct dso *dso__kernel_findnew(struct machine *machine, const char *name,
 				const char *short_name, int dso_type);
 
-void dsos__add(struct list_head *head, struct dso *dso);
-struct dso *dsos__find(const struct list_head *head, const char *name,
+void dsos__add(struct rb_root *root, struct dso *dso);
+struct dso *dsos__find(const struct rb_root *root, const char *name,
 		       bool cmp_short);
-struct dso *__dsos__findnew(struct list_head *head, const char *name);
-bool __dsos__read_build_ids(struct list_head *head, bool with_hits);
+struct dso *__dsos__findnew(struct rb_root *root, const char *name);
+bool __dsos__read_build_ids(struct rb_root *root, bool with_hits);
 
-size_t __dsos__fprintf_buildid(struct list_head *head, FILE *fp,
+size_t __dsos__fprintf_buildid(struct rb_root *root, FILE *fp,
 			       bool (skip)(struct dso *dso, int parm), int parm);
-size_t __dsos__fprintf(struct list_head *head, FILE *fp);
+size_t __dsos__fprintf(struct rb_root *root, FILE *fp);
 
 size_t dso__fprintf_buildid(struct dso *dso, FILE *fp);
 size_t dso__fprintf_symbols_by_name(struct dso *dso,
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 158c787..636c183 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -171,10 +171,10 @@ perf_header__set_cmdline(int argc, const char **argv)
 	return 0;
 }
 
-#define dsos__for_each_with_build_id(pos, head)	\
-	list_for_each_entry(pos, head, node)	\
-		if (!pos->has_build_id)		\
-			continue;		\
+#define dsos__for_each_with_build_id(pos, n, root)	\
+	dso__for_each_entry_safe(pos, n, root)		\
+		if (!pos->has_build_id)			\
+			continue;			\
 		else
 
 static int write_buildid(const char *name, size_t name_len, u8 *build_id,
@@ -200,11 +200,11 @@ static int write_buildid(const char *name, size_t name_len, u8 *build_id,
 	return write_padded(fd, name, name_len + 1, len);
 }
 
-static int __dsos__hit_all(struct list_head *head)
+static int __dsos__hit_all(struct rb_root *root)
 {
-	struct dso *pos;
+	struct dso *pos, *n;
 
-	list_for_each_entry(pos, head, node)
+	dso__for_each_entry_safe(pos, n, root)
 		pos->hit = true;
 
 	return 0;
@@ -241,14 +241,14 @@ int dsos__hit_all(struct perf_session *session)
 	return 0;
 }
 
-static int __dsos__write_buildid_table(struct list_head *head,
+static int __dsos__write_buildid_table(struct rb_root *root,
 				       struct machine *machine,
 				       pid_t pid, u16 misc, int fd)
 {
 	char nm[PATH_MAX];
-	struct dso *pos;
+	struct dso *pos, *n;
 
-	dsos__for_each_with_build_id(pos, head) {
+	dsos__for_each_with_build_id(pos, n, root) {
 		int err;
 		const char *name;
 		size_t name_len;
@@ -440,13 +440,13 @@ static int dso__cache_build_id(struct dso *dso, struct machine *machine,
 				     debugdir, is_kallsyms, is_vdso);
 }
 
-static int __dsos__cache_build_ids(struct list_head *head,
+static int __dsos__cache_build_ids(struct rb_root *root,
 				   struct machine *machine, const char *debugdir)
 {
-	struct dso *pos;
+	struct dso *pos, *n;
 	int err = 0;
 
-	dsos__for_each_with_build_id(pos, head)
+	dsos__for_each_with_build_id(pos, n, root)
 		if (dso__cache_build_id(pos, machine, debugdir))
 			err = -1;
 
@@ -1548,7 +1548,7 @@ static int __event_process_build_id(struct build_id_event *bev,
 				    struct perf_session *session)
 {
 	int err = -1;
-	struct list_head *head;
+	struct rb_root *root;
 	struct machine *machine;
 	u16 misc;
 	struct dso *dso;
@@ -1563,22 +1563,22 @@ static int __event_process_build_id(struct build_id_event *bev,
 	switch (misc) {
 	case PERF_RECORD_MISC_KERNEL:
 		dso_type = DSO_TYPE_KERNEL;
-		head = &machine->kernel_dsos;
+		root = &machine->kernel_dsos;
 		break;
 	case PERF_RECORD_MISC_GUEST_KERNEL:
 		dso_type = DSO_TYPE_GUEST_KERNEL;
-		head = &machine->kernel_dsos;
+		root = &machine->kernel_dsos;
 		break;
 	case PERF_RECORD_MISC_USER:
 	case PERF_RECORD_MISC_GUEST_USER:
 		dso_type = DSO_TYPE_USER;
-		head = &machine->user_dsos;
+		root = &machine->user_dsos;
 		break;
 	default:
 		goto out;
 	}
 
-	dso = __dsos__findnew(head, filename);
+	dso = __dsos__findnew(root, filename);
 	if (dso != NULL) {
 		char sbuild_id[BUILD_ID_SIZE * 2 + 1];
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 16bba9f..317f50b 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -17,8 +17,8 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
 {
 	map_groups__init(&machine->kmaps);
 	RB_CLEAR_NODE(&machine->rb_node);
-	INIT_LIST_HEAD(&machine->user_dsos);
-	INIT_LIST_HEAD(&machine->kernel_dsos);
+	machine->user_dsos = RB_ROOT;
+	machine->kernel_dsos = RB_ROOT;
 
 	machine->threads = RB_ROOT;
 	INIT_LIST_HEAD(&machine->dead_threads);
@@ -70,14 +70,12 @@ out_delete:
 	return NULL;
 }
 
-static void dsos__delete(struct list_head *dsos)
+static void dsos__delete(struct rb_root *dsos)
 {
 	struct dso *pos, *n;
 
-	list_for_each_entry_safe(pos, n, dsos, node) {
-		list_del(&pos->node);
+	dso__for_each_entry_safe(pos, n, dsos)
 		dso__delete(pos);
-	}
 }
 
 void machine__delete_dead_threads(struct machine *machine)
@@ -963,9 +961,9 @@ static void machine__set_kernel_mmap_len(struct machine *machine,
 
 static bool machine__uses_kcore(struct machine *machine)
 {
-	struct dso *dso;
+	struct dso *dso, *n;
 
-	list_for_each_entry(dso, &machine->kernel_dsos, node) {
+	dso__for_each_entry_safe(dso, n, &machine->kernel_dsos) {
 		if (dso__is_kcore(dso))
 			return true;
 	}
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index b972824..c75ce9e 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -31,8 +31,8 @@ struct machine {
 	struct list_head  dead_threads;
 	struct thread	  *last_match;
 	struct vdso_info  *vdso_info;
-	struct list_head  user_dsos;
-	struct list_head  kernel_dsos;
+	struct rb_root	  user_dsos;
+	struct rb_root	  kernel_dsos;
 	struct map_groups kmaps;
 	struct map	  *vmlinux_maps[MAP__NR_TYPES];
 	symbol_filter_t	  symbol_filter;
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 9a0a183..b093d25 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -179,12 +179,12 @@ static struct map *kernel_get_module_map(const char *module)
 
 static struct dso *kernel_get_module_dso(const char *module)
 {
-	struct dso *dso;
+	struct dso *dso, *n;
 	struct map *map;
 	const char *vmlinux_name;
 
 	if (module) {
-		list_for_each_entry(dso, &host_machine->kernel_dsos, node) {
+		dso__for_each_entry_safe(dso, n, &host_machine->kernel_dsos) {
 			if (strncmp(dso->short_name + 1, module,
 				    dso->short_name_len - 2) == 0)
 				goto found;
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index d753499..97b0d18 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -916,7 +916,7 @@ int dso__load_sym(struct dso *dso, struct map *map,
 				}
 				curr_dso->symtab_type = dso->symtab_type;
 				map_groups__insert(kmap->kmaps, curr_map);
-				dsos__add(&dso->node, curr_dso);
+				dsos__add(dso->rb_root, curr_dso);
 				dso__set_loaded(curr_dso, map->type);
 			} else
 				curr_dso = curr_map->dso;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 1/2] perf tool: improves DSO long names search speed with RB tree
  2014-09-18 13:30 ` [PATCH v3 1/2] " Waiman Long
@ 2014-09-18 15:10   ` Arnaldo Carvalho de Melo
  2014-09-24 15:29     ` Waiman Long
  0 siblings, 1 reply; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-18 15:10 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, linux-kernel,
	Scott J Norton, Douglas Hatch, Don Zickus, Jiri Olsa,
	Adrian Hunter

Em Thu, Sep 18, 2014 at 09:30:20AM -0400, Waiman Long escreveu:
> +++ b/tools/perf/util/dso.c
> @@ -651,6 +651,80 @@ struct dso *dso__kernel_findnew(struct machine *machine, const char *name,
>  	return dso;
  
> +/*
> + * RB root of DSOs sorted by the long name
> + */
> +struct rb_root dso__root = { NULL };

Why do we still have a global variable for this? I thought that we would
be having this in struct machine?

Ok, I shouldn't have done this, but I went on and looked at the second
patch, and there, this goes away, why not avoid introducing the global
in the first place?

I.e. the existing code operates on a data structure that holds struct
dsos, you are switching to a new data structure, so it looked natural to
me to do this in one step, no?

Also at some point I thought about adding rb_tree helper functions to do
some rb__for_each() like operation, i.e. to sequentially access the
rb_tree instead of using it for searching using its key. PeterZ
rightfully nacked that because that would, IIRC, encourage people to use
a rb_tree to do linear searches for normal operation, i.e. not just for
rb_tree__printf() dump like routines:

https://lkml.org/lkml/2010/1/13/227

Also I saw at least one place where some foo__for_each_entry_safe() is
used but the loop doesn't look like it will remove/add anything to the
data structure that is being made "_safe", i.e. it should remain
foo__for_each_entry(), as it was before.

So, I would _keep_ the list_head, or else replace it with a another
rb_node to do lookups on it by shortname the same way we do for long
names.

The cheapest thing now would be for solving your problem, i.e. use a
rb_tree for searching for long names, keep the list_head for short names
linear searches.

I suggest having a

struct dsos {
	struct list_head short_names;
	struct rb_root	 long_names;
};

Then make struct machine use this type for:

	struct dsos	kernel_dsos, user_dsos;

Then all those dsos__find* routines stop receiving a list_head pointer
and start receiving a "struct dsos" instance.

That way it can add the dso to both containers, the one "sorted" by
short names (that linear search, just like before) and to the rb_tree
sorted by long names.

- Arnaldo

> +/*
> + * Find a matching entry and/or link current entry to RB tree.
> + * Either one of the dso or name parameter must be non-NULL or the
> + * function will not work.
> + */
> +static struct dso *dso__findlink_by_longname(struct rb_root *root,
> +					     struct dso *dso, const char *name)
> +{
> +	struct rb_node **p = &root->rb_node;
> +	struct rb_node  *parent = NULL;
> +	int warned = false;
> +
> +	if (!name)
> +		name = dso->long_name;
> +	/*
> +	 * Find node with the matching name
> +	 */
> +	while (*p) {
> +		struct dso *this = rb_entry(*p, struct dso, rb_node);
> +		long rc = (long)strcmp(name, this->long_name);
> +
> +		parent = *p;
> +		if (rc == 0) {
> +			/*
> +			 * In case the new DSO is a duplicate of an existing
> +			 * one, print an one-time warning & sort the entry
> +			 * by its DSO address.
> +			 */
> +			if (!dso || (dso == this))
> +				return this;	/* Find matching dso */
> +			/*
> +			 * The kernel DSOs may have duplicated long name,
> +			 * so don't print warning for them.
> +			 */
> +			if (!warned && !strstr(name, "kernel.kallsyms")
> +				    && !strstr(name, "/vmlinux")) {
> +				pr_warning("Duplicated dso long name: %s\n",
> +					   name);
> +				warned = true;
> +			}
> +			rc = (long)dso - (long)this;
> +		}
> +		if (rc < 0)
> +			p = &parent->rb_left;
> +		else
> +			p = &parent->rb_right;
> +	}
> +	if (dso) {
> +		/* Add new node and rebalance tree */
> +		rb_link_node(&dso->rb_node, parent, p);
> +		rb_insert_color(&dso->rb_node, root);
> +	}
> +	return NULL;
> +}
> +
> +static inline struct dso *
> +dso__find_by_longname(struct rb_root *root, const char *name)
> +{
> +	return dso__findlink_by_longname(root, NULL, name);
> +}
> +
> +/*
> + * Unlink the longname-sorted RB tree node
> + */
> +static inline void dso__rb_unlink(struct rb_root *root, struct dso *dso)
> +{
> +	rb_erase(&dso->rb_node, root);
> +}
> +
>  void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated)
>  {
>  	if (name == NULL)
> @@ -753,6 +827,8 @@ struct dso *dso__new(const char *name)
>  		dso->a2l_fails = 1;
>  		dso->kernel = DSO_TYPE_USER;
>  		dso->needs_swap = DSO_SWAP__UNSET;
> +		dso->rb_root = NULL;
> +		RB_CLEAR_NODE(&dso->rb_node);
>  		INIT_LIST_HEAD(&dso->node);
>  		INIT_LIST_HEAD(&dso->data.open_entry);
>  	}
> @@ -775,6 +851,10 @@ void dso__delete(struct dso *dso)
>  		zfree((char **)&dso->long_name);
>  		dso->long_name_allocated = false;
>  	}
> +	if (dso->rb_root) {
> +		dso__rb_unlink(dso->rb_root, dso);
> +		dso->rb_root = NULL;
> +	}
>  
>  	dso__data_close(dso);
>  	dso_cache__free(&dso->data.cache);
> @@ -852,6 +932,8 @@ bool __dsos__read_build_ids(struct list_head *head, bool with_hits)
>  void dsos__add(struct list_head *head, struct dso *dso)
>  {
>  	list_add_tail(&dso->node, head);
> +	dso__findlink_by_longname(&dso__root, dso, NULL);
> +	dso->rb_root = &dso__root;
>  }
>  
>  struct dso *dsos__find(const struct list_head *head, const char *name, bool cmp_short)
> @@ -864,10 +946,7 @@ struct dso *dsos__find(const struct list_head *head, const char *name, bool cmp_
>  				return pos;
>  		return NULL;
>  	}
> -	list_for_each_entry(pos, head, node)
> -		if (strcmp(pos->long_name, name) == 0)
> -			return pos;
> -	return NULL;
> +	return dso__find_by_longname(&dso__root, name);
>  }
>  
>  struct dso *__dsos__findnew(struct list_head *head, const char *name)
> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
> index 5e463c0..75cda1d 100644
> --- a/tools/perf/util/dso.h
> +++ b/tools/perf/util/dso.h
> @@ -92,6 +92,8 @@ struct dso_cache {
>  
>  struct dso {
>  	struct list_head node;
> +	struct rb_node   rb_node;	/* rbtree sorted by long name */
> +	struct rb_root   *rb_root;	/* pointer to rbtree root */
>  	struct rb_root	 symbols[MAP__NR_TYPES];
>  	struct rb_root	 symbol_names[MAP__NR_TYPES];
>  	void		 *a2l;
> -- 
> 1.7.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 1/2] perf tool: improves DSO long names search speed with RB tree
  2014-09-18 15:10   ` Arnaldo Carvalho de Melo
@ 2014-09-24 15:29     ` Waiman Long
  0 siblings, 0 replies; 5+ messages in thread
From: Waiman Long @ 2014-09-24 15:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, linux-kernel,
	Scott J Norton, Douglas Hatch, Don Zickus, Jiri Olsa,
	Adrian Hunter

On 09/18/2014 11:10 AM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Sep 18, 2014 at 09:30:20AM -0400, Waiman Long escreveu:
>> +++ b/tools/perf/util/dso.c
>> @@ -651,6 +651,80 @@ struct dso *dso__kernel_findnew(struct machine *machine, const char *name,
>>   	return dso;
>
>> +/*
>> + * RB root of DSOs sorted by the long name
>> + */
>> +struct rb_root dso__root = { NULL };
> Why do we still have a global variable for this? I thought that we would
> be having this in struct machine?
>
> Ok, I shouldn't have done this, but I went on and looked at the second
> patch, and there, this goes away, why not avoid introducing the global
> in the first place?

The global variable was added to make the first patch compilable by 
itself. I will take this out in the next version of the patch.

> I.e. the existing code operates on a data structure that holds struct
> dsos, you are switching to a new data structure, so it looked natural to
> me to do this in one step, no?

Yes, I think that makes sense.

> Also at some point I thought about adding rb_tree helper functions to do
> some rb__for_each() like operation, i.e. to sequentially access the
> rb_tree instead of using it for searching using its key. PeterZ
> rightfully nacked that because that would, IIRC, encourage people to use
> a rb_tree to do linear searches for normal operation, i.e. not just for
> rb_tree__printf() dump like routines:
>
> https://lkml.org/lkml/2010/1/13/227
>
> Also I saw at least one place where some foo__for_each_entry_safe() is
> used but the loop doesn't look like it will remove/add anything to the
> data structure that is being made "_safe", i.e. it should remain
> foo__for_each_entry(), as it was before.
>
> So, I would _keep_ the list_head, or else replace it with a another
> rb_node to do lookups on it by shortname the same way we do for long
> names.
>
> The cheapest thing now would be for solving your problem, i.e. use a
> rb_tree for searching for long names, keep the list_head for short names
> linear searches.
>
> I suggest having a
>
> struct dsos {
> 	struct list_head short_names;
> 	struct rb_root	 long_names;
> };
>
> Then make struct machine use this type for:
>
> 	struct dsos	kernel_dsos, user_dsos;
>
> Then all those dsos__find* routines stop receiving a list_head pointer
> and start receiving a "struct dsos" instance.
>
> That way it can add the dso to both containers, the one "sorted" by
> short names (that linear search, just like before) and to the rb_tree
> sorted by long names.
>
> - Arnaldo

I think this is a good idea. I will incorporate that in my next patch. 
BTW, the list isn't sorted in any way. So I won't use the same structure 
field names as you have suggested.


-Longman

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-09-24 15:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-18 13:30 [PATCH v3 0/2] perf tool: improves DSO long names search speed with RB tree Waiman Long
2014-09-18 13:30 ` [PATCH v3 1/2] " Waiman Long
2014-09-18 15:10   ` Arnaldo Carvalho de Melo
2014-09-24 15:29     ` Waiman Long
2014-09-18 13:30 ` [PATCH v3 2/2] perf tool: iterate DSOs with rbtree instead of list Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox