public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/9] latched RB-trees and __module_address()
@ 2015-05-06 13:51 Peter Zijlstra
  2015-05-06 13:51 ` [PATCH v6 1/9] rbtree: Make lockless searches non-fatal Peter Zijlstra
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: Peter Zijlstra @ 2015-05-06 13:51 UTC (permalink / raw)
  To: mingo, rusty, mathieu.desnoyers, oleg, paulmck, torvalds
  Cc: linux-kernel, andi, rostedt, tglx, laijs, linux, peterz

This series is aimed at making __module_address() go fast(er).

The reason for doing so is that most stack unwinders use kernel_text_address()
to validate each frame. Perf and ftrace (can) end up doing a lot of stack
traces from performance sensitive code.

On the way there it:
 - annotates and sanitizes module locking
 - introduces the latched RB-tree
 - employs it to make __module_address() go fast.

I've build and boot tested this on x86_64 with modules and lockdep
enabled.  Performance numbers (below) are done with lockdep disabled.

As previously mentioned; the reason for writing the latched RB-tree as generic
code is mostly for clarity/documentation purposes; as there are a number of
separate and non trivial bits to the complete solution.

As measured on my ivb-ep system with 84 modules loaded; the test module reports
(cache hot, performance cpufreq):

          avg +- stdev
Before:   611 +- 10 [ns] per __module_address() call
After:     17 +-  5 [ns] per __module_address() call

PMI measurements for a cpu running loops in a module (also [ns]):

Before:	Mean: 2719 +- 1, Stdev: 214, Samples: 40036
After:  Mean:  947 +- 0, Stdev: 132, Samples: 40037

Note; I have also tested things like: perf record -a -g modprobe
mod_test, to make 'sure' to hit some of the more interesting paths.

Changes since last time:

 - rebased against Rusty's tree
 - raw_read_seqcount_latch() -- (mingo)

Based on rusty/linux.git/pending-rebases; please consider for 4.2

Thanks!


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-05-12 11:52 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-06 13:51 [PATCH v6 0/9] latched RB-trees and __module_address() Peter Zijlstra
2015-05-06 13:51 ` [PATCH v6 1/9] rbtree: Make lockless searches non-fatal Peter Zijlstra
2015-05-06 13:51 ` [PATCH v6 2/9] seqlock: Better document raw_write_seqcount_latch() Peter Zijlstra
2015-05-06 13:51 ` [PATCH v6 3/9] rcu: Move lockless_dereference() out of rcupdate.h Peter Zijlstra
2015-05-06 13:51 ` [PATCH v6 4/9] seqlock: Introduce raw_read_seqcount_latch() Peter Zijlstra
2015-05-06 13:51 ` [PATCH v6 5/9] rbtree: Implement generic latch_tree Peter Zijlstra
2015-05-06 13:51 ` [PATCH v6 6/9] module: Optimize __module_address() using a latched RB-tree Peter Zijlstra
2015-05-06 13:51 ` [PATCH v6 7/9] module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING Peter Zijlstra
2015-05-06 13:51 ` [PATCH v6 8/9] module: Use __module_address() for module_address_lookup() Peter Zijlstra
2015-05-06 13:51 ` [PATCH v6 9/9] module: Rework module_addr_{min,max} Peter Zijlstra
2015-05-07  1:20 ` [PATCH v6 0/9] latched RB-trees and __module_address() Rusty Russell
2015-05-07 19:28   ` Ingo Molnar
2015-05-08 17:42     ` Rusty Russell
2015-05-12 11:52       ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox