public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] ia64: possible module unwind table optimisation
@ 2010-09-10 10:40 Phil Carmody
  2010-09-10 10:40 ` [PATCH 1/2] ia64: unwind: remove preprocesser noise, and correct comment Phil Carmody
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Phil Carmody @ 2010-09-10 10:40 UTC (permalink / raw)
  To: tony.luck; +Cc: fenghua.yu, linux-ia64, linux-kernel


This move-to-front linked list optimisation was a huge win on ARM, so I 
checked for other architectures which manage their module unwind tables
in a similar way.

Alas I have no way to test this patch.

Phil

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] ia64: unwind: remove preprocesser noise, and correct comment
  2010-09-10 10:40 [PATCH 0/2] ia64: possible module unwind table optimisation Phil Carmody
@ 2010-09-10 10:40 ` Phil Carmody
  2010-09-10 10:40 ` [PATCH 2/2] ia64: unwind - optimise linked-list searches for modules Phil Carmody
  2010-09-14 23:18 ` [PATCH 0/2] ia64: possible module unwind table optimisation Tony Luck
  2 siblings, 0 replies; 10+ messages in thread
From: Phil Carmody @ 2010-09-10 10:40 UTC (permalink / raw)
  To: tony.luck; +Cc: fenghua.yu, linux-ia64, linux-kernel

The expression in the comment was mis-bracketted.
There was also no need to introduce a coding-style breaking
macro for a single use, so just made it a static const.

Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
---
 arch/ia64/kernel/unwind.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/ia64/kernel/unwind.c b/arch/ia64/kernel/unwind.c
index b6c0e63..f47217b 100644
--- a/arch/ia64/kernel/unwind.c
+++ b/arch/ia64/kernel/unwind.c
@@ -1204,10 +1204,10 @@ desc_spill_sprel_p (unsigned char qp, unw_word t, unsigned char abreg, unw_word
 static inline unw_hash_index_t
 hash (unsigned long ip)
 {
-#	define hashmagic	0x9e3779b97f4a7c16UL	/* based on (sqrt(5)/2-1)*2^64 */
+	/* magic number = ((sqrt(5)-1)/2)*2^64 */
+	static const unsigned long hashmagic = 0x9e3779b97f4a7c16UL;
 
-	return (ip >> 4)*hashmagic >> (64 - UNW_LOG_HASH_SIZE);
-#undef hashmagic
+	return (ip >> 4) * hashmagic >> (64 - UNW_LOG_HASH_SIZE);
 }
 
 static inline long
-- 
1.7.2.rc1.37.gf8c40


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] ia64: unwind - optimise linked-list searches for modules
  2010-09-10 10:40 [PATCH 0/2] ia64: possible module unwind table optimisation Phil Carmody
  2010-09-10 10:40 ` [PATCH 1/2] ia64: unwind: remove preprocesser noise, and correct comment Phil Carmody
@ 2010-09-10 10:40 ` Phil Carmody
  2010-09-14 23:18 ` [PATCH 0/2] ia64: possible module unwind table optimisation Tony Luck
  2 siblings, 0 replies; 10+ messages in thread
From: Phil Carmody @ 2010-09-10 10:40 UTC (permalink / raw)
  To: tony.luck; +Cc: fenghua.yu, linux-ia64, linux-kernel

It's clear from the comment in the code about keeping the
kernel's unwind table at the front of the list that some
attention has been paid to access patterns. Tests on other
architectures have shown that a move-to-front optimisation
improves searches dramatically.

Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>

---
This might not be a useful optimisation for the ia64
due to the caching that's already in place. If someone
can count the number of linked list operations with and
without this patch in a real-world scenario, I'd be
grateful.
---
 arch/ia64/kernel/unwind.c |   17 ++++++++++++++++-
 1 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/arch/ia64/kernel/unwind.c b/arch/ia64/kernel/unwind.c
index f47217b..fed6afa 100644
--- a/arch/ia64/kernel/unwind.c
+++ b/arch/ia64/kernel/unwind.c
@@ -1531,7 +1531,7 @@ build_script (struct unw_frame_info *info)
 	struct unw_labeled_state *ls, *next;
 	unsigned long ip = info->ip;
 	struct unw_state_record sr;
-	struct unw_table *table;
+	struct unw_table *table, *prev;
 	struct unw_reg_info *r;
 	struct unw_insn insn;
 	u8 *dp, *desc_end;
@@ -1560,11 +1560,26 @@ build_script (struct unw_frame_info *info)
 
 	STAT(parse_start = ia64_get_itc());
 
+	prev = NULL;
 	for (table = unw.tables; table; table = table->next) {
 		if (ip >= table->start && ip < table->end) {
+			/*
+			 * Leave the kernel unwind table at the very front,
+			 * lest moving it breaks some assumption elsewhere.
+			 * Otherwise, move the matching table to the second
+			 * position in the list so that traversals can benefit
+			 * from commonality in backtrace paths.
+			 */
+			if (prev && prev != unw.tables) {
+				/* unw is safe - we're already spinlocked */
+				prev->next = table->next;
+				table->next = unw.tables->next;
+				unw.tables->next = table;
+			}
 			e = lookup(table, ip - table->segment_base);
 			break;
 		}
+		prev = table;
 	}
 	if (!e) {
 		/* no info, return default unwinder (leaf proc, no mem stack, no saved regs)  */
-- 
1.7.2.rc1.37.gf8c40


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] ia64: possible module unwind table optimisation
  2010-09-10 10:40 [PATCH 0/2] ia64: possible module unwind table optimisation Phil Carmody
  2010-09-10 10:40 ` [PATCH 1/2] ia64: unwind: remove preprocesser noise, and correct comment Phil Carmody
  2010-09-10 10:40 ` [PATCH 2/2] ia64: unwind - optimise linked-list searches for modules Phil Carmody
@ 2010-09-14 23:18 ` Tony Luck
  2010-09-15 12:48   ` Phil Carmody
  2 siblings, 1 reply; 10+ messages in thread
From: Tony Luck @ 2010-09-14 23:18 UTC (permalink / raw)
  To: Phil Carmody; +Cc: fenghua.yu, linux-ia64, linux-kernel

On Fri, Sep 10, 2010 at 3:40 AM, Phil Carmody
<ext-phil.2.carmody@nokia.com> wrote:
> Alas I have no way to test this patch.

Phil,

I have this built and booted on an ia64 ... what do you want me to measure? The
number of iterations in:

  for (table = unw.tables; table; table = table->next)

before we find the right table to look?

-Tony

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] ia64: possible module unwind table optimisation
  2010-09-14 23:18 ` [PATCH 0/2] ia64: possible module unwind table optimisation Tony Luck
@ 2010-09-15 12:48   ` Phil Carmody
  2010-09-15 17:28     ` Tony Luck
  0 siblings, 1 reply; 10+ messages in thread
From: Phil Carmody @ 2010-09-15 12:48 UTC (permalink / raw)
  To: ext Tony Luck
  Cc: fenghua.yu@intel.com, linux-ia64@vger.kernel.org,
	linux-kernel@vger.kernel.org

On 15/09/10 01:18 +0200, ext Tony Luck wrote:
> On Fri, Sep 10, 2010 at 3:40 AM, Phil Carmody
> <ext-phil.2.carmody@nokia.com> wrote:
> > Alas I have no way to test this patch.
> 
> Phil,
> 
> I have this built and booted on an ia64 ... what do you want me to measure? The
> number of iterations in:
> 
>   for (table = unw.tables; table; table = table->next)
> 
> before we find the right table to look?

Yes, that's the one. Sorry I wasn't more explicit.

Phil

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] ia64: possible module unwind table optimisation
  2010-09-15 12:48   ` Phil Carmody
@ 2010-09-15 17:28     ` Tony Luck
  2010-09-15 22:22       ` Phil Carmody
  0 siblings, 1 reply; 10+ messages in thread
From: Tony Luck @ 2010-09-15 17:28 UTC (permalink / raw)
  To: Phil Carmody
  Cc: fenghua.yu@intel.com, linux-ia64@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Wed, Sep 15, 2010 at 5:48 AM, Phil Carmody
<ext-phil.2.carmody@nokia.com> wrote:
>>   for (table = unw.tables; table; table = table->next)
>>
>> before we find the right table to look?
>
> Yes, that's the one. Sorry I wasn't more explicit.

My usual workload (building new Linux kernels - what else :-) doesn't
seem to generate
any unwind requests.  I reached the loop just 58 times while booting,
and only did 23
steps past the first item [didn't track a histogram, so I don't know
whether that was a
single 23 step lookup, or 23 one-step ones].  Building the kernel
(make -j32) didn't add
to either count.

Workload ideas?

-Tony

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] ia64: possible module unwind table optimisation
  2010-09-15 17:28     ` Tony Luck
@ 2010-09-15 22:22       ` Phil Carmody
  2010-09-16 21:56         ` Tony Luck
  0 siblings, 1 reply; 10+ messages in thread
From: Phil Carmody @ 2010-09-15 22:22 UTC (permalink / raw)
  To: ext Tony Luck
  Cc: fenghua.yu@intel.com, linux-ia64@vger.kernel.org,
	linux-kernel@vger.kernel.org

On 15/09/10 19:28 +0200, ext Tony Luck wrote:
> On Wed, Sep 15, 2010 at 5:48 AM, Phil Carmody
> <ext-phil.2.carmody@nokia.com> wrote:
> >>   for (table = unw.tables; table; table = table->next)
> >>
> >> before we find the right table to look?
> >
> > Yes, that's the one. Sorry I wasn't more explicit.
> 
> My usual workload (building new Linux kernels - what else :-) doesn't
> seem to generate
> any unwind requests.  I reached the loop just 58 times while booting,
> and only did 23
> steps past the first item [didn't track a histogram, so I don't know
> whether that was a
> single 23 step lookup, or 23 one-step ones].  Building the kernel
> (make -j32) didn't add
> to either count.
> 
> Workload ideas?

Turn on kmemleak, or anything else which repeatedly runs up the stack just
for the fun of gathering backtraces. Kmemleak's what caused us to notice 
the issue in our ARM-based environment.

Phil

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] ia64: possible module unwind table optimisation
  2010-09-15 22:22       ` Phil Carmody
@ 2010-09-16 21:56         ` Tony Luck
  2010-09-17  8:44           ` Phil Carmody
  0 siblings, 1 reply; 10+ messages in thread
From: Tony Luck @ 2010-09-16 21:56 UTC (permalink / raw)
  To: Phil Carmody
  Cc: fenghua.yu@intel.com, linux-ia64@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Wed, Sep 15, 2010 at 3:22 PM, Phil Carmody
<ext-phil.2.carmody@nokia.com> wrote:
> Turn on kmemleak, or anything else which repeatedly runs up the stack just
> for the fun of gathering backtraces. Kmemleak's what caused us to notice
> the issue in our ARM-based environment.

kmemleak needs CONFIG_STACKTRACE - which I haven't got around to putting
into ia64 yet.  I dug up an earlier attempt at doing so, and turned on kmemleak,
but it crashes early in boot (before ever getting to my stack tracer)
in possibly
the first call to kmem_cache_alloc().  With SLUB the error is an unaligned
reference to 0xffffffffffffffff, with SLAB I get a NULL deref.

:-(

-Tony

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] ia64: possible module unwind table optimisation
  2010-09-16 21:56         ` Tony Luck
@ 2010-09-17  8:44           ` Phil Carmody
  2010-09-17 17:44             ` Tony Luck
  0 siblings, 1 reply; 10+ messages in thread
From: Phil Carmody @ 2010-09-17  8:44 UTC (permalink / raw)
  To: ext Tony Luck
  Cc: fenghua.yu@intel.com, linux-ia64@vger.kernel.org,
	linux-kernel@vger.kernel.org

On 16/09/10 23:56 +0200, ext Tony Luck wrote:
> On Wed, Sep 15, 2010 at 3:22 PM, Phil Carmody
> <ext-phil.2.carmody@nokia.com> wrote:
> > Turn on kmemleak, or anything else which repeatedly runs up the stack just
> > for the fun of gathering backtraces. Kmemleak's what caused us to notice
> > the issue in our ARM-based environment.
> 
> kmemleak needs CONFIG_STACKTRACE - which I haven't got around to putting
> into ia64 yet.  I dug up an earlier attempt at doing so, and turned on kmemleak,
> but it crashes early in boot (before ever getting to my stack tracer)
> in possibly
> the first call to kmem_cache_alloc().  With SLUB the error is an unaligned
> reference to 0xffffffffffffffff, with SLAB I get a NULL deref.
> 
> :-(

This is a shame, but not a real problem in the context of my patch - you can't 
usefully optimise something that's seemingly almost never used. It can be filed
in the cylindrical filing cabinet.

Phil

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] ia64: possible module unwind table optimisation
  2010-09-17  8:44           ` Phil Carmody
@ 2010-09-17 17:44             ` Tony Luck
  0 siblings, 0 replies; 10+ messages in thread
From: Tony Luck @ 2010-09-17 17:44 UTC (permalink / raw)
  To: Phil Carmody
  Cc: fenghua.yu@intel.com, linux-ia64@vger.kernel.org,
	linux-kernel@vger.kernel.org

> This is a shame, but not a real problem in the context of my patch - you can't
> usefully optimise something that's seemingly almost never used. It can be filed
> in the cylindrical filing cabinet.

I'll probably apply it anyway ... it looks like it will be useful when
I get things like this
working.  The initial problem with kmemleak seems to be configuration related. I
cut back from 4096 cpus on 1024 nodes to a more modest 32 cpus on 8 nodes,
and my system boots.  Still can't test your patch because although debugfs got
configured and apparently built-in ok ... I can't see it in
/proc/filesystems, so it
won't mount.

Thanks

-Tony

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-09-17 17:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-10 10:40 [PATCH 0/2] ia64: possible module unwind table optimisation Phil Carmody
2010-09-10 10:40 ` [PATCH 1/2] ia64: unwind: remove preprocesser noise, and correct comment Phil Carmody
2010-09-10 10:40 ` [PATCH 2/2] ia64: unwind - optimise linked-list searches for modules Phil Carmody
2010-09-14 23:18 ` [PATCH 0/2] ia64: possible module unwind table optimisation Tony Luck
2010-09-15 12:48   ` Phil Carmody
2010-09-15 17:28     ` Tony Luck
2010-09-15 22:22       ` Phil Carmody
2010-09-16 21:56         ` Tony Luck
2010-09-17  8:44           ` Phil Carmody
2010-09-17 17:44             ` Tony Luck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox