public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [Linux-ia64] Re: new ia64 kernel patch (relative to 2.5.69)
@ 2003-05-10 14:02 Keith Owens
  2003-05-12 23:20 ` David Mosberger
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Keith Owens @ 2003-05-10 14:02 UTC (permalink / raw)
  To: linux-ia64

On Sat, 10 May 2003 04:19:20 -0700, 
David Mosberger <davidm@napali.hpl.hp.com> wrote:
>The kernel patch linux-2.5.69-ia64-030509.diff.gz is now available at
>
>Oh, Keith, you might like the fact that I added NEW_LOCK back again
>and it's even turned on by default!

A couple of months of experience inside SGI with my NEW_LOCK patch
resulted in some changes.

* Always save the return address in r28 in addition to b6 (btw, your
  patch still mentions b7).  b6 is not in minstate which means it is
  missing for INIT and MCA dumps, making it impossible to determine
  which code was in contention.  Also kernprof only gets the minstate
  data and it needs the return address to report the real contention
  code.  Using r28 as well as b6 makes INT, MCA and kernprof (with a
  patch to kernprof) useful again.

* Add more registers to the clobber list.  ar.ccv is really clobbered
  and is missing from the list.  Some of the optional code that is
  added to the contention path (e.g. exponential backoff) needs more
  registers.  Changing spinlock.h forces a complete recompile, so
  predefine a suitable set of free registers.

  There is also a problem with binary only modules, I hate them but
  they are a fact of life.  Changing the clobber list in spinlock.h
  will result in binary only modules that have an incorrect view of
  what the kernel is doing and will result in register corruption on
  contended locks.  I don't like the idea that tuning and debugging
  patches introduce Heisenbugs!

  Extending the predefined list of clobbers means that any reasonable
  code can be added to the contention path without causing problems for
  the rest of the kernel.  The extra registers do not cause any
  problems for gcc, the "memory" clobber means that only constants can
  be held in registers over the spinlock code, there are still 13 free
  registers to hold constant values or addresses, not to mention local
  stacked registers.


This patch has not even been compiled, it is just for discussion.


Index: 69.2/arch/ia64/kernel/head.S
--- 69.2/arch/ia64/kernel/head.S Sat, 10 May 2003 22:34:16 +1000 kaos (linux-2.5/t/35_head.S 1.1.2.1.1.1.2.1.1.1.1.2 444)
+++ 69.2(w)/arch/ia64/kernel/head.S Sat, 10 May 2003 23:27:30 +1000 kaos (linux-2.5/t/35_head.S 1.1.2.1.1.1.2.1.1.1.1.2 444)
@@ -747,14 +747,19 @@ SET_REG(b5);
 	 * - do not use any registers other than the ones listed below
 	 *
 	 * Inputs:
-	 *   ar.pfs - saved CFM of caller
-	 *   ar.ccv - 0 (and available for use)
-	 *   r28    - available for use.
-	 *   r29    - available for use.
-	 *   r30    - available for use.
-	 *   r31    - address of lock, available for use.
-	 *   b7     - return address
-	 *   p14    - available for use.
+	 *   ar.pfs	- saved CFM of caller
+	 *   ar.ccv	- 0 (and available for use)
+	 *   r12	- kernel stack pointer, but see above.
+	 *   r13	- current process.
+	 *   r28	- address of start of spin_lock code.  Used as a "return"
+	 *		  address from this contention path.  Do not change, b6
+	 *		  is not always dumped, r28 is always dumped.  kernprof
+	 *		  relies on getting r28 to work out what is contending.
+	 *   r31	- address of lock.
+	 *   r21-r27, r29, r30 - available for use.
+	 *   b6		- return address, do not use.
+	 *   p12-p15	- available for use.
+	 *   Rest	- caller's state, do not use, especially ar.pfs.
 	 *
 	 * If you patch this code to use more registers, do not forget to update
 	 * the clobber lists for spin_lock() in include/asm-ia64/spinlock.h.
@@ -765,16 +770,10 @@ SET_REG(b5);
 GLOBAL_ENTRY(ia64_spinlock_contention_pre3_4)
 	.prologue
 	.save ar.pfs, r0	// this code effectively has a zero frame size
-	.save rp, r28
-	.body
-	nop 0
-	nop 0
-	.restore sp		// pop existing prologue after next insn
-	mov b6 = r28
-	.prologue
-	.save ar.pfs, r0
-	.altrp b6
+	.spillreg rp, r28
 	.body
+	mov b6=r28		// copy to b6 for return, preserve r28 for debug
+
 .wait:
 	// exponential backoff, kdb, lockmeter etc. go in here
 	hint @pause
@@ -792,6 +791,8 @@ GLOBAL_ENTRY(ia64_spinlock_contention)
 	.prologue
 	.altrp b6
 	.body
+	mov r28¶		// copy return address to r28 for debug
+
 .wait:
 	// exponential backoff, kdb, lockmeter etc. go in here
 	hint @pause
Index: 69.2/include/asm-ia64/spinlock.h
--- 69.2/include/asm-ia64/spinlock.h Sat, 10 May 2003 22:34:16 +1000 kaos (linux-2.5/D/c/9_spinlock.h 1.1.2.1.1.1.1.1.1.4 444)
+++ 69.2(w)/include/asm-ia64/spinlock.h Sun, 11 May 2003 00:02:09 +1000 kaos (linux-2.5/D/c/9_spinlock.h 1.1.2.1.1.1.1.1.1.4 444)
@@ -30,9 +30,26 @@ typedef struct {
  * ia64_spinlock_contention().  We do not use a normal call because that would force all
  * callers of spin_lock() to be non-leaf routines.  Instead, ia64_spinlock_contention() is
  * carefully coded to touch only those registers that spin_lock() marks "clobbered".
+ *
+ * ia64_spinlock_contention is entered with :-
+ *   r28 - address of start of spin_lock code.  Used as a "return" address
+ *         from the contention path.
+ *   r31 - address of lock.
+ *   r21-r27, r29, r30 - available for use.
+ *   b6  - available for use.
+ *   p12-p15 - available for use.
+ *
+ * If you patch ia64_spinlock_contention to use more registers, do not forget to
+ * update the clobber list below.  Although the inline code does not use
+ * r21-r27, r29, p12, p14, p15, they are marked as clobbers so the contention
+ * path has a decent set of registers to use without requiring changes to this
+ * file and forcing a complete kernel rebuild, not to mention invalidating every
+ * module in a way that cannot be detected by modversions.
  */
 
-#define IA64_SPINLOCK_CLOBBERS "ar.pfs", "p14", "r28", "r29", "r30", "b6", "memory"
+#define IA64_SPINLOCK_CLOBBERS "ar.ccv", "ar.pfs",					\
+		  "r21", "r22", "r23", "r24", "r25", "r26", "r27", "r28", "r29", "r30", \
+		  "b6", "p12", "p13", "p14", "p15", "memory"
 
 static inline void
 _raw_spin_lock (spinlock_t *lock)



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Linux-ia64] Re: new ia64 kernel patch (relative to 2.5.69)
  2003-05-10 14:02 [Linux-ia64] Re: new ia64 kernel patch (relative to 2.5.69) Keith Owens
@ 2003-05-12 23:20 ` David Mosberger
  2003-05-12 23:38 ` Keith Owens
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: David Mosberger @ 2003-05-12 23:20 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Sun, 11 May 2003 00:02:50 +1000, Keith Owens <kaos@sgi.com> said:

  Keith> * Always save the return address in r28 in addition to b6
  Keith> (btw, your patch still mentions b7).

You mean in the comment?  I fixed that now, thanks.

  Keith> b6 is not in minstate which means it is missing for INIT and
  Keith> MCA dumps, making it impossible to determine which code was
  Keith> in contention.

This doesn't make any sense to me.  If it's not in MINSTATE, it's
preserved across the firmware INIT path.  You may want to take a
look at how the 2.5 MCA code is handling this.

  Keith> * Add more registers to the clobber list.

  Keith> ar.ccv is really clobbered and is missing from the list.

Indeed.

  Keith> Some of the optional code that is added to the contention
  Keith> path (e.g. exponential backoff) needs more registers.
  Keith> Changing spinlock.h forces a complete recompile, so predefine
  Keith> a suitable set of free registers.

I'd like to see that exponential backoff code first.  The old code in
head.S did exponential backoff with 4 general registers just fine,
IIRC.

  Keith>   There is also a problem with binary only modules, I hate
  Keith> them but they are a fact of life.  Changing the clobber list
  Keith> in spinlock.h will result in binary only modules that have an
  Keith> incorrect view of what the kernel is doing and will result in
  Keith> register corruption on contended locks.  I don't like the
  Keith> idea that tuning and debugging patches introduce Heisenbugs!

That's an argument against binary-only driver.  Not much else.

	--david


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Linux-ia64] Re: new ia64 kernel patch (relative to 2.5.69)
  2003-05-10 14:02 [Linux-ia64] Re: new ia64 kernel patch (relative to 2.5.69) Keith Owens
  2003-05-12 23:20 ` David Mosberger
@ 2003-05-12 23:38 ` Keith Owens
  2003-05-13  0:01 ` Keith Owens
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Keith Owens @ 2003-05-12 23:38 UTC (permalink / raw)
  To: linux-ia64

On Mon, 12 May 2003 16:20:09 -0700, 
David Mosberger <davidm@napali.hpl.hp.com> wrote:
>>>>>> On Sun, 11 May 2003 00:02:50 +1000, Keith Owens <kaos@sgi.com> said:
>  Keith> b6 is not in minstate which means it is missing for INIT and
>  Keith> MCA dumps, making it impossible to determine which code was
>  Keith> in contention.
>
>This doesn't make any sense to me.  If it's not in MINSTATE, it's
>preserved across the firmware INIT path.  You may want to take a
>look at how the 2.5 MCA code is handling this.

Preserved but not listed in the INIT/MCA dumps, i.e. on a first cut of
problem diagnosis you are missing the address of the mainline code that
is in contention.  Also not available to code that only gets pt_regs,
e.g. perfmon and any other code that relies on interrupt state to
record where the kernel is spending its time.

>  Keith> Some of the optional code that is added to the contention
>  Keith> path (e.g. exponential backoff) needs more registers.
>  Keith> Changing spinlock.h forces a complete recompile, so predefine
>  Keith> a suitable set of free registers.
>
>I'd like to see that exponential backoff code first.  The old code in
>head.S did exponential backoff with 4 general registers just fine,
>IIRC.

Not just exponential backoff.  Also kdb, checking for hung spinlocks,
calculating spinlock hold time plus any other debugging or performance
monitoring code that we can add now that there is only one copy of the
spinlock contention code.  They all need additional and separate sets
of registers.

Adding debugging or monitoring code for spinlocks should not require a
complete kernel recompile.  It only takes one user to forget to rebuild
or reinstall their modules after changing spinlock.h and you get
Heisenbugs.  Predefining a reasonable list saves us from all that grief
and costs nothing.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Linux-ia64] Re: new ia64 kernel patch (relative to 2.5.69)
  2003-05-10 14:02 [Linux-ia64] Re: new ia64 kernel patch (relative to 2.5.69) Keith Owens
  2003-05-12 23:20 ` David Mosberger
  2003-05-12 23:38 ` Keith Owens
@ 2003-05-13  0:01 ` Keith Owens
  2003-05-13  0:25 ` David Mosberger
  2003-05-13  0:29 ` David Mosberger
  4 siblings, 0 replies; 6+ messages in thread
From: Keith Owens @ 2003-05-13  0:01 UTC (permalink / raw)
  To: linux-ia64

On Sun, 11 May 2003 00:02:50 +1000, 
Keith Owens <kaos@sgi.com> wrote:
>* Always save the return address in r28 in addition to b6 ...
>  ... kernprof only gets the minstate
>  data and it needs the return address to report the real contention
>  code.

Ignore that last bit, b6 is in pt_regs.  I don't know what I was
thinking when I wrote that.  But the bit about not appearing in the
INIT/MCA minstate dump still applies.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Linux-ia64] Re: new ia64 kernel patch (relative to 2.5.69)
  2003-05-10 14:02 [Linux-ia64] Re: new ia64 kernel patch (relative to 2.5.69) Keith Owens
                   ` (2 preceding siblings ...)
  2003-05-13  0:01 ` Keith Owens
@ 2003-05-13  0:25 ` David Mosberger
  2003-05-13  0:29 ` David Mosberger
  4 siblings, 0 replies; 6+ messages in thread
From: David Mosberger @ 2003-05-13  0:25 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Tue, 13 May 2003 10:01:00 +1000, Keith Owens <kaos@sgi.com> said:

  Keith> On Sun, 11 May 2003 00:02:50 +1000, Keith Owens
  Keith> <kaos@sgi.com> wrote:
  >> * Always save the return address in r28 in addition to b6 ...
  >> ... kernprof only gets the minstate data and it needs the return
  >> address to report the real contention code.

  Keith> Ignore that last bit, b6 is in pt_regs.  I don't know what I
  Keith> was thinking when I wrote that.  But the bit about not
  Keith> appearing in the INIT/MCA minstate dump still applies.

Then fix the dump.  There is certainly not a reason to change normal
kernel code because of something that (for ordinary users) happens
once in a blue moon, if ever.

	--david


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Linux-ia64] Re: new ia64 kernel patch (relative to 2.5.69)
  2003-05-10 14:02 [Linux-ia64] Re: new ia64 kernel patch (relative to 2.5.69) Keith Owens
                   ` (3 preceding siblings ...)
  2003-05-13  0:25 ` David Mosberger
@ 2003-05-13  0:29 ` David Mosberger
  4 siblings, 0 replies; 6+ messages in thread
From: David Mosberger @ 2003-05-13  0:29 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Tue, 13 May 2003 09:38:15 +1000, Keith Owens <kaos@sgi.com> said:

  Keith> Not just exponential backoff.  Also kdb, checking for hung
  Keith> spinlocks, calculating spinlock hold time plus any other
  Keith> debugging or performance monitoring code that we can add now
  Keith> that there is only one copy of the spinlock contention code.
  Keith> They all need additional and separate sets of registers.

Why do they need separate sets of registers?

Please show some real code that justifies the number of registers you
want to add.  The cost may be small, but it's certainly not zero.

	--david


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-05-13  0:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-05-10 14:02 [Linux-ia64] Re: new ia64 kernel patch (relative to 2.5.69) Keith Owens
2003-05-12 23:20 ` David Mosberger
2003-05-12 23:38 ` Keith Owens
2003-05-13  0:01 ` Keith Owens
2003-05-13  0:25 ` David Mosberger
2003-05-13  0:29 ` David Mosberger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox