public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path
@ 2006-01-31  9:13 Chen, Kenneth W
  2006-01-31 17:44 ` David Mosberger-Tang
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Chen, Kenneth W @ 2006-01-31  9:13 UTC (permalink / raw)
  To: linux-ia64

Kernel knows where the register backing store is corresponding
to user dirty stack registers.  Prefetch those cache lines as
early as possible in ia64_leave_syscall path.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>


--- ./arch/ia64/kernel/entry.S.orig	2006-01-02 19:21:10.000000000 -0800
+++ ./arch/ia64/kernel/entry.S	2006-01-30 13:26:41.161634409 -0800
@@ -715,10 +715,11 @@ ENTRY(ia64_leave_syscall)
 	;;
 (p6)	ld4 r31=[r18]				// load current_thread_info()->flags
 	ld8 r19=[r2],PT(B6)-PT(LOADRS)		// load ar.rsc value for "loadrs"
-	nop.i 0
+	add r17=IA64_RBS_OFFSET,r13
 	;;
 	mov r16=ar.bsp				// M2  get existing backing store pointer
 	ld8 r18=[r2],PT(R9)-PT(B6)		// load b6
+	shr r22=r19,16
 (p6)	and r15=TIF_WORK_MASK,r31		// any work other than TIF_SYSCALL_TRACE?
 	;;
 	ld8 r23=[r3],PT(R11)-PT(AR_BSPSTORE)	// load ar.bspstore (may be garbage)
@@ -729,6 +730,12 @@ ENTRY(ia64_leave_syscall)
 	ld8 r9=[r2],PT(CR_IPSR)-PT(R9)
 	ld8 r11=[r3],PT(CR_IIP)-PT(R11)
 (pNonSys) break 0		//      bug check: we shouldn't be here if pNonSys is TRUE!
+1:
+(pUStk)	lfetch [r17],128
+	add r22=-128,r22
+	;;
+(pUStk)	cmp.gt.unc p7,p0=r22,r0
+(p7)	br.dptk.few 1b
 	;;
 	invala			// M0|1 invalidate ALAT
 	rsm psr.i | psr.ic	// M2   turn off interrupts and interruption collection



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path
  2006-01-31  9:13 [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path Chen, Kenneth W
@ 2006-01-31 17:44 ` David Mosberger-Tang
  2006-01-31 19:27 ` Chen, Kenneth W
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: David Mosberger-Tang @ 2006-01-31 17:44 UTC (permalink / raw)
  To: linux-ia64

How much latency does this add for a minimal syscall when everything is cached?

  --david

On 1/31/06, Chen, Kenneth W <kenneth.w.chen@intel.com> wrote:
> Kernel knows where the register backing store is corresponding
> to user dirty stack registers.  Prefetch those cache lines as
> early as possible in ia64_leave_syscall path.
>
> Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
>
>
> --- ./arch/ia64/kernel/entry.S.orig     2006-01-02 19:21:10.000000000 -0800
> +++ ./arch/ia64/kernel/entry.S  2006-01-30 13:26:41.161634409 -0800
> @@ -715,10 +715,11 @@ ENTRY(ia64_leave_syscall)
>         ;;
>  (p6)   ld4 r31=[r18]                           // load current_thread_info()->flags
>         ld8 r19=[r2],PT(B6)-PT(LOADRS)          // load ar.rsc value for "loadrs"
> -       nop.i 0
> +       add r17=IA64_RBS_OFFSET,r13
>         ;;
>         mov r16=ar.bsp                          // M2  get existing backing store pointer
>         ld8 r18=[r2],PT(R9)-PT(B6)              // load b6
> +       shr r22=r19,16
>  (p6)   and r15=TIF_WORK_MASK,r31               // any work other than TIF_SYSCALL_TRACE?
>         ;;
>         ld8 r23=[r3],PT(R11)-PT(AR_BSPSTORE)    // load ar.bspstore (may be garbage)
> @@ -729,6 +730,12 @@ ENTRY(ia64_leave_syscall)
>         ld8 r9=[r2],PT(CR_IPSR)-PT(R9)
>         ld8 r11=[r3],PT(CR_IIP)-PT(R11)
>  (pNonSys) break 0              //      bug check: we shouldn't be here if pNonSys is TRUE!
> +1:
> +(pUStk)        lfetch [r17],128
> +       add r22=-128,r22
> +       ;;
> +(pUStk)        cmp.gt.unc p7,p0=r22,r0
> +(p7)   br.dptk.few 1b
>         ;;
>         invala                  // M0|1 invalidate ALAT
>         rsm psr.i | psr.ic      // M2   turn off interrupts and interruption collection
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


--
Mosberger Consulting LLC, http://www.mosberger-consulting.com/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path
  2006-01-31  9:13 [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path Chen, Kenneth W
  2006-01-31 17:44 ` David Mosberger-Tang
@ 2006-01-31 19:27 ` Chen, Kenneth W
  2006-02-07  2:56 ` Chen, Kenneth W
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Chen, Kenneth W @ 2006-01-31 19:27 UTC (permalink / raw)
  To: linux-ia64

David Mosberger-Tang wrote on Tuesday, January 31, 2006 9:45 AM
> How much latency does this add for a minimal syscall when
> everything is cached?

I will get the measurement out shortly.

- Ken


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path
  2006-01-31  9:13 [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path Chen, Kenneth W
  2006-01-31 17:44 ` David Mosberger-Tang
  2006-01-31 19:27 ` Chen, Kenneth W
@ 2006-02-07  2:56 ` Chen, Kenneth W
  2006-02-07 21:19 ` David Mosberger-Tang
  2006-02-07 21:27 ` Chen, Kenneth W
  4 siblings, 0 replies; 6+ messages in thread
From: Chen, Kenneth W @ 2006-02-07  2:56 UTC (permalink / raw)
  To: linux-ia64

Chen, Kenneth wrote on Tuesday, January 31, 2006 11:27 AM
> David Mosberger-Tang wrote on Tuesday, January 31, 2006 9:45 AM
> > How much latency does this add for a minimal syscall when
> > everything is cached?
> 
> I will get the measurement out shortly.

Here are the numbers for break based getpid():

Vanilla 2.6.15:    238 cycles
Vanilla + patches: 250 cycles

- Ken


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path
  2006-01-31  9:13 [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path Chen, Kenneth W
                   ` (2 preceding siblings ...)
  2006-02-07  2:56 ` Chen, Kenneth W
@ 2006-02-07 21:19 ` David Mosberger-Tang
  2006-02-07 21:27 ` Chen, Kenneth W
  4 siblings, 0 replies; 6+ messages in thread
From: David Mosberger-Tang @ 2006-02-07 21:19 UTC (permalink / raw)
  To: linux-ia64

On 2/6/06, Chen, Kenneth W <kenneth.w.chen@intel.com> wrote:
> Chen, Kenneth wrote on Tuesday, January 31, 2006 11:27 AM
> > David Mosberger-Tang wrote on Tuesday, January 31, 2006 9:45 AM
> > > How much latency does this add for a minimal syscall when
> > > everything is cached?
> >
> > I will get the measurement out shortly.
>
> Here are the numbers for break based getpid():
>
> Vanilla 2.6.15:    238 cycles
> Vanilla + patches: 250 cycles

Hmmh, 12 cycles lost.  Can you see if you can do better?  There really
are apps out there that care about this case (not getpid per se, but
minimal syscall entry/exit overhead when everything is cached).

  --david
--
Mosberger Consulting LLC, http://www.mosberger-consulting.com/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path
  2006-01-31  9:13 [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path Chen, Kenneth W
                   ` (3 preceding siblings ...)
  2006-02-07 21:19 ` David Mosberger-Tang
@ 2006-02-07 21:27 ` Chen, Kenneth W
  4 siblings, 0 replies; 6+ messages in thread
From: Chen, Kenneth W @ 2006-02-07 21:27 UTC (permalink / raw)
  To: linux-ia64

David Mosberger-Tang wrote on Tuesday, February 07, 2006 1:19 PM
> > Chen, Kenneth wrote on Tuesday, January 31, 2006 11:27 AM
> > > David Mosberger-Tang wrote on Tuesday, January 31, 2006 9:45 AM
> > > > How much latency does this add for a minimal syscall when
> > > > everything is cached?
> > >
> > > I will get the measurement out shortly.
> >
> > Here are the numbers for break based getpid():
> >
> > Vanilla 2.6.15:    238 cycles
> > Vanilla + patches: 250 cycles
> 
> Hmmh, 12 cycles lost.  Can you see if you can do better?  There really
> are apps out there that care about this case (not getpid per se, but
> minimal syscall entry/exit overhead when everything is cached).

I'm with you completely on this.  I'm not happy with it either. So back
to the drawing board.

- Ken


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-02-07 21:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-31  9:13 [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path Chen, Kenneth W
2006-01-31 17:44 ` David Mosberger-Tang
2006-01-31 19:27 ` Chen, Kenneth W
2006-02-07  2:56 ` Chen, Kenneth W
2006-02-07 21:19 ` David Mosberger-Tang
2006-02-07 21:27 ` Chen, Kenneth W

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox