* Re: [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path
2006-01-31 9:13 [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path Chen, Kenneth W
@ 2006-01-31 17:44 ` David Mosberger-Tang
2006-01-31 19:27 ` Chen, Kenneth W
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: David Mosberger-Tang @ 2006-01-31 17:44 UTC (permalink / raw)
To: linux-ia64
How much latency does this add for a minimal syscall when everything is cached?
--david
On 1/31/06, Chen, Kenneth W <kenneth.w.chen@intel.com> wrote:
> Kernel knows where the register backing store is corresponding
> to user dirty stack registers. Prefetch those cache lines as
> early as possible in ia64_leave_syscall path.
>
> Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
>
>
> --- ./arch/ia64/kernel/entry.S.orig 2006-01-02 19:21:10.000000000 -0800
> +++ ./arch/ia64/kernel/entry.S 2006-01-30 13:26:41.161634409 -0800
> @@ -715,10 +715,11 @@ ENTRY(ia64_leave_syscall)
> ;;
> (p6) ld4 r31=[r18] // load current_thread_info()->flags
> ld8 r19=[r2],PT(B6)-PT(LOADRS) // load ar.rsc value for "loadrs"
> - nop.i 0
> + add r17=IA64_RBS_OFFSET,r13
> ;;
> mov r16=ar.bsp // M2 get existing backing store pointer
> ld8 r18=[r2],PT(R9)-PT(B6) // load b6
> + shr r22=r19,16
> (p6) and r15=TIF_WORK_MASK,r31 // any work other than TIF_SYSCALL_TRACE?
> ;;
> ld8 r23=[r3],PT(R11)-PT(AR_BSPSTORE) // load ar.bspstore (may be garbage)
> @@ -729,6 +730,12 @@ ENTRY(ia64_leave_syscall)
> ld8 r9=[r2],PT(CR_IPSR)-PT(R9)
> ld8 r11=[r3],PT(CR_IIP)-PT(R11)
> (pNonSys) break 0 // bug check: we shouldn't be here if pNonSys is TRUE!
> +1:
> +(pUStk) lfetch [r17],128
> + add r22=-128,r22
> + ;;
> +(pUStk) cmp.gt.unc p7,p0=r22,r0
> +(p7) br.dptk.few 1b
> ;;
> invala // M0|1 invalidate ALAT
> rsm psr.i | psr.ic // M2 turn off interrupts and interruption collection
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Mosberger Consulting LLC, http://www.mosberger-consulting.com/
^ permalink raw reply [flat|nested] 6+ messages in thread* RE: [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path
2006-01-31 9:13 [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path Chen, Kenneth W
2006-01-31 17:44 ` David Mosberger-Tang
@ 2006-01-31 19:27 ` Chen, Kenneth W
2006-02-07 2:56 ` Chen, Kenneth W
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Chen, Kenneth W @ 2006-01-31 19:27 UTC (permalink / raw)
To: linux-ia64
David Mosberger-Tang wrote on Tuesday, January 31, 2006 9:45 AM
> How much latency does this add for a minimal syscall when
> everything is cached?
I will get the measurement out shortly.
- Ken
^ permalink raw reply [flat|nested] 6+ messages in thread* RE: [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path
2006-01-31 9:13 [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path Chen, Kenneth W
2006-01-31 17:44 ` David Mosberger-Tang
2006-01-31 19:27 ` Chen, Kenneth W
@ 2006-02-07 2:56 ` Chen, Kenneth W
2006-02-07 21:19 ` David Mosberger-Tang
2006-02-07 21:27 ` Chen, Kenneth W
4 siblings, 0 replies; 6+ messages in thread
From: Chen, Kenneth W @ 2006-02-07 2:56 UTC (permalink / raw)
To: linux-ia64
Chen, Kenneth wrote on Tuesday, January 31, 2006 11:27 AM
> David Mosberger-Tang wrote on Tuesday, January 31, 2006 9:45 AM
> > How much latency does this add for a minimal syscall when
> > everything is cached?
>
> I will get the measurement out shortly.
Here are the numbers for break based getpid():
Vanilla 2.6.15: 238 cycles
Vanilla + patches: 250 cycles
- Ken
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path
2006-01-31 9:13 [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path Chen, Kenneth W
` (2 preceding siblings ...)
2006-02-07 2:56 ` Chen, Kenneth W
@ 2006-02-07 21:19 ` David Mosberger-Tang
2006-02-07 21:27 ` Chen, Kenneth W
4 siblings, 0 replies; 6+ messages in thread
From: David Mosberger-Tang @ 2006-02-07 21:19 UTC (permalink / raw)
To: linux-ia64
On 2/6/06, Chen, Kenneth W <kenneth.w.chen@intel.com> wrote:
> Chen, Kenneth wrote on Tuesday, January 31, 2006 11:27 AM
> > David Mosberger-Tang wrote on Tuesday, January 31, 2006 9:45 AM
> > > How much latency does this add for a minimal syscall when
> > > everything is cached?
> >
> > I will get the measurement out shortly.
>
> Here are the numbers for break based getpid():
>
> Vanilla 2.6.15: 238 cycles
> Vanilla + patches: 250 cycles
Hmmh, 12 cycles lost. Can you see if you can do better? There really
are apps out there that care about this case (not getpid per se, but
minimal syscall entry/exit overhead when everything is cached).
--david
--
Mosberger Consulting LLC, http://www.mosberger-consulting.com/
^ permalink raw reply [flat|nested] 6+ messages in thread* RE: [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path
2006-01-31 9:13 [patch 3/6] prefetch bottom of kernel rbs stack at syscall exit path Chen, Kenneth W
` (3 preceding siblings ...)
2006-02-07 21:19 ` David Mosberger-Tang
@ 2006-02-07 21:27 ` Chen, Kenneth W
4 siblings, 0 replies; 6+ messages in thread
From: Chen, Kenneth W @ 2006-02-07 21:27 UTC (permalink / raw)
To: linux-ia64
David Mosberger-Tang wrote on Tuesday, February 07, 2006 1:19 PM
> > Chen, Kenneth wrote on Tuesday, January 31, 2006 11:27 AM
> > > David Mosberger-Tang wrote on Tuesday, January 31, 2006 9:45 AM
> > > > How much latency does this add for a minimal syscall when
> > > > everything is cached?
> > >
> > > I will get the measurement out shortly.
> >
> > Here are the numbers for break based getpid():
> >
> > Vanilla 2.6.15: 238 cycles
> > Vanilla + patches: 250 cycles
>
> Hmmh, 12 cycles lost. Can you see if you can do better? There really
> are apps out there that care about this case (not getpid per se, but
> minimal syscall entry/exit overhead when everything is cached).
I'm with you completely on this. I'm not happy with it either. So back
to the drawing board.
- Ken
^ permalink raw reply [flat|nested] 6+ messages in thread