All of lore.kernel.org
 help / color / mirror / Atom feed
* [parisc-linux] itlb miss handler optimizations!
@ 2003-07-25  7:04 Carlos O'Donell
  2003-07-25 11:46 ` Matthew Wilcox
  0 siblings, 1 reply; 23+ messages in thread
From: Carlos O'Donell @ 2003-07-25  7:04 UTC (permalink / raw)
  To: parisc-linux

[-- Attachment #1: Type: text/plain, Size: 1767 bytes --]


pa,

Lamont and myself were discussing the lightweight syscall
implementations and ran across some interesting itlb optimizations.

We first looked at the itlb_miss_XX functions, where XX is one of 11 or
20 wether your kernel is 32 or 64-bits respectively. And we saw that
there is an interlocked 'or' that nullifies a compare and branch. This
as Lamont argued, isn't as optimal as possible. 

Before:
	mfsp current space
	/* if faulting space is kernel space that's okay */
	or with nullify the current space and 0.
	/* die bad userpace die */
	cmpb if the faulting space <> current space then die.

Which can mean that branch prediction borks _all_ the time since if
userspace was constantly faulting then there wouldn't be much userspace
left.

Now:
	mfsp current space
	/* branch prediciton forward is winning */
	cmpb to itlb_user_fault if faulting space <> current space.
	/* ... else life is good */


	itlb_user_fault:
	/* Was it the kernel? Oh yeah... that's okay then */
	/* branch prediction winning again! */
	cmpb if the faulting space was 0, then go back up.

The nice part seems to be the predicted branches. Since we still have
one interlock between the mfsp and the cmpb, but the processor is
already filled it's queues with coming insn in the next bit of the itlb.
We keep the processor looking forward in the common case. Maybe it's
early in the morning and I'm not thinking well, but maybe it's Lamonts
ability to convince you of something you aren't sure of :)

Patch attached. We also moved a zdep to better the forward path during a
set of insn that weren't doing much waiting around for a memory read.

THE PATCH IS UNTESTED! If you want to give it a shot... please do so and
tell us if your box dies^H^H^H^H runs faster :)

Cheers,
Carlos.


[-- Attachment #2: entry.S.diff --]
[-- Type: text/plain, Size: 2027 bytes --]

Index: entry.S
===================================================================
RCS file: /var/cvs/linux/arch/parisc/kernel/entry.S,v
retrieving revision 1.98
diff -u -p -r1.98 entry.S
--- entry.S	9 Dec 2002 06:09:08 -0000	1.98
+++ entry.S	25 Jul 2003 06:37:58 -0000
@@ -1535,8 +1535,7 @@ itlb_miss_11:
 	mfctl           %cr25,ptp	/* load user pgd */
 
 	mfsp            %sr7,t0		/* Get current space */
-	or,=            %r0,t0,%r0	/* If kernel, nullify following test */
-	cmpb,<>,n       t0,spc,itlb_fault /* forward */
+	cmpb,<>,n	t0,spc,itlb_user_fault /* forward */
 
 	/* First level page table lookup */
 
@@ -1551,6 +1550,10 @@ itlb_miss_common_11:
 	sh2addl 	 t0,ptp,ptp
 	ldi		_PAGE_ACCESSED,t1
 	ldw		 0(ptp),pte
+
+	/* Running parallel, taken from below 'zdep0' */
+	zdep            spc,30,15,prot  /* create prot id from space */
+
 	bb,>=,n 	 pte,_PAGE_PRESENT_BIT,itlb_fault
 
 	/* Check whether the "accessed" bit was set, otherwise do so */
@@ -1559,7 +1562,7 @@ itlb_miss_common_11:
 	and,<>		t1,pte,%r0	/* test and nullify if already set */
 	stw		t0,0(ptp)	/* write back pte */
 
-	zdep            spc,30,15,prot  /* create prot id from space */
+	/* zdep0 moved back */
 	dep             pte,8,7,prot    /* add in prot bits from pte */
 
 	extru,=		pte,_PAGE_NO_CACHE_BIT,1,r0
@@ -1602,8 +1605,7 @@ itlb_miss_20:
 	mfctl           %cr25,ptp	/* load user pgd */
 
 	mfsp            %sr7,t0		/* Get current space */
-	or,=            %r0,t0,%r0	/* If kernel, nullify following test */
-	cmpb,<>,n       t0,spc,itlb_fault /* forward */
+	cmpb,<>,n	t0,spc,itlb_user_fault	/* forward */
 
 	/* First level page table lookup */
 
@@ -1882,6 +1884,15 @@ kernel_bad_space:
 dbit_fault:
 	b               intr_save
 	ldi             20,%r8
+
+itlb_user_fault:
+	/* User tlb missed for other than his own space. Optimization. */
+#ifdef __LP64__
+	cmpb,=		%r0,t0,itlb_miss_common20 /* backward */
+#else
+	cmpb,=		%r0,t0,itlb_miss_common11 /* backward */
+#endif
+	nop
 
 itlb_fault:
 	b               intr_save

^ permalink raw reply	[flat|nested] 23+ messages in thread
* [parisc-linux] itlb miss handler optimizations!
@ 2003-08-19 12:33 Joel Soete
  2003-08-19 13:42 ` Matthew Wilcox
  0 siblings, 1 reply; 23+ messages in thread
From: Joel Soete @ 2003-08-19 12:33 UTC (permalink / raw)
  To: parisc-linux; +Cc: Matthew Wilcox

>On Thu, Aug 14, 2003 at 08:02:04AM +0200, Joel Soete wrote:
>> btw is it for that reason (interlock) that in your patch we can read:
>> [...]
>>         cmpb,=        %r0,t0,itlb_miss_...
>>         nop
>> [...]
>> I am alway asking why the 'nop'.
>
>To fill the delayed branch slot (a silly idea, but ...)

Would it be the same 'by setting the "nullify" bit' (ie cmpb,=,n ...)

Joel




-------------------------------------------------------------------------
Tiscali ADSL, seulement 35 eur/mois et le modem est inclus...abonnez-vous!
http://reg.tiscali.be/default.asp?lg=fr 

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2003-08-19 13:42 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-07-25  7:04 [parisc-linux] itlb miss handler optimizations! Carlos O'Donell
2003-07-25 11:46 ` Matthew Wilcox
2003-07-26 18:02   ` Carlos O'Donell
2003-08-12  3:58   ` Carlos O'Donell
2003-08-12 12:21     ` Joel Soete
2003-08-12 14:40       ` Carlos O'Donell
2003-08-12 16:06     ` Grant Grundler
2003-08-12 16:32       ` Matthew Wilcox
2003-08-12 17:06       ` Joel Soete
2003-08-13 15:57         ` Grant Grundler
2003-08-13 16:38           ` Joel Soete
2003-08-13 14:52       ` Joel Soete
2003-08-13 15:56         ` Carlos O'Donell
2003-08-13 16:05           ` Carlos O'Donell
2003-08-13 16:43             ` Joel Soete
2003-08-13 16:51               ` Grant Grundler
2003-08-14  6:02             ` Joel Soete
2003-08-14 11:46               ` Matthew Wilcox
2003-08-14 13:56                 ` Joel Soete
2003-08-14 15:23                   ` Grant Grundler
2003-08-14 16:15                     ` Joel Soete
  -- strict thread matches above, loose matches on Subject: below --
2003-08-19 12:33 Joel Soete
2003-08-19 13:42 ` Matthew Wilcox

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.