public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] faster signal handling on x86
@ 2004-10-26 21:55 Zachary Amsden
  2004-10-26 23:15 ` Linus Torvalds
  0 siblings, 1 reply; 2+ messages in thread
From: Zachary Amsden @ 2004-10-26 21:55 UTC (permalink / raw)
  To: linux-kernel, davej, hpa, Linus Torvalds

[-- Attachment #1: Type: text/plain, Size: 374 bytes --]

I noticed an unneeded write to dr7 in the signal handling path for x86.  
We only need to write to dr7 if there is a breakpoint to re-enable, and 
MOVDR is a serializing instruction, which is expensive.  Getting rid of 
it gets a 33% faster signal delivery path (at least on Xeon - I didn't 
test other CPUs, so your gain may vary).

Cheers,

Zachary Amsden
zach@vmware.com

[-- Attachment #2: README.i386-fast-signal --]
[-- Type: text/plain, Size: 918 bytes --]

Optimize away the unconditional write to debug registers on signal delivery
path.  This is already done on x86_64.  Measured delta TSC for three paths
on a 2.4GHz Xeon.

1) With unconditional write to dr7 :  800-1000 cycles
2) With conditional write to dr7   :  84-112 cycles
3) With unlikely write to dr7      :  84 cycles

Performance test using divzero microbenchmark (3 million divide by zeros):

With unconditional write: 
   7.445 real / 6.136 system
   7.529 real / 6.482 system
   7.541 real / 5.974 system
   7.546 real / 6.217 system
   7.445 real / 6.167 system

With unlikely write:
   5.779 real / 4.518 system
   5.783 real / 4.591 system
   5.552 real / 4.569 system
   5.790 real / 4.528 system
   5.554 real / 4.382 system

That's about a 33% speedup - more than I expected; apparently getting rid
of the serializing instruction makes the do_signal path much faster.

Zachary Amsden (zach@vmware.com)

[-- Attachment #3: i386-fast-signal.patch --]
[-- Type: text/plain, Size: 561 bytes --]

--- linux-2.6.10-rc1/arch/i386/kernel/signal.c	2004-10-25 11:15:43.000000000 -0700
+++ linux-2.6.10-rc1-nsz/arch/i386/kernel/signal.c	2004-10-26 14:30:54.000000000 -0700
@@ -600,7 +600,9 @@
 		 * have been cleared if the watchpoint triggered
 		 * inside the kernel.
 		 */
-		__asm__("movl %0,%%db7"	: : "r" (current->thread.debugreg[7]));
+		if (unlikely(current->thread.debugreg[7])) {
+			__asm__("movl %0,%%db7"	: : "r" (current->thread.debugreg[7]));
+		}
 
 		/* Whee!  Actually deliver the signal.  */
 		handle_signal(signr, &info, &ka, oldset, regs);

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] faster signal handling on x86
  2004-10-26 21:55 [PATCH] faster signal handling on x86 Zachary Amsden
@ 2004-10-26 23:15 ` Linus Torvalds
  0 siblings, 0 replies; 2+ messages in thread
From: Linus Torvalds @ 2004-10-26 23:15 UTC (permalink / raw)
  To: Zachary Amsden; +Cc: linux-kernel, davej, hpa



On Tue, 26 Oct 2004, Zachary Amsden wrote:
>
> I noticed an unneeded write to dr7 in the signal handling path for x86.  
> We only need to write to dr7 if there is a breakpoint to re-enable, and 
> MOVDR is a serializing instruction, which is expensive.  Getting rid of 
> it gets a 33% faster signal delivery path (at least on Xeon - I didn't 
> test other CPUs, so your gain may vary).

I'm suprised it is _that_ slow, but sure, no problem, the patch just makes 
it match all the other paths. 

I suspect Xeon is alone in being _that_ slow - I bet Netburst flushes the 
whole trace cache on db7 writes.

		Linus

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2004-10-26 23:15 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-26 21:55 [PATCH] faster signal handling on x86 Zachary Amsden
2004-10-26 23:15 ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox