public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Alexander van Heukelum" <heukelum@fastmail.fm>
To: "Ingo Molnar" <mingo@elte.hu>
Cc: "Alexander van Heukelum" <heukelum@mailshack.com>,
	"LKML" <linux-kernel@vger.kernel.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	lguest@ozlabs.org, jeremy@xensource.com,
	"Steven Rostedt" <srostedt@redhat.com>,
	"Cyrill Gorcunov" <gorcunov@gmail.com>,
	"Mike Travis" <travis@sgi.com>,
	"Jeremy Fitzhardinge" <jeremy@goop.org>,
	"Andi Kleen" <andi@firstfloor.org>
Subject: Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes
Date: Tue, 04 Nov 2008 17:23:09 +0100	[thread overview]
Message-ID: <1225815789.30706.1282936457@webmail.messagingengine.com> (raw)
In-Reply-To: <20081104140030.GA16178@elte.hu>

On Tue, 4 Nov 2008 15:00:30 +0100, "Ingo Molnar" <mingo@elte.hu> said:
> 
> * Alexander van Heukelum <heukelum@fastmail.fm> wrote:
> 
> > On Tue, 4 Nov 2008 13:42:42 +0100, "Ingo Molnar" <mingo@elte.hu> said:
> > > 
> > > * Alexander van Heukelum <heukelum@mailshack.com> wrote:
> > > 
> > > > Hi all,
> > > > 
> > > > An x86 processor handles an interrupt (from an external source, 
> > > > software generated or due to an exception), depending on the 
> > > > contents if the IDT. Normally the IDT contains mostly interrupt 
> > > > gates. Linux points each interrupt gate to a unique function. Some 
> > > > are specific to some task (handling traps, IPI's, ...), the others 
> > > > are stubs that push the interrupt number to the stack and jump to 
> > > > 'common_interrupt'.
> > > > 
> > > > This patch removes the need for the stubs.
> > > 
> > > hm, the cost would be this new code:
> > > 
> > > > +.p2align
> > > > +ENTRY(maininterrupt)
> > > >  	RING0_INT_FRAME
> > > > -vector=0
> > > > -.rept NR_VECTORS
> > > > -	ALIGN
> > > > - .if vector
> > > > -	CFI_ADJUST_CFA_OFFSET -4
> > > > - .endif
> > > > -1:	pushl $~(vector)
> > > > -	CFI_ADJUST_CFA_OFFSET 4
> > > > +	push %eax
> > > > +	push %eax
> > > > +	mov %cs,%eax
> > > > +	shr $3,%eax
> > > > +	and $0xff,%eax
> > > > +	not %eax
> > > > +	mov %eax,4(%esp)
> > > > +	pop %eax
> > > >  	jmp common_interrupt
> > > 
> > > .. which we were able to avoid before. A couple of segment register 
> > > accesses, shifts, etc to calculate the vector - each of which can be 
> > > quite costly (especially the segment register access - this is a 
> > > relatively rare instruction pattern).
> > 
> > The way it is written now is just so I did not have to change 
> > common_interrupt (to keep changes small). All those accesses so 
> > close together will cost some cycles, but much can be avoided if it 
> > is integrated. If the precise content of the stack can be changed, 
> > this could be as simple as "push %cs". Even that can be delayed, 
> > because the content of the cs register will still be there.
> > 
> > Note that the specialized interrupts (including page fault, etc.) 
> > will not go via this path. As far as I understand now, it is only 
> > the interrupts from external devices that normally go via 
> > common_interrupt. There I think the overhead is really tiny compared 
> > to the rest of the handling of the interrupt.
> 
> no complaints from me about the cleanup/simplification effect - that's 
> really great. To make the reasoning all iron-clad please post timings 
> of "push %cs" costs measured via RDTSC or so - can be done in 
> user-space as well. (you can simulate the entry+exit sequence in 
> user-space as well and prove that the overhead is near zero.) In the 
> end it could all even be faster (perhaps), besides smaller.

I did some timings using the little program below (32-bit only), doing
1024 times the same sequence. TEST1 is just pushing a constant onto
the stack; TEST2 is pushing the cs register; TEST3 is the sequence
from the patch to extract the vector number from the cs register.

Opteron    (cycles): 1024 / 1157 / 3527
Xeon E5345 (cycles): 1092 / 1085 / 6622
Athlon XP  (cycles): 1028 / 1166 / 5192

I'ld say that the cost of the push %cs itself is negligible.

> ( another advantage is that the 6 bytes GDT descriptor is more 
>   compressed and hence uses up less L1/L2 cache footprint than the 
>   larger (~7 byte) trampolines we have at the moment. )

A GDT descriptor has to be read and processed anyhow... It might
just not be in cache. But at least it is aligned. The trampolines
are 7 bytes (irq#<128) or 10 bytes (irq#>127) on i386 and x86_64.
And one is data, and the other is code, which might also cause
different behaviour. It's just a bit too complicated to decide by
just reasoning about it ;).

> plus it's possible to observe the typical cost of irqs from user-space 
> as well: run a task on a single CPU and save away all the RDTSC deltas 
> that are larger than ~10 cycles - these will be the IRQ entry costs. 
> Print out these deltas after 60 seconds of runtime (or something like 
> that), and look at the histogram.

I'll see if I can do that. Maybe in a few days...

Thanks,
    Alexander

> 	Ingo


#include <stdio.h>
#include <stdlib.h>

#define TEST 3

int main(void)
{
        int i, ticks[1024];

        for (i=0; i<(sizeof(ticks)/sizeof(*ticks)); i++) {
                asm volatile (
                "push %%edx\n\t"
                "push %%ecx\n\t"
                "rdtsc\n\t"
                "mov %%eax,%%ecx\n\t"
                ".rept 1024\n\t"
#if TEST==1
                "push $-255\n\t"
#endif
#if TEST==2
                "push %%cs\n\t"
#endif
#if TEST==3
                "push %%eax\n\t"
                "push %%eax\n\t"
                "mov %%cs,%%eax\n\t"
                "shr $3,%%eax\n\t"
                "and $0xff,%%eax\n\t"
                "not %%eax\n\t"
                "mov %%eax,4(%%esp)\n\t"
                "pop %%eax\n\t"
#endif
                ".endr\n\t"
                "rdtsc\n\t"
                ".rept 1024\n\t"
                "pop %%edx\n\t"
                ".endr\n\t"
                "sub %%ecx,%%eax\n\t"
                "pop %%ecx\n\t"
                "pop %%edx"
                : "=a" (ticks[i]) );
        }

        for (i=0; i<(sizeof(ticks)/sizeof(*ticks)); i++) {
                printf("%i\n", ticks[i]);
        }
}
-- 
  Alexander van Heukelum
  heukelum@fastmail.fm

-- 
http://www.fastmail.fm - A fast, anti-spam email service.


  reply	other threads:[~2008-11-04 16:23 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-04 12:28 [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes Alexander van Heukelum
2008-11-04 12:42 ` Ingo Molnar
2008-11-04 13:29   ` Alexander van Heukelum
2008-11-04 14:00     ` Ingo Molnar
2008-11-04 16:23       ` Alexander van Heukelum [this message]
2008-11-04 16:47         ` Cyrill Gorcunov
2008-11-04 16:58           ` Ingo Molnar
2008-11-04 17:13             ` Cyrill Gorcunov
2008-11-04 17:29               ` Alexander van Heukelum
2008-11-06  9:19                 ` Ingo Molnar
2008-11-04 20:02       ` Jeremy Fitzhardinge
2008-11-04 20:15         ` H. Peter Anvin
2008-11-04 20:02   ` Jeremy Fitzhardinge
2008-11-04 15:07 ` Cyrill Gorcunov
2008-11-04 15:47   ` Alexander van Heukelum
2008-11-04 16:36     ` Ingo Molnar
2008-11-04 16:45       ` Alexander van Heukelum
2008-11-04 16:54         ` Ingo Molnar
2008-11-04 16:55           ` Ingo Molnar
2008-11-04 16:58           ` Alexander van Heukelum
2008-11-04 17:39           ` Alexander van Heukelum
2008-11-04 17:05   ` Andi Kleen
2008-11-04 18:06     ` Alexander van Heukelum
2008-11-04 18:14       ` H. Peter Anvin
2008-11-04 18:44         ` Alexander van Heukelum
2008-11-04 19:07           ` H. Peter Anvin
2008-11-04 19:33           ` H. Peter Anvin
2008-11-04 20:06             ` Jeremy Fitzhardinge
2008-11-04 20:30             ` Andi Kleen
2008-11-04 20:26               ` H. Peter Anvin
2008-11-04 20:46                 ` Andi Kleen
2008-11-04 20:44       ` Ingo Molnar
2008-11-04 21:06         ` Andi Kleen
2008-11-05  0:42           ` Jeremy Fitzhardinge
2008-11-05  0:50             ` H. Peter Anvin
2008-11-06  9:15             ` Ingo Molnar
2008-11-06  9:25               ` H. Peter Anvin
2008-11-06  9:30                 ` Ingo Molnar
2008-11-05 10:26           ` Ingo Molnar
2008-11-14  1:11             ` Nick Piggin
2008-11-14  1:20               ` H. Peter Anvin
2008-11-14  2:12                 ` Nick Piggin
2008-11-04 21:29         ` Ingo Molnar
2008-11-04 21:35           ` H. Peter Anvin
2008-11-04 21:52             ` Ingo Molnar
2008-11-05 17:53               ` Cyrill Gorcunov
2008-11-05 18:04                 ` H. Peter Anvin
2008-11-05 18:14                   ` Cyrill Gorcunov
2008-11-05 18:20                     ` H. Peter Anvin
2008-11-05 18:26                       ` Cyrill Gorcunov
     [not found]         ` <1226243805.27361.1283784629@webmail.messagingengine.com>
2008-11-10  1:29           ` H. Peter Anvin
2008-11-26 21:35             ` [Lguest] " Avi Kivity
2008-11-26 21:50               ` Avi Kivity
2008-11-27  0:03               ` H. Peter Anvin
2008-11-27 10:13                 ` Avi Kivity
2008-11-27 10:56                   ` Andi Kleen
2008-11-27 10:59                     ` Avi Kivity
2008-11-28 20:48                   ` Alexander van Heukelum
2008-11-29 15:45                     ` Alexander van Heukelum
2008-11-29 18:21                       ` Avi Kivity
2008-11-29 18:22                       ` Avi Kivity
2008-11-29 19:58                         ` Ingo Molnar
2008-12-01  4:32                         ` Rusty Russell
2008-12-01  8:00                           ` Ingo Molnar
2008-12-01  9:24                           ` Avi Kivity
2008-12-01 10:32                             ` Cyrill Gorcunov
2008-12-01 10:41                               ` Avi Kivity
2008-12-01 10:49                                 ` Ingo Molnar
2008-11-10  8:58           ` Ingo Molnar
2008-11-10 12:44             ` Alexander van Heukelum
2008-11-10 13:07               ` Ingo Molnar
2008-11-10 21:35                 ` Alexander van Heukelum
2008-11-10 22:21                   ` H. Peter Anvin
2008-11-11  5:00                   ` H. Peter Anvin
2008-11-13 22:23                     ` Matt Mackall
2008-11-14  1:18                       ` H. Peter Anvin
2008-11-14  2:29                         ` Matt Mackall
2008-11-14  3:22                           ` H. Peter Anvin
2008-11-11  9:54                   ` Ingo Molnar
2008-11-10 15:39             ` H. Peter Anvin
2008-11-10 21:44               ` Alexander van Heukelum
2008-11-10 23:34                 ` H. Peter Anvin
2008-11-05 18:15     ` Cyrill Gorcunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1225815789.30706.1282936457@webmail.messagingengine.com \
    --to=heukelum@fastmail.fm \
    --cc=andi@firstfloor.org \
    --cc=gorcunov@gmail.com \
    --cc=heukelum@mailshack.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=jeremy@xensource.com \
    --cc=lguest@ozlabs.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=srostedt@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=travis@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox