All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zachary Amsden <zach@vmware.com>
To: Andi Kleen <ak@suse.de>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [PATCH] 3/5 explicit-iopl
Date: Thu, 04 Aug 2005 08:29:17 -0700	[thread overview]
Message-ID: <42F2344D.1070209@vmware.com> (raw)
In-Reply-To: <p73k6j1rj1i.fsf@bragg.suse.de>

Andi Kleen wrote:

>zach@vmware.com writes:
>  
>
>>Unfortunately, this added one field to the thread_struct.  But as a bonus, on
>>P4, the fastest time measured for switch_to() went from 312 to 260 cycles, a
>>win of about 17% in the fast case through this performance critical path.
>>    
>>
>
>Cool! Definitely want this on x86-64 too.
>  
>

Well... maybe.  On Opteron and/or Intel EMT it may not be a win.  The 
cost of the branch could overtake the cost of the POPF (that's the 
expensive one).  Grrr.

>Can we perhaps get rid of the PUSHF/POPF in the SYSENTER syscall path too?
>iirc they were only for single stepping. But SYSENTER doesn't disable
>single stepping, so the debug handler could detect this and set
>some magic flag that restores it on syscall exit.
>  
>

A context switch requires IRET, which requires the flags to be saved, so 
you can't eliminate the pushf (*) IIRC, the popf is already omitted.  
Many of these patches may be beneficial to x86-64, but. unfortunately 
the performance deltas may not translate.  Lets hope they do!  
Unfortunately, that requires re-measuring the cost of switch_to(), which 
was quite amusing to do.  I can send you diffs if you're interested, but 
using printk around this path turned out to be a really bad idea ;)  I 
really would like to bring some of the cleanup and performance work I've 
done on i386 over to x86_64 as well, but that is still probably a couple 
of weeks out.  If you can't wait, you're welcome to port pieces you 
like!  Let me know.

(*) Well, you could.  It's just that system calls would have to clobber 
flags - hmm.. sysenter based calls already do. But I'm not 100% sure 
there isn't some bogon case where kernel preemption could cause you a 
problem.  Keeping around the fake IRET frame still appears to be a good 
thing to do just for the benefit of ptrace / debug functionality.  PUSHF 
is cheap on every core I have measured on.

Zach

  reply	other threads:[~2005-08-04 15:33 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <200508040043.j740hi0R004184@zach-dev.vmware.com.suse.lists.linux.kernel>
2005-08-04 12:19 ` [PATCH] 3/5 explicit-iopl Andi Kleen
2005-08-04 15:29   ` Zachary Amsden [this message]
2005-08-04 15:46     ` Andi Kleen
2005-08-04  0:43 zach

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42F2344D.1070209@vmware.com \
    --to=zach@vmware.com \
    --cc=ak@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.