Re: [PATCH 0/4] Really lazy fpu

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Avi Kivity <avi@redhat.com>
To: Valdis.Kletnieks@vt.edu
Cc: Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/4] Really lazy fpu
Date: Mon, 14 Jun 2010 10:47:19 +0300	[thread overview]
Message-ID: <4C15DE87.3080202@redhat.com> (raw)
In-Reply-To: <39727.1276461922@localhost>

On 06/13/2010 11:45 PM, Valdis.Kletnieks@vt.edu wrote:
> On Sun, 13 Jun 2010 18:03:43 +0300, Avi Kivity said:
>    
>> Currently fpu management is only lazy in one direction.  When we switch into
>> a task, we may avoid loading the fpu state in the hope that the task will
>> never use it.  If we guess right we save an fpu load/save cycle; if not,
>> a Device not Available exception will remind us to load the fpu.
>>
>> However, in the other direction, fpu management is eager.  When we switch out
>> of an fpu-using task, we always save its fpu state.
>>      
> Does anybody have numbers on how many clocks it takes a modern CPU design
> to do a FPU state save or restore?

320 cycles for a back-to-back round trip.  Presumably less on more 
modern hardware, more if uncached, more on even more modern hardware 
that has the xsave header (8 bytes) and ymm state (256 bytes) in addition.

> I know it must have been painful in the
> days before cache memory, having to make added trips out to RAM for 128-bit
> registers.  But what's the impact today?

I'd estimate between 300 and 600 cycles depending on the factors above.

> (Yes, I see there's the potential
> for a painful IPI call - anything else?)
>    

The IPI is only taken after a task migration, hopefully a rare event.  
The patchset also adds the overhead of irq save/restore.  I think I can 
remove that at the cost of some complexity, but prefer to start with a 
simple approach.

> Do we have any numbers on how many saves/restores this will save us when
> running the hypothetical "standard Gnome desktop" environment?

The potential is in the number of context switches per second.  On a 
desktop environment, I don't see much potential for a throughput 
improvement, rather latency reduction from making the crypto threads 
preemptible and reducing context switch times.

Servers with high context switch rates, esp. with real-time preemptible 
kernels (due to threaded interrupts), will see throughput gains.  And, 
of course, kvm will benefit from not needing to switch the fpu when 
going from guest to host userspace or to a host kernel thread (vhost-net).

> How common
> is the "we went all the way around to the original single FPU-using task" case?
>    

When your context switch is due to an oversubscribed cpu, not very 
common.  When it is due to the need to service an event and go back to 
sleep, very common.

-- 
error compiling committee.c: too many arguments to function

next prev parent reply	other threads:[~2010-06-14  7:47 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-13 15:03 [PATCH 0/4] Really lazy fpu Avi Kivity
2010-06-13 15:03 ` [PATCH 1/4] x86, fpu: merge __save_init_fpu() implementations Avi Kivity
2010-06-13 15:03 ` [PATCH 2/4] x86, fpu: run device not available trap with interrupts enabled Avi Kivity
2010-06-13 15:03 ` [PATCH 3/4] x86, fpu: Let the fpu remember which cpu it is active on Avi Kivity
2010-06-13 15:03 ` [PATCH 4/4] x86, fpu: don't save fpu state when switching from a task Avi Kivity
2010-06-13 20:45 ` [PATCH 0/4] Really lazy fpu Valdis.Kletnieks
2010-06-14  7:47   ` Avi Kivity [this message]
2010-06-16  7:24 ` Avi Kivity
2010-06-16  7:32   ` H. Peter Anvin
2010-06-16  8:02     ` Avi Kivity
2010-06-16  8:39       ` Ingo Molnar
2010-06-16  9:01         ` Samuel Thibault
2010-06-16  9:43           ` Avi Kivity
2010-06-16  9:10         ` Nick Piggin
2010-06-16  9:30           ` Avi Kivity
2010-06-16  9:28         ` Avi Kivity
  -- strict thread matches above, loose matches on Subject: below --
2010-06-16 11:32 George Spelvin
2010-06-16 11:46 ` Avi Kivity
2010-06-17  9:38   ` George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C15DE87.3080202@redhat.com \
    --to=avi@redhat.com \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).