public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Reduce length of the eagerfpu path during x86 context switches
@ 2014-08-06 12:55 Mel Gorman
  2014-08-06 12:55 ` [PATCH 1/3] x86, fpu: Do not copy fpu preload state Mel Gorman
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Mel Gorman @ 2014-08-06 12:55 UTC (permalink / raw)
  To: H Peter Anvin, Suresh Siddha; +Cc: Mike Galbraith, Linux-X86, LKML, Mel Gorman

Eager FPU switching is used on CPUs that support xsave on the grounds
that CPUs that support it can optimise the switch with xsaveopt and xrstor
instead of serialising by updating cr0.TS which has serialising semantics.

The path for eagerfpu is fatter than it needs to be because it still
maintains the fpu_counter for lazy FPU switches even though the information
is never used. This patch splits the paths optimises the eagerfpu path a
little. The benefit is marginal, it was just noticed when looking at why
integer-only workloads were spending time saving/restoring FPU states.

 arch/x86/include/asm/fpu-internal.h | 46 +++++++++++++++++++++++++++++--------
 arch/x86/kernel/process_32.c        |  2 +-
 arch/x86/kernel/process_64.c        |  2 +-
 3 files changed, 38 insertions(+), 12 deletions(-)

-- 
1.8.4.5


^ permalink raw reply	[flat|nested] 7+ messages in thread
* [PATCH 0/3] Reduce length of the eagerfpu path during x86 context switches
@ 2014-08-06 12:55 Mel Gorman
  0 siblings, 0 replies; 7+ messages in thread
From: Mel Gorman @ 2014-08-06 12:55 UTC (permalink / raw)
  To: H Peter Anvin, Suresh Siddha; +Cc: Linux-X86, LKML, Mel Gorman

Eager FPU switching is used on CPUs that support xsave on the grounds
that CPUs that support it can optimise the switch with xsaveopt and xrstor
instead of serialising by updating cr0.TS which has serialising semantics.

The path for eagerfpu is fatter than it needs to be because it still
maintains the fpu_counter for lazy FPU switches even though the information
is never used. This patch splits the paths optimises the eagerfpu path a
little. The benefit is marginal, it was just noticed when looking at why
integer-only workloads were spending time saving/restoring FPU states.

 arch/x86/include/asm/fpu-internal.h | 46 +++++++++++++++++++++++++++++--------
 arch/x86/kernel/process_32.c        |  2 +-
 arch/x86/kernel/process_64.c        |  2 +-
 3 files changed, 38 insertions(+), 12 deletions(-)

-- 
1.8.4.5


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-08-27 17:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-06 12:55 [PATCH 0/3] Reduce length of the eagerfpu path during x86 context switches Mel Gorman
2014-08-06 12:55 ` [PATCH 1/3] x86, fpu: Do not copy fpu preload state Mel Gorman
2014-08-06 12:55 ` [PATCH 2/3] x86, fpu: Split FPU save state preparation into eagerfpu and !eagerfpu parts Mel Gorman
2014-08-06 12:55 ` [PATCH 3/3] x86, fpu: Do not update fpu_counter in the eagerfpu case Mel Gorman
2014-08-27 16:03 ` [PATCH 0/3] Reduce length of the eagerfpu path during x86 context switches Mel Gorman
2014-08-27 17:03   ` H. Peter Anvin
  -- strict thread matches above, loose matches on Subject: below --
2014-08-06 12:55 Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox