* experimental FPU context switch patch
@ 2002-03-04 19:51 Jun Sun
0 siblings, 0 replies; only message in thread
From: Jun Sun @ 2002-03-04 19:51 UTC (permalink / raw)
To: linux-mips@oss.sgi.com
I implemented a new FPU context saving/restoring patch, as previously
suggested by Kevin and Ralf. The major change is that we will save the FPU
context when we switch out a process, if necessary.
The goal is to gurrantee an off-line process always has its FPU context
saved in memory and thus free to move aother CPU in a SMP system.
The initial experimental patch can be found at the following URL.
It is a quick hack to study the performance impact. It should be
further optimized. It also needs to be extended so that it works
for all CPUs (including the ones without FPU) and becomes true SMP-safe
(getting rid of global variable last_task_used_math).
http://linux.junsun.net/patches/oss.sgi.com/experiemental/020304-new-fpu-context-switch/patch
Here is the pseudo code version of the patch:
do_cpu() {
if (current->used_math) { /* Using the FPU again. */
- lazy_fpu_switch(last_task_used_math);
+ restore_fp(current); /* we don't need to save for the
current proc */
} else { /* First time FPU user. */
r4xx0_resume()
save non_scratch registers
+ if (current proc owns FPU) { /* t used FPU in the curr run */
+ make it turn off FPU for next run
+ save FPU context to current proc
+ (note we leave last_task_used_math alone)
....
lmbench is run to compare the performance difference on a UP system
(NEC VR5500). See the output at the following URL. orig are
the unpatched kernel.
http://linux.junsun.net/patches/oss.sgi.com/experiemental/020304-new-fpu-context-switch/performance
It is obvious there is not much performance difference. And this is not
a surprise.
A couple of attributes of the patch:
1) it does not save FPU if the proc did not use FPU in the current run
2) when proc uses FPU again in next run, we don't have to restore FPU context
if the hardware context has not been used by another proc yet
(i.e., last_task_used_math == current)
So
1) if no processes are actively using FPU, we don't see much overhead other
than a couple of load/branch instructions in resume
2) if most processes are actively using FPU, then we see the same overhead.
The saving of FPU context is necessary in this scenario, whether it is done
resume() (as in the patch) or a little later in lazy_fpu_swotch() as in
the current kernel.
3) The only pathological case which would make the patch bad is when you have
a process that actively uses FPU and it frequently switches context with
non-FPU-using processes. In this case, the saving of FPU context each
time fpu-using proc is switched off is an overhead.
If each time the fpu-using process runs through a full time slice, the
overhead is very small percentage wise. It is the frequent context
switching in this case would make a kill.
I am interested in testing any benchmarks that would create case 3). Please
let me know if you know any.
So much for rambling.
Jun
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2002-03-04 20:56 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-03-04 19:51 experimental FPU context switch patch Jun Sun
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox