From: Carsten Langgaard <carstenl@mips.com>
To: Jun Sun <jsun@mvista.com>
Cc: Louis Hamilton <hamilton@redhat.com>,
linux-mips@oss.sgi.com, sandcraft-elinux-project@redhat.com
Subject: Re: Bug in Linux? fcr31 not being saved-restored
Date: Wed, 19 Jun 2002 08:55:26 +0200 [thread overview]
Message-ID: <3D102ADD.71178636@mips.com> (raw)
In-Reply-To: 3D0F7213.6000002@mvista.com
Jun Sun wrote:
> Your first chunk is not clear as to where it applies. Maybe you are using a
> different code base.
Here is my do_cpu handler:
asmlinkage void do_cpu(struct pt_regs *regs)
{
unsigned int cpid;
extern void lazy_fpu_switch(void *);
extern void save_fp(struct task_struct *);
extern void init_fpu(void);
void fpu_emulator_init_fpu(void);
int sig;
cpid = (regs->cp0_cause >> CAUSEB_CE) & 3;
if (cpid != 1)
goto bad_cid;
if (!(mips_cpu.options & MIPS_CPU_FPU))
goto fp_emul;
regs->cp0_status |= ST0_CU1;
if (last_task_used_math == current)
return;
if (current->used_math) { /* Using the FPU again. */
lazy_fpu_switch(last_task_used_math);
} else { /* First time FPU user. */
if (last_task_used_math != NULL)
{
__enable_fpu();
save_fp(last_task_used_math);
/* last_task_used_math loose fpu */
((struct pt_regs *)(__KSTK_TOS(last_task_used_math) -
sizeof(struct pt_regs)))->
cp0_status &= ~ST0_CU1;
}
init_fpu();
current->used_math = 1;
}
last_task_used_math = current;
return;
fp_emul:
if (last_task_used_math != current) {
if (!current->used_math) {
fpu_emulator_init_fpu();
current->used_math = 1;
}
}
sig = fpu_emulator_cop1Handler(0, regs, ¤t->thread.fpu.soft);
last_task_used_math = current;
if (sig)
{
/*
* Return EPC is not calculated in the FPU emulator, if
* a signal is being send. So we calculate it here.
*/
compute_return_epc(regs);
force_sig(sig, current);
}
return;
bad_cid:
#ifndef CONFIG_CPU_HAS_LLSC
switch (mips_cpu.cputype) {
case CPU_TX3927:
case CPU_TX39XX:
do_ri(regs);
return;
}
#endif
compute_return_epc(regs);
force_sig(SIGILL, current);
}
>
> The second chunck is not necessary. Although the FPU values in thread struct
> for the new thread are stale, the new program cannot assume the fpu register
> values and cannot use them without initialization anyway.
At least with the do_cpu handler above, you need this.
I'm not sure which tests failed without this, but I think it was some of the gcc selftests
and I think the problem was, that a thread inherited the previous thread's fsr (floating
point status register).
>
> I don't see such a code in x86's code. Current call to start_thread() is ok
> with or without this change. I am afraid future use of start_thread() may
> give undesired effect after we make this change.
Why ?
>
> As for the new fpu context switch code, I wrote a experiemental patch after
> much discussion with Ralf and Kevin. It is at
>
> http://linux.junsun.net/patches/oss.sgi.com/experimental/020304-half-fpu-context-switch
>
> I also did some performance study of this patch. Did not get much feedbacks
> when I asked last time. :-(
>
> Jun
>
> Carsten Langgaard wrote:
>
> > I believe this is the patch that solve the problem.
> > The lazy fpu stuff has cause so many problems, that we have decided (together
> > with Ralf) to get rid of it, and do the FPU context save and restore a little bit
> > differently.
> > We now have a solution here locally and we are testing it at the moment.
> >
> > /Carsten
> >
> >
> > Index: arch/mips/kernel/traps.c
> > ===================================================================
> > RCS file: /cvs/linux/arch/mips/kernel/traps.c,v
> > retrieving revision 1.99.2.14
> > diff -u -r1.99.2.14 traps.c
> > --- arch/mips/kernel/traps.c 2002/05/28 06:33:13 1.99.2.14
> > +++ arch/mips/kernel/traps.c 2002/06/18 07:53:47
> > + {
> > + __enable_fpu();
> > save_fp(last_task_used_math);
> > + /* last_task_used_math loose fpu */
> > + ((struct pt_regs *)(__KSTK_TOS(last_task_used_math) -
> > + sizeof(struct pt_regs)))->
> > + cp0_status &= ~ST0_CU1;
> > + }
> >
> > Index: include/asm-mips/processor.h
> > ===================================================================
> > RCS file: /cvs/linux/include/asm-mips/processor.h,v
> > retrieving revision 1.43.2.2
> > diff -u -r1.43.2.2 processor.h
> > --- include/asm-mips/processor.h 2002/05/28 06:11:56 1.43.2.2
> > +++ include/asm-mips/processor.h 2002/06/18 07:56:58
> > @@ -215,6 +215,7 @@
> > regs->cp0_epc = new_pc; \
> > regs->regs[29] = new_sp; \
> > current->thread.current_ds = USER_DS; \
> > + current->used_math = 0; \
> > } while (0)
> >
> > unsigned long get_wchan(struct task_struct *p);
> >
> >
> >
> > Jun Sun wrote:
> >
> >
> >>Carsten Langgaard wrote:
> >>
> >>
> >>>This is one of the bugs, among others, we have fixed.
> >>>I'm not sure, if Ralf have integrated the patches we send him yet.
> >>>
> >>>
> >>Carsten,
> >>
> >>Do you remember the cause and the fix? It appears to me the first ctc1
> >>instruction should trap into kernel and mark current process as fpu owner, and
> >>should not cause fcr31 corruption.
> >>
> >>Or somehow the ctc1 does not trap into kernel?
> >>
> >>Jun
> >>
> >>
> >>>/Carsten
> >>>
> >>>Louis Hamilton wrote:
> >>>
> >>>
> >>>
> >>>>We have a customer here testing a 2.4.16 mips kernel on an embedded
> >>>>Linux RM7000/SR71000 based system who has written a test that they
> >>>>believe has uncovered a bug in Linux. The FPU control register appears
> >>>>to not get saved and restored. I've reproduced the problem described
> >>>>below and find the results consistent with their description. The
> >>>>problem occurs on both the RM7000 and SR71000 cpus.
> >>>>
> >>>>It looks like save_fp_context and restore_fp_context are not being
> >>>>called since the kernel save-restore logic thinks the process is not
> >>>>using floating point math. If you do some fp math before calling the
> >>>>test routine below, it seems to works fine.
> >>>>
> >>>>Is this a known caveat? A true bug? Or a contorted corner case
> >>>>unlikely to be seen under typical end-user usage (see customer's
> >>>>last paragraph :-) ? If true bug, recommended remedy?
> >>>>
> >>>>TIA,
> >>>>Louis
> >>>>
> >>>>Louis Hamilton
> >>>>hamilton@redhat.com
> >>>>
> >>>>------ customer reports the following: ---------
> >>>>We found a bug in Linux. A ^C (control-C) typed into a shell (or a
> >>>>running program, it doesn't matter), causes the FCR (floating-point
> >>>>control register) to be corrupted in another, unrelated process. This
> >>>>is repeatable behavior.
> >>>>
> >>>>This can be reproduced with the following short assembly language
> >>>>program that loops forever, waiting for the FCR to change.
> >>>>
> >>>> .align 2
> >>>> .globl mips_float_debug_loop
> >>>>mips_float_debug_loop:
> >>>> li $9, 0xF000F02F
> >>>> ctc1 $9, $31 # set FCR to some non-zero value
> >>>> nop
> >>>>1: cfc1 $8, $31 # get FCR
> >>>> beq $8, $9, 1b # spin, waiting for FCR to change
> >>>> nop
> >>>> or $2, $0, $8
> >>>> jr $31
> >>>> nop
> >>>>
> >>>>You can call this function from a short C program and the return value
> >>>>is the (corrupted) FCR, which turns out to alwyas be: 0x00000002.
> >>>>
> >>>>Run the above loop in one window (connected to the board using telnet)
> >>>>and then in another window (connected to the same board) type ^C.
> >>>>
> >>>>I'm surprised this bug hasn't been encountered by other MIPS vendors.
> >>>>
> >>>><end>
> >>>>
> >>>--
> >>>_ _ ____ ___ Carsten Langgaard Mailto:carstenl@mips.com
> >>>|\ /|||___)(___ MIPS Denmark Direct: +45 4486 5527
> >>>| \/ ||| ____) Lautrupvang 4B Switch: +45 4486 5555
> >>> TECHNOLOGIES 2750 Ballerup Fax...: +45 4486 5556
> >>> Denmark http://www.mips.com
> >>>
> >>>
> >>>
> >>>
> >>>
> >
> > --
> > _ _ ____ ___ Carsten Langgaard Mailto:carstenl@mips.com
> > |\ /|||___)(___ MIPS Denmark Direct: +45 4486 5527
> > | \/ ||| ____) Lautrupvang 4B Switch: +45 4486 5555
> > TECHNOLOGIES 2750 Ballerup Fax...: +45 4486 5556
> > Denmark http://www.mips.com
> >
> >
> >
> >
--
_ _ ____ ___ Carsten Langgaard Mailto:carstenl@mips.com
|\ /|||___)(___ MIPS Denmark Direct: +45 4486 5527
| \/ ||| ____) Lautrupvang 4B Switch: +45 4486 5555
TECHNOLOGIES 2750 Ballerup Fax...: +45 4486 5556
Denmark http://www.mips.com
next prev parent reply other threads:[~2002-06-19 6:53 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-06-15 23:56 Bug in Linux? fcr31 not being saved-restored Louis Hamilton
2002-06-17 6:20 ` Carsten Langgaard
2002-06-17 19:30 ` Jun Sun
2002-06-18 8:07 ` Carsten Langgaard
2002-06-18 17:46 ` Jun Sun
2002-06-19 6:55 ` Carsten Langgaard [this message]
2002-06-20 18:03 ` Jun Sun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3D102ADD.71178636@mips.com \
--to=carstenl@mips.com \
--cc=hamilton@redhat.com \
--cc=jsun@mvista.com \
--cc=linux-mips@oss.sgi.com \
--cc=sandcraft-elinux-project@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox