From: Jun Sun <jsun@mvista.com>
To: Carsten Langgaard <carstenl@mips.com>
Cc: Louis Hamilton <hamilton@redhat.com>,
linux-mips@oss.sgi.com, sandcraft-elinux-project@redhat.com
Subject: Re: Bug in Linux? fcr31 not being saved-restored
Date: Thu, 20 Jun 2002 11:03:54 -0700 [thread overview]
Message-ID: <3D12190A.9030607@mvista.com> (raw)
In-Reply-To: 3D102ADD.71178636@mips.com
I think this patch is OK.
Whatever your gcc test is, it seems like it should fail on all other archs
too, because a new program would always inherit the old fpu registers. It
would be interesting to do a test on those platforms.
I can see that having old values for general FPU registers is not a problem,
but having the old value in FPU status register might. So an alternative is
to clear fcr31 only in start_thread().
Note that without your patch, the secondary if (last_task_used_math != NULL)
branch is never taken. This is because init process will be the only process
that does not have used_math flag set and in that case last_task_use_math is
indeed NULL. All other processes will inherit the used_math flag from its parent.
Jun
Carsten Langgaard wrote:
> Jun Sun wrote:
>
>
>>Your first chunk is not clear as to where it applies. Maybe you are using a
>>different code base.
>>
>
> Here is my do_cpu handler:
>
> asmlinkage void do_cpu(struct pt_regs *regs)
> {
> unsigned int cpid;
> extern void lazy_fpu_switch(void *);
> extern void save_fp(struct task_struct *);
> extern void init_fpu(void);
> void fpu_emulator_init_fpu(void);
> int sig;
>
> cpid = (regs->cp0_cause >> CAUSEB_CE) & 3;
> if (cpid != 1)
> goto bad_cid;
>
> if (!(mips_cpu.options & MIPS_CPU_FPU))
> goto fp_emul;
>
> regs->cp0_status |= ST0_CU1;
> if (last_task_used_math == current)
> return;
>
> if (current->used_math) { /* Using the FPU again. */
> lazy_fpu_switch(last_task_used_math);
> } else { /* First time FPU user. */
> if (last_task_used_math != NULL)
> {
> __enable_fpu();
> save_fp(last_task_used_math);
> /* last_task_used_math loose fpu */
> ((struct pt_regs *)(__KSTK_TOS(last_task_used_math) -
> sizeof(struct pt_regs)))->
> cp0_status &= ~ST0_CU1;
> }
> init_fpu();
> current->used_math = 1;
> }
> last_task_used_math = current;
>
> return;
>
> fp_emul:
> if (last_task_used_math != current) {
> if (!current->used_math) {
> fpu_emulator_init_fpu();
> current->used_math = 1;
> }
> }
> sig = fpu_emulator_cop1Handler(0, regs, ¤t->thread.fpu.soft);
> last_task_used_math = current;
> if (sig)
> {
> /*
> * Return EPC is not calculated in the FPU emulator, if
> * a signal is being send. So we calculate it here.
> */
> compute_return_epc(regs);
> force_sig(sig, current);
> }
> return;
>
> bad_cid:
> #ifndef CONFIG_CPU_HAS_LLSC
> switch (mips_cpu.cputype) {
> case CPU_TX3927:
> case CPU_TX39XX:
> do_ri(regs);
> return;
> }
> #endif
> compute_return_epc(regs);
> force_sig(SIGILL, current);
> }
>
>
>
>>The second chunck is not necessary. Although the FPU values in thread struct
>>for the new thread are stale, the new program cannot assume the fpu register
>>values and cannot use them without initialization anyway.
>>
>
> At least with the do_cpu handler above, you need this.
> I'm not sure which tests failed without this, but I think it was some of the gcc selftests
> and I think the problem was, that a thread inherited the previous thread's fsr (floating
> point status register).
>
>
>>I don't see such a code in x86's code. Current call to start_thread() is ok
>>with or without this change. I am afraid future use of start_thread() may
>>give undesired effect after we make this change.
>>
>
> Why ?
>
>
>>As for the new fpu context switch code, I wrote a experiemental patch after
>>much discussion with Ralf and Kevin. It is at
>>
>>http://linux.junsun.net/patches/oss.sgi.com/experimental/020304-half-fpu-context-switch
>>
>>I also did some performance study of this patch. Did not get much feedbacks
>>when I asked last time. :-(
>>
>>Jun
>>
>>Carsten Langgaard wrote:
>>
>>
>>>I believe this is the patch that solve the problem.
>>>The lazy fpu stuff has cause so many problems, that we have decided (together
>>>with Ralf) to get rid of it, and do the FPU context save and restore a little bit
>>>differently.
>>>We now have a solution here locally and we are testing it at the moment.
>>>
>>>/Carsten
>>>
>>>
>>>Index: arch/mips/kernel/traps.c
>>>===================================================================
>>>RCS file: /cvs/linux/arch/mips/kernel/traps.c,v
>>>retrieving revision 1.99.2.14
>>>diff -u -r1.99.2.14 traps.c
>>>--- arch/mips/kernel/traps.c 2002/05/28 06:33:13 1.99.2.14
>>>+++ arch/mips/kernel/traps.c 2002/06/18 07:53:47
>>>+ {
>>>+ __enable_fpu();
>>> save_fp(last_task_used_math);
>>>+ /* last_task_used_math loose fpu */
>>>+ ((struct pt_regs *)(__KSTK_TOS(last_task_used_math) -
>>>+ sizeof(struct pt_regs)))->
>>>+ cp0_status &= ~ST0_CU1;
>>>+ }
>>>
>>>Index: include/asm-mips/processor.h
>>>===================================================================
>>>RCS file: /cvs/linux/include/asm-mips/processor.h,v
>>>retrieving revision 1.43.2.2
>>>diff -u -r1.43.2.2 processor.h
>>>--- include/asm-mips/processor.h 2002/05/28 06:11:56 1.43.2.2
>>>+++ include/asm-mips/processor.h 2002/06/18 07:56:58
>>>@@ -215,6 +215,7 @@
>>> regs->cp0_epc = new_pc; \
>>> regs->regs[29] = new_sp; \
>>> current->thread.current_ds = USER_DS; \
>>>+ current->used_math = 0; \
>>> } while (0)
>>>
>>> unsigned long get_wchan(struct task_struct *p);
>>>
>>>
>>>
>>>Jun Sun wrote:
>>>
>>>
>>>
>>>>Carsten Langgaard wrote:
>>>>
>>>>
>>>>
>>>>>This is one of the bugs, among others, we have fixed.
>>>>>I'm not sure, if Ralf have integrated the patches we send him yet.
>>>>>
>>>>>
>>>>>
>>>>Carsten,
>>>>
>>>>Do you remember the cause and the fix? It appears to me the first ctc1
>>>>instruction should trap into kernel and mark current process as fpu owner, and
>>>>should not cause fcr31 corruption.
>>>>
>>>>Or somehow the ctc1 does not trap into kernel?
>>>>
>>>>Jun
>>>>
>>>>
>>>>
>>>>>/Carsten
>>>>>
>>>>>Louis Hamilton wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>We have a customer here testing a 2.4.16 mips kernel on an embedded
>>>>>>Linux RM7000/SR71000 based system who has written a test that they
>>>>>>believe has uncovered a bug in Linux. The FPU control register appears
>>>>>>to not get saved and restored. I've reproduced the problem described
>>>>>>below and find the results consistent with their description. The
>>>>>>problem occurs on both the RM7000 and SR71000 cpus.
>>>>>>
>>>>>>It looks like save_fp_context and restore_fp_context are not being
>>>>>>called since the kernel save-restore logic thinks the process is not
>>>>>>using floating point math. If you do some fp math before calling the
>>>>>>test routine below, it seems to works fine.
>>>>>>
>>>>>>Is this a known caveat? A true bug? Or a contorted corner case
>>>>>>unlikely to be seen under typical end-user usage (see customer's
>>>>>>last paragraph :-) ? If true bug, recommended remedy?
>>>>>>
>>>>>>TIA,
>>>>>>Louis
>>>>>>
>>>>>>Louis Hamilton
>>>>>>hamilton@redhat.com
>>>>>>
>>>>>>------ customer reports the following: ---------
>>>>>>We found a bug in Linux. A ^C (control-C) typed into a shell (or a
>>>>>>running program, it doesn't matter), causes the FCR (floating-point
>>>>>>control register) to be corrupted in another, unrelated process. This
>>>>>>is repeatable behavior.
>>>>>>
>>>>>>This can be reproduced with the following short assembly language
>>>>>>program that loops forever, waiting for the FCR to change.
>>>>>>
>>>>>> .align 2
>>>>>> .globl mips_float_debug_loop
>>>>>>mips_float_debug_loop:
>>>>>> li $9, 0xF000F02F
>>>>>> ctc1 $9, $31 # set FCR to some non-zero value
>>>>>> nop
>>>>>>1: cfc1 $8, $31 # get FCR
>>>>>> beq $8, $9, 1b # spin, waiting for FCR to change
>>>>>> nop
>>>>>> or $2, $0, $8
>>>>>> jr $31
>>>>>> nop
>>>>>>
>>>>>>You can call this function from a short C program and the return value
>>>>>>is the (corrupted) FCR, which turns out to alwyas be: 0x00000002.
>>>>>>
>>>>>>Run the above loop in one window (connected to the board using telnet)
>>>>>>and then in another window (connected to the same board) type ^C.
>>>>>>
>>>>>>I'm surprised this bug hasn't been encountered by other MIPS vendors.
>>>>>>
>>>>>><end>
>>>>>>
>>>>>--
>>>>>_ _ ____ ___ Carsten Langgaard Mailto:carstenl@mips.com
>>>>>|\ /|||___)(___ MIPS Denmark Direct: +45 4486 5527
>>>>>| \/ ||| ____) Lautrupvang 4B Switch: +45 4486 5555
>>>>> TECHNOLOGIES 2750 Ballerup Fax...: +45 4486 5556
>>>>> Denmark http://www.mips.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>--
>>>_ _ ____ ___ Carsten Langgaard Mailto:carstenl@mips.com
>>>|\ /|||___)(___ MIPS Denmark Direct: +45 4486 5527
>>>| \/ ||| ____) Lautrupvang 4B Switch: +45 4486 5555
>>> TECHNOLOGIES 2750 Ballerup Fax...: +45 4486 5556
>>> Denmark http://www.mips.com
>>>
>>>
>>>
>>>
>>>
>
> --
> _ _ ____ ___ Carsten Langgaard Mailto:carstenl@mips.com
> |\ /|||___)(___ MIPS Denmark Direct: +45 4486 5527
> | \/ ||| ____) Lautrupvang 4B Switch: +45 4486 5555
> TECHNOLOGIES 2750 Ballerup Fax...: +45 4486 5556
> Denmark http://www.mips.com
>
>
>
>
prev parent reply other threads:[~2002-06-20 18:22 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-06-15 23:56 Bug in Linux? fcr31 not being saved-restored Louis Hamilton
2002-06-17 6:20 ` Carsten Langgaard
2002-06-17 19:30 ` Jun Sun
2002-06-18 8:07 ` Carsten Langgaard
2002-06-18 17:46 ` Jun Sun
2002-06-19 6:55 ` Carsten Langgaard
2002-06-20 18:03 ` Jun Sun [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3D12190A.9030607@mvista.com \
--to=jsun@mvista.com \
--cc=carstenl@mips.com \
--cc=hamilton@redhat.com \
--cc=linux-mips@oss.sgi.com \
--cc=sandcraft-elinux-project@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox