* CONFIG_PREEMPT causes corruption of application's FPU stack
@ 2008-05-17 16:31 Jürgen Mell
2008-05-18 15:07 ` Steven Rostedt
0 siblings, 1 reply; 20+ messages in thread
From: Jürgen Mell @ 2008-05-17 16:31 UTC (permalink / raw)
To: linux-kernel
I am running the Einstein@home application (version 4.35,
http://einstein.phys.uwm.edu).This application does lots of computations
mostly with FPU and SSE instructions.
After I started experimenting with real-time optimized kernels the
application began to crash with floating point errors like in the
following message:
APP DEBUG: Application caught signal 8.
FPU status word ffffa0e1, flags: ERR_SUMM STACK_FAULT PRECISION INVALID
Obtained 6 stack frames for this thread.
Use gdb command: 'info line *0xADDRESS' to print corresponding line
numbers.
einstein_S5R3_4.35_i686-pc-linux-gnu[0x8069e7e]
einstein_S5R3_4.35_i686-pc-linux-gnu[0x818d436]
einstein_S5R3_4.35_i686-pc-linux-gnu[0x805db8f]
einstein_S5R3_4.35_i686-pc-linux-gnu[0x806b11c]
/lib/libc.so.6(__libc_start_main+0xe0)[0xb7e14fe0]
einstein_S5R3_4.35_i686-pc-linux-gnu(shmat+0x59)[0x804bda1]
Stack trace of LAL functions in worker thread:
GetSemiCohToplist at line 3177 of
file /home/bema/einsteinathome/HierarchicalSearch/EaH_build_release_einstein_S5R3_4.35/extra_sources/lalapps-CVS/src/pulsar/hough/src2/HierarchicalSearch.c
At lowest level status code = 0, description: NO LAL ERROR REGISTERED
called boinc_finish
I tracked this down to a single kernel configuration option. If
CONFIG_PREEMPT is set to 'y' the application will start crashing. If
CONFIG_PREEMPT is replaced by CONFIG_PREEMPT_VOLUNTARY, the application
will run without errors.
The problem is reproducible in so far as the error always occurs when
CONFIG_PREEMPT is set, but the time to the first occurrence varies greatly
from some minutes up to more than 10 CPU hours.
I found this error first on an openSUSE kernel 2.6.22.17-0.1-rt. I verified
the problem on the following kernel versions:
openSUSE 2.6.22.17-0.1-default
openSUSE 2.6.23.17-ccj64-rt
kernel.org 2.6.26-rc1
kernel.org 2.6.26-rc2-git5
My CPU is an Intel Core2Duo 6420, running two of the Einstein applications
in 32-bit mode. From a discussion on the Einstein message boards I know
that other user of the application are also affected.
Please let me know if you need any additional information to track this
down.
Jürgen
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-05-17 16:31 CONFIG_PREEMPT causes corruption of application's FPU stack Jürgen Mell
@ 2008-05-18 15:07 ` Steven Rostedt
2008-05-18 15:57 ` Jürgen Mell
0 siblings, 1 reply; 20+ messages in thread
From: Steven Rostedt @ 2008-05-18 15:07 UTC (permalink / raw)
To: J?rgen Mell; +Cc: linux-kernel
On Sat, May 17, 2008 at 06:31:08PM +0200, J?rgen Mell wrote:
>
> I tracked this down to a single kernel configuration option. If
> CONFIG_PREEMPT is set to 'y' the application will start crashing. If
> CONFIG_PREEMPT is replaced by CONFIG_PREEMPT_VOLUNTARY, the application
> will run without errors.
>
> The problem is reproducible in so far as the error always occurs when
> CONFIG_PREEMPT is set, but the time to the first occurrence varies greatly
> from some minutes up to more than 10 CPU hours.
>
> I found this error first on an openSUSE kernel 2.6.22.17-0.1-rt. I verified
> the problem on the following kernel versions:
>
> openSUSE 2.6.22.17-0.1-default
> openSUSE 2.6.23.17-ccj64-rt
> kernel.org 2.6.26-rc1
> kernel.org 2.6.26-rc2-git5
So you see this error in both the SuSE RT kernel, *and* mainline
kernel.org?
If you see it in the kernel.org kernel, can you please do a git-bisect
to see which commit caused the problem?
Thanks,
-- Steve
>
> My CPU is an Intel Core2Duo 6420, running two of the Einstein applications
> in 32-bit mode. From a discussion on the Einstein message boards I know
> that other user of the application are also affected.
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-05-18 15:07 ` Steven Rostedt
@ 2008-05-18 15:57 ` Jürgen Mell
0 siblings, 0 replies; 20+ messages in thread
From: Jürgen Mell @ 2008-05-18 15:57 UTC (permalink / raw)
To: Steven Rostedt; +Cc: linux-kernel
On Sonday, 18. May 2008, you wrote:
> On Sat, May 17, 2008 at 06:31:08PM +0200, J?rgen Mell wrote:
> > I tracked this down to a single kernel configuration option. If
> > CONFIG_PREEMPT is set to 'y' the application will start crashing. If
> > CONFIG_PREEMPT is replaced by CONFIG_PREEMPT_VOLUNTARY, the
> > application will run without errors.
> >
> > The problem is reproducible in so far as the error always occurs when
> > CONFIG_PREEMPT is set, but the time to the first occurrence varies
> > greatly from some minutes up to more than 10 CPU hours.
> >
> > I found this error first on an openSUSE kernel 2.6.22.17-0.1-rt. I
> > verified the problem on the following kernel versions:
> >
> > openSUSE 2.6.22.17-0.1-default
> > openSUSE 2.6.23.17-ccj64-rt
> > kernel.org 2.6.26-rc1
> > kernel.org 2.6.26-rc2-git5
>
> So you see this error in both the SuSE RT kernel, *and* mainline
> kernel.org?
Yes, that is correct. The error is present from the 2.6.22 SUSE kernel up
to the most recent mainline kernel. It is also present in the standard
SUSE kernel, if I just modify *only* CONFIG_PREEMPT.
What makes me wonder: I am using the machine in a production environment
for programming, multi-media etc. Why does only the Einstein program catch
the SIGFPE? Normally I would expect other programs to crash, too, if the
problem is present. But up to now this never happened.
> If you see it in the kernel.org kernel, can you please do a git-bisect
> to see which commit caused the problem?
This is a bit of a problem. I do not know whether there was *ever* a kernel
version with CONFIG_PREEMPT and without this problem as I have not tried
any older kernel version yet. I will go back to SUSE 10.2 and try the
2.6.18 kernel that comes with it.
Bye,
Jürgen
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
@ 2008-05-24 18:52 j.mell
0 siblings, 0 replies; 20+ messages in thread
From: j.mell @ 2008-05-24 18:52 UTC (permalink / raw)
To: Steven Rostedt; +Cc: linux-kernel
On Sonday, 18. May 2008, you wrote:
> > On Sat, May 17, 2008 at 06:31:08PM +0200, J?rgen Mell wrote:
> > > I tracked this down to a single kernel configuration option. If
> > > CONFIG_PREEMPT is set to 'y' the application will start crashing. If
> > > CONFIG_PREEMPT is replaced by CONFIG_PREEMPT_VOLUNTARY, the
> > > application will run without errors.
> > If you see it in the kernel.org kernel, can you please do a git-bisect
> > to see which commit caused the problem?
> This is a bit of a problem. I do not know whether there was *ever* a
> kernel version with CONFIG_PREEMPT and without this problem as I have
> not tried any older kernel version yet. I will go back to SUSE 10.2 and
> try the 2.6.18 kernel that comes with it.
I found now that the problem was introduced somewhere between the
kernel.org kernels 2.6.19.7 and 2.6.20. I will start bisecting now.
Bye,
Jürgen
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
@ 2008-06-01 9:01 j.mell
2008-06-01 11:40 ` Andi Kleen
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: j.mell @ 2008-06-01 9:01 UTC (permalink / raw)
To: Steven Rostedt; +Cc: linux-kernel, ak
Hi,
> On Sat, May 17, 2008 at 06:31:08PM +0200, J?rgen Mell wrote:
> I tracked this down to a single kernel configuration option. If
> CONFIG_PREEMPT is set to 'y' the application will start crashing.
> If CONFIG_PREEMPT is replaced by CONFIG_PREEMPT_VOLUNTARY, the
> application will run without errors.
With lots of help from Heinz-Bernd, Bernd and Oliver of the Einstein@Home
project I now found the the following:
1. Einstein@home will crash with trap #8 if the problem is present. The
error occurs between some minutes after starting Einstein up to more than
10 hours after starting Einstein. This seems to depend on how many other
applications are used on the system (it takes much more time, if only the
Einstein processes are active on the system).
2. The error was introduced between kernel.org kernels 2.6.19.7 and 2.6.20.
It is still present in 2.6.26-rc4
3. If I revert the patch
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=acc207616a91a413a50fdd8847a747c4a7324167
in 2.6.20, Einstein does not crash anymore (program was run for more than
30 hours while system was in normal use with programming, multi-media
etc.). Unfortunately git refuses to revert this patch in 2.6.26-rc4.
Now I need some help as I am not an expert in this area. What I assume is
that either the state of the FPU is not always restored (perhaps if the
process is swapped between the two cores?) or it is restored more than
once. Please keep in mind, that I am always running two Einstein processes
simultaneously on my two cores!
I am willing to do further testing of this problem if someone can give me a
hint how to continue.
Bye,
Jürgen
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-01 9:01 j.mell
@ 2008-06-01 11:40 ` Andi Kleen
2008-06-01 16:47 ` Jürgen Mell
2008-06-01 12:12 ` Steven Rostedt
2008-06-01 17:11 ` Simon Holm Thøgersen
2 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2008-06-01 11:40 UTC (permalink / raw)
To: j.mell; +Cc: Steven Rostedt, linux-kernel, suresh.b.siddha, arjan
j.mell@t-online.de writes:
> or it is restored more than
> once. Please keep in mind, that I am always running two Einstein processes
> simultaneously on my two cores!
> I am willing to do further testing of this problem if someone can give me a
> hint how to continue.
My bet would have been actually on aa283f49276e7d840a40fb01eee6de97eaa7e012
because it does some nasty things (enable interrupts in the middle
of __switch_to).
I looked through the old patchkit and couldn't find any specific
PREEMPT problems. All code it changes should run with preempt_off
You could verify with sticking WARN_ON_ONCE(preemptible()) into
all the places acc207616a91a413a50fdd8847a747c4a7324167
changes (__unlazy_fpu, math_state_restore) and see if that triggers
anywhere.
-Andi
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-01 9:01 j.mell
2008-06-01 11:40 ` Andi Kleen
@ 2008-06-01 12:12 ` Steven Rostedt
2008-06-01 17:11 ` Simon Holm Thøgersen
2 siblings, 0 replies; 20+ messages in thread
From: Steven Rostedt @ 2008-06-01 12:12 UTC (permalink / raw)
To: j.mell; +Cc: LKML, Chuck Ebbert, Arjan van de Ven, Andrew Morton, Andi Kleen
[
Fixed Andi's email and added those that Signed off on the
problem commit.
]
On Sun, 1 Jun 2008 j.mell@t-online.de wrote:
>
> Hi,
>
> > On Sat, May 17, 2008 at 06:31:08PM +0200, J?rgen Mell wrote:
> > I tracked this down to a single kernel configuration option. If
> > CONFIG_PREEMPT is set to 'y' the application will start crashing.
> > If CONFIG_PREEMPT is replaced by CONFIG_PREEMPT_VOLUNTARY, the
> > application will run without errors.
>
> With lots of help from Heinz-Bernd, Bernd and Oliver of the Einstein@Home
> project I now found the the following:
>
> 1. Einstein@home will crash with trap #8 if the problem is present. The
> error occurs between some minutes after starting Einstein up to more than
> 10 hours after starting Einstein. This seems to depend on how many other
> applications are used on the system (it takes much more time, if only the
> Einstein processes are active on the system).
>
> 2. The error was introduced between kernel.org kernels 2.6.19.7 and 2.6.20.
> It is still present in 2.6.26-rc4
>
> 3. If I revert the patch
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=acc207616a91a413a50fdd8847a747c4a7324167
Hi,
Thanks for bisecting this. I added the commiter and those that signed off
on the problem commit. They are the ones that will need to help you solve
this.
-- Steve
>
> in 2.6.20, Einstein does not crash anymore (program was run for more than
> 30 hours while system was in normal use with programming, multi-media
> etc.). Unfortunately git refuses to revert this patch in 2.6.26-rc4.
>
> Now I need some help as I am not an expert in this area. What I assume is
> that either the state of the FPU is not always restored (perhaps if the
> process is swapped between the two cores?) or it is restored more than
> once. Please keep in mind, that I am always running two Einstein processes
> simultaneously on my two cores!
> I am willing to do further testing of this problem if someone can give me a
> hint how to continue.
>
> Bye,
>
> Jürgen
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-01 11:40 ` Andi Kleen
@ 2008-06-01 16:47 ` Jürgen Mell
2008-06-02 21:37 ` Suresh Siddha
0 siblings, 1 reply; 20+ messages in thread
From: Jürgen Mell @ 2008-06-01 16:47 UTC (permalink / raw)
To: Andi Kleen; +Cc: Steven Rostedt, linux-kernel, suresh.b.siddha, arjan
On Sonntag, 1. Juni 2008, Andi Kleen wrote:
> j.mell@t-online.de writes:
> > or it is restored more than
> > once. Please keep in mind, that I am always running two Einstein
> > processes simultaneously on my two cores!
> > I am willing to do further testing of this problem if someone can give
> > me a hint how to continue.
>
> My bet would have been actually on
> aa283f49276e7d840a40fb01eee6de97eaa7e012 because it does some nasty
> things (enable interrupts in the middle of __switch_to).
>
> I looked through the old patchkit and couldn't find any specific
> PREEMPT problems. All code it changes should run with preempt_off
>
> You could verify with sticking WARN_ON_ONCE(preemptible()) into
> all the places acc207616a91a413a50fdd8847a747c4a7324167
> changes (__unlazy_fpu, math_state_restore) and see if that triggers
> anywhere.
No, that did not trigger. I put the WARN_ON_ONCE into process.c, traps.c
and also into the __unlazy_fpu macro in i387.h but I got no messages
anywhere (dmesg, /var/log/messages, /var/log/warn) when the trap #8
occurred.
Meanwhile I am also running the tests on another machine to make sure it is
not a hardware-related problem.
Any new ideas are welcome!
Meanwhile I will go back to 2.6.20 and revert
aa283f49276e7d840a40fb01eee6de97eaa7e012. Maybe I got on a wrong track...
Bye,
Jürgen
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-01 9:01 j.mell
2008-06-01 11:40 ` Andi Kleen
2008-06-01 12:12 ` Steven Rostedt
@ 2008-06-01 17:11 ` Simon Holm Thøgersen
2008-06-02 21:31 ` Suresh Siddha
2 siblings, 1 reply; 20+ messages in thread
From: Simon Holm Thøgersen @ 2008-06-01 17:11 UTC (permalink / raw)
To: j.mell; +Cc: Steven Rostedt, linux-kernel, ak
søn, 01 06 2008 kl. 11:01 +0200, skrev j.mell@t-online.de:
[...]
>
> 3. If I revert the patch
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=acc207616a91a413a50fdd8847a747c4a7324167
>
> in 2.6.20, Einstein does not crash anymore (program was run for more than
> 30 hours while system was in normal use with programming, multi-media
> etc.). Unfortunately git refuses to revert this patch in 2.6.26-rc4.
[...]
I don't think the bisected commit is responsible for anything, but
triggering a bug elsewhere with your workload. I've been chasing the
same problem I think, but with other symptoms.
I'm triggering the following by running an lguest guest, but I guess the
workload just need to have the right scheduler intensity to trigger the
bug.
BUG: sleeping function called from invalid context at mm/slab.c:3052
in_atomic():1, irqs_disabled():0
Pid: 4771, comm: lguest Not tainted
2.6.26-rc4-debug-only-preemptible-00103-g1beee8d #3
[<c01146ee>] __might_sleep+0xe4/0xeb
[<c01605d9>] kmem_cache_alloc+0x22/0xb4
[<c0108479>] init_fpu+0xb0/0x14d
[<c0104768>] math_state_restore+0x26/0x5d
[<c01045ab>] device_not_available+0x43/0x48
[<c011007b>] ? handle_vm86_fault+0x213/0x6b8
[<c01029ad>] ? __switch_to+0x23/0x113
[<c02d6c9f>] schedule+0x221/0x2a4
[<c02d716a>] ? schedule_timeout+0x16/0x89
[<c016ed36>] ? __pollwait+0xaa/0xb0
[<c0168358>] ? pipe_poll+0x29/0x89
[<c016e79a>] ? do_select+0x478/0x4cd
[<c016ec8c>] ? __pollwait+0x0/0xb0
[<c0116657>] ? default_wake_function+0x0/0xd
[<c0116657>] ? default_wake_function+0x0/0xd
[<c0116657>] ? default_wake_function+0x0/0xd
[<c0116657>] ? default_wake_function+0x0/0xd
[<c0116657>] ? default_wake_function+0x0/0xd
[<c012f49d>] ? getnstimeofday+0x37/0xb7
[<c012d810>] ? ktime_get_ts+0x40/0x44
[<c012d827>] ? ktime_get+0x13/0x2f
[<c011406f>] ? hrtick_start_fair+0xd5/0x111
[<c01216f6>] ? internal_add_timer+0x8e/0x92
[<c01f27b7>] ? delay_tsc+0x4f/0x68
[<c025b05b>] ? ide_dma_intr+0x0/0x79
[<c01f274d>] ? __delay+0x9/0xb
[<c01f2766>] ? __const_udelay+0x17/0x19
[<c0256df7>] ? ide_execute_command+0x7b/0x95
[<c025a7a2>] ? ide_dma_start+0x24/0x36
[<c0259ebb>] ? do_rw_taskfile+0x1be/0x1cf
[<c025c31a>] ? ide_do_rw_disk+0x19a/0x1dd
[<c01f2766>] ? __const_udelay+0x17/0x19
[<c025554e>] ? ide_do_request+0x838/0x875
[<c0254941>] ? ide_end_request+0x7d/0x99
[<c016e9d8>] ? core_sys_select+0x1e9/0x2c7
[<c0146aa4>] ? find_lock_page+0xa1/0xbb
[<c01489a2>] ? filemap_fault+0x21c/0x382
[<c0146942>] ? unlock_page+0x24/0x27
[<c0151bac>] ? __do_fault+0x314/0x34c
[<c0153a56>] ? handle_mm_fault+0x291/0x65a
[<c016edcb>] ? sys_select+0x8f/0x143
[<c02d9dc8>] ? do_page_fault+0x33c/0x616
[<c0103a67>] ? sysenter_past_esp+0x78/0xb1
[<c02d0000>] ? packet_rcv+0x159/0x2c7
=======================
BUG: sleeping function called from invalid context at mm/slab.c:3052
in_atomic():1, irqs_disabled():0
Pid: 4771, comm: lguest Not tainted
2.6.26-rc4-debug-only-preemptible-00103-g1beee8d #3
[<c01146ee>] __might_sleep+0xe4/0xeb
[<c01605d9>] kmem_cache_alloc+0x22/0xb4
[<c012f49d>] ? getnstimeofday+0x37/0xb7
[<c012f49d>] ? getnstimeofday+0x37/0xb7
[<c0108479>] init_fpu+0xb0/0x14d
[<c0104768>] math_state_restore+0x26/0x5d
[<c01045ab>] device_not_available+0x43/0x48
[<c0110000>] ? handle_vm86_fault+0x198/0x6b8
[<c01029ad>] ? __switch_to+0x23/0x113
[<c02d6c9f>] schedule+0x221/0x2a4
[<c012a95b>] ? prepare_to_wait+0x6c/0x84
[<c0168bdd>] ? pipe_wait+0x53/0x72
[<c012a76f>] ? autoremove_wake_function+0x0/0x30
[<c01692b9>] ? pipe_read+0x29a/0x302
[<c012d6ae>] ? hrtimer_start+0xcc/0xf8
[<c0115efd>] ? hrtick_set+0xcc/0x140
[<c01630b0>] ? do_sync_read+0xba/0xf8
[<c012a76f>] ? autoremove_wake_function+0x0/0x30
[<c01638b8>] ? default_llseek+0xa7/0xb5
[<c0162ff6>] ? do_sync_read+0x0/0xf8
[<c0163795>] ? vfs_read+0x8a/0x106
[<c0163ada>] ? sys_read+0x3b/0x60
[<c0103a67>] ? sysenter_past_esp+0x78/0xb1
=======================
I'm getting the traces with CONFIG_DEBUG_PREEMPT=y and no
CONFIG_DEBUG_SPINLOCK, since otherwise I'd just get
BUG: sleeping function called from invalid context at mm/slab.c:3052
BUG: spinlock recursion on CPU#0, lguest/5428
and nothing further. Using CONFIG_DEBUG_PREEMPT=y,
CONFIG_DEBUG_SPINLOCK=y and booting with "lapic nmi_watchdog=2" I was
also able to grab the following over netconsole though
BUG: sleeping function called from invalid context at mm/slab.c:3052
BUG: spinlock recursion on CPU#0, lguest/5428
BUG: NMI Watchdog detected LOCKUP on CPU0, ip c01f1f76, registers:
Modules linked in: lg tun arc4 ecb crypto_blkcipher cryptomgr
crypto_algapi ieee80211_crypt_wep bridge llc snd_seq snd_seq_device
radeonfb uhci_hcd snd_intel8x0 ehci_hcd snd_ac97_codec ipw2200 usbcore
fb ac97_bus fb_ddc backlight snd_pcm ieee80211 firewire_ohci
firewire_core i2c_algo_bit snd_timer intel_agp cfbcopyarea snd agpgart
i2c_i801 ieee80211_crypt firmware_class cfbimgblt soundcore crc_itu_t
cfbfillrect rtc iTCO_wdt snd_page_alloc i2c_core
Pid: 5428, comm: lguest Not tainted
(2.6.26-rc4debug-locks-extended-00103-g1beee8d #2)
EIP: 0060:[<c01f1f76>] EFLAGS: 00000002 CPU: 0
EIP is at delay_tsc+0x2e/0x68
EAX: 100a8436 EBX: c03acd84 ECX: 00000000 EDX: 0000004c
ESI: 100a83fb EDI: 00000001 EBP: e9651738 ESP: e9651724
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process lguest (pid: 5428, ti=e9650000 task=efbc92c0 task.ti=e964c000)
Stack: 00000001 100a83fb c03acd84 02df69af 00000001 e9651740 c01f1f2d
e9651758
c01ff481 593ac9e8 c03acd84 00000092 000078cf e9651774 c02d4968
00000000
00000002 c01141c4 c03acd84 00000001 e965178c c01141c4 00000001
00007892
Call Trace:
[<c01f1f2d>] ? __delay+0x9/0xb
[<c01ff481>] ? _raw_spin_lock+0x83/0xcb
[<c02d4968>] ? _spin_lock_irqsave+0x46/0x4f
[<c01141c4>] ? __wake_up+0x15/0x3b
[<c01141c4>] ? __wake_up+0x15/0x3b
[<c0119cf0>] ? wake_up_klogd+0x2e/0x31
[<c0119ea0>] ? release_console_sem+0x1ad/0x1b5
[<c011a3d1>] ? vprintk+0x383/0x3c2
[<c0134276>] ? debug_check_no_locks_freed+0x111/0x13c
[<c0134276>] ? debug_check_no_locks_freed+0x111/0x13c
[<c011a425>] ? printk+0x15/0x17
[<c01ff286>] ? spin_bug+0x3f/0x80
[<c01ff432>] ? _raw_spin_lock+0x34/0xcb
[<c02d474e>] ? _spin_lock+0x2a/0x32
[<c011471b>] ? task_rq_lock+0x2a/0x31
[<c011471b>] ? task_rq_lock+0x2a/0x31
[<c0114dc8>] ? try_to_wake_up+0x15/0x81
[<c0114e3f>] ? default_wake_function+0xb/0xd
[<c012985f>] ? autoremove_wake_function+0xe/0x30
[<c0113626>] ? __wake_up_common+0x35/0x5b
[<c01141d7>] ? __wake_up+0x28/0x3b
[<c0119cf0>] ? wake_up_klogd+0x2e/0x31
[<c0119ea0>] ? release_console_sem+0x1ad/0x1b5
[<c011a3d1>] ? vprintk+0x383/0x3c2
[<c011a425>] ? printk+0x15/0x17
[<c0115232>] ? __might_sleep+0x8c/0x108
[<c0160c37>] ? kmem_cache_alloc+0x22/0xd6
[<c0108441>] ? init_fpu+0xb0/0x14d
[<c01047fd>] ? math_state_restore+0x2b/0x67
[<c010463a>] ? device_not_available+0x4e/0x53
[<c02d2b8a>] ? schedule+0x1fe/0x2a1
[<c011007b>] ? handle_vm86_fault+0x31f/0x6b8
[<c01029c1>] ? __switch_to+0x23/0x113
[<c02d2bc6>] ? schedule+0x23a/0x2a1
[<c0134130>] ? trace_hardirqs_on+0xe6/0x11b
[<c02d4c3f>] ? _spin_unlock_irqrestore+0x56/0x6c
[<c02d2db8>] ? schedule_timeout+0x16/0x89
[<c016ee08>] ? __pollwait+0xaa/0xb0
[<c0168604>] ? pipe_poll+0x29/0x89
[<c016e844>] ? do_select+0x4a6/0x4f8
[<c016ed5e>] ? __pollwait+0x0/0xb0
[<c0114e34>] ? default_wake_function+0x0/0xd
[<c0114e34>] ? default_wake_function+0x0/0xd
[<c0114e34>] ? default_wake_function+0x0/0xd
[<c0114e34>] ? default_wake_function+0x0/0xd
[<c0114e34>] ? default_wake_function+0x0/0xd
[<c0113dfa>] ? hrtick_start_fair+0x117/0x11f
[<c01607a8>] ? kfree+0xad/0xc0
[<c0160833>] ? kmem_cache_free+0x78/0x8a
[<c0134130>] ? trace_hardirqs_on+0xe6/0x11b
[<c0133fac>] ? mark_held_locks+0x4e/0x66
[<c01349ae>] ? __lock_acquire+0x488/0xb4c
[<c016eaad>] ? core_sys_select+0x217/0x2f2
[<c02d4d42>] ? _read_unlock_irq+0x36/0x4b
[<c0147fab>] ? unlock_page+0x24/0x27
[<c0152c7f>] ? __do_fault+0x31e/0x356
[<c0153fb3>] ? handle_mm_fault+0x2a0/0x637
[<c016ee9d>] ? sys_select+0x8f/0x143
[<c02d67d3>] ? do_page_fault+0x352/0x631
[<c0103b59>] ? restore_nocheck+0x12/0x15
[<c02d6481>] ? do_page_fault+0x0/0x631
[<c0134130>] ? trace_hardirqs_on+0xe6/0x11b
[<c0103a4f>] ? sysenter_past_esp+0x78/0xd1
=======================
Code: 57 56 53 83 ec 08 89 45 ec b8 01 00 00 00 e8 b8 4b 0e 00 0f 31 0f
1f 40 00 b9 00 00 00 00 89 c6 89 c8 09 f0 89 45 f0 f3 90 0f 31 <0f> 1f
40 00 b9 00 00 00 00 89 c6 89 c8 09 f0 2b 45 f0 3b 45 ec
Let us take a closer look at __switch_to. I'm using test data from yet
another trace (that I actually captured before all the other traces
presented here).
BUG: unable to handle kernel NULL pointer dereference at 000001ff
IP: [<c0102964>] __switch_to+0x19/0xff
That is the only part of the trace I've got; there wasn't produced more.
Anyway, for that particular kernel I used, __switch_to disassembled to
0xc0102955 <__switch_to+10>: mov 0x4(%eax),%eax
0xc0102958 <__switch_to+13>: testb $0x1,0xc(%eax)
0xc010295c <__switch_to+17>: je 0xc0102995 <__switch_to+74>
0xc010295e <__switch_to+19>: mov 0x26c(%edi),%eax
0xc0102964 <__switch_to+25>: fnsave (%eax)
with corresponding source code
struct task_struct * __switch_to(struct task_struct *prev_p, struct
task_struct *next_p)
{
[...]
__unlazy_fpu(prev_p);
[...]
}
static inline void __unlazy_fpu(struct task_struct *tsk)
{
if (task_thread_info(tsk)->status & TS_USEDFPU) {
__save_init_fpu(tsk);
stts();
} else
tsk->fpu_counter = 0;
}
where __save_init_fpu(tsk) accesses
tsk->thread.xstate->fxsave
The first mov, testb and je of the disassembly are the if statement, and
the second mov and fnsave are part of __save_init_fpu that access
tsk->thread.xstate->fxsave. %eax (initially) and %edi hold prev_p from
__switch_to.
Hope that was readable and not too confusing. Now comes some partly
guesswork from my side that could be wrong. As far as I can tell
%eax (initially) and %edi / prev_p must hold 0xffffff93. Why the first
mov is valid then I'm not sure, but the if statement must be true by
chance. It is not by pure chance though, since prev_p always has the
same value (i.e. it is consistently "dereference at 000001ff" I get
across multiple runs).
So I'd say some sort of corruption caused by very fast context switch
between the launcher and switcher (or-whatever-their-names-are)
processes in lguest.
That is as far as I have been able to debug this, suggestions are
welcome. I guess I should note that I haven't tried with preemption
disabled, but I don't think there's much point in it.
Simon Holm Thøgersen
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-01 17:11 ` Simon Holm Thøgersen
@ 2008-06-02 21:31 ` Suresh Siddha
2008-06-03 13:23 ` Simon Holm Thøgersen
0 siblings, 1 reply; 20+ messages in thread
From: Suresh Siddha @ 2008-06-02 21:31 UTC (permalink / raw)
To: Simon Holm Thøgersen
Cc: j.mell, Steven Rostedt, linux-kernel, ak, mingo, hpa, tglx, arjan
On Sun, Jun 01, 2008 at 07:11:02PM +0200, Simon Holm Thøgersen wrote:
> søn, 01 06 2008 kl. 11:01 +0200, skrev j.mell@t-online.de:
> [...]
> >
> > 3. If I revert the patch
> >
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=acc207616a91a413a50fdd8847a747c4a7324167
> >
> > in 2.6.20, Einstein does not crash anymore (program was run for more than
> > 30 hours while system was in normal use with programming, multi-media
> > etc.). Unfortunately git refuses to revert this patch in 2.6.26-rc4.
> [...]
>
> I don't think the bisected commit is responsible for anything, but
> triggering a bug elsewhere with your workload. I've been chasing the
> same problem I think, but with other symptoms.
Simon, There seems to be multiple issues here. fpu corruption seems
to be a different problem compared to the issue you have encountered.
>
> I'm triggering the following by running an lguest guest, but I guess the
> workload just need to have the right scheduler intensity to trigger the
> bug.
>
> BUG: sleeping function called from invalid context at mm/slab.c:3052
> in_atomic():1, irqs_disabled():0
> Pid: 4771, comm: lguest Not tainted
> 2.6.26-rc4-debug-only-preemptible-00103-g1beee8d #3
> [<c01146ee>] __might_sleep+0xe4/0xeb
> [<c01605d9>] kmem_cache_alloc+0x22/0xb4
> [<c0108479>] init_fpu+0xb0/0x14d
> [<c0104768>] math_state_restore+0x26/0x5d
> [<c01045ab>] device_not_available+0x43/0x48
> [<c011007b>] ? handle_vm86_fault+0x213/0x6b8
> [<c01029ad>] ? __switch_to+0x23/0x113
> [<c02d6c9f>] schedule+0x221/0x2a4
Simon, Can you please try the appended patch and see if it fixes this
issue? Thanks.
---
[patch] x86: fix blocking call (math_state_restore()) condition in __switch_to
Add tsk_used_math() checks to prevent calling math_state_restore()
which can sleep in the case of !tsk_used_math(). This prevents
making a blocking call in __switch_to().
Apparently "fpu_counter > 5" check is not enough, as in some signal handling
and fork/exec scenarios, fpu_counter > 5 and !tsk_used_math() is possible.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
---
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index f8476df..6d54833 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -649,8 +649,11 @@ struct task_struct * __switch_to(struct task_struct *prev_p, struct task_struct
/* If the task has used fpu the last 5 timeslices, just do a full
* restore of the math state immediately to avoid the trap; the
* chances of needing FPU soon are obviously high now
+ *
+ * tsk_used_math() checks prevent calling math_state_restore(),
+ * which can sleep in the case of !tsk_used_math()
*/
- if (next_p->fpu_counter > 5)
+ if (tsk_used_math(next_p) && next_p->fpu_counter > 5)
math_state_restore();
/*
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index e2319f3..ac54ff5 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -658,8 +658,11 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
/* If the task has used fpu the last 5 timeslices, just do a full
* restore of the math state immediately to avoid the trap; the
* chances of needing FPU soon are obviously high now
+ *
+ * tsk_used_math() checks prevent calling math_state_restore(),
+ * which can sleep in the case of !tsk_used_math()
*/
- if (next_p->fpu_counter>5)
+ if (tsk_used_math(next_p) && next_p->fpu_counter > 5)
math_state_restore();
return prev_p;
}
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-01 16:47 ` Jürgen Mell
@ 2008-06-02 21:37 ` Suresh Siddha
2008-06-02 22:57 ` Suresh Siddha
0 siblings, 1 reply; 20+ messages in thread
From: Suresh Siddha @ 2008-06-02 21:37 UTC (permalink / raw)
To: Jürgen Mell
Cc: Andi Kleen, Steven Rostedt, linux-kernel, suresh.b.siddha, arjan,
mingo, hpa, tglx
On Sun, Jun 01, 2008 at 06:47:29PM +0200, Jürgen Mell wrote:
> On Sonntag, 1. Juni 2008, Andi Kleen wrote:
> > j.mell@t-online.de writes:
> > > or it is restored more than
> > > once. Please keep in mind, that I am always running two Einstein
> > > processes simultaneously on my two cores!
> > > I am willing to do further testing of this problem if someone can give
> > > me a hint how to continue.
> >
> > My bet would have been actually on
> > aa283f49276e7d840a40fb01eee6de97eaa7e012 because it does some nasty
> > things (enable interrupts in the middle of __switch_to).
> >
> > I looked through the old patchkit and couldn't find any specific
> > PREEMPT problems. All code it changes should run with preempt_off
> >
> > You could verify with sticking WARN_ON_ONCE(preemptible()) into
> > all the places acc207616a91a413a50fdd8847a747c4a7324167
> > changes (__unlazy_fpu, math_state_restore) and see if that triggers
> > anywhere.
>
> No, that did not trigger. I put the WARN_ON_ONCE into process.c, traps.c
> and also into the __unlazy_fpu macro in i387.h but I got no messages
> anywhere (dmesg, /var/log/messages, /var/log/warn) when the trap #8
> occurred.
> Meanwhile I am also running the tests on another machine to make sure it is
> not a hardware-related problem.
>
> Any new ideas are welcome!
>
> Meanwhile I will go back to 2.6.20 and revert
> aa283f49276e7d840a40fb01eee6de97eaa7e012. Maybe I got on a wrong track...
2.6.20 doesn't have the commit 'aa283f49276e7d840a40fb01eee6de97eaa7e012'
As you are seeing this corruption problem starting from 2.6.20,
atleast recent(in 2.6.26 series) fpu changes don't play a role in this.
I will try to reproduce your issue.
thanks,
suresh
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-02 21:37 ` Suresh Siddha
@ 2008-06-02 22:57 ` Suresh Siddha
2008-06-03 6:02 ` Jürgen Mell
2008-06-04 7:44 ` Jürgen Mell
0 siblings, 2 replies; 20+ messages in thread
From: Suresh Siddha @ 2008-06-02 22:57 UTC (permalink / raw)
To: j.mell
Cc: Jürgen Mell, Andi Kleen, Steven Rostedt, linux-kernel, arjan,
mingo, hpa, tglx
On Mon, Jun 02, 2008 at 02:37:56PM -0700, Suresh Siddha wrote:
> On Sun, Jun 01, 2008 at 06:47:29PM +0200, Jürgen Mell wrote:
> > On Sonntag, 1. Juni 2008, Andi Kleen wrote:
> > > j.mell@t-online.de writes:
> > > > or it is restored more than
> > > > once. Please keep in mind, that I am always running two Einstein
> > > > processes simultaneously on my two cores!
> > > > I am willing to do further testing of this problem if someone can give
> > > > me a hint how to continue.
> > >
> > > My bet would have been actually on
> > > aa283f49276e7d840a40fb01eee6de97eaa7e012 because it does some nasty
> > > things (enable interrupts in the middle of __switch_to).
> > >
> > > I looked through the old patchkit and couldn't find any specific
> > > PREEMPT problems. All code it changes should run with preempt_off
> > >
> > > You could verify with sticking WARN_ON_ONCE(preemptible()) into
> > > all the places acc207616a91a413a50fdd8847a747c4a7324167
> > > changes (__unlazy_fpu, math_state_restore) and see if that triggers
> > > anywhere.
> >
> > No, that did not trigger. I put the WARN_ON_ONCE into process.c, traps.c
> > and also into the __unlazy_fpu macro in i387.h but I got no messages
> > anywhere (dmesg, /var/log/messages, /var/log/warn) when the trap #8
> > occurred.
> > Meanwhile I am also running the tests on another machine to make sure it is
> > not a hardware-related problem.
> >
> > Any new ideas are welcome!
> >
> > Meanwhile I will go back to 2.6.20 and revert
> > aa283f49276e7d840a40fb01eee6de97eaa7e012. Maybe I got on a wrong track...
>
> 2.6.20 doesn't have the commit 'aa283f49276e7d840a40fb01eee6de97eaa7e012'
>
> As you are seeing this corruption problem starting from 2.6.20,
> atleast recent(in 2.6.26 series) fpu changes don't play a role in this.
>
> I will try to reproduce your issue.
Jürgen, I think I found the reason for your issue aswell.
As you observed, it is probably coming from the commit
acc207616a91a413a50fdd8847a747c4a7324167, i386: add sleazy FPU optimization
It's a side affect though. This is the failing scenario:
process 'A' in save_i387_ia32() just after clear_used_math()
Got an interrupt and pre-empted out.
At the next context switch to process 'A' again, kernel tries to restore
the math state proactively and sees a fpu_counter > 0 and !tsk_used_math()
This results in init_fpu() during the __switch_to()'s math_state_restore()
And resulting in fpu corruption which will be saved/restored
(save_i387_fxsave and restore_i387_fxsave) during the remaining
part of the signal handling after the context switch.
So in short, yes the problem shows up for preempt enabled kernels and the
same patch I sent out 30 mins back (appended again) should fix your issue
aswell. Can you please test this and check if my theory is indeed correct.
If it fixes your issue aswell, then I will re-post the patch with
a new changelog and updated comments in the patch.
thanks,
suresh
---
[patch] x86: fix blocking call (math_state_restore()) condition in __switch_to
Add tsk_used_math() checks to prevent calling math_state_restore()
which can sleep in the case of !tsk_used_math(). This prevents
making a blocking call in __switch_to().
Apparently "fpu_counter > 5" check is not enough, as in some signal handling
and fork/exec scenarios, fpu_counter > 5 and !tsk_used_math() is possible.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
---
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index f8476df..6d54833 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -649,8 +649,11 @@ struct task_struct * __switch_to(struct task_struct *prev_p, struct task_struct
/* If the task has used fpu the last 5 timeslices, just do a full
* restore of the math state immediately to avoid the trap; the
* chances of needing FPU soon are obviously high now
+ *
+ * tsk_used_math() checks prevent calling math_state_restore(),
+ * which can sleep in the case of !tsk_used_math()
*/
- if (next_p->fpu_counter > 5)
+ if (tsk_used_math(next_p) && next_p->fpu_counter > 5)
math_state_restore();
/*
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index e2319f3..ac54ff5 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -658,8 +658,11 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
/* If the task has used fpu the last 5 timeslices, just do a full
* restore of the math state immediately to avoid the trap; the
* chances of needing FPU soon are obviously high now
+ *
+ * tsk_used_math() checks prevent calling math_state_restore(),
+ * which can sleep in the case of !tsk_used_math()
*/
- if (next_p->fpu_counter>5)
+ if (tsk_used_math(next_p) && next_p->fpu_counter > 5)
math_state_restore();
return prev_p;
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-02 22:57 ` Suresh Siddha
@ 2008-06-03 6:02 ` Jürgen Mell
2008-06-04 7:44 ` Jürgen Mell
1 sibling, 0 replies; 20+ messages in thread
From: Jürgen Mell @ 2008-06-03 6:02 UTC (permalink / raw)
To: Suresh Siddha
Cc: Andi Kleen, Steven Rostedt, linux-kernel, arjan, mingo, hpa, tglx
On Dienstag, 3. Juni 2008, Suresh Siddha wrote:
> On Mon, Jun 02, 2008 at 02:37:56PM -0700, Suresh Siddha wrote:
> > On Sun, Jun 01, 2008 at 06:47:29PM +0200, Jürgen Mell wrote:
> > > On Sonntag, 1. Juni 2008, Andi Kleen wrote:
> > > > j.mell@t-online.de writes:
> > > > > or it is restored more than
> > > > > once. Please keep in mind, that I am always running two Einstein
> > > > > processes simultaneously on my two cores!
> > > > > I am willing to do further testing of this problem if someone
> > > > > can give me a hint how to continue.
> > > >
> > > > My bet would have been actually on
> > > > aa283f49276e7d840a40fb01eee6de97eaa7e012 because it does some
> > > > nasty things (enable interrupts in the middle of __switch_to).
> > > >
> > > > I looked through the old patchkit and couldn't find any specific
> > > > PREEMPT problems. All code it changes should run with preempt_off
> > > >
> > > > You could verify with sticking WARN_ON_ONCE(preemptible()) into
> > > > all the places acc207616a91a413a50fdd8847a747c4a7324167
> > > > changes (__unlazy_fpu, math_state_restore) and see if that
> > > > triggers anywhere.
> > >
> > > No, that did not trigger. I put the WARN_ON_ONCE into process.c,
> > > traps.c and also into the __unlazy_fpu macro in i387.h but I got no
> > > messages anywhere (dmesg, /var/log/messages, /var/log/warn) when the
> > > trap #8 occurred.
> > > Meanwhile I am also running the tests on another machine to make
> > > sure it is not a hardware-related problem.
> > >
> > > Any new ideas are welcome!
> > >
> > > Meanwhile I will go back to 2.6.20 and revert
> > > aa283f49276e7d840a40fb01eee6de97eaa7e012. Maybe I got on a wrong
> > > track...
> >
> > 2.6.20 doesn't have the commit
> > 'aa283f49276e7d840a40fb01eee6de97eaa7e012'
> >
> > As you are seeing this corruption problem starting from 2.6.20,
> > atleast recent(in 2.6.26 series) fpu changes don't play a role in
> > this.
> >
> > I will try to reproduce your issue.
>
> Jürgen, I think I found the reason for your issue aswell.
>
> As you observed, it is probably coming from the commit
> acc207616a91a413a50fdd8847a747c4a7324167, i386: add sleazy FPU
> optimization
>
> It's a side affect though. This is the failing scenario:
>
> process 'A' in save_i387_ia32() just after clear_used_math()
>
> Got an interrupt and pre-empted out.
>
> At the next context switch to process 'A' again, kernel tries to restore
> the math state proactively and sees a fpu_counter > 0 and
> !tsk_used_math()
>
> This results in init_fpu() during the __switch_to()'s
> math_state_restore()
>
> And resulting in fpu corruption which will be saved/restored
> (save_i387_fxsave and restore_i387_fxsave) during the remaining
> part of the signal handling after the context switch.
>
> So in short, yes the problem shows up for preempt enabled kernels and
> the same patch I sent out 30 mins back (appended again) should fix your
> issue aswell. Can you please test this and check if my theory is indeed
> correct. If it fixes your issue aswell, then I will re-post the patch
> with a new changelog and updated comments in the patch.
>
> thanks,
> suresh
Many thanks for the patch!
I will test this immediately but as it takes some time to make sure that
the problem is really gone it will take some time until I have a report.
Thanks,
Jürgen
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-02 21:31 ` Suresh Siddha
@ 2008-06-03 13:23 ` Simon Holm Thøgersen
2008-06-03 19:43 ` Suresh Siddha
0 siblings, 1 reply; 20+ messages in thread
From: Simon Holm Thøgersen @ 2008-06-03 13:23 UTC (permalink / raw)
To: Suresh Siddha
Cc: j.mell, Steven Rostedt, linux-kernel, ak, mingo, hpa, tglx, arjan,
lguest
[CC lguest <lguest@ozlabs.org>]
man, 02 06 2008 kl. 14:31 -0700, skrev Suresh Siddha:
> On Sun, Jun 01, 2008 at 07:11:02PM +0200, Simon Holm Thøgersen wrote:
> > søn, 01 06 2008 kl. 11:01 +0200, skrev j.mell@t-online.de:
> > [...]
> > >
> > > 3. If I revert the patch
> > >
> > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=acc207616a91a413a50fdd8847a747c4a7324167
> > >
> > > in 2.6.20, Einstein does not crash anymore (program was run for more than
> > > 30 hours while system was in normal use with programming, multi-media
> > > etc.). Unfortunately git refuses to revert this patch in 2.6.26-rc4.
> > [...]
> >
> > I don't think the bisected commit is responsible for anything, but
> > triggering a bug elsewhere with your workload. I've been chasing the
> > same problem I think, but with other symptoms.
>
> Simon, There seems to be multiple issues here. fpu corruption seems
> to be a different problem compared to the issue you have encountered.
>
> >
> > I'm triggering the following by running an lguest guest, but I guess the
> > workload just need to have the right scheduler intensity to trigger the
> > bug.
> >
> > BUG: sleeping function called from invalid context at mm/slab.c:3052
> > in_atomic():1, irqs_disabled():0
> > Pid: 4771, comm: lguest Not tainted
> > 2.6.26-rc4-debug-only-preemptible-00103-g1beee8d #3
> > [<c01146ee>] __might_sleep+0xe4/0xeb
> > [<c01605d9>] kmem_cache_alloc+0x22/0xb4
> > [<c0108479>] init_fpu+0xb0/0x14d
> > [<c0104768>] math_state_restore+0x26/0x5d
> > [<c01045ab>] device_not_available+0x43/0x48
> > [<c011007b>] ? handle_vm86_fault+0x213/0x6b8
> > [<c01029ad>] ? __switch_to+0x23/0x113
> > [<c02d6c9f>] schedule+0x221/0x2a4
>
> Simon, Can you please try the appended patch and see if it fixes this
> issue? Thanks.
> ---
>
> [patch] x86: fix blocking call (math_state_restore()) condition in __switch_to
>
> Add tsk_used_math() checks to prevent calling math_state_restore()
> which can sleep in the case of !tsk_used_math(). This prevents
> making a blocking call in __switch_to().
>
> Apparently "fpu_counter > 5" check is not enough, as in some signal handling
> and fork/exec scenarios, fpu_counter > 5 and !tsk_used_math() is possible.
>
> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
> ---
Hi Suresh,
and thanks for looking into this. The patch did not fix the issue, but
I'm wondering if it is lguest calling math_state_restore in
drivers/lguest/x86/core.c that could be the problem?
Regardless of whether that is the issue, I think you (and everybody
else) will be able to reproduce the issue by running lguest on a 32-bit
system with CONFIG_PREEMPT=y and CONFIG_DEBUG_SPINLOCKS_SLEEP=y (I'm
also using CONFIG_DEBUG_PREEMPT=y but I don't think that matter). If you
download http://xm-test.xensource.com/ramdisks/initrd-1.1-i386.img and
run
Documentation/lguest/lguest 64 vmlinux --block=initrd-1.1-i386.img
it will very likely trigger the backtraces I'm getting. Has anyone on
the lguest list tried running with CONFIG_PREEMPT?
Simon
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-03 13:23 ` Simon Holm Thøgersen
@ 2008-06-03 19:43 ` Suresh Siddha
2008-06-03 21:08 ` Simon Holm Thøgersen
0 siblings, 1 reply; 20+ messages in thread
From: Suresh Siddha @ 2008-06-03 19:43 UTC (permalink / raw)
To: Simon Holm Thøgersen
Cc: Suresh Siddha, j.mell, Steven Rostedt, linux-kernel, mingo, hpa,
tglx, arjan, lguest, andi
On Tue, Jun 03, 2008 at 03:23:30PM +0200, Simon Holm Thøgersen wrote:
> > [patch] x86: fix blocking call (math_state_restore()) condition in __switch_to
> >
> > Add tsk_used_math() checks to prevent calling math_state_restore()
> > which can sleep in the case of !tsk_used_math(). This prevents
> > making a blocking call in __switch_to().
> >
> > Apparently "fpu_counter > 5" check is not enough, as in some signal handling
> > and fork/exec scenarios, fpu_counter > 5 and !tsk_used_math() is possible.
> >
> > Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
> > ---
> Hi Suresh,
>
> and thanks for looking into this. The patch did not fix the issue, but
Ok. You are probably running into different issue (please see below).
Above patch fixes a real issue and I think it should fix the fpu
corruption issue encountered by Jürgen. I will wait for Jürgen's test
results before pushing the above patch.
> I'm wondering if it is lguest calling math_state_restore in
> drivers/lguest/x86/core.c that could be the problem?
I def see a problem. In lguest_arch_run_guest(), MSR_IA32_SYSENTER_CS is not
restored before making the math_state_restore() call. As the
math_state_restore() can now block, this can cause issues. Appending
patch should fix this issue and from your oops report, it is not very
clear if the below patch should help fix your issue or not. Can you
please try the below appended patch.
>
> Regardless of whether that is the issue, I think you (and everybody
> else) will be able to reproduce the issue by running lguest on a 32-bit
> system with CONFIG_PREEMPT=y and CONFIG_DEBUG_SPINLOCKS_SLEEP=y (I'm
> also using CONFIG_DEBUG_PREEMPT=y but I don't think that matter). If you
> download http://xm-test.xensource.com/ramdisks/initrd-1.1-i386.img and
> run
>
> Documentation/lguest/lguest 64 vmlinux --block=initrd-1.1-i386.img
>
> it will very likely trigger the backtraces I'm getting.
If the below patch doesn't help fix your issue, then I will try to reproduce
it locally here.
thanks,
suresh
---
[patch] x86, lguest: Restore MSR_IA32_SYSENTER_CS before math_state_restore()
Restore MSR_IA32_SYSENTER_CS before making the blocking math_state_restore()
in lguest_arch_run_guest()
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
---
diff --git a/drivers/lguest/x86/core.c b/drivers/lguest/x86/core.c
index 5126d5d..9279ce7 100644
--- a/drivers/lguest/x86/core.c
+++ b/drivers/lguest/x86/core.c
@@ -191,6 +191,10 @@ void lguest_arch_run_guest(struct lg_cpu *cpu)
* was doing. */
run_guest_once(cpu, lguest_pages(raw_smp_processor_id()));
+ /* Restore SYSENTER if it's supposed to be on. */
+ if (boot_cpu_has(X86_FEATURE_SEP))
+ wrmsr(MSR_IA32_SYSENTER_CS, __KERNEL_CS, 0);
+
/* Note that the "regs" structure contains two extra entries which are
* not really registers: a trap number which says what interrupt or
* trap made the switcher code come back, and an error code which some
@@ -203,13 +207,10 @@ void lguest_arch_run_guest(struct lg_cpu *cpu)
if (cpu->regs->trapnum == 14)
cpu->arch.last_pagefault = read_cr2();
/* Similarly, if we took a trap because the Guest used the FPU,
- * we have to restore the FPU it expects to see. */
+ * we have to restore the FPU it expects to see. math_state_restore() can
+ * re-enable interrupts and block. */
else if (cpu->regs->trapnum == 7)
math_state_restore();
-
- /* Restore SYSENTER if it's supposed to be on. */
- if (boot_cpu_has(X86_FEATURE_SEP))
- wrmsr(MSR_IA32_SYSENTER_CS, __KERNEL_CS, 0);
}
/*H:130 Now we've examined the hypercall code; our Guest can make requests.
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-03 19:43 ` Suresh Siddha
@ 2008-06-03 21:08 ` Simon Holm Thøgersen
0 siblings, 0 replies; 20+ messages in thread
From: Simon Holm Thøgersen @ 2008-06-03 21:08 UTC (permalink / raw)
To: Suresh Siddha
Cc: j.mell, Steven Rostedt, linux-kernel, mingo, hpa, tglx, arjan,
lguest, andi
tir, 03 06 2008 kl. 12:43 -0700, skrev Suresh Siddha:
> On Tue, Jun 03, 2008 at 03:23:30PM +0200, Simon Holm Thøgersen wrote:
> > > [patch] x86: fix blocking call (math_state_restore()) condition in __switch_to
> > >
> > > Add tsk_used_math() checks to prevent calling math_state_restore()
> > > which can sleep in the case of !tsk_used_math(). This prevents
> > > making a blocking call in __switch_to().
> > >
> > > Apparently "fpu_counter > 5" check is not enough, as in some signal handling
> > > and fork/exec scenarios, fpu_counter > 5 and !tsk_used_math() is possible.
> > >
> > > Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
> > > ---
> > Hi Suresh,
> >
> > and thanks for looking into this. The patch did not fix the issue, but
>
> Ok. You are probably running into different issue (please see below).
> Above patch fixes a real issue and I think it should fix the fpu
> corruption issue encountered by Jürgen. I will wait for Jürgen's test
> results before pushing the above patch.
>
> > I'm wondering if it is lguest calling math_state_restore in
> > drivers/lguest/x86/core.c that could be the problem?
>
> I def see a problem. In lguest_arch_run_guest(), MSR_IA32_SYSENTER_CS is not
> restored before making the math_state_restore() call. As the
> math_state_restore() can now block, this can cause issues. Appending
> patch should fix this issue and from your oops report, it is not very
> clear if the below patch should help fix your issue or not. Can you
> please try the below appended patch.
>
> >
> > Regardless of whether that is the issue, I think you (and everybody
> > else) will be able to reproduce the issue by running lguest on a 32-bit
> > system with CONFIG_PREEMPT=y and CONFIG_DEBUG_SPINLOCKS_SLEEP=y (I'm
> > also using CONFIG_DEBUG_PREEMPT=y but I don't think that matter). If you
> > download http://xm-test.xensource.com/ramdisks/initrd-1.1-i386.img and
> > run
> >
> > Documentation/lguest/lguest 64 vmlinux --block=initrd-1.1-i386.img
> >
> > it will very likely trigger the backtraces I'm getting.
>
> If the below patch doesn't help fix your issue, then I will try to reproduce
> it locally here.
>
It didn't, I'm afraid. I had both patches applied, and was able to
reproduce the trace fairly easily. The patches might have made the issue
slightly more difficult to provoke, but I'm not sure.
Simon
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-02 22:57 ` Suresh Siddha
2008-06-03 6:02 ` Jürgen Mell
@ 2008-06-04 7:44 ` Jürgen Mell
2008-06-04 10:53 ` Ingo Molnar
1 sibling, 1 reply; 20+ messages in thread
From: Jürgen Mell @ 2008-06-04 7:44 UTC (permalink / raw)
To: Suresh Siddha
Cc: Andi Kleen, Steven Rostedt, linux-kernel, arjan, mingo, hpa, tglx,
Simon Holm Thøgersen
On Tuesday, 3rd June 2008, Suresh Siddha wrote:
> On Mon, Jun 02, 2008 at 02:37:56PM -0700, Suresh Siddha wrote:
> > On Sun, Jun 01, 2008 at 06:47:29PM +0200, Jürgen Mell wrote:
> > > On Sonntag, 1. Juni 2008, Andi Kleen wrote:
> > > > j.mell@t-online.de writes:
> > > > > or it is restored more than
> > > > > once. Please keep in mind, that I am always running two Einstein
> > > > > processes simultaneously on my two cores!
> > > > > I am willing to do further testing of this problem if someone
> > > > > can give me a hint how to continue.
> > > >
> > > > My bet would have been actually on
> > > > aa283f49276e7d840a40fb01eee6de97eaa7e012 because it does some
> > > > nasty things (enable interrupts in the middle of __switch_to).
> > > >
> > > > I looked through the old patchkit and couldn't find any specific
> > > > PREEMPT problems. All code it changes should run with preempt_off
> > > >
> > > > You could verify with sticking WARN_ON_ONCE(preemptible()) into
> > > > all the places acc207616a91a413a50fdd8847a747c4a7324167
> > > > changes (__unlazy_fpu, math_state_restore) and see if that
> > > > triggers anywhere.
> > >
> > > No, that did not trigger. I put the WARN_ON_ONCE into process.c,
> > > traps.c and also into the __unlazy_fpu macro in i387.h but I got no
> > > messages anywhere (dmesg, /var/log/messages, /var/log/warn) when the
> > > trap #8 occurred.
> > > Meanwhile I am also running the tests on another machine to make
> > > sure it is not a hardware-related problem.
> > >
> > > Any new ideas are welcome!
> > >
> > > Meanwhile I will go back to 2.6.20 and revert
> > > aa283f49276e7d840a40fb01eee6de97eaa7e012. Maybe I got on a wrong
> > > track...
> >
> > 2.6.20 doesn't have the commit
> > 'aa283f49276e7d840a40fb01eee6de97eaa7e012'
> >
> > As you are seeing this corruption problem starting from 2.6.20,
> > atleast recent(in 2.6.26 series) fpu changes don't play a role in
> > this.
> >
> > I will try to reproduce your issue.
>
> Jürgen, I think I found the reason for your issue aswell.
>
> As you observed, it is probably coming from the commit
> acc207616a91a413a50fdd8847a747c4a7324167, i386: add sleazy FPU
> optimization
>
> It's a side affect though. This is the failing scenario:
>
> process 'A' in save_i387_ia32() just after clear_used_math()
>
> Got an interrupt and pre-empted out.
>
> At the next context switch to process 'A' again, kernel tries to restore
> the math state proactively and sees a fpu_counter > 0 and
> !tsk_used_math()
>
> This results in init_fpu() during the __switch_to()'s
> math_state_restore()
>
> And resulting in fpu corruption which will be saved/restored
> (save_i387_fxsave and restore_i387_fxsave) during the remaining
> part of the signal handling after the context switch.
>
> So in short, yes the problem shows up for preempt enabled kernels and
> the same patch I sent out 30 mins back (appended again) should fix your
> issue aswell. Can you please test this and check if my theory is indeed
> correct. If it fixes your issue aswell, then I will re-post the patch
> with a new changelog and updated comments in the patch.
>
I have applied your patch to both an openSUSE 2.6.22.17 kernel and a
2.6.26-rc4 kernel.org kernel and run the test with Einstein@home on two
different machines. One machine is running 24 hours now, the other 18
hours.
During this time there were no faults on both machines.
As it never before took more than 12 hours until the first appearance of
the problem, I think your patch fixed it. Very good work!
I will continue running the test, but I believe we can call this fixed.
Thank you again!
Jürgen
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-04 7:44 ` Jürgen Mell
@ 2008-06-04 10:53 ` Ingo Molnar
2008-06-04 12:55 ` Steven Rostedt
0 siblings, 1 reply; 20+ messages in thread
From: Ingo Molnar @ 2008-06-04 10:53 UTC (permalink / raw)
To: Jürgen Mell
Cc: Suresh Siddha, Andi Kleen, Steven Rostedt, linux-kernel, arjan,
hpa, tglx, Simon Holm Thøgersen
* Jürgen Mell <j.mell@t-online.de> wrote:
> > Jürgen, I think I found the reason for your issue aswell.
> >
> > As you observed, it is probably coming from the commit
> > acc207616a91a413a50fdd8847a747c4a7324167, i386: add sleazy FPU
> > optimization
> >
> > It's a side affect though. This is the failing scenario:
> >
> > process 'A' in save_i387_ia32() just after clear_used_math()
> >
> > Got an interrupt and pre-empted out.
> >
> > At the next context switch to process 'A' again, kernel tries to restore
> > the math state proactively and sees a fpu_counter > 0 and
> > !tsk_used_math()
> >
> > This results in init_fpu() during the __switch_to()'s
> > math_state_restore()
> >
> > And resulting in fpu corruption which will be saved/restored
> > (save_i387_fxsave and restore_i387_fxsave) during the remaining
> > part of the signal handling after the context switch.
> >
> > So in short, yes the problem shows up for preempt enabled kernels and
> > the same patch I sent out 30 mins back (appended again) should fix your
> > issue aswell. Can you please test this and check if my theory is indeed
> > correct. If it fixes your issue aswell, then I will re-post the patch
> > with a new changelog and updated comments in the patch.
> >
>
> I have applied your patch to both an openSUSE 2.6.22.17 kernel and a
> 2.6.26-rc4 kernel.org kernel and run the test with Einstein@home on
> two different machines. One machine is running 24 hours now, the other
> 18 hours.
>
> During this time there were no faults on both machines.
>
> As it never before took more than 12 hours until the first appearance
> of the problem, I think your patch fixed it. Very good work!
>
> I will continue running the test, but I believe we can call this
> fixed.
>
> Thank you again!
fix applied to tip/x86/urgent. Thanks everyone, nice find!
Ingo
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-04 10:53 ` Ingo Molnar
@ 2008-06-04 12:55 ` Steven Rostedt
2008-06-04 13:02 ` Ingo Molnar
0 siblings, 1 reply; 20+ messages in thread
From: Steven Rostedt @ 2008-06-04 12:55 UTC (permalink / raw)
To: Ingo Molnar
Cc: Jürgen Mell, Suresh Siddha, Andi Kleen, linux-kernel, arjan,
hpa, tglx, Simon Holm Thøgersen
On Wed, 4 Jun 2008, Ingo Molnar wrote:
> * Jürgen Mell <j.mell@t-online.de> wrote:
> >
> > Thank you again!
No, thank YOU. What you gave was excellent feedback.
>
> fix applied to tip/x86/urgent. Thanks everyone, nice find!
>
Ingo, should this be forward to the stable branch?
-- Steve
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: CONFIG_PREEMPT causes corruption of application's FPU stack
2008-06-04 12:55 ` Steven Rostedt
@ 2008-06-04 13:02 ` Ingo Molnar
0 siblings, 0 replies; 20+ messages in thread
From: Ingo Molnar @ 2008-06-04 13:02 UTC (permalink / raw)
To: Steven Rostedt
Cc: Jürgen Mell, Suresh Siddha, Andi Kleen, linux-kernel, arjan,
hpa, tglx, Simon Holm Thøgersen, stable
* Steven Rostedt <rostedt@goodmis.org> wrote:
> > fix applied to tip/x86/urgent. Thanks everyone, nice find!
>
> Ingo, should this be forward to the stable branch?
yes indeed. Cc:-ed.
Ingo
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2008-06-04 13:06 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-17 16:31 CONFIG_PREEMPT causes corruption of application's FPU stack Jürgen Mell
2008-05-18 15:07 ` Steven Rostedt
2008-05-18 15:57 ` Jürgen Mell
-- strict thread matches above, loose matches on Subject: below --
2008-05-24 18:52 j.mell
2008-06-01 9:01 j.mell
2008-06-01 11:40 ` Andi Kleen
2008-06-01 16:47 ` Jürgen Mell
2008-06-02 21:37 ` Suresh Siddha
2008-06-02 22:57 ` Suresh Siddha
2008-06-03 6:02 ` Jürgen Mell
2008-06-04 7:44 ` Jürgen Mell
2008-06-04 10:53 ` Ingo Molnar
2008-06-04 12:55 ` Steven Rostedt
2008-06-04 13:02 ` Ingo Molnar
2008-06-01 12:12 ` Steven Rostedt
2008-06-01 17:11 ` Simon Holm Thøgersen
2008-06-02 21:31 ` Suresh Siddha
2008-06-03 13:23 ` Simon Holm Thøgersen
2008-06-03 19:43 ` Suresh Siddha
2008-06-03 21:08 ` Simon Holm Thøgersen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox