* BUG on 2.6.20-rc1 when using gdb
@ 2006-12-18 1:55 Andrew J. Barr
2006-12-20 0:42 ` Andrew Morton
0 siblings, 1 reply; 12+ messages in thread
From: Andrew J. Barr @ 2006-12-18 1:55 UTC (permalink / raw)
To: linux-kernel
When I was using gdb to debug xchat-gnome, I got a kernel BUG and stack
trace as the program was running (e.g. I had typed 'run' in gdb):
WARNING at kernel/softirq.c:137 local_bh_enable()
[<c0103cd6>] dump_trace+0x68/0x1d9
[<c0103e5f>] show_trace_log_lvl+0x18/0x2c
[<c01044d3>] show_trace+0xf/0x11
[<c010455e>] dump_stack+0x12/0x14
[<c011cc7d>] local_bh_enable+0x44/0x94
[<c02871b9>] unix_release_sock+0x6e/0x1fe
[<c02887eb>] unix_stream_connect+0x3b4/0x3cf
[<c0232dee>] sys_connect+0x82/0xad
[<c0233641>] sys_socketcall+0xac/0x261
[<c0102d38>] syscall_call+0x7/0xb
[<b7f70822>] 0xb7f70822
=======================
------------[ cut here ]------------
kernel BUG at fs/buffer.c:1235!
invalid opcode: 0000 [#1]
PREEMPT
Modules linked in: binfmt_misc rfcomm l2cap i915 drm bluetooth nfs nfsd
exportfs lockd nfs_acl sunrpc nvram uinput ipv6 ppdev lp button ac
battery dm_crypt dm_snapshot dm_mirror dm_mod fuse cpufreq_conservative
cpufreq_ondemand cpufreq_performance cpufreq_powersave
speedstep_centrino freq_table ibm_acpi loop snd_intel8x0m snd_pcm_oss
snd_mixer_oss snd_intel8x0 snd_ac97_codec pcmcia ac97_bus irtty_sir
sir_dev ipw2200 snd_pcm snd_timer irda ieee80211 ieee80211_crypt
crc_ccitt rtc parport_pc parport 8250_pnp snd soundcore 8250_pci 8250
serial_core firmware_class i2c_i801 yenta_socket rsrc_nonstatic
pcmcia_core snd_page_alloc i2c_core intel_agp agpgart evdev tsdev joydev
ext3 jbd mbcache ide_cd cdrom ide_disk ide_generic e100 mii generic piix
ide_core ehci_hcd uhci_hcd usbcore
CPU: 0
EIP: 0060:[<c0179266>] Not tainted VLI
EFLAGS: 00010046 (2.6.20-rc1 #1)
EIP is at __find_get_block+0x1c/0x16f
eax: 00000086 ebx: 00000000 ecx: 00000000 edx: 0088a800
esi: 0088a800 edi: 00000000 ebp: dfffd040 esp: cad2dd30
ds: 007b es: 007b ss: 0068
Process xchat-gnome (pid: 4322, ti=cad2c000 task=d0cd3ab0
task.ti=cad2c000)
Stack: cad2dd58 c02caa0b 00000002 0000000e 0000000b 00000001 e8836580
0088a800
00000000 00000000 e8836610 00000000 c01793dc 00001000 c03ab3e0
f3cadd80
00000086 c90d41b0 0088a800 00000000 dfffd040 00008000 00000000
00000002
Call Trace:
[<c01793dc>] __getblk+0x23/0x268
[<f040d4c6>] ext3_getblk+0x10b/0x244 [ext3]
[<f040e364>] ext3_bread+0x19/0x70 [ext3]
[<f04106f3>] dx_probe+0x43/0x2c9 [ext3]
[<f04119b3>] ext3_htree_fill_tree+0x99/0x1ba [ext3]
[<f040ab77>] ext3_readdir+0x1d4/0x5ed [ext3]
[<c0167b29>] vfs_readdir+0x63/0x8d
[<c0167bb6>] sys_getdents64+0x63/0xa5
[<c0102d38>] syscall_call+0x7/0xb
[<b7f70822>] 0xb7f70822
=======================
Code: 8b 40 08 a8 08 74 05 e8 02 2f 11 00 5b 5e c3 55 89 c5 57 89 cf 56
89 d6 53 83 ec 20 9c 58 90 8d b4 26 00 00 00 00 f6 c4 02 75 04 <0f> 0b
eb fe 89 e0 25 00 e0 ff ff ff 40 14 31 c9 8b 1c 8d a0 74
EIP: [<c0179266>] __find_get_block+0x1c/0x16f SS:ESP 0068:cad2dd30
This happens on 2.6.20-rc1 but not 2.6.19.
Andrew Barr
andrew.james.barr@gmail.com
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: BUG on 2.6.20-rc1 when using gdb 2006-12-18 1:55 BUG on 2.6.20-rc1 when using gdb Andrew J. Barr @ 2006-12-20 0:42 ` Andrew Morton 2006-12-20 0:53 ` Dave Airlie 2006-12-20 11:21 ` Jeremy Fitzhardinge 0 siblings, 2 replies; 12+ messages in thread From: Andrew Morton @ 2006-12-20 0:42 UTC (permalink / raw) To: Andrew J. Barr; +Cc: linux-kernel, Jan Beulich, Andi Kleen, Eric W. Biederman On Sun, 17 Dec 2006 20:55:18 -0500 "Andrew J. Barr" <andrew.james.barr@gmail.com> wrote: > When I was using gdb to debug xchat-gnome, I got a kernel BUG and stack > trace as the program was running (e.g. I had typed 'run' in gdb): > > WARNING at kernel/softirq.c:137 local_bh_enable() > [<c0103cd6>] dump_trace+0x68/0x1d9 > [<c0103e5f>] show_trace_log_lvl+0x18/0x2c > [<c01044d3>] show_trace+0xf/0x11 > [<c010455e>] dump_stack+0x12/0x14 > [<c011cc7d>] local_bh_enable+0x44/0x94 > [<c02871b9>] unix_release_sock+0x6e/0x1fe > [<c02887eb>] unix_stream_connect+0x3b4/0x3cf > [<c0232dee>] sys_connect+0x82/0xad > [<c0233641>] sys_socketcall+0xac/0x261 > [<c0102d38>] syscall_call+0x7/0xb > [<b7f70822>] 0xb7f70822 > ======================= > ------------[ cut here ]------------ > kernel BUG at fs/buffer.c:1235! > invalid opcode: 0000 [#1] > PREEMPT > Modules linked in: binfmt_misc rfcomm l2cap i915 drm bluetooth nfs nfsd > exportfs lockd nfs_acl sunrpc nvram uinput ipv6 ppdev lp button ac > battery dm_crypt dm_snapshot dm_mirror dm_mod fuse cpufreq_conservative > cpufreq_ondemand cpufreq_performance cpufreq_powersave > speedstep_centrino freq_table ibm_acpi loop snd_intel8x0m snd_pcm_oss > snd_mixer_oss snd_intel8x0 snd_ac97_codec pcmcia ac97_bus irtty_sir > sir_dev ipw2200 snd_pcm snd_timer irda ieee80211 ieee80211_crypt > crc_ccitt rtc parport_pc parport 8250_pnp snd soundcore 8250_pci 8250 > serial_core firmware_class i2c_i801 yenta_socket rsrc_nonstatic > pcmcia_core snd_page_alloc i2c_core intel_agp agpgart evdev tsdev joydev > ext3 jbd mbcache ide_cd cdrom ide_disk ide_generic e100 mii generic piix > ide_core ehci_hcd uhci_hcd usbcore > CPU: 0 > EIP: 0060:[<c0179266>] Not tainted VLI > EFLAGS: 00010046 (2.6.20-rc1 #1) > EIP is at __find_get_block+0x1c/0x16f > eax: 00000086 ebx: 00000000 ecx: 00000000 edx: 0088a800 > esi: 0088a800 edi: 00000000 ebp: dfffd040 esp: cad2dd30 > ds: 007b es: 007b ss: 0068 > Process xchat-gnome (pid: 4322, ti=cad2c000 task=d0cd3ab0 > task.ti=cad2c000) > Stack: cad2dd58 c02caa0b 00000002 0000000e 0000000b 00000001 e8836580 > 0088a800 > 00000000 00000000 e8836610 00000000 c01793dc 00001000 c03ab3e0 > f3cadd80 > 00000086 c90d41b0 0088a800 00000000 dfffd040 00008000 00000000 > 00000002 > Call Trace: > [<c01793dc>] __getblk+0x23/0x268 > [<f040d4c6>] ext3_getblk+0x10b/0x244 [ext3] > [<f040e364>] ext3_bread+0x19/0x70 [ext3] > [<f04106f3>] dx_probe+0x43/0x2c9 [ext3] > [<f04119b3>] ext3_htree_fill_tree+0x99/0x1ba [ext3] > [<f040ab77>] ext3_readdir+0x1d4/0x5ed [ext3] > [<c0167b29>] vfs_readdir+0x63/0x8d > [<c0167bb6>] sys_getdents64+0x63/0xa5 > [<c0102d38>] syscall_call+0x7/0xb > [<b7f70822>] 0xb7f70822 > ======================= > Code: 8b 40 08 a8 08 74 05 e8 02 2f 11 00 5b 5e c3 55 89 c5 57 89 cf 56 > 89 d6 53 83 ec 20 9c 58 90 8d b4 26 00 00 00 00 f6 c4 02 75 04 <0f> 0b > eb fe 89 e0 25 00 e0 ff ff ff 40 14 31 c9 8b 1c 8d a0 74 > EIP: [<c0179266>] __find_get_block+0x1c/0x16f SS:ESP 0068:cad2dd30 > > This happens on 2.6.20-rc1 but not 2.6.19. > And it's repeatable, yes? And you're sure that use of gdb triggers it? Something is forgetting to reenable local interrupts. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: BUG on 2.6.20-rc1 when using gdb 2006-12-20 0:42 ` Andrew Morton @ 2006-12-20 0:53 ` Dave Airlie 2006-12-20 0:54 ` Dave Airlie 2006-12-20 11:21 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 12+ messages in thread From: Dave Airlie @ 2006-12-20 0:53 UTC (permalink / raw) To: Andrew Morton Cc: Andrew J. Barr, linux-kernel, Jan Beulich, Andi Kleen, Eric W. Biederman On 12/20/06, Andrew Morton <akpm@osdl.org> wrote: > > When I was using gdb to debug xchat-gnome, I got a kernel BUG and stack > > trace as the program was running (e.g. I had typed 'run' in gdb): > > > > WARNING at kernel/softirq.c:137 local_bh_enable() > > [<c0103cd6>] dump_trace+0x68/0x1d9 > > [<c0103e5f>] show_trace_log_lvl+0x18/0x2c > > [<c01044d3>] show_trace+0xf/0x11 > > [<c010455e>] dump_stack+0x12/0x14 > > [<c011cc7d>] local_bh_enable+0x44/0x94 > > [<c02871b9>] unix_release_sock+0x6e/0x1fe > > [<c02887eb>] unix_stream_connect+0x3b4/0x3cf > > [<c0232dee>] sys_connect+0x82/0xad > > [<c0233641>] sys_socketcall+0xac/0x261 > > [<c0102d38>] syscall_call+0x7/0xb > > [<b7f70822>] 0xb7f70822 > > ======================= > > ------------[ cut here ]------------ > > kernel BUG at fs/buffer.c:1235! > > invalid opcode: 0000 [#1] > > PREEMPT > > Modules linked in: binfmt_misc rfcomm l2cap i915 drm bluetooth nfs nfsd > > exportfs lockd nfs_acl sunrpc nvram uinput ipv6 ppdev lp button ac > > battery dm_crypt dm_snapshot dm_mirror dm_mod fuse cpufreq_conservative > > cpufreq_ondemand cpufreq_performance cpufreq_powersave > > speedstep_centrino freq_table ibm_acpi loop snd_intel8x0m snd_pcm_oss > > snd_mixer_oss snd_intel8x0 snd_ac97_codec pcmcia ac97_bus irtty_sir > > sir_dev ipw2200 snd_pcm snd_timer irda ieee80211 ieee80211_crypt > > crc_ccitt rtc parport_pc parport 8250_pnp snd soundcore 8250_pci 8250 > > serial_core firmware_class i2c_i801 yenta_socket rsrc_nonstatic > > pcmcia_core snd_page_alloc i2c_core intel_agp agpgart evdev tsdev joydev > > ext3 jbd mbcache ide_cd cdrom ide_disk ide_generic e100 mii generic piix > > ide_core ehci_hcd uhci_hcd usbcore > > CPU: 0 > > EIP: 0060:[<c0179266>] Not tainted VLI > > EFLAGS: 00010046 (2.6.20-rc1 #1) > > EIP is at __find_get_block+0x1c/0x16f > > eax: 00000086 ebx: 00000000 ecx: 00000000 edx: 0088a800 > > esi: 0088a800 edi: 00000000 ebp: dfffd040 esp: cad2dd30 > > ds: 007b es: 007b ss: 0068 > > Process xchat-gnome (pid: 4322, ti=cad2c000 task=d0cd3ab0 > > task.ti=cad2c000) > > Stack: cad2dd58 c02caa0b 00000002 0000000e 0000000b 00000001 e8836580 > > 0088a800 > > 00000000 00000000 e8836610 00000000 c01793dc 00001000 c03ab3e0 > > f3cadd80 > > 00000086 c90d41b0 0088a800 00000000 dfffd040 00008000 00000000 > > 00000002 > > Call Trace: > > [<c01793dc>] __getblk+0x23/0x268 > > [<f040d4c6>] ext3_getblk+0x10b/0x244 [ext3] > > [<f040e364>] ext3_bread+0x19/0x70 [ext3] > > [<f04106f3>] dx_probe+0x43/0x2c9 [ext3] > > [<f04119b3>] ext3_htree_fill_tree+0x99/0x1ba [ext3] > > [<f040ab77>] ext3_readdir+0x1d4/0x5ed [ext3] > > [<c0167b29>] vfs_readdir+0x63/0x8d > > [<c0167bb6>] sys_getdents64+0x63/0xa5 > > [<c0102d38>] syscall_call+0x7/0xb > > [<b7f70822>] 0xb7f70822 > > ======================= > > Code: 8b 40 08 a8 08 74 05 e8 02 2f 11 00 5b 5e c3 55 89 c5 57 89 cf 56 > > 89 d6 53 83 ec 20 9c 58 90 8d b4 26 00 00 00 00 f6 c4 02 75 04 <0f> 0b > > eb fe 89 e0 25 00 e0 ff ff ff 40 14 31 c9 8b 1c 8d a0 74 > > EIP: [<c0179266>] __find_get_block+0x1c/0x16f SS:ESP 0068:cad2dd30 > > > > This happens on 2.6.20-rc1 but not 2.6.19. > > > > And it's repeatable, yes? > > And you're sure that use of gdb triggers it? > > Something is forgetting to reenable local interrupts. I've managed to get nearly the same thing on a test system I built yesterday, my app when running under gdb would also blow up in __find_get_block. I was using close to Linus's git head... Dave. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: BUG on 2.6.20-rc1 when using gdb 2006-12-20 0:53 ` Dave Airlie @ 2006-12-20 0:54 ` Dave Airlie 0 siblings, 0 replies; 12+ messages in thread From: Dave Airlie @ 2006-12-20 0:54 UTC (permalink / raw) To: Andrew Morton Cc: Andrew J. Barr, linux-kernel, Jan Beulich, Andi Kleen, Eric W. Biederman On 12/20/06, Dave Airlie <airlied@gmail.com> wrote: > On 12/20/06, Andrew Morton <akpm@osdl.org> wrote: > > > When I was using gdb to debug xchat-gnome, I got a kernel BUG and stack > > > trace as the program was running (e.g. I had typed 'run' in gdb): > > > > > > WARNING at kernel/softirq.c:137 local_bh_enable() > > > [<c0103cd6>] dump_trace+0x68/0x1d9 > > > [<c0103e5f>] show_trace_log_lvl+0x18/0x2c > > > [<c01044d3>] show_trace+0xf/0x11 > > > [<c010455e>] dump_stack+0x12/0x14 > > > [<c011cc7d>] local_bh_enable+0x44/0x94 > > > [<c02871b9>] unix_release_sock+0x6e/0x1fe > > > [<c02887eb>] unix_stream_connect+0x3b4/0x3cf > > > [<c0232dee>] sys_connect+0x82/0xad > > > [<c0233641>] sys_socketcall+0xac/0x261 > > > [<c0102d38>] syscall_call+0x7/0xb > > > [<b7f70822>] 0xb7f70822 > > > ======================= > > > ------------[ cut here ]------------ > > > kernel BUG at fs/buffer.c:1235! > > > invalid opcode: 0000 [#1] > > > PREEMPT > > > Modules linked in: binfmt_misc rfcomm l2cap i915 drm bluetooth nfs nfsd > > > exportfs lockd nfs_acl sunrpc nvram uinput ipv6 ppdev lp button ac > > > battery dm_crypt dm_snapshot dm_mirror dm_mod fuse cpufreq_conservative > > > cpufreq_ondemand cpufreq_performance cpufreq_powersave > > > speedstep_centrino freq_table ibm_acpi loop snd_intel8x0m snd_pcm_oss > > > snd_mixer_oss snd_intel8x0 snd_ac97_codec pcmcia ac97_bus irtty_sir > > > sir_dev ipw2200 snd_pcm snd_timer irda ieee80211 ieee80211_crypt > > > crc_ccitt rtc parport_pc parport 8250_pnp snd soundcore 8250_pci 8250 > > > serial_core firmware_class i2c_i801 yenta_socket rsrc_nonstatic > > > pcmcia_core snd_page_alloc i2c_core intel_agp agpgart evdev tsdev joydev > > > ext3 jbd mbcache ide_cd cdrom ide_disk ide_generic e100 mii generic piix > > > ide_core ehci_hcd uhci_hcd usbcore > > > CPU: 0 > > > EIP: 0060:[<c0179266>] Not tainted VLI > > > EFLAGS: 00010046 (2.6.20-rc1 #1) > > > EIP is at __find_get_block+0x1c/0x16f > > > eax: 00000086 ebx: 00000000 ecx: 00000000 edx: 0088a800 > > > esi: 0088a800 edi: 00000000 ebp: dfffd040 esp: cad2dd30 > > > ds: 007b es: 007b ss: 0068 > > > Process xchat-gnome (pid: 4322, ti=cad2c000 task=d0cd3ab0 > > > task.ti=cad2c000) > > > Stack: cad2dd58 c02caa0b 00000002 0000000e 0000000b 00000001 e8836580 > > > 0088a800 > > > 00000000 00000000 e8836610 00000000 c01793dc 00001000 c03ab3e0 > > > f3cadd80 > > > 00000086 c90d41b0 0088a800 00000000 dfffd040 00008000 00000000 > > > 00000002 > > > Call Trace: > > > [<c01793dc>] __getblk+0x23/0x268 > > > [<f040d4c6>] ext3_getblk+0x10b/0x244 [ext3] > > > [<f040e364>] ext3_bread+0x19/0x70 [ext3] > > > [<f04106f3>] dx_probe+0x43/0x2c9 [ext3] > > > [<f04119b3>] ext3_htree_fill_tree+0x99/0x1ba [ext3] > > > [<f040ab77>] ext3_readdir+0x1d4/0x5ed [ext3] > > > [<c0167b29>] vfs_readdir+0x63/0x8d > > > [<c0167bb6>] sys_getdents64+0x63/0xa5 > > > [<c0102d38>] syscall_call+0x7/0xb > > > [<b7f70822>] 0xb7f70822 > > > ======================= > > > Code: 8b 40 08 a8 08 74 05 e8 02 2f 11 00 5b 5e c3 55 89 c5 57 89 cf 56 > > > 89 d6 53 83 ec 20 9c 58 90 8d b4 26 00 00 00 00 f6 c4 02 75 04 <0f> 0b > > > eb fe 89 e0 25 00 e0 ff ff ff 40 14 31 c9 8b 1c 8d a0 74 > > > EIP: [<c0179266>] __find_get_block+0x1c/0x16f SS:ESP 0068:cad2dd30 > > > > > > This happens on 2.6.20-rc1 but not 2.6.19. > > > > > > > And it's repeatable, yes? > > > > And you're sure that use of gdb triggers it? > > > > Something is forgetting to reenable local interrupts. > > I've managed to get nearly the same thing on a test system I built > yesterday, my app when running under gdb would also blow up in > __find_get_block. > > I was using close to Linus's git head... And of course it was on a fresh 32-bit x86 with FC6 on it. Dave. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: BUG on 2.6.20-rc1 when using gdb 2006-12-20 0:42 ` Andrew Morton 2006-12-20 0:53 ` Dave Airlie @ 2006-12-20 11:21 ` Jeremy Fitzhardinge 2006-12-20 18:35 ` [-mm patch] ptrace: Fix EFL_OFFSET value according to i386 pda changes (was Re: BUG on 2.6.20-rc1 when using gdb) Frederik Deweerdt 1 sibling, 1 reply; 12+ messages in thread From: Jeremy Fitzhardinge @ 2006-12-20 11:21 UTC (permalink / raw) To: Andrew Morton Cc: Andrew J. Barr, linux-kernel, Jan Beulich, Andi Kleen, Eric W. Biederman, walt Andrew Morton wrote: > On Sun, 17 Dec 2006 20:55:18 -0500 > "Andrew J. Barr" <andrew.james.barr@gmail.com> wrote: > > >> When I was using gdb to debug xchat-gnome, I got a kernel BUG and stack >> trace as the program was running (e.g. I had typed 'run' in gdb): >> >> WARNING at kernel/softirq.c:137 local_bh_enable() >> [<c0103cd6>] dump_trace+0x68/0x1d9 >> [<c0103e5f>] show_trace_log_lvl+0x18/0x2c >> [<c01044d3>] show_trace+0xf/0x11 >> [<c010455e>] dump_stack+0x12/0x14 >> [<c011cc7d>] local_bh_enable+0x44/0x94 >> [<c02871b9>] unix_release_sock+0x6e/0x1fe >> [<c02887eb>] unix_stream_connect+0x3b4/0x3cf >> [<c0232dee>] sys_connect+0x82/0xad >> [<c0233641>] sys_socketcall+0xac/0x261 >> [<c0102d38>] syscall_call+0x7/0xb >> [<b7f70822>] 0xb7f70822 >> ======================= >> ------------[ cut here ]------------ >> kernel BUG at fs/buffer.c:1235! >> invalid opcode: 0000 [#1] >> PREEMPT >> Modules linked in: binfmt_misc rfcomm l2cap i915 drm bluetooth nfs nfsd >> exportfs lockd nfs_acl sunrpc nvram uinput ipv6 ppdev lp button ac >> battery dm_crypt dm_snapshot dm_mirror dm_mod fuse cpufreq_conservative >> cpufreq_ondemand cpufreq_performance cpufreq_powersave >> speedstep_centrino freq_table ibm_acpi loop snd_intel8x0m snd_pcm_oss >> snd_mixer_oss snd_intel8x0 snd_ac97_codec pcmcia ac97_bus irtty_sir >> sir_dev ipw2200 snd_pcm snd_timer irda ieee80211 ieee80211_crypt >> crc_ccitt rtc parport_pc parport 8250_pnp snd soundcore 8250_pci 8250 >> serial_core firmware_class i2c_i801 yenta_socket rsrc_nonstatic >> pcmcia_core snd_page_alloc i2c_core intel_agp agpgart evdev tsdev joydev >> ext3 jbd mbcache ide_cd cdrom ide_disk ide_generic e100 mii generic piix >> ide_core ehci_hcd uhci_hcd usbcore >> CPU: 0 >> EIP: 0060:[<c0179266>] Not tainted VLI >> EFLAGS: 00010046 (2.6.20-rc1 #1) >> EIP is at __find_get_block+0x1c/0x16f >> eax: 00000086 ebx: 00000000 ecx: 00000000 edx: 0088a800 >> esi: 0088a800 edi: 00000000 ebp: dfffd040 esp: cad2dd30 >> ds: 007b es: 007b ss: 0068 >> Process xchat-gnome (pid: 4322, ti=cad2c000 task=d0cd3ab0 >> task.ti=cad2c000) >> Stack: cad2dd58 c02caa0b 00000002 0000000e 0000000b 00000001 e8836580 >> 0088a800 >> 00000000 00000000 e8836610 00000000 c01793dc 00001000 c03ab3e0 >> f3cadd80 >> 00000086 c90d41b0 0088a800 00000000 dfffd040 00008000 00000000 >> 00000002 >> Call Trace: >> [<c01793dc>] __getblk+0x23/0x268 >> [<f040d4c6>] ext3_getblk+0x10b/0x244 [ext3] >> [<f040e364>] ext3_bread+0x19/0x70 [ext3] >> [<f04106f3>] dx_probe+0x43/0x2c9 [ext3] >> [<f04119b3>] ext3_htree_fill_tree+0x99/0x1ba [ext3] >> [<f040ab77>] ext3_readdir+0x1d4/0x5ed [ext3] >> [<c0167b29>] vfs_readdir+0x63/0x8d >> [<c0167bb6>] sys_getdents64+0x63/0xa5 >> [<c0102d38>] syscall_call+0x7/0xb >> [<b7f70822>] 0xb7f70822 >> ======================= >> Code: 8b 40 08 a8 08 74 05 e8 02 2f 11 00 5b 5e c3 55 89 c5 57 89 cf 56 >> 89 d6 53 83 ec 20 9c 58 90 8d b4 26 00 00 00 00 f6 c4 02 75 04 <0f> 0b >> eb fe 89 e0 25 00 e0 ff ff ff 40 14 31 c9 8b 1c 8d a0 74 >> EIP: [<c0179266>] __find_get_block+0x1c/0x16f SS:ESP 0068:cad2dd30 >> >> This happens on 2.6.20-rc1 but not 2.6.19. >> >> > > And it's repeatable, yes? > > And you're sure that use of gdb triggers it? > > Something is forgetting to reenable local interrupts. "walt" <w41ter@gmail.com> reported a similar problem which he bisected down to the PDA changeset which touches ptrace (66e10a44d724f1464b5e8b5a3eae1e2cbbc2cca6). I haven't managed to repo the problem, but I guess there's something nasty going on in ptrace - maybe its screwing up eflags on the stack or something. Need to double-check all the conversions from kernel<->usermode registers. Hm, wonder if its fixed with the %gs->%fs conversion patch applied? J ^ permalink raw reply [flat|nested] 12+ messages in thread
* [-mm patch] ptrace: Fix EFL_OFFSET value according to i386 pda changes (was Re: BUG on 2.6.20-rc1 when using gdb) 2006-12-20 11:21 ` Jeremy Fitzhardinge @ 2006-12-20 18:35 ` Frederik Deweerdt 2006-12-20 19:02 ` Andrew J. Barr 2006-12-20 19:21 ` Jeremy Fitzhardinge 0 siblings, 2 replies; 12+ messages in thread From: Frederik Deweerdt @ 2006-12-20 18:35 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Andrew Morton, Andrew J. Barr, linux-kernel, Jan Beulich, Andi Kleen, Eric W. Biederman, walt On Wed, Dec 20, 2006 at 03:21:53AM -0800, Jeremy Fitzhardinge wrote: > "walt" <w41ter@gmail.com> reported a similar problem which he bisected > down to the PDA changeset which touches ptrace > (66e10a44d724f1464b5e8b5a3eae1e2cbbc2cca6). I haven't managed to repo > the problem, but I guess there's something nasty going on in ptrace - > maybe its screwing up eflags on the stack or something. Need to > double-check all the conversions from kernel<->usermode registers. Hm, > wonder if its fixed with the %gs->%fs conversion patch applied? > Hi Jeremy, Same problems here with 2.6.20-rc1-mm1 (ie with the %gs->%fs patch). It seems to me that the problem comes from the EFL_OFFSET no longer beeing accurate. The following patch fixes the problem for me. Regards, Frederik Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> diff --git a/arch/i386/kernel/ptrace.c b/arch/i386/kernel/ptrace.c index 7f7d830..00d8a5a 100644 --- a/arch/i386/kernel/ptrace.c +++ b/arch/i386/kernel/ptrace.c @@ -45,7 +45,7 @@ /* * Offset of eflags on child stack.. */ -#define EFL_OFFSET ((EFL-2)*4-sizeof(struct pt_regs)) +#define EFL_OFFSET ((EFL-1)*4-sizeof(struct pt_regs)) static inline struct pt_regs *get_child_regs(struct task_struct *task) { ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [-mm patch] ptrace: Fix EFL_OFFSET value according to i386 pda changes (was Re: BUG on 2.6.20-rc1 when using gdb) 2006-12-20 18:35 ` [-mm patch] ptrace: Fix EFL_OFFSET value according to i386 pda changes (was Re: BUG on 2.6.20-rc1 when using gdb) Frederik Deweerdt @ 2006-12-20 19:02 ` Andrew J. Barr 2006-12-20 19:21 ` Jeremy Fitzhardinge 1 sibling, 0 replies; 12+ messages in thread From: Andrew J. Barr @ 2006-12-20 19:02 UTC (permalink / raw) To: Frederik Deweerdt Cc: Jeremy Fitzhardinge, Andrew Morton, linux-kernel, Jan Beulich, Andi Kleen, Eric W. Biederman, walt On Wed, 2006-12-20 at 18:35 +0000, Frederik Deweerdt wrote: > On Wed, Dec 20, 2006 at 03:21:53AM -0800, Jeremy Fitzhardinge wrote: > > "walt" <w41ter@gmail.com> reported a similar problem which he bisected > > down to the PDA changeset which touches ptrace > > (66e10a44d724f1464b5e8b5a3eae1e2cbbc2cca6). I haven't managed to repo > > the problem, but I guess there's something nasty going on in ptrace - > > maybe its screwing up eflags on the stack or something. Need to > > double-check all the conversions from kernel<->usermode registers. Hm, > > wonder if its fixed with the %gs->%fs conversion patch applied? > > > Hi Jeremy, > > Same problems here with 2.6.20-rc1-mm1 (ie with the %gs->%fs patch). > It seems to me that the problem comes from the EFL_OFFSET no longer > beeing accurate. > The following patch fixes the problem for me. Me too. Thanks. Andrew > Regards, > Frederik > > Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> > > diff --git a/arch/i386/kernel/ptrace.c b/arch/i386/kernel/ptrace.c > index 7f7d830..00d8a5a 100644 > --- a/arch/i386/kernel/ptrace.c > +++ b/arch/i386/kernel/ptrace.c > @@ -45,7 +45,7 @@ > /* > * Offset of eflags on child stack.. > */ > -#define EFL_OFFSET ((EFL-2)*4-sizeof(struct pt_regs)) > +#define EFL_OFFSET ((EFL-1)*4-sizeof(struct pt_regs)) > > static inline struct pt_regs *get_child_regs(struct task_struct *task) > { ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [-mm patch] ptrace: Fix EFL_OFFSET value according to i386 pda changes (was Re: BUG on 2.6.20-rc1 when using gdb) 2006-12-20 18:35 ` [-mm patch] ptrace: Fix EFL_OFFSET value according to i386 pda changes (was Re: BUG on 2.6.20-rc1 when using gdb) Frederik Deweerdt 2006-12-20 19:02 ` Andrew J. Barr @ 2006-12-20 19:21 ` Jeremy Fitzhardinge 2006-12-20 20:37 ` walt 2006-12-20 20:42 ` Frederik Deweerdt 1 sibling, 2 replies; 12+ messages in thread From: Jeremy Fitzhardinge @ 2006-12-20 19:21 UTC (permalink / raw) To: Frederik Deweerdt Cc: Andrew Morton, Andrew J. Barr, linux-kernel, Jan Beulich, Andi Kleen, Eric W. Biederman, walt Frederik Deweerdt wrote: > Same problems here with 2.6.20-rc1-mm1 (ie with the %gs->%fs patch). > It seems to me that the problem comes from the EFL_OFFSET no longer > beeing accurate. > The following patch fixes the problem for me. > Thanks Frederik; that's exactly the kind of thing I thought it might be. I wonder if there's some way we can make this more robust though... Does this work for you? I did a slightly larger cleanup which should make it less fragile and more comprehensible. J diff -r e775f6e42258 arch/i386/kernel/ptrace.c --- a/arch/i386/kernel/ptrace.c Tue Dec 19 10:32:40 2006 -0800 +++ b/arch/i386/kernel/ptrace.c Wed Dec 20 11:18:56 2006 -0800 @@ -45,7 +45,7 @@ /* * Offset of eflags on child stack.. */ -#define EFL_OFFSET ((EFL-2)*4-sizeof(struct pt_regs)) +#define EFL_OFFSET offsetof(struct pt_regs, eflags) static inline struct pt_regs *get_child_regs(struct task_struct *task) { @@ -54,24 +54,24 @@ static inline struct pt_regs *get_child_ } /* - * this routine will get a word off of the processes privileged stack. - * the offset is how far from the base addr as stored in the TSS. - * this routine assumes that all the privileged stacks are in our + * This routine will get a word off of the processes privileged stack. + * the offset is bytes into the pt_regs structure on the stack. + * This routine assumes that all the privileged stacks are in our * data space. */ static inline int get_stack_long(struct task_struct *task, int offset) { unsigned char *stack; - stack = (unsigned char *)task->thread.esp0; + stack = (unsigned char *)task->thread.esp0 - sizeof(struct pt_regs); stack += offset; return (*((int *)stack)); } /* - * this routine will put a word on the processes privileged stack. - * the offset is how far from the base addr as stored in the TSS. - * this routine assumes that all the privileged stacks are in our + * This routine will put a word on the processes privileged stack. + * the offset is bytes into the pt_regs structure on the stack. + * This routine assumes that all the privileged stacks are in our * data space. */ static inline int put_stack_long(struct task_struct *task, int offset, @@ -79,7 +79,7 @@ static inline int put_stack_long(struct { unsigned char * stack; - stack = (unsigned char *) task->thread.esp0; + stack = (unsigned char *)task->thread.esp0 - sizeof(struct pt_regs); stack += offset; *(unsigned long *) stack = data; return 0; @@ -114,7 +114,7 @@ static int putreg(struct task_struct *ch } if (regno > ES*4) regno -= 1*4; - put_stack_long(child, regno - sizeof(struct pt_regs), value); + put_stack_long(child, regno, value); return 0; } @@ -137,7 +137,6 @@ static unsigned long getreg(struct task_ default: if (regno > ES*4) regno -= 1*4; - regno = regno - sizeof(struct pt_regs); retval &= get_stack_long(child, regno); } return retval; ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [-mm patch] ptrace: Fix EFL_OFFSET value according to i386 pda changes (was Re: BUG on 2.6.20-rc1 when using gdb) 2006-12-20 19:21 ` Jeremy Fitzhardinge @ 2006-12-20 20:37 ` walt 2006-12-20 20:42 ` Frederik Deweerdt 1 sibling, 0 replies; 12+ messages in thread From: walt @ 2006-12-20 20:37 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Frederik Deweerdt, Andrew Morton, Andrew J. Barr, linux-kernel, Jan Beulich, Andi Kleen, Eric W. Biederman Jeremy Fitzhardinge wrote: > Frederik Deweerdt wrote: >> Same problems here with 2.6.20-rc1-mm1 (ie with the %gs->%fs patch). >> It seems to me that the problem comes from the EFL_OFFSET no longer >> beeing accurate. >> The following patch fixes the problem for me. >> > > Thanks Frederik; that's exactly the kind of thing I thought it might > be. I wonder if there's some way we can make this more robust > though... Does this work for you? I did a slightly larger cleanup > which should make it less fragile and more comprehensible. <patch snipped> Hi Jeremy, Your patch works fine for me. (I didn't try the first patch, but I will if anyone wants.) Thanks! ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [-mm patch] ptrace: Fix EFL_OFFSET value according to i386 pda changes (was Re: BUG on 2.6.20-rc1 when using gdb) 2006-12-20 19:21 ` Jeremy Fitzhardinge 2006-12-20 20:37 ` walt @ 2006-12-20 20:42 ` Frederik Deweerdt 2006-12-20 20:53 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 12+ messages in thread From: Frederik Deweerdt @ 2006-12-20 20:42 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Andrew Morton, Andrew J. Barr, linux-kernel, Jan Beulich, Andi Kleen, Eric W. Biederman, walt On Wed, Dec 20, 2006 at 11:21:50AM -0800, Jeremy Fitzhardinge wrote: > Frederik Deweerdt wrote: > > Same problems here with 2.6.20-rc1-mm1 (ie with the %gs->%fs patch). > > It seems to me that the problem comes from the EFL_OFFSET no longer > > beeing accurate. > > The following patch fixes the problem for me. > > > > Thanks Frederik; that's exactly the kind of thing I thought it might > be. I wonder if there's some way we can make this more robust > though... Does this work for you? I did a slightly larger cleanup > which should make it less fragile and more comprehensible. > It works too, thanks. BTW, I wondered if the "case GS:" in getreg() made sense now? Frederik > J > > diff -r e775f6e42258 arch/i386/kernel/ptrace.c > --- a/arch/i386/kernel/ptrace.c Tue Dec 19 10:32:40 2006 -0800 > +++ b/arch/i386/kernel/ptrace.c Wed Dec 20 11:18:56 2006 -0800 > @@ -45,7 +45,7 @@ > /* > * Offset of eflags on child stack.. > */ > -#define EFL_OFFSET ((EFL-2)*4-sizeof(struct pt_regs)) > +#define EFL_OFFSET offsetof(struct pt_regs, eflags) > > static inline struct pt_regs *get_child_regs(struct task_struct *task) > { > @@ -54,24 +54,24 @@ static inline struct pt_regs *get_child_ > } > > /* > - * this routine will get a word off of the processes privileged stack. > - * the offset is how far from the base addr as stored in the TSS. > - * this routine assumes that all the privileged stacks are in our > + * This routine will get a word off of the processes privileged stack. > + * the offset is bytes into the pt_regs structure on the stack. > + * This routine assumes that all the privileged stacks are in our > * data space. > */ > static inline int get_stack_long(struct task_struct *task, int offset) > { > unsigned char *stack; > > - stack = (unsigned char *)task->thread.esp0; > + stack = (unsigned char *)task->thread.esp0 - sizeof(struct pt_regs); > stack += offset; > return (*((int *)stack)); > } > > /* > - * this routine will put a word on the processes privileged stack. > - * the offset is how far from the base addr as stored in the TSS. > - * this routine assumes that all the privileged stacks are in our > + * This routine will put a word on the processes privileged stack. > + * the offset is bytes into the pt_regs structure on the stack. > + * This routine assumes that all the privileged stacks are in our > * data space. > */ > static inline int put_stack_long(struct task_struct *task, int offset, > @@ -79,7 +79,7 @@ static inline int put_stack_long(struct > { > unsigned char * stack; > > - stack = (unsigned char *) task->thread.esp0; > + stack = (unsigned char *)task->thread.esp0 - sizeof(struct pt_regs); > stack += offset; > *(unsigned long *) stack = data; > return 0; > @@ -114,7 +114,7 @@ static int putreg(struct task_struct *ch > } > if (regno > ES*4) > regno -= 1*4; > - put_stack_long(child, regno - sizeof(struct pt_regs), value); > + put_stack_long(child, regno, value); > return 0; > } > > @@ -137,7 +137,6 @@ static unsigned long getreg(struct task_ > default: > if (regno > ES*4) > regno -= 1*4; > - regno = regno - sizeof(struct pt_regs); > retval &= get_stack_long(child, regno); > } > return retval; > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [-mm patch] ptrace: Fix EFL_OFFSET value according to i386 pda changes (was Re: BUG on 2.6.20-rc1 when using gdb) 2006-12-20 20:42 ` Frederik Deweerdt @ 2006-12-20 20:53 ` Jeremy Fitzhardinge 2006-12-20 21:07 ` Frederik Deweerdt 0 siblings, 1 reply; 12+ messages in thread From: Jeremy Fitzhardinge @ 2006-12-20 20:53 UTC (permalink / raw) To: Frederik Deweerdt Cc: Andrew Morton, Andrew J. Barr, linux-kernel, Jan Beulich, Andi Kleen, Eric W. Biederman, walt Frederik Deweerdt wrote: > It works too, thanks. BTW, I wondered if the "case GS:" in getreg() made > sense now? Sorry, what do you mean? It looks OK to me, but I'm not sure what you're referring to. J ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [-mm patch] ptrace: Fix EFL_OFFSET value according to i386 pda changes (was Re: BUG on 2.6.20-rc1 when using gdb) 2006-12-20 20:53 ` Jeremy Fitzhardinge @ 2006-12-20 21:07 ` Frederik Deweerdt 0 siblings, 0 replies; 12+ messages in thread From: Frederik Deweerdt @ 2006-12-20 21:07 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Andrew Morton, Andrew J. Barr, linux-kernel, Jan Beulich, Andi Kleen, Eric W. Biederman, walt On Wed, Dec 20, 2006 at 12:53:33PM -0800, Jeremy Fitzhardinge wrote: > Frederik Deweerdt wrote: > > It works too, thanks. BTW, I wondered if the "case GS:" in getreg() made > > sense now? > > Sorry, what do you mean? It looks OK to me, but I'm not sure what > you're referring to. My bad, that's the code I'm referring to: 121 static unsigned long getreg(struct task_struct *child, 122 unsigned long regno) [...] 126 switch (regno >> 2) { 127 case GS: 128 retval = child->thread.gs; 129 break; What seem weird to me is that putreg(GS) will end up putting 'value' in: child->thread.esp0 - sizeof(struct pt_regs) + (GS - 1)*4 whereas getreg(GS) will return the value of child->thread.gs I must miss something, but the symetry seemed odd to me. Regards, Frederik > > J > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2006-12-20 21:09 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-12-18 1:55 BUG on 2.6.20-rc1 when using gdb Andrew J. Barr 2006-12-20 0:42 ` Andrew Morton 2006-12-20 0:53 ` Dave Airlie 2006-12-20 0:54 ` Dave Airlie 2006-12-20 11:21 ` Jeremy Fitzhardinge 2006-12-20 18:35 ` [-mm patch] ptrace: Fix EFL_OFFSET value according to i386 pda changes (was Re: BUG on 2.6.20-rc1 when using gdb) Frederik Deweerdt 2006-12-20 19:02 ` Andrew J. Barr 2006-12-20 19:21 ` Jeremy Fitzhardinge 2006-12-20 20:37 ` walt 2006-12-20 20:42 ` Frederik Deweerdt 2006-12-20 20:53 ` Jeremy Fitzhardinge 2006-12-20 21:07 ` Frederik Deweerdt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox