* Re: 2.6.32-rc6 BUG at mm/slab.c:2869! [not found] <20090820015624.GE524@hash.localnet> @ 2009-08-20 5:08 ` Pekka Enberg 2009-08-20 11:19 ` Bob Copeland 0 siblings, 1 reply; 10+ messages in thread From: Pekka Enberg @ 2009-08-20 5:08 UTC (permalink / raw) To: Bob Copeland; +Cc: linux-kernel, Andrew Morton, linux-ext4, Vegard Nossum Hi Bob, On Thu, Aug 20, 2009 at 4:56 AM, Bob Copeland<me@bobcopeland.com> wrote: > Sorry I don't have a lot to go on here, but thought I'd report it anyway. > I was just starting firefox and got the following two stack traces in > succession. The kernel is based on 894ef820b10d77e2d6d717342fc408bdd9825139. > > Some kind of memory/filesystem corruption? Someone is stomping on SLAB internal data structures. Ext4 appears in both stack traces so I guess we should CC linux-ext4. How easy is it to reproduce this bug? One option is to try kmemcheck to see if it catches the problem (see Documentation/kmemcheck.txt for details). > [ 8904.113230] ------------[ cut here ]------------ > [ 8902.989751] kernel BUG at mm/slab.c:2869! > [ 8902.989755] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > [ 8902.989762] last sysfs file: /sys/class/rfkill/rfkill0/state > [ 8902.989765] Modules linked in: fuse af_packet ipt_REJECT xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state iptable_filter ip_tables x_tables acpi_cpufreq binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath dm_mod kvm_intel kvm uinput arc4 ecb ath5k mac80211 i915 appletouch hid_apple usbhid snd_hda_codec_idt ath drm ohci1394 snd_hda_intel cfg80211 snd_hda_codec i2c_algo_bit ieee1394 snd_pcm video processor rfkill ehci_hcd sky2 snd_timer snd snd_page_alloc backlight uhci_hcd joydev sg ac battery thermal button output applesmc sr_mod cdrom input_polldev evdev unix [last unloaded: scsi_wait_scan] > [ 8902.989844] > [ 8902.989849] Pid: 253, comm: kswapd0 Not tainted (2.6.31-rc6 #179) MacBook1,1 > [ 8902.989852] EIP: 0060:[<c10995cc>] EFLAGS: 00010002 CPU: 0 > [ 8902.989860] EIP is at cache_free_debugcheck+0x1a9/0x28a > [ 8902.989863] EAX: 009d4e2e EBX: 8ee08ec0 ECX: 31286ea0 EDX: 009d4e2e > [ 8902.989867] ESI: d84156c5 EDI: f7034440 EBP: f590fcb0 ESP: f590fc80 > [ 8902.989870] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > [ 8902.989874] Process kswapd0 (pid: 253, ti=f590e000 task=f72b0418 task.ti=f590e000) > [ 8902.989877] Stack: > [ 8902.989879] f590fc88 c104aac4 00002184 c008f000 c10b8f69 f4205880 009d4e2e c008fd60 > [ 8902.989890] <0> c008fd68 f706af30 f7034440 e77e4ea0 f590fccc c1099911 c008fd68 00000296 > [ 8902.989901] <0> c008fd68 00000001 e77e4ea0 f590fcd8 c10b8f69 c008fd68 f590fcf4 c10b92f3 > [ 8902.989914] Call Trace: > [ 8902.989920] [<c104aac4>] ? trace_hardirqs_off+0xb/0xd > [ 8902.989926] [<c10b8f69>] ? free_buffer_head+0x21/0x62 > [ 8902.989931] [<c1099911>] ? kmem_cache_free+0x4c/0xdf > [ 8902.989936] [<c10b8f69>] ? free_buffer_head+0x21/0x62 > [ 8902.989941] [<c10b92f3>] ? try_to_free_buffers+0x7f/0x8f > [ 8902.989947] [<c1122f6c>] ? jbd2_journal_try_to_free_buffers+0x137/0x13f > [ 8902.989953] [<c10fbce2>] ? ext4_releasepage+0x54/0x62 > [ 8902.989959] [<c10784a2>] ? try_to_release_page+0x35/0x44 > [ 8902.989965] [<c10827e1>] ? shrink_page_list+0x4b5/0x657 > [ 8902.989971] [<c1049d9d>] ? put_lock_stats+0xd/0x21 > [ 8902.989976] [<c104b8eb>] ? trace_hardirqs_on+0xb/0xd > [ 8902.989982] [<c1297839>] ? _spin_unlock_irq+0x32/0x47 > [ 8902.989987] [<c1082bd0>] ? shrink_list+0x24d/0x514 > [ 8902.989993] [<c1007cb3>] ? sched_clock+0x48/0x8d > [ 8902.990000] [<c1007cb3>] ? sched_clock+0x48/0x8d > [ 8902.990004] [<c1049d69>] ? get_lock_stats+0x11/0x38 > [ 8902.990009] [<c1049d9d>] ? put_lock_stats+0xd/0x21 > [ 8902.990014] [<c10830a8>] ? shrink_zone+0x211/0x2aa > [ 8902.990020] [<c1083682>] ? kswapd+0x3e2/0x587 > [ 8902.990021] [<c1081686>] ? isolate_pages_global+0x0/0x18b > [ 8902.990021] [<c103ca60>] ? autoremove_wake_function+0x0/0x34 > [ 8902.990021] [<c10832a0>] ? kswapd+0x0/0x587 > [ 8902.990021] [<c103c78b>] ? kthread+0x70/0x75 > [ 8902.990021] [<c103c71b>] ? kthread+0x0/0x75 > [ 8902.990021] [<c10034f7>] ? kernel_thread_helper+0x7/0x10 > [ 8902.990021] Code: 0d 8b 47 2c 8b 55 e0 8b 4d ec 89 54 01 fc 8b 4d dc 8b 59 0c 8b 4d ec 29 d9 89 c8 f7 67 30 3b 57 38 89 45 e4 89 d0 89 55 e8 72 04 <0f> 0b eb fe 8b 4f 2c 0f af ca 8d 14 0b 39 55 ec 74 04 0f 0b eb > [ 8902.990021] EIP: [<c10995cc>] cache_free_debugcheck+0x1a9/0x28a SS:ESP 0068:f590fc80 > [ 8902.990021] ---[ end trace 73f44c391fc78b0a ]--- > [ 8904.113230] ------------[ cut here ]------------ > [ 8904.113250] WARNING: at lib/list_debug.c:51 list_del+0x41/0x60() > [ 8904.113256] Hardware name: MacBook1,1 > [ 8904.113262] list_del corruption. next->prev should be c03c0000, but was 8ed88ec8 > [ 8904.113267] Modules linked in: fuse af_packet ipt_REJECT xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state iptable_filter ip_tables x_tables acpi_cpufreq binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath dm_mod kvm_intel kvm uinput arc4 ecb ath5k mac80211 i915 appletouch hid_apple usbhid snd_hda_codec_idt ath drm ohci1394 snd_hda_intel cfg80211 snd_hda_codec i2c_algo_bit ieee1394 snd_pcm video processor rfkill ehci_hcd sky2 snd_timer snd snd_page_alloc backlight uhci_hcd joydev sg ac battery thermal button output applesmc sr_mod cdrom input_polldev evdev unix [last unloaded: scsi_wait_scan] > [ 8904.113422] Pid: 6452, comm: firefox Tainted: G D 2.6.31-rc6 #179 > [ 8904.113425] Call Trace: > [ 8904.113433] [<c1029a35>] warn_slowpath_common+0x6a/0x81 > [ 8904.113438] [<c1156ff1>] ? list_del+0x41/0x60 > [ 8904.113443] [<c1029a8a>] warn_slowpath_fmt+0x29/0x2c > [ 8904.113447] [<c1156ff1>] list_del+0x41/0x60 > [ 8904.113453] [<c1099b3f>] free_block+0x7d/0x15a > [ 8904.113457] [<c10996f2>] ? cache_flusharray+0x45/0xf2 > [ 8904.113462] [<c1099743>] cache_flusharray+0x96/0xf2 > [ 8904.113467] [<c1099933>] kmem_cache_free+0x6e/0xdf > [ 8904.113473] [<c10b8f69>] free_buffer_head+0x21/0x62 > [ 8904.113477] [<c10b92f3>] try_to_free_buffers+0x7f/0x8f > [ 8904.113483] [<c1122f6c>] jbd2_journal_try_to_free_buffers+0x137/0x13f > [ 8904.113490] [<c1106acf>] bdev_try_to_free_page+0x4c/0x5a > [ 8904.113495] [<c1106a83>] ? bdev_try_to_free_page+0x0/0x5a > [ 8904.113500] [<c10bebd7>] blkdev_releasepage+0x2b/0x36 > [ 8904.113506] [<c10784a2>] try_to_release_page+0x35/0x44 > [ 8904.113512] [<c10827e1>] shrink_page_list+0x4b5/0x657 > [ 8904.113518] [<c104b8eb>] ? trace_hardirqs_on+0xb/0xd > [ 8904.113524] [<c1297839>] ? _spin_unlock_irq+0x32/0x47 > [ 8904.113529] [<c1082bd0>] shrink_list+0x24d/0x514 > [ 8904.113535] [<c103275f>] ? lock_timer_base+0x26/0x45 > [ 8904.113540] [<c107f258>] ? get_dirty_limits+0x21/0x28a > [ 8904.113562] [<faaa16af>] ? mmu_shrink+0x1c/0xf4 [kvm] > [ 8904.113566] [<c104d6fa>] ? print_lock_contention_bug+0x11/0xb2 > [ 8904.113573] [<c10830a8>] shrink_zone+0x211/0x2aa > [ 8904.113578] [<c1083cd3>] try_to_free_pages+0x1d1/0x2b2 > [ 8904.113583] [<c1081686>] ? isolate_pages_global+0x0/0x18b > [ 8904.113589] [<c107e254>] __alloc_pages_nodemask+0x2d5/0x457 > [ 8904.113594] [<c108001d>] __do_page_cache_readahead+0x10c/0x1c7 > [ 8904.113599] [<c10800f4>] ra_submit+0x1c/0x21 > [ 8904.113604] [<c1079897>] filemap_fault+0x1c5/0x39e > [ 8904.113609] [<c108a99e>] __do_fault+0x40/0x363 > [ 8904.113614] [<c1299c10>] ? do_page_fault+0x1da/0x313 > [ 8904.113620] [<c108c3a2>] handle_mm_fault+0x228/0x4ea > [ 8904.113625] [<c1299c10>] ? do_page_fault+0x1da/0x313 > [ 8904.113630] [<c1299d3a>] do_page_fault+0x304/0x313 > [ 8904.113635] [<c1299a36>] ? do_page_fault+0x0/0x313 > [ 8904.113639] [<c1298133>] error_code+0x6b/0x70 > [ 8904.113644] [<c1299a36>] ? do_page_fault+0x0/0x313 > [ 8904.113648] ---[ end trace 73f44c391fc78b0b ]--- > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.32-rc6 BUG at mm/slab.c:2869! 2009-08-20 5:08 ` 2.6.32-rc6 BUG at mm/slab.c:2869! Pekka Enberg @ 2009-08-20 11:19 ` Bob Copeland 2009-08-20 12:02 ` Vegard Nossum 0 siblings, 1 reply; 10+ messages in thread From: Bob Copeland @ 2009-08-20 11:19 UTC (permalink / raw) To: Pekka Enberg; +Cc: linux-kernel, Andrew Morton, linux-ext4, Vegard Nossum On Thu, Aug 20, 2009 at 08:08:16AM +0300, Pekka Enberg wrote: > Someone is stomping on SLAB internal data structures. Ext4 appears in > both stack traces so I guess we should CC linux-ext4. How easy is it > to reproduce this bug? One option is to try kmemcheck to see if it > catches the problem (see Documentation/kmemcheck.txt for details). I don't yet have a way to reproduce at will but I've anecdotally hit this or some related bug 3 times in the last week or two without trying, I only now captured a stack trace. I can try some workload to exercise the filesystem a bit more to see what turns up. I'll try that and kmemcheck next. -- Bob Copeland %% www.bobcopeland.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.32-rc6 BUG at mm/slab.c:2869! 2009-08-20 11:19 ` Bob Copeland @ 2009-08-20 12:02 ` Vegard Nossum 2009-08-21 2:06 ` Bob Copeland 0 siblings, 1 reply; 10+ messages in thread From: Vegard Nossum @ 2009-08-20 12:02 UTC (permalink / raw) To: Bob Copeland; +Cc: Pekka Enberg, linux-kernel, Andrew Morton, linux-ext4 2009/8/20 Bob Copeland <me@bobcopeland.com>: > On Thu, Aug 20, 2009 at 08:08:16AM +0300, Pekka Enberg wrote: >> Someone is stomping on SLAB internal data structures. Ext4 appears in >> both stack traces so I guess we should CC linux-ext4. How easy is it >> to reproduce this bug? One option is to try kmemcheck to see if it >> catches the problem (see Documentation/kmemcheck.txt for details). > > I don't yet have a way to reproduce at will but I've anecdotally hit > this or some related bug 3 times in the last week or two without trying, > I only now captured a stack trace. I can try some workload to exercise > the filesystem a bit more to see what turns up. > > I'll try that and kmemcheck next. Hm, I'm afraid kmemcheck gives some known false positives related to bitfields in ext4 code, so in the case that something turned up, it might be hard to distinguish it from those false positives. Sorry :-/ Vegard -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.32-rc6 BUG at mm/slab.c:2869! 2009-08-20 12:02 ` Vegard Nossum @ 2009-08-21 2:06 ` Bob Copeland 2009-08-21 6:46 ` Vegard Nossum 0 siblings, 1 reply; 10+ messages in thread From: Bob Copeland @ 2009-08-21 2:06 UTC (permalink / raw) To: Vegard Nossum; +Cc: Pekka Enberg, linux-kernel, Andrew Morton, linux-ext4 On Thu, Aug 20, 2009 at 02:02:49PM +0200, Vegard Nossum wrote: > > I'll try that and kmemcheck next. > > Hm, I'm afraid kmemcheck gives some known false positives related to > bitfields in ext4 code, so in the case that something turned up, it > might be hard to distinguish it from those false positives. Well I didn't get anything from ext4 so far. I did hit one with fsnotify: WARNING: kmemcheck: Caught 32-bit read from freed memory (f34a443c) eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee008a06f700011000 a a a a a a a a a a a a a a a a a a a a a a a a f f f f f f f f ^ Pid: 2745, comm: fsck.ext4 Not tainted (2.6.31-rc6 #2) MacBook1,1 EIP: 0060:[<c10f3656>] EFLAGS: 00010217 CPU: 0 EIP is at inotify_handle_event+0x76/0xc0 EAX: f34a443c EBX: f34a4438 ECX: 00000000 EDX: f6732000 ESI: f6559764 EDI: 00000000 EBP: f6733f0c ESP: c1527450 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 8005003b CR2: f6c046d4 CR3: 367fb000 CR4: 000026d0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff4ff0 DR7: 00000400 [<c10f0d78>] fsnotify+0xa8/0x130 [<c10c5e11>] __fput+0xb1/0x1e0 [<c10c5f55>] fput+0x15/0x20 [<c10c2ca7>] filp_close+0x47/0x80 [<c10c2d54>] sys_close+0x74/0xc0 [<c1002ec8>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff I think that is list_empty() here where %eax is list_head and event_list->next is the read location... which definitely doesn't look like a pointer, if I'm reading it correctly. inotify_fsnotify.o: /* did event_priv get attached? */ if (list_empty(&fsn_event_priv->event_list)) 143: 8d 43 04 lea 0x4(%ebx),%eax event_priv = kmem_cache_alloc(event_priv_cachep, GFP_KERNEL); if (unlikely(!event_priv)) return -ENOMEM; fsn_event_priv = &event_priv->fsnotify_event_priv_data; 146: 39 43 04 cmp %eax,0x4(%ebx) <=== read here 149: 74 1d je 168 <inotify_handle_event+0x98> -- Bob Copeland %% www.bobcopeland.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.32-rc6 BUG at mm/slab.c:2869! 2009-08-21 2:06 ` Bob Copeland @ 2009-08-21 6:46 ` Vegard Nossum 2009-08-21 8:28 ` Vegard Nossum ` (3 more replies) 0 siblings, 4 replies; 10+ messages in thread From: Vegard Nossum @ 2009-08-21 6:46 UTC (permalink / raw) To: Bob Copeland Cc: Eric Paris, Pekka Enberg, linux-kernel, Andrew Morton, linux-ext4 2009/8/21 Bob Copeland <me@bobcopeland.com>: > On Thu, Aug 20, 2009 at 02:02:49PM +0200, Vegard Nossum wrote: >> > I'll try that and kmemcheck next. >> >> Hm, I'm afraid kmemcheck gives some known false positives related to >> bitfields in ext4 code, so in the case that something turned up, it >> might be hard to distinguish it from those false positives. > > Well I didn't get anything from ext4 so far. I did hit one with > fsnotify: > > WARNING: kmemcheck: Caught 32-bit read from freed memory (f34a443c) > eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee008a06f700011000 > a a a a a a a a a a a a a a a a a a a a a a a a f f f f f f f f > ^ > > Pid: 2745, comm: fsck.ext4 Not tainted (2.6.31-rc6 #2) MacBook1,1 > EIP: 0060:[<c10f3656>] EFLAGS: 00010217 CPU: 0 > EIP is at inotify_handle_event+0x76/0xc0 > EAX: f34a443c EBX: f34a4438 ECX: 00000000 EDX: f6732000 > ESI: f6559764 EDI: 00000000 EBP: f6733f0c ESP: c1527450 > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > CR0: 8005003b CR2: f6c046d4 CR3: 367fb000 CR4: 000026d0 > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > DR6: ffff4ff0 DR7: 00000400 > [<c10f0d78>] fsnotify+0xa8/0x130 > [<c10c5e11>] __fput+0xb1/0x1e0 > [<c10c5f55>] fput+0x15/0x20 > [<c10c2ca7>] filp_close+0x47/0x80 > [<c10c2d54>] sys_close+0x74/0xc0 > [<c1002ec8>] sysenter_do_call+0x12/0x36 > [<ffffffff>] 0xffffffff > > I think that is list_empty() here where %eax is list_head > and event_list->next is the read location... which definitely > doesn't look like a pointer, if I'm reading it correctly. I think f34a443c is a valid pointer. On my machine, at least: [ 0.004000] lowmem : 0xc0000000 - 0xf73fe000 ( 883 MB) > > inotify_fsnotify.o: > > /* did event_priv get attached? */ > if (list_empty(&fsn_event_priv->event_list)) > 143: 8d 43 04 lea 0x4(%ebx),%eax > > event_priv = kmem_cache_alloc(event_priv_cachep, GFP_KERNEL); > if (unlikely(!event_priv)) > return -ENOMEM; > > fsn_event_priv = &event_priv->fsnotify_event_priv_data; > 146: 39 43 04 cmp %eax,0x4(%ebx) <=== read here > 149: 74 1d je 168 <inotify_handle_event+0x98> I can see somewhat of a race, I think: 1. userspace calls inotify_read(), where we wait for something to happen: 249 while (1) { 250 prepare_to_wait(&group->notification_waitq, &wait, TASK_INTERRUPTIBLE); 251 252 mutex_lock(&group->notification_mutex); 253 kevent = get_one_event(group, count); 254 mutex_unlock(&group->notification_mutex); 2. an event occurs, and inotify_handle_event() calls fsnotify_add_notify_event(): 64 ret = fsnotify_add_notify_event(group, event, fsn_event_priv); 65 /* EEXIST is not an error */ 66 if (ret == -EEXIST) 67 ret = 0; 3. fsnotify_add_notify_event() adds the fsn_event_priv to the event, and adds the event to the group, and finally wakes up anybody who is waiting on &group->notification_waitq: 230 fsnotify_get_event(event); 231 list_add_tail(&holder->event_list, list); 232 if (priv) 233 list_add_tail(&priv->event_list, &event->private_data_list); 234 spin_unlock(&event->lock); 235 mutex_unlock(&group->notification_mutex); 236 237 wake_up(&group->notification_waitq); 4. inotify_read() wakes up and frees the event: 253 kevent = get_one_event(group, count); 5. inotify_handle_event() now dereferences the event_priv pointer, which was already freed: 69 /* did event_priv get attached? */ 70 if (list_empty(&fsn_event_priv->event_list)) I think that's it. Any thoughts? I put Eric Paris on Cc. Vegard -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.32-rc6 BUG at mm/slab.c:2869! 2009-08-21 6:46 ` Vegard Nossum @ 2009-08-21 8:28 ` Vegard Nossum 2009-08-21 13:00 ` Eric Paris 2009-08-21 12:34 ` Eric Paris ` (2 subsequent siblings) 3 siblings, 1 reply; 10+ messages in thread From: Vegard Nossum @ 2009-08-21 8:28 UTC (permalink / raw) To: Bob Copeland Cc: Eric Paris, Pekka Enberg, linux-kernel, Andrew Morton, linux-ext4 2009/8/21 Vegard Nossum <vegard.nossum@gmail.com>: > 2009/8/21 Bob Copeland <me@bobcopeland.com>: >> On Thu, Aug 20, 2009 at 02:02:49PM +0200, Vegard Nossum wrote: >>> > I'll try that and kmemcheck next. >>> >>> Hm, I'm afraid kmemcheck gives some known false positives related to >>> bitfields in ext4 code, so in the case that something turned up, it >>> might be hard to distinguish it from those false positives. >> >> Well I didn't get anything from ext4 so far. I did hit one with >> fsnotify: >> >> WARNING: kmemcheck: Caught 32-bit read from freed memory (f34a443c) >> eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee008a06f700011000 >> a a a a a a a a a a a a a a a a a a a a a a a a f f f f f f f f >> ^ >> >> Pid: 2745, comm: fsck.ext4 Not tainted (2.6.31-rc6 #2) MacBook1,1 >> EIP: 0060:[<c10f3656>] EFLAGS: 00010217 CPU: 0 >> EIP is at inotify_handle_event+0x76/0xc0 >> EAX: f34a443c EBX: f34a4438 ECX: 00000000 EDX: f6732000 >> ESI: f6559764 EDI: 00000000 EBP: f6733f0c ESP: c1527450 >> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 >> CR0: 8005003b CR2: f6c046d4 CR3: 367fb000 CR4: 000026d0 >> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 >> DR6: ffff4ff0 DR7: 00000400 >> [<c10f0d78>] fsnotify+0xa8/0x130 >> [<c10c5e11>] __fput+0xb1/0x1e0 >> [<c10c5f55>] fput+0x15/0x20 >> [<c10c2ca7>] filp_close+0x47/0x80 >> [<c10c2d54>] sys_close+0x74/0xc0 >> [<c1002ec8>] sysenter_do_call+0x12/0x36 >> [<ffffffff>] 0xffffffff >> >> I think that is list_empty() here where %eax is list_head >> and event_list->next is the read location... which definitely >> doesn't look like a pointer, if I'm reading it correctly. > > I think f34a443c is a valid pointer. On my machine, at least: > > [ 0.004000] lowmem : 0xc0000000 - 0xf73fe000 ( 883 MB) > >> >> inotify_fsnotify.o: >> >> /* did event_priv get attached? */ >> if (list_empty(&fsn_event_priv->event_list)) >> 143: 8d 43 04 lea 0x4(%ebx),%eax >> >> event_priv = kmem_cache_alloc(event_priv_cachep, GFP_KERNEL); >> if (unlikely(!event_priv)) >> return -ENOMEM; >> >> fsn_event_priv = &event_priv->fsnotify_event_priv_data; >> 146: 39 43 04 cmp %eax,0x4(%ebx) <=== read here >> 149: 74 1d je 168 <inotify_handle_event+0x98> > > I can see somewhat of a race, I think: > > 1. userspace calls inotify_read(), where we wait for something to happen: > > 249 while (1) { > 250 prepare_to_wait(&group->notification_waitq, &wait, > TASK_INTERRUPTIBLE); > 251 > 252 mutex_lock(&group->notification_mutex); > 253 kevent = get_one_event(group, count); > 254 mutex_unlock(&group->notification_mutex); > > 2. an event occurs, and inotify_handle_event() calls > fsnotify_add_notify_event(): > > 64 ret = fsnotify_add_notify_event(group, event, fsn_event_priv); > 65 /* EEXIST is not an error */ > 66 if (ret == -EEXIST) > 67 ret = 0; > > 3. fsnotify_add_notify_event() adds the fsn_event_priv to the event, > and adds the event to the group, and finally wakes up anybody who is > waiting on &group->notification_waitq: > > 230 fsnotify_get_event(event); > 231 list_add_tail(&holder->event_list, list); > 232 if (priv) > 233 list_add_tail(&priv->event_list, &event->private_data_list); > 234 spin_unlock(&event->lock); > 235 mutex_unlock(&group->notification_mutex); > 236 > 237 wake_up(&group->notification_waitq); > > 4. inotify_read() wakes up and frees the event: > > 253 kevent = get_one_event(group, count); > > 5. inotify_handle_event() now dereferences the event_priv pointer, > which was already freed: > > 69 /* did event_priv get attached? */ > 70 if (list_empty(&fsn_event_priv->event_list)) > > > I think that's it. Any thoughts? I put Eric Paris on Cc. I guess it was fixed by this recently posted patch: http://osdir.com/ml/linux-kernel/2009-08/msg05185.html Was kmemcheck by any chance used to discover this race in the first place? ;-) Vegard -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.32-rc6 BUG at mm/slab.c:2869! 2009-08-21 8:28 ` Vegard Nossum @ 2009-08-21 13:00 ` Eric Paris 0 siblings, 0 replies; 10+ messages in thread From: Eric Paris @ 2009-08-21 13:00 UTC (permalink / raw) To: Vegard Nossum Cc: Bob Copeland, Pekka Enberg, linux-kernel, Andrew Morton, linux-ext4 On Fri, 2009-08-21 at 10:28 +0200, Vegard Nossum wrote: > 2009/8/21 Vegard Nossum <vegard.nossum@gmail.com>: > > 2009/8/21 Bob Copeland <me@bobcopeland.com>: > >> On Thu, Aug 20, 2009 at 02:02:49PM +0200, Vegard Nossum wrote: > >>> > I'll try that and kmemcheck next. > >>> > >>> Hm, I'm afraid kmemcheck gives some known false positives related to > >>> bitfields in ext4 code, so in the case that something turned up, it > >>> might be hard to distinguish it from those false positives. > >> > >> Well I didn't get anything from ext4 so far. I did hit one with > >> fsnotify: > >> > >> WARNING: kmemcheck: Caught 32-bit read from freed memory (f34a443c) > >> eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee008a06f700011000 > >> a a a a a a a a a a a a a a a a a a a a a a a a f f f f f f f f > >> ^ > >> > >> Pid: 2745, comm: fsck.ext4 Not tainted (2.6.31-rc6 #2) MacBook1,1 > >> EIP: 0060:[<c10f3656>] EFLAGS: 00010217 CPU: 0 > >> EIP is at inotify_handle_event+0x76/0xc0 > >> EAX: f34a443c EBX: f34a4438 ECX: 00000000 EDX: f6732000 > >> ESI: f6559764 EDI: 00000000 EBP: f6733f0c ESP: c1527450 > >> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > >> CR0: 8005003b CR2: f6c046d4 CR3: 367fb000 CR4: 000026d0 > >> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > >> DR6: ffff4ff0 DR7: 00000400 > >> [<c10f0d78>] fsnotify+0xa8/0x130 > >> [<c10c5e11>] __fput+0xb1/0x1e0 > >> [<c10c5f55>] fput+0x15/0x20 > >> [<c10c2ca7>] filp_close+0x47/0x80 > >> [<c10c2d54>] sys_close+0x74/0xc0 > >> [<c1002ec8>] sysenter_do_call+0x12/0x36 > >> [<ffffffff>] 0xffffffff > >> > >> I think that is list_empty() here where %eax is list_head > >> and event_list->next is the read location... which definitely > >> doesn't look like a pointer, if I'm reading it correctly. > > > > I think f34a443c is a valid pointer. On my machine, at least: > > > > [ 0.004000] lowmem : 0xc0000000 - 0xf73fe000 ( 883 MB) > > > >> > >> inotify_fsnotify.o: > >> > >> /* did event_priv get attached? */ > >> if (list_empty(&fsn_event_priv->event_list)) > >> 143: 8d 43 04 lea 0x4(%ebx),%eax > >> > >> event_priv = kmem_cache_alloc(event_priv_cachep, GFP_KERNEL); > >> if (unlikely(!event_priv)) > >> return -ENOMEM; > >> > >> fsn_event_priv = &event_priv->fsnotify_event_priv_data; > >> 146: 39 43 04 cmp %eax,0x4(%ebx) <=== read here > >> 149: 74 1d je 168 <inotify_handle_event+0x98> > > > > I can see somewhat of a race, I think: > > > > 1. userspace calls inotify_read(), where we wait for something to happen: > > > > 249 while (1) { > > 250 prepare_to_wait(&group->notification_waitq, &wait, > > TASK_INTERRUPTIBLE); > > 251 > > 252 mutex_lock(&group->notification_mutex); > > 253 kevent = get_one_event(group, count); > > 254 mutex_unlock(&group->notification_mutex); > > > > 2. an event occurs, and inotify_handle_event() calls > > fsnotify_add_notify_event(): > > > > 64 ret = fsnotify_add_notify_event(group, event, fsn_event_priv); > > 65 /* EEXIST is not an error */ > > 66 if (ret == -EEXIST) > > 67 ret = 0; > > > > 3. fsnotify_add_notify_event() adds the fsn_event_priv to the event, > > and adds the event to the group, and finally wakes up anybody who is > > waiting on &group->notification_waitq: > > > > 230 fsnotify_get_event(event); > > 231 list_add_tail(&holder->event_list, list); > > 232 if (priv) > > 233 list_add_tail(&priv->event_list, &event->private_data_list); > > 234 spin_unlock(&event->lock); > > 235 mutex_unlock(&group->notification_mutex); > > 236 > > 237 wake_up(&group->notification_waitq); > > > > 4. inotify_read() wakes up and frees the event: > > > > 253 kevent = get_one_event(group, count); > > > > 5. inotify_handle_event() now dereferences the event_priv pointer, > > which was already freed: > > > > 69 /* did event_priv get attached? */ > > 70 if (list_empty(&fsn_event_priv->event_list)) > > > > > > I think that's it. Any thoughts? I put Eric Paris on Cc. > > I guess it was fixed by this recently posted patch: > > http://osdir.com/ml/linux-kernel/2009-08/msg05185.html > > Was kmemcheck by any chance used to discover this race in the first place? ;-) No, it was found by Linus' stellar eye. I haven't tried kmemcheck since my last report that I couldn't get a vmware server machine to boot with kmemcheck=1 http://osdir.com/ml/linux-kernel/2009-08/msg03797.html I'll give it another shot today. -Eric ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.32-rc6 BUG at mm/slab.c:2869! 2009-08-21 6:46 ` Vegard Nossum 2009-08-21 8:28 ` Vegard Nossum @ 2009-08-21 12:34 ` Eric Paris 2009-08-21 12:36 ` Bob Copeland 2009-08-21 12:39 ` Pekka Enberg 3 siblings, 0 replies; 10+ messages in thread From: Eric Paris @ 2009-08-21 12:34 UTC (permalink / raw) To: Vegard Nossum Cc: Bob Copeland, Pekka Enberg, linux-kernel, Andrew Morton, linux-ext4 On Fri, 2009-08-21 at 08:46 +0200, Vegard Nossum wrote: > 2009/8/21 Bob Copeland <me@bobcopeland.com>: > > On Thu, Aug 20, 2009 at 02:02:49PM +0200, Vegard Nossum wrote: > >> > I'll try that and kmemcheck next. > >> > >> Hm, I'm afraid kmemcheck gives some known false positives related to > >> bitfields in ext4 code, so in the case that something turned up, it > >> might be hard to distinguish it from those false positives. > > > > Well I didn't get anything from ext4 so far. I did hit one with > > fsnotify: > > > > WARNING: kmemcheck: Caught 32-bit read from freed memory (f34a443c) > > eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee008a06f700011000 > > a a a a a a a a a a a a a a a a a a a a a a a a f f f f f f f f > > ^ > > > > Pid: 2745, comm: fsck.ext4 Not tainted (2.6.31-rc6 #2) MacBook1,1 > > EIP: 0060:[<c10f3656>] EFLAGS: 00010217 CPU: 0 > > EIP is at inotify_handle_event+0x76/0xc0 > > EAX: f34a443c EBX: f34a4438 ECX: 00000000 EDX: f6732000 > > ESI: f6559764 EDI: 00000000 EBP: f6733f0c ESP: c1527450 > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > > CR0: 8005003b CR2: f6c046d4 CR3: 367fb000 CR4: 000026d0 > > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > > DR6: ffff4ff0 DR7: 00000400 > > [<c10f0d78>] fsnotify+0xa8/0x130 > > [<c10c5e11>] __fput+0xb1/0x1e0 > > [<c10c5f55>] fput+0x15/0x20 > > [<c10c2ca7>] filp_close+0x47/0x80 > > [<c10c2d54>] sys_close+0x74/0xc0 > > [<c1002ec8>] sysenter_do_call+0x12/0x36 > > [<ffffffff>] 0xffffffff > > > > I think that is list_empty() here where %eax is list_head > > and event_list->next is the read location... which definitely > > doesn't look like a pointer, if I'm reading it correctly. > > I think f34a443c is a valid pointer. On my machine, at least: > > [ 0.004000] lowmem : 0xc0000000 - 0xf73fe000 ( 883 MB) > > > > > inotify_fsnotify.o: > > > > /* did event_priv get attached? */ > > if (list_empty(&fsn_event_priv->event_list)) > > 143: 8d 43 04 lea 0x4(%ebx),%eax > > > > event_priv = kmem_cache_alloc(event_priv_cachep, GFP_KERNEL); > > if (unlikely(!event_priv)) > > return -ENOMEM; > > > > fsn_event_priv = &event_priv->fsnotify_event_priv_data; > > 146: 39 43 04 cmp %eax,0x4(%ebx) <=== read here > > 149: 74 1d je 168 <inotify_handle_event+0x98> > > I can see somewhat of a race, I think: > > 1. userspace calls inotify_read(), where we wait for something to happen: > > 249 while (1) { > 250 prepare_to_wait(&group->notification_waitq, &wait, > TASK_INTERRUPTIBLE); > 251 > 252 mutex_lock(&group->notification_mutex); > 253 kevent = get_one_event(group, count); > 254 mutex_unlock(&group->notification_mutex); > > 2. an event occurs, and inotify_handle_event() calls > fsnotify_add_notify_event(): > > 64 ret = fsnotify_add_notify_event(group, event, fsn_event_priv); > 65 /* EEXIST is not an error */ > 66 if (ret == -EEXIST) > 67 ret = 0; > > 3. fsnotify_add_notify_event() adds the fsn_event_priv to the event, > and adds the event to the group, and finally wakes up anybody who is > waiting on &group->notification_waitq: > > 230 fsnotify_get_event(event); > 231 list_add_tail(&holder->event_list, list); > 232 if (priv) > 233 list_add_tail(&priv->event_list, &event->private_data_list); > 234 spin_unlock(&event->lock); > 235 mutex_unlock(&group->notification_mutex); > 236 > 237 wake_up(&group->notification_waitq); > > 4. inotify_read() wakes up and frees the event: > > 253 kevent = get_one_event(group, count); > > 5. inotify_handle_event() now dereferences the event_priv pointer, > which was already freed: > > 69 /* did event_priv get attached? */ > 70 if (list_empty(&fsn_event_priv->event_list)) > > > I think that's it. Any thoughts? I put Eric Paris on Cc. Yes, that's a race. It's already fixed in Linus' git tree (Linus pointed out what an idiot I am on that one) -Eric ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.32-rc6 BUG at mm/slab.c:2869! 2009-08-21 6:46 ` Vegard Nossum 2009-08-21 8:28 ` Vegard Nossum 2009-08-21 12:34 ` Eric Paris @ 2009-08-21 12:36 ` Bob Copeland 2009-08-21 12:39 ` Pekka Enberg 3 siblings, 0 replies; 10+ messages in thread From: Bob Copeland @ 2009-08-21 12:36 UTC (permalink / raw) To: Vegard Nossum Cc: Eric Paris, Pekka Enberg, linux-kernel, Andrew Morton, linux-ext4 On Fri, Aug 21, 2009 at 08:46:37AM +0200, Vegard Nossum wrote: > 2009/8/21 Bob Copeland <me@bobcopeland.com>: > > WARNING: kmemcheck: Caught 32-bit read from freed memory (f34a443c) > > eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee008a06f700011000 > > a a a a a a a a a a a a a a a a a a a a a a a a f f f f f f f f > > > > I think that is list_empty() here where %eax is list_head > > and event_list->next is the read location... which definitely > > doesn't look like a pointer, if I'm reading it correctly. > > I think f34a443c is a valid pointer. On my machine, at least: I meant list->next, which was 00100100 here. (By the way, my first time using kmemcheck, very nice tool!) [race condition explanation snipped] > I think that's it. Any thoughts? I put Eric Paris on Cc. Thanks! With a fresh pull containing the fsnotify fix the kmemcheck warning is gone. So unless I see anything else I'll assume that this one is already fixed, sorry for the false alarm. -- Bob Copeland %% www.bobcopeland.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.32-rc6 BUG at mm/slab.c:2869! 2009-08-21 6:46 ` Vegard Nossum ` (2 preceding siblings ...) 2009-08-21 12:36 ` Bob Copeland @ 2009-08-21 12:39 ` Pekka Enberg 3 siblings, 0 replies; 10+ messages in thread From: Pekka Enberg @ 2009-08-21 12:39 UTC (permalink / raw) To: Vegard Nossum Cc: Bob Copeland, Eric Paris, linux-kernel, Andrew Morton, linux-ext4 On Fri, Aug 21, 2009 at 9:46 AM, Vegard Nossum<vegard.nossum@gmail.com> wrote: > I think that's it. Any thoughts? I put Eric Paris on Cc. Nice catch by kmemcheck! I don't think that explains the original slab corruption, though. Bob, you might want to try again with the fsnotify fix applied. Pekka ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-08-21 13:01 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20090820015624.GE524@hash.localnet>
2009-08-20 5:08 ` 2.6.32-rc6 BUG at mm/slab.c:2869! Pekka Enberg
2009-08-20 11:19 ` Bob Copeland
2009-08-20 12:02 ` Vegard Nossum
2009-08-21 2:06 ` Bob Copeland
2009-08-21 6:46 ` Vegard Nossum
2009-08-21 8:28 ` Vegard Nossum
2009-08-21 13:00 ` Eric Paris
2009-08-21 12:34 ` Eric Paris
2009-08-21 12:36 ` Bob Copeland
2009-08-21 12:39 ` Pekka Enberg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).