* Oops with 2.6.32-rc6 @ 2009-11-19 3:48 Lucas C. Villa Real 2010-01-19 4:50 ` Lucas C. Villa Real 0 siblings, 1 reply; 3+ messages in thread From: Lucas C. Villa Real @ 2009-11-19 3:48 UTC (permalink / raw) To: linux-kernel Hi, I recently decided to test 2.6.32-rc6 and I noticed that, whenever too many disk activity happens, the system crashes. The error shown in the traces below happened about 3 times in a week. Do you have any suggestions? Thanks, Lucas Nov 16 10:37:27 (none) kernel: BUG: unable to handle kernel paging request at 0000b2cb Nov 16 10:37:33 (none) kernel: IP: [<c0198266>] __rmqueue+0x98/0x36c Nov 16 10:37:33 (none) kernel: *pdpt = 0000000031dd2001 *pde = 0000000000000000 Nov 16 10:37:33 (none) kernel: Oops: 0002 [#1] PREEMPT SMP Nov 16 10:37:33 (none) kernel: last sysfs file: /System/Kernel/Objects/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0003:00/power_supply/ADP1/online Nov 16 10:37:33 (none) kernel: Modules linked in: ipv6 acpi_cpufreq snd_pcm_oss snd_mixer_oss hfsplus ndiswrapper fuse snd_hda_codec_realtek joydev isight_firmware snd_hda_intel sky2 uvcvideo snd_hda_codec videodev firewire_ohci video output snd_hwdep firewire_core v4l1_compat ac battery appletouch snd_pcm i2c_i801 applesmc led_class rtc_cmos thermal snd_timer processor shpchp i2c_core button ohci1394 intel_agp iTCO_wdt rtc_core rtc_lib input_polldev pcspkr snd snd_page_alloc iTCO_vendor_support pci_hotplug Nov 16 10:37:33 (none) kernel: Nov 16 10:37:33 (none) kernel: Pid: 1724, comm: tar Tainted: P (2.6.32-rc6-Gobo #3) MacBook3,1 Nov 16 10:37:33 (none) kernel: EIP: 0060:[<c0198266>] EFLAGS: 00010086 CPU: 0 Nov 16 10:37:33 (none) kernel: EIP is at __rmqueue+0x98/0x36c Nov 16 10:37:33 (none) kernel: EAX: 000001b8 EBX: c1ad1000 ECX: 0000000a EDX: 0000b2c7 Nov 16 10:37:33 (none) kernel: ESI: c0b2cf40 EDI: c0b2d22c EBP: f0c8fc50 ESP: f0c8fc18 Nov 16 10:37:33 (none) kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Nov 16 10:37:33 (none) kernel: Process tar (pid: 1724, ti=f0c8f000 task=f307a610 task.ti=f0c8f000) Nov 16 10:37:33 (none) kernel: Stack: Nov 16 10:37:33 (none) kernel: c01c909e f24c7250 00000000 00000000 00000010 00000000 c0b2d218 c0b2d21c Nov 16 10:37:33 (none) kernel: <0> 00000002 c1ad1018 00000010 c0b2cf40 c1ae0ff8 00000000 f0c8fca4 c0199335 Nov 16 10:37:33 (none) kernel: <0> 00000000 ffffffff 0000001f 0000003c 00000000 c0b2d744 c0b2cf7c c0bd4134 Nov 16 10:37:33 (none) kernel: Call Trace: Nov 16 10:37:33 (none) kernel: [<c01c909e>] ? inode_get_bytes+0x48/0x54 Nov 16 10:37:33 (none) kernel: [<c0199335>] ? get_page_from_freelist+0x147/0x3ec Nov 16 10:37:33 (none) kernel: [<c01996a0>] ? __alloc_pages_nodemask+0xc6/0x480 Nov 16 10:37:33 (none) kernel: [<c014e22f>] ? autoremove_wake_function+0x0/0x34 Nov 16 10:37:33 (none) kernel: [<c01946fe>] ? find_get_page+0x2d/0x9f Nov 16 10:37:33 (none) kernel: [<c0194cdc>] ? grab_cache_page_write_begin+0x54/0x8e Nov 16 10:37:33 (none) kernel: [<c0217b8b>] ? reiserfs_write_begin+0x81/0x1b2 Nov 16 10:37:33 (none) kernel: [<c0195588>] ? generic_file_buffered_write+0xd9/0x22f Nov 16 10:37:33 (none) kernel: [<c0195c2e>] ? __generic_file_aio_write+0x3a2/0x3e3 Nov 16 10:37:33 (none) kernel: [<c019635a>] ? generic_file_aio_read+0x4cf/0x509 Nov 16 10:37:33 (none) kernel: [<c0195cd3>] ? generic_file_aio_write+0x64/0xab Nov 16 10:37:33 (none) kernel: [<c01c5eea>] ? do_sync_write+0xb0/0xeb Nov 16 10:37:33 (none) kernel: [<c014e22f>] ? autoremove_wake_function+0x0/0x34 Nov 16 10:37:33 (none) kernel: [<c014e22f>] ? autoremove_wake_function+0x0/0x34 Nov 16 10:37:33 (none) kernel: [<c01d81e9>] ? expand_files+0xe/0x201 Nov 16 10:37:33 (none) kernel: [<c01ebbce>] ? fsnotify+0xe/0xdb Nov 16 10:37:33 (none) kernel: [<c021be86>] ? reiserfs_file_write+0x6e/0x77 Nov 16 10:37:33 (none) kernel: [<c01c68f3>] ? vfs_write+0x99/0x14c Nov 16 10:37:33 (none) kernel: [<c021be18>] ? reiserfs_file_write+0x0/0x77 Nov 16 10:37:33 (none) kernel: [<c01c6a62>] ? sys_write+0x48/0x75 Nov 16 10:37:33 (none) kernel: [<c0103553>] ? sysenter_do_call+0x12/0x28 Nov 16 10:37:33 (none) kernel: Code: 39 5d f0 75 06 41 e9 a0 00 00 00 8b 55 e8 c1 e2 03 89 55 f0 01 c2 8b 94 16 24 01 00 00 89 d3 83 eb 18 89 55 ec 8b 7b 1c 8b 53 18 <89> 7a 04 89 17 c7 43 1c 00 02 20 00 c7 43 18 00 01 10 00 8b 7d Nov 16 10:37:33 (none) kernel: EIP: [<c0198266>] __rmqueue+0x98/0x36c SS:ESP 0068:f0c8fc18 Nov 16 10:37:33 (none) kernel: CR2: 000000000000b2cb Nov 16 10:37:33 (none) kernel: ---[ end trace 611dcee22abb0dec ]--- Nov 16 10:37:33 (none) kernel: note: tar[1724] exited with preempt_count 2 Nov 16 10:37:33 (none) kernel: BUG: scheduling while atomic: tar/1724/0x10000003 Nov 16 10:37:33 (none) kernel: Modules linked in: ipv6 acpi_cpufreq snd_pcm_oss snd_mixer_oss hfsplus ndiswrapper fuse snd_hda_codec_realtek joydev isight_firmware snd_hda_intel sky2 uvcvideo snd_hda_codec videodev firewire_ohci video output snd_hwdep firewire_core v4l1_compat ac battery appletouch snd_pcm i2c_i801 applesmc led_class rtc_cmos thermal snd_timer processor shpchp i2c_core button ohci1394 intel_agp iTCO_wdt rtc_core rtc_lib input_polldev pcspkr snd snd_page_alloc iTCO_vendor_support pci_hotplug Nov 16 10:37:33 (none) kernel: Pid: 1724, comm: tar Tainted: P D 2.6.32-rc6-Gobo #3 Nov 16 10:37:33 (none) kernel: Call Trace: Nov 16 10:37:33 (none) kernel: [<c012da92>] __schedule_bug+0x51/0x56 Nov 16 10:37:33 (none) kernel: [<c07f0fc9>] schedule+0x9f/0x993 Nov 16 10:37:33 (none) kernel: [<c019b750>] ? release_pages+0xe/0x165 Nov 16 10:37:33 (none) kernel: [<c01c4463>] ? lookup_page_cgroup+0x9/0x32 Nov 16 10:37:33 (none) kernel: [<c01c4463>] ? lookup_page_cgroup+0x9/0x32 Nov 16 10:37:33 (none) kernel: [<c07f1a47>] ? preempt_schedule+0x8/0x49 Nov 16 10:37:33 (none) kernel: [<c019bd73>] ? lru_add_drain+0x95/0x9b Nov 16 10:37:33 (none) kernel: [<c01a569f>] ? __dec_zone_state+0xe/0x87 Nov 16 10:37:33 (none) kernel: [<c012e61e>] __cond_resched+0x1b/0x2b Nov 16 10:37:33 (none) kernel: [<c07f19c1>] _cond_resched+0x20/0x2b Nov 16 10:37:33 (none) kernel: [<c01a972c>] unmap_vmas+0x55f/0x6af Nov 16 10:37:33 (none) kernel: [<c019b750>] ? release_pages+0xe/0x165 Nov 16 10:37:33 (none) kernel: [<c01ad4e4>] exit_mmap+0xaf/0x13e Nov 16 10:37:33 (none) kernel: [<c0135b76>] mmput+0x3a/0xb0 Nov 16 10:37:33 (none) kernel: [<c01395bb>] exit_mm+0xea/0xf2 Nov 16 10:37:33 (none) kernel: [<c013ae64>] do_exit+0x1b3/0x5f6 Nov 16 10:37:33 (none) kernel: [<c07f0b7a>] ? printk+0x14/0x16 Nov 16 10:37:33 (none) kernel: [<c07f4039>] oops_end+0xa2/0xaa Nov 16 10:37:33 (none) kernel: [<c011e1c0>] no_context+0x13b/0x145 Nov 16 10:37:33 (none) kernel: [<c011e2b6>] __bad_area_nosemaphore+0xec/0xf4 Nov 16 10:37:33 (none) kernel: [<c011e2d0>] bad_area_nosemaphore+0x12/0x15 Nov 16 10:37:33 (none) kernel: [<c07f52a7>] do_page_fault+0x200/0x34e Nov 16 10:37:33 (none) kernel: [<c07f50a7>] ? do_page_fault+0x0/0x34e Nov 16 10:37:33 (none) kernel: [<c07f36c3>] error_code+0x73/0x78 Nov 16 10:37:33 (none) kernel: [<c02200d8>] ? find_hash_out+0xc1/0x1dd Nov 16 10:37:33 (none) kernel: [<c0198266>] ? __rmqueue+0x98/0x36c Nov 16 10:37:33 (none) kernel: [<c01c909e>] ? inode_get_bytes+0x48/0x54 Nov 16 10:37:33 (none) kernel: [<c0199335>] get_page_from_freelist+0x147/0x3ec Nov 16 10:37:33 (none) kernel: [<c01996a0>] __alloc_pages_nodemask+0xc6/0x480 Nov 16 10:37:33 (none) kernel: [<c014e22f>] ? autoremove_wake_function+0x0/0x34 Nov 16 10:37:33 (none) kernel: [<c01946fe>] ? find_get_page+0x2d/0x9f Nov 16 10:37:33 (none) kernel: [<c0194cdc>] grab_cache_page_write_begin+0x54/0x8e Nov 16 10:37:33 (none) kernel: [<c0217b8b>] reiserfs_write_begin+0x81/0x1b2 Nov 16 10:37:33 (none) kernel: [<c0195588>] generic_file_buffered_write+0xd9/0x22f Nov 16 10:37:33 (none) kernel: [<c0195c2e>] __generic_file_aio_write+0x3a2/0x3e3 Nov 16 10:37:33 (none) kernel: [<c019635a>] ? generic_file_aio_read+0x4cf/0x509 Nov 16 10:37:33 (none) kernel: [<c0195cd3>] generic_file_aio_write+0x64/0xab Nov 16 10:37:33 (none) kernel: [<c01c5eea>] do_sync_write+0xb0/0xeb Nov 16 10:37:33 (none) kernel: [<c014e22f>] ? autoremove_wake_function+0x0/0x34 Nov 16 10:37:33 (none) kernel: [<c014e22f>] ? autoremove_wake_function+0x0/0x34 Nov 16 10:37:33 (none) kernel: [<c01d81e9>] ? expand_files+0xe/0x201 Nov 16 10:37:33 (none) kernel: [<c01ebbce>] ? fsnotify+0xe/0xdb Nov 16 10:37:33 (none) kernel: [<c021be86>] reiserfs_file_write+0x6e/0x77 Nov 16 10:37:33 (none) kernel: [<c01c68f3>] vfs_write+0x99/0x14c Nov 16 10:37:33 (none) kernel: [<c021be18>] ? reiserfs_file_write+0x0/0x77 Nov 16 10:37:33 (none) kernel: [<c01c6a62>] sys_write+0x48/0x75 Nov 16 10:37:33 (none) kernel: [<c0103553>] sysenter_do_call+0x12/0x28 Nov 16 10:37:33 (none) kernel: BUG: scheduling while atomic: tar/1724/0x00000003 Nov 16 10:37:33 (none) kernel: Modules linked in: ipv6 acpi_cpufreq snd_pcm_oss snd_mixer_oss hfsplus ndiswrapper fuse snd_hda_codec_realtek joydev isight_firmware snd_hda_intel sky2 uvcvideo snd_hda_codec videodev firewire_ohci video output snd_hwdep firewire_core v4l1_compat ac battery appletouch snd_pcm i2c_i801 applesmc led_class rtc_cmos thermal snd_timer processor shpchp i2c_core button ohci1394 intel_agp iTCO_wdt rtc_core rtc_lib input_polldev pcspkr snd snd_page_alloc iTCO_vendor_support pci_hotplug Nov 16 10:37:33 (none) kernel: Pid: 1724, comm: tar Tainted: P D 2.6.32-rc6-Gobo #3 Nov 16 10:37:33 (none) kernel: Call Trace: Nov 16 10:37:33 (none) kernel: [<c012da92>] __schedule_bug+0x51/0x56 Nov 16 10:37:34 (none) kernel: [<c07f0fc9>] schedule+0x9f/0x993 Nov 16 10:37:34 (none) kernel: [<c012422b>] ? idle_cpu+0x8/0x2a Nov 16 10:37:34 (none) kernel: [<c013d77d>] ? irq_exit+0x3e/0x6b Nov 16 10:37:34 (none) kernel: [<c0124c0e>] ? mutex_spin_on_owner+0x56/0x65 Nov 16 10:37:34 (none) kernel: [<c07f2066>] __mutex_lock_slowpath+0xc8/0x122 Nov 16 10:37:34 (none) kernel: [<c07f1f14>] mutex_lock+0x18/0x26 Nov 16 10:37:34 (none) kernel: [<c021bbe7>] reiserfs_file_release+0x116/0x303 Nov 16 10:37:34 (none) kernel: [<c01f39aa>] ? locks_remove_posix+0xc/0x8c Nov 16 10:37:34 (none) kernel: [<c01ebbce>] ? fsnotify+0xe/0xdb Nov 16 10:37:34 (none) kernel: [<c01c724d>] ? __fput+0x85/0x177 Nov 16 10:37:34 (none) kernel: [<c01c7297>] __fput+0xcf/0x177 Nov 16 10:37:34 (none) kernel: [<c01c7359>] fput+0x1a/0x1c Nov 16 10:37:34 (none) kernel: [<c01c4811>] filp_close+0x56/0x60 Nov 16 10:37:34 (none) kernel: [<c0139747>] put_files_struct+0x5d/0xa1 Nov 16 10:37:34 (none) kernel: [<c01397c7>] exit_files+0x3c/0x41 Nov 16 10:37:34 (none) kernel: [<c013aebb>] do_exit+0x20a/0x5f6 Nov 16 10:37:34 (none) kernel: [<c07f0b7a>] ? printk+0x14/0x16 Nov 16 10:37:34 (none) kernel: [<c07f4039>] oops_end+0xa2/0xaa Nov 16 10:37:34 (none) kernel: [<c011e1c0>] no_context+0x13b/0x145 Nov 16 10:37:34 (none) kernel: [<c011e2b6>] __bad_area_nosemaphore+0xec/0xf4 Nov 16 10:37:34 (none) kernel: [<c011e2d0>] bad_area_nosemaphore+0x12/0x15 Nov 16 10:37:34 (none) kernel: [<c07f52a7>] do_page_fault+0x200/0x34e Nov 16 10:37:34 (none) kernel: [<c07f50a7>] ? do_page_fault+0x0/0x34e Nov 16 10:37:34 (none) kernel: [<c07f36c3>] error_code+0x73/0x78 Nov 16 10:37:34 (none) kernel: [<c02200d8>] ? find_hash_out+0xc1/0x1dd Nov 16 10:37:34 (none) kernel: [<c0198266>] ? __rmqueue+0x98/0x36c Nov 16 10:37:34 (none) kernel: [<c01c909e>] ? inode_get_bytes+0x48/0x54 Nov 16 10:37:34 (none) kernel: [<c0199335>] get_page_from_freelist+0x147/0x3ec Nov 16 10:37:34 (none) kernel: [<c01996a0>] __alloc_pages_nodemask+0xc6/0x480 Nov 16 10:37:34 (none) kernel: [<c014e22f>] ? autoremove_wake_function+0x0/0x34 Nov 16 10:37:34 (none) kernel: [<c01946fe>] ? find_get_page+0x2d/0x9f Nov 16 10:37:34 (none) kernel: [<c0194cdc>] grab_cache_page_write_begin+0x54/0x8e Nov 16 10:37:34 (none) kernel: [<c0217b8b>] reiserfs_write_begin+0x81/0x1b2 Nov 16 10:37:34 (none) kernel: [<c0195588>] generic_file_buffered_write+0xd9/0x22f Nov 16 10:37:34 (none) kernel: [<c0195c2e>] __generic_file_aio_write+0x3a2/0x3e3 Nov 16 10:37:34 (none) kernel: [<c019635a>] ? generic_file_aio_read+0x4cf/0x509 Nov 16 10:37:34 (none) kernel: [<c0195cd3>] generic_file_aio_write+0x64/0xab Nov 16 10:37:34 (none) kernel: [<c01c5eea>] do_sync_write+0xb0/0xeb Nov 16 10:37:34 (none) kernel: [<c014e22f>] ? autoremove_wake_function+0x0/0x34 Nov 16 10:37:34 (none) kernel: [<c014e22f>] ? autoremove_wake_function+0x0/0x34 Nov 16 10:37:34 (none) kernel: [<c01d81e9>] ? expand_files+0xe/0x201 Nov 16 10:37:34 (none) kernel: [<c01ebbce>] ? fsnotify+0xe/0xdb Nov 16 10:37:34 (none) kernel: [<c021be86>] reiserfs_file_write+0x6e/0x77 Nov 16 10:37:34 (none) kernel: [<c01c68f3>] vfs_write+0x99/0x14c Nov 16 10:37:34 (none) kernel: [<c021be18>] ? reiserfs_file_write+0x0/0x77 Nov 16 10:37:34 (none) kernel: [<c01c6a62>] sys_write+0x48/0x75 Nov 16 10:37:34 (none) kernel: [<c0103553>] sysenter_do_call+0x12/0x28 ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Oops with 2.6.32-rc6 2009-11-19 3:48 Oops with 2.6.32-rc6 Lucas C. Villa Real @ 2010-01-19 4:50 ` Lucas C. Villa Real 2010-02-02 17:04 ` Lucas C. Villa Real 0 siblings, 1 reply; 3+ messages in thread From: Lucas C. Villa Real @ 2010-01-19 4:50 UTC (permalink / raw) To: linux-kernel On Thu, Nov 19, 2009 at 1:48 AM, Lucas C. Villa Real <lucasvr@gobolinux.org> wrote: > Hi, > > I recently decided to test 2.6.32-rc6 and I noticed that, whenever too > many disk activity happens, the system crashes. The error shown in the > traces below happened about 3 times in a week. > > Do you have any suggestions? > > Thanks, > Lucas > I just got a reproduction of the kernel oops with 2.6.33-rc4, whose original report can be seen at http://bugzilla.kernel.org/show_bug.cgi?id=14656. I'm seeing this problem while I'm stressing a FUSE file system which is sitting on top of ReiserFS 3. However, since some write operations in this test-case also operate in the root filesystem I cannot tell if FUSE has anything to do with this. Based on the stack trace I would say no. I have one complete message which shows the complete stack trace, found below, and another partial one which includes some debugging messages from CONFIG_DEBUG_LIST=y. The very line which is causing the problem is a list_del() in __rmqueue: (gdb) list *__rmqueue+0x98 0x963 is in __rmqueue (mm/page_alloc.c:730). 725 continue; 726 727 page = list_entry(area->free_list[migratetype].next, 728 struct page, lru); 729 list_del(&page->lru); 730 rmv_page_order(page); "page" is a valid pointer, but it looks like the members of lru are corrupted, as seen in the first trace below: Jan 19 02:01:46 (none) kernel: ------------[ cut here ]------------ Jan 19 02:01:47 (none) kernel: WARNING: at lib/list_debug.c:51 list_del+0x41/0x60() Jan 19 02:01:47 (none) kernel: Hardware name: MacBook3,1 Jan 19 02:01:47 (none) kernel: list_del corruption. next->prev should be c1b71018, but was 00005095 Jan 19 02:01:47 (none) kernel: Modules linked in: tun ipv6 acpi_cpufreq snd_pcm_oss snd_mixer_oss hfsplus ndiswrapper fuse snd_hda_codec_realtek snd_hda_ intel snd_hda_codec joydev snd_hwdep sky2 applesmc led_class uvcvideo firewire_ohci rtc_cmos snd_pcm videodev firewire_core input_polldev rtc_core video output snd_timer v4l1_compat shpchp battery rtc_lib ac appletouch pcspkr snd thermal button processor ohci1394 pci_hotplug intel_agp snd_page_alloc iTCO_ wdt i2c_i801 iTCO_vendor_support i2c_core Jan 19 02:01:47 (none) kernel: Pid: 30559, comm: lt-ltfs Tainted: P M 2.6.33-rc4-Gobo #3 Jan 19 02:01:47 (none) kernel: Call Trace: Jan 19 02:01:47 (none) kernel: [<c0137f28>] warn_slowpath_common+0x6a/0x81 Jan 19 02:01:47 (none) kernel: [<c0400811>] ? list_del+0x41/0x60 For reference, this is the complete stack trace which I got yesterday: Jan 18 00:58:30 (none) kernel: BUG: unable to handle kernel NULL pointer dereference at 00000006 Jan 18 00:58:30 (none) kernel: IP: [<c019b505>] __rmqueue+0x98/0x36c Jan 18 00:58:30 (none) kernel: *pdpt = 00000000298e7001 *pde = 0000000000000000 Jan 18 00:58:30 (none) kernel: Oops: 0002 [#1] PREEMPT SMP Jan 18 00:58:30 (none) kernel: last sysfs file: /System/Kernel/Objects/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0003:00/power_supply/ADP1/online Jan 18 00:58:30 (none) kernel: Modules linked in: cdc_ether usbnet mii cdc_acm tun kqemu ndiswrapper dvb_usb_dib0700 dib7000p dib0090 dib7000m dib0070 dv b_usb dib8000 dvb_core dib3000mc dibx000_common ipv6 acpi_cpufreq snd_pcm_oss snd_mixer_oss hfsplus fuse joydev snd_hda_codec_realtek applesmc led_class snd_hda_intel uvcvideo input_polldev snd_hda_codec videodev firewire_ohci video firewire_core output snd_hwdep v4l1_compat ac sky2 battery snd_pcm i2c_i8 01 ohci1394 appletouch button thermal processor snd_timer snd i2c_core intel_agp snd_page_alloc iTCO_wdt iTCO_vendor_support rtc_cmos pcspkr rtc_core rtc _lib shpchp pci_hotplug Jan 18 00:58:30 (none) kernel: Jan 18 00:58:30 (none) kernel: Pid: 10381, comm: lt-ltfs Tainted: P 2.6.33-rc4-Gobo #1 Mac-F22788C8/MacBook3,1 Jan 18 00:58:30 (none) kernel: EIP: 0060:[<c019b505>] EFLAGS: 00010086 CPU: 0 Jan 18 00:58:30 (none) kernel: EIP is at __rmqueue+0x98/0x36c Jan 18 00:58:30 (none) kernel: EAX: 000001b8 EBX: c1b69000 ECX: 0000000a EDX: 00000002 Jan 18 00:58:30 (none) kernel: ESI: c0bb69c0 EDI: c0bb6ccc EBP: f011ec64 ESP: f011ec2c Jan 18 00:58:30 (none) kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Jan 18 00:58:30 (none) kernel: Process lt-ltfs (pid: 10381, ti=f011e000 task=f004a610 task.ti=f011e000) Jan 18 00:58:30 (none) kernel: Stack: Jan 18 00:58:30 (none) kernel: c01cc35e e9130990 00000000 00000000 00000010 00000000 c0bb6cb8 c0bb6cbc Jan 18 00:58:30 (none) kernel: <0> 00000002 c1b69018 00000010 c0bb69c0 c1b78ff8 00000000 f011ecbc c019cb28 Jan 18 00:58:30 (none) kernel: <0> 00000000 00000040 00000002 ffffffff 0000001f 00000020 00000000 c0bb7244 Jan 18 00:58:30 (none) kernel: Call Trace: Jan 18 00:58:30 (none) kernel: [<c01cc35e>] ? inode_get_bytes+0x48/0x54 Jan 18 00:58:31 (none) kernel: [<c019cb28>] ? get_page_from_freelist+0x14c/0x3ea Jan 18 00:58:31 (none) kernel: [<c019ce8c>] ? __alloc_pages_nodemask+0xc6/0x49a Jan 18 00:58:31 (none) kernel: [<c01980ac>] ? find_get_page+0x2d/0xaf Jan 18 00:58:31 (none) kernel: [<c01986af>] ? grab_cache_page_write_begin+0x54/0x8e Jan 18 00:58:31 (none) kernel: [<c021b54b>] ? reiserfs_write_begin+0x7b/0x1cf Jan 18 00:58:31 (none) kernel: [<c0197a2d>] ? generic_file_buffered_write+0xd2/0x1d2 Jan 18 00:58:31 (none) kernel: [<c019939d>] ? __generic_file_aio_write+0x39f/0x3e0 Jan 18 00:58:31 (none) kernel: [<c01d9380>] ? wake_up_inode+0x1c/0x1e Jan 18 00:58:31 (none) kernel: [<c023531d>] ? reiserfs_write_unlock+0x37/0x39 Jan 18 00:58:31 (none) kernel: [<c0851fcf>] ? _raw_spin_unlock+0xd/0x25 Jan 18 00:58:31 (none) kernel: [<c0199442>] ? generic_file_aio_write+0x64/0xab Jan 18 00:58:31 (none) kernel: [<c01c9179>] ? do_sync_write+0x8e/0xc9 Jan 18 00:58:31 (none) kernel: [<c01d3906>] ? do_filp_open+0x564/0xa44 Jan 18 00:58:31 (none) kernel: [<c021f466>] ? reiserfs_file_write+0x6e/0x77 Jan 18 00:58:31 (none) kernel: [<c01c9b3e>] ? vfs_write+0x99/0x14c Jan 18 00:58:31 (none) kernel: [<c021f3f8>] ? reiserfs_file_write+0x0/0x77 Jan 18 00:58:31 (none) kernel: [<c01c9cad>] ? sys_write+0x48/0x75 Jan 18 00:58:31 (none) kernel: [<c010345f>] ? sysenter_do_call+0x12/0x28 Jan 18 00:58:31 (none) kernel: Code: 39 5d f0 75 06 41 e9 a0 00 00 00 8b 55 e8 c1 e2 03 89 55 f0 01 c2 8b 94 16 44 01 00 00 89 d3 83 eb 18 89 55 ec 8b 7b 1c 8b 53 18 <89> 7a 04 89 17 c7 43 1c 00 02 20 00 c7 43 18 00 01 10 00 8b 7d Do you have any suggestions on things that I should try? The last kernel version that I used which works just fine is 2.6.27.4, which is a bit old to look for possible regressions. Thanks, Lucas ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Oops with 2.6.32-rc6 2010-01-19 4:50 ` Lucas C. Villa Real @ 2010-02-02 17:04 ` Lucas C. Villa Real 0 siblings, 0 replies; 3+ messages in thread From: Lucas C. Villa Real @ 2010-02-02 17:04 UTC (permalink / raw) To: linux-kernel; +Cc: linux-fsdevel On Tue, Jan 19, 2010 at 2:50 AM, Lucas C. Villa Real <lucasvr@gobolinux.org> wrote: > > On Thu, Nov 19, 2009 at 1:48 AM, Lucas C. Villa Real > <lucasvr@gobolinux.org> wrote: > > Hi, > > > > I recently decided to test 2.6.32-rc6 and I noticed that, whenever too > > many disk activity happens, the system crashes. The error shown in the > > traces below happened about 3 times in a week. > > > > Do you have any suggestions? > > > > Thanks, > > Lucas > > > > I just got a reproduction of the kernel oops with 2.6.33-rc4, whose > original report can be seen at > http://bugzilla.kernel.org/show_bug.cgi?id=14656. > > I'm seeing this problem while I'm stressing a FUSE file system which > is sitting on top of ReiserFS 3. However, since some write operations > in this test-case also operate in the root filesystem I cannot tell if > FUSE has anything to do with this. Based on the stack trace I would > say no. > > I have one complete message which shows the complete stack trace, > found below, and another partial one which includes some debugging > messages from CONFIG_DEBUG_LIST=y. The very line which is causing the > problem is a list_del() in __rmqueue: > > (gdb) list *__rmqueue+0x98 > 0x963 is in __rmqueue (mm/page_alloc.c:730). > 725 continue; > 726 > 727 page = list_entry(area->free_list[migratetype].next, > 728 struct > page, lru); > 729 list_del(&page->lru); > 730 rmv_page_order(page); > > "page" is a valid pointer, but it looks like the members of lru are > corrupted, as seen in the first trace below: > > Jan 19 02:01:46 (none) kernel: ------------[ cut here ]------------ > Jan 19 02:01:47 (none) kernel: WARNING: at lib/list_debug.c:51 > list_del+0x41/0x60() > Jan 19 02:01:47 (none) kernel: Hardware name: MacBook3,1 > Jan 19 02:01:47 (none) kernel: list_del corruption. next->prev should > be c1b71018, but was 00005095 > Jan 19 02:01:47 (none) kernel: Modules linked in: tun ipv6 > acpi_cpufreq snd_pcm_oss snd_mixer_oss hfsplus ndiswrapper fuse > snd_hda_codec_realtek snd_hda_ > intel snd_hda_codec joydev snd_hwdep sky2 applesmc led_class uvcvideo > firewire_ohci rtc_cmos snd_pcm videodev firewire_core input_polldev > rtc_core video > output snd_timer v4l1_compat shpchp battery rtc_lib ac appletouch > pcspkr snd thermal button processor ohci1394 pci_hotplug intel_agp > snd_page_alloc iTCO_ > wdt i2c_i801 iTCO_vendor_support i2c_core > Jan 19 02:01:47 (none) kernel: Pid: 30559, comm: lt-ltfs Tainted: P > M 2.6.33-rc4-Gobo #3 > Jan 19 02:01:47 (none) kernel: Call Trace: > Jan 19 02:01:47 (none) kernel: [<c0137f28>] warn_slowpath_common+0x6a/0x81 > Jan 19 02:01:47 (none) kernel: [<c0400811>] ? list_del+0x41/0x60 > > > For reference, this is the complete stack trace which I got yesterday: > > Jan 18 00:58:30 (none) kernel: BUG: unable to handle kernel NULL > pointer dereference at 00000006 > Jan 18 00:58:30 (none) kernel: IP: [<c019b505>] __rmqueue+0x98/0x36c > Jan 18 00:58:30 (none) kernel: *pdpt = 00000000298e7001 *pde = 0000000000000000 > Jan 18 00:58:30 (none) kernel: Oops: 0002 [#1] PREEMPT SMP > Jan 18 00:58:30 (none) kernel: last sysfs file: > /System/Kernel/Objects/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0003:00/power_supply/ADP1/online > Jan 18 00:58:30 (none) kernel: Modules linked in: cdc_ether usbnet mii > cdc_acm tun kqemu ndiswrapper dvb_usb_dib0700 dib7000p dib0090 > dib7000m dib0070 dv > b_usb dib8000 dvb_core dib3000mc dibx000_common ipv6 acpi_cpufreq > snd_pcm_oss snd_mixer_oss hfsplus fuse joydev snd_hda_codec_realtek > applesmc led_class > snd_hda_intel uvcvideo input_polldev snd_hda_codec videodev > firewire_ohci video firewire_core output snd_hwdep v4l1_compat ac sky2 > battery snd_pcm i2c_i8 > 01 ohci1394 appletouch button thermal processor snd_timer snd i2c_core > intel_agp snd_page_alloc iTCO_wdt iTCO_vendor_support rtc_cmos pcspkr > rtc_core rtc > _lib shpchp pci_hotplug > Jan 18 00:58:30 (none) kernel: > Jan 18 00:58:30 (none) kernel: Pid: 10381, comm: lt-ltfs Tainted: P > 2.6.33-rc4-Gobo #1 Mac-F22788C8/MacBook3,1 > Jan 18 00:58:30 (none) kernel: EIP: 0060:[<c019b505>] EFLAGS: 00010086 CPU: 0 > Jan 18 00:58:30 (none) kernel: EIP is at __rmqueue+0x98/0x36c > Jan 18 00:58:30 (none) kernel: EAX: 000001b8 EBX: c1b69000 ECX: > 0000000a EDX: 00000002 > Jan 18 00:58:30 (none) kernel: ESI: c0bb69c0 EDI: c0bb6ccc EBP: > f011ec64 ESP: f011ec2c > Jan 18 00:58:30 (none) kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > Jan 18 00:58:30 (none) kernel: Process lt-ltfs (pid: 10381, > ti=f011e000 task=f004a610 task.ti=f011e000) > Jan 18 00:58:30 (none) kernel: Stack: > Jan 18 00:58:30 (none) kernel: c01cc35e e9130990 00000000 00000000 > 00000010 00000000 c0bb6cb8 c0bb6cbc > Jan 18 00:58:30 (none) kernel: <0> 00000002 c1b69018 00000010 c0bb69c0 > c1b78ff8 00000000 f011ecbc c019cb28 > Jan 18 00:58:30 (none) kernel: <0> 00000000 00000040 00000002 ffffffff > 0000001f 00000020 00000000 c0bb7244 > Jan 18 00:58:30 (none) kernel: Call Trace: > Jan 18 00:58:30 (none) kernel: [<c01cc35e>] ? inode_get_bytes+0x48/0x54 > Jan 18 00:58:31 (none) kernel: [<c019cb28>] ? > get_page_from_freelist+0x14c/0x3ea > Jan 18 00:58:31 (none) kernel: [<c019ce8c>] ? __alloc_pages_nodemask+0xc6/0x49a > Jan 18 00:58:31 (none) kernel: [<c01980ac>] ? find_get_page+0x2d/0xaf > Jan 18 00:58:31 (none) kernel: [<c01986af>] ? > grab_cache_page_write_begin+0x54/0x8e > Jan 18 00:58:31 (none) kernel: [<c021b54b>] ? reiserfs_write_begin+0x7b/0x1cf > Jan 18 00:58:31 (none) kernel: [<c0197a2d>] ? > generic_file_buffered_write+0xd2/0x1d2 > Jan 18 00:58:31 (none) kernel: [<c019939d>] ? > __generic_file_aio_write+0x39f/0x3e0 > Jan 18 00:58:31 (none) kernel: [<c01d9380>] ? wake_up_inode+0x1c/0x1e > Jan 18 00:58:31 (none) kernel: [<c023531d>] ? reiserfs_write_unlock+0x37/0x39 > Jan 18 00:58:31 (none) kernel: [<c0851fcf>] ? _raw_spin_unlock+0xd/0x25 > Jan 18 00:58:31 (none) kernel: [<c0199442>] ? generic_file_aio_write+0x64/0xab > Jan 18 00:58:31 (none) kernel: [<c01c9179>] ? do_sync_write+0x8e/0xc9 > Jan 18 00:58:31 (none) kernel: [<c01d3906>] ? do_filp_open+0x564/0xa44 > Jan 18 00:58:31 (none) kernel: [<c021f466>] ? reiserfs_file_write+0x6e/0x77 > Jan 18 00:58:31 (none) kernel: [<c01c9b3e>] ? vfs_write+0x99/0x14c > Jan 18 00:58:31 (none) kernel: [<c021f3f8>] ? reiserfs_file_write+0x0/0x77 > Jan 18 00:58:31 (none) kernel: [<c01c9cad>] ? sys_write+0x48/0x75 > Jan 18 00:58:31 (none) kernel: [<c010345f>] ? sysenter_do_call+0x12/0x28 > Jan 18 00:58:31 (none) kernel: Code: 39 5d f0 75 06 41 e9 a0 00 00 00 > 8b 55 e8 c1 e2 03 89 55 f0 01 c2 8b 94 16 44 01 00 00 89 d3 83 eb 18 > 89 55 ec 8b 7b > 1c 8b 53 18 <89> 7a 04 89 17 c7 43 1c 00 02 20 00 c7 43 18 00 01 10 00 8b 7d > > > Do you have any suggestions on things that I should try? The last > kernel version that I used which works just fine is 2.6.27.4, which is > a bit old to look for possible regressions. Hi, folks, I compiled linux-2.6-stable from Git last night and just got a reproduction of this oops. A few days ago I took a diff from 2.6.27.4, which was the latest stable version I had installed, to 2.6.33-rc4. All the significant changes involve locking operations, such as the removal of the BKL and lock contention fixes. I'm about to rollback a few of these, starting with the BKL ones, in an attempt to find the culprit. However I'd really like to have some comments from some of you, as I'm not familiar with ReiserFS code. The new trace finds below. Thanks, Lucas Feb 2 14:40:32 (none) kernel: ------------[ cut here ]------------ Feb 2 14:40:32 (none) kernel: WARNING: at lib/list_debug.c:51 list_del+0x41/0x60() Feb 2 14:40:32 (none) kernel: Hardware name: MacBook3,1 Feb 2 14:40:32 (none) kernel: list_del corruption. next->prev should be c1b71018, but was 000056d5 Feb 2 14:40:32 (none) kernel: Modules linked in: ndiswrapper tun fuse ipv6 acpi_cpufreq snd_pcm_oss snd_mixer_oss hfsplus snd_hda_codec_realtek s nd_hda_intel joydev sky2 snd_hda_codec uvcvideo applesmc led_class snd_hwdep rtc_cmos videodev video snd_pcm firewire_ohci firewire_core snd_timer input_polldev output v4l1_compat rtc_core battery snd ac shpchp appletouch thermal processor button rtc_lib ohci1394 intel_agp snd_page_alloc pci _hotplug pcspkr iTCO_wdt iTCO_vendor_support i2c_i801 i2c_core [last unloaded: fuse] Feb 2 14:40:32 (none) kernel: Pid: 24395, comm: lnotes Tainted: P M 2.6.33-rc6-Gobo-00072-gab65832-dirty #1 Feb 2 14:40:32 (none) kernel: Call Trace: Feb 2 14:40:32 (none) kernel: [<c0137f50>] warn_slowpath_common+0x6a/0x81 Feb 2 14:40:32 (none) kernel: [<c0400581>] ? list_del+0x41/0x60 Feb 2 14:40:32 (none) kernel: [<c0137fa5>] warn_slowpath_fmt+0x29/0x2c Feb 2 14:40:32 (none) kernel: [<c0400581>] list_del+0x41/0x60 Feb 2 14:40:32 (none) kernel: [<c019c1ca>] __rmqueue+0x9f/0x38f Feb 2 14:40:32 (none) kernel: [<c019d7a5>] get_page_from_freelist+0x151/0x3ea Feb 2 14:40:32 (none) kernel: [<c019db04>] __alloc_pages_nodemask+0xc6/0x49a Feb 2 14:40:32 (none) kernel: [<c01c61ad>] ? mem_cgroup_charge_statistics+0xad/0xc5 Feb 2 14:40:32 (none) kernel: [<c01c638d>] ? __mem_cgroup_commit_charge+0xc1/0xd8 Feb 2 14:40:32 (none) kernel: [<c0852ff1>] ? sub_preempt_count+0x8/0x74 Feb 2 14:40:32 (none) kernel: [<c019ff65>] ? __lru_cache_add+0x71/0x89 Feb 2 14:40:32 (none) kernel: [<c01aa9c7>] ? page_address+0xe/0xb5 Feb 2 14:40:32 (none) kernel: [<c019ffa7>] ? lru_cache_add_lru+0x2a/0x2c Feb 2 14:40:32 (none) kernel: [<c01ad5dc>] handle_mm_fault+0x1ff/0x897 Feb 2 14:40:32 (none) kernel: [<c01d9249>] ? __d_lookup+0xf1/0x10d Feb 2 14:40:32 (none) kernel: [<c0852fd3>] do_page_fault+0x350/0x366 Feb 2 14:40:32 (none) kernel: [<c0852c83>] ? do_page_fault+0x0/0x366 Feb 2 14:40:32 (none) kernel: [<c0850d53>] error_code+0x73/0x78 Feb 2 14:40:32 (none) kernel: [<c085007b>] ? _raw_spin_unlock+0x2b/0x2c Feb 2 14:40:32 (none) kernel: [<c019894d>] ? file_read_actor+0x42/0xc6 Feb 2 14:40:32 (none) kernel: [<c019a5a3>] generic_file_aio_read+0x327/0x50c Feb 2 14:40:32 (none) kernel: [<c01c9d86>] do_sync_read+0x8e/0xc9 Feb 2 14:40:32 (none) kernel: [<c019ffa7>] ? lru_cache_add_lru+0x2a/0x2c Feb 2 14:40:32 (none) kernel: [<c011da3b>] ? native_set_pte_at+0xc/0x19 Feb 2 14:40:32 (none) kernel: [<c0852ff1>] ? sub_preempt_count+0x8/0x74 Feb 2 14:40:32 (none) kernel: [<c01c988a>] ? generic_file_llseek_unlocked+0xe/0x84 Feb 2 14:40:32 (none) kernel: [<c084ec93>] ? mutex_unlock+0x8/0x1b Feb 2 14:40:32 (none) kernel: [<c01c9dd2>] ? rw_verify_area+0x11/0xa7 Feb 2 14:40:32 (none) kernel: [<c01ca8b5>] vfs_read+0x97/0x14a Feb 2 14:40:32 (none) kernel: [<c01c9cf8>] ? do_sync_read+0x0/0xc9 Feb 2 14:40:33 (none) kernel: [<c01caa24>] sys_read+0x48/0x75 Feb 2 14:40:33 (none) kernel: [<c010345f>] sysenter_do_call+0x12/0x28 Feb 2 14:40:33 (none) kernel: ---[ end trace c8086567704fab22 ]--- Feb 2 14:40:33 (none) kernel: BUG: unable to handle kernel NULL pointer dereference at 00000006 ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2010-02-02 17:04 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-11-19 3:48 Oops with 2.6.32-rc6 Lucas C. Villa Real 2010-01-19 4:50 ` Lucas C. Villa Real 2010-02-02 17:04 ` Lucas C. Villa Real
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).