* 3.0.3 kernel BUG at kernel/timer.c:1035
@ 2011-08-24 13:02 Frank van Maarseveen
2011-08-30 9:08 ` Frank van Maarseveen
2011-09-02 8:10 ` Andrew Morton
0 siblings, 2 replies; 11+ messages in thread
From: Frank van Maarseveen @ 2011-08-24 13:02 UTC (permalink / raw)
To: linux-kernel
Got several of these (logged via netconsole):
kernel BUG at kernel/timer.c:1035!
invalid opcode: 0000 [#1]
PREEMPT
SMP
Modules linked in:
[last unloaded: scsi_wait_scan]
Pid: 0, comm: swapper Not tainted 3.0.3-x263 #1
Dell Inc. OptiPlex GX620
/0F8098
EIP: 0060:[<c107adfe>] EFLAGS: 00010812 CPU: 0
EIP is at cascade+0x6e/0x70
EAX: 6b6b6b6a EBX: c1bbb480 ECX: c1ac2d50 EDX: f541335c
ESI: c1ac2d50 EDI: f600bf60 EBP: f600bf74 ESP: f600bf5c
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process swapper (pid: 0, ti=f600a000 task=c1aba320 task.ti=c1a94000)
Stack:
00000034
f541335c
c1ac2d50
c1bbb480
00000000
f600bfac
f600bfc0
c107af48
00000004
00000000
f600bfb8
c1069265
00000000
f600bfa8
c1bbc29c
c1bbc09c
c1bbbe9c
c106a28b
00000100
c1bbbc9c
c106a28b
00000100
00000041
c1a99a84
Call Trace:
[<c107af48>] run_timer_softirq+0x148/0x1e0
[<c1069265>] ? rebalance_domains+0x135/0x160
[<c106a28b>] ? get_parent_ip+0xb/0x40
[<c106a28b>] ? get_parent_ip+0xb/0x40
[<c1075098>] __do_softirq+0x78/0x100
[<c1075020>] ? local_bh_enable+0xa0/0xa0
<IRQ>
[<c10753ad>] ? irq_exit+0x5d/0x70
[<c104df53>] ? smp_apic_timer_interrupt+0x53/0x90
[<c178fa22>] ? apic_timer_interrupt+0x2a/0x30
[<c103d3ed>] ? mwait_idle+0x4d/0x80
[<c1034b0a>] ? cpu_idle+0x3a/0x80
[<c176dc3b>] ? rest_init+0x7b/0x80
[<c1b1471b>] ? start_kernel+0x2e2/0x2e8
[<c1b141c1>] ? loglevel+0x1a/0x1a
[<c1b140b3>] ? i386_start_kernel+0xb3/0xbb
Got one stack trace on 64 bit:
kernel BUG at kernel/timer.c:1035!
invalid opcode: 0000 [#1]
PREEMPT
SMP
CPU 1
Modules linked in:
vmthrottle
radeon
[last unloaded: scsi_wait_scan]
Pid: 4312, comm: qemu Not tainted 3.0.3-x263lm #1
Dell Inc. Dell DXP051
/0FJ030
RIP: 0010:[<ffffffff8109438b>]
[<ffffffff8109438b>] cascade+0x9b/0xa0
RSP: 0018:ffff8800dfc83e40 EFLAGS: 00210096
RAX: 6b6b6b6b6b6b6b6a RBX: ffff8800dfc83e40 RCX: ffff8800df0ad080
RDX: ffff8800dfc83e40 RSI: ffff8800daa7c838 RDI: ffff8800df0ac000
RBP: ffff8800dfc83e70 R08: ffff8800dfc8c640 R09: ffff8800dfc90df8
R10: 0000000000000001 R11: ffffffff8189c230 R12: ffff8800df0ac000
R13: ffff8800dfc83e40 R14: 0000000000000005 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff8800dfc80000(0063) knlGS:00000000f760b770
CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000080582b8 CR3: 00000000d1b44000 CR4: 00000000000026e0
DR0: 0000000000000001 DR1: 0000000000000002 DR2: 0000000000000001
DR3: 000000000000000a DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process qemu (pid: 4312, threadinfo ffff8800c6052000, task ffff8800d1ae9c80)
Stack:
ffff8800daa7c838
ffff8800daa7c838
0000000000000000
ffff8800df0ac000
0000000000000101
ffff8800dfc83eb0
ffff8800dfc83ef0
ffffffff81094653
ffff8800c6053fd8
ffff8800c6053fd8
ffff8800df0adc30
ffff8800df0ad830
Call Trace:
<IRQ>
[<ffffffff81094653>] run_timer_softirq+0x183/0x250
[<ffffffff81058398>] ? lapic_next_event+0x18/0x20
[<ffffffff810b35f7>] ? clockevents_program_event+0x57/0xa0
[<ffffffff8108d9da>] __do_softirq+0x9a/0x150
[<ffffffff8188625c>] call_softirq+0x1c/0x30
[<ffffffff8103ebe5>] do_softirq+0x65/0xa0
[<ffffffff8108d72d>] irq_exit+0x7d/0xa0
[<ffffffff81058c99>] smp_apic_timer_interrupt+0x69/0xa0
[<ffffffff81885d13>] apic_timer_interrupt+0x13/0x20
<EOI>
[<ffffffff810a4b99>] ? add_wait_queue+0x49/0x60
[<ffffffff81884914>] ? _raw_spin_unlock_irqrestore+0x44/0x50
[<ffffffff810a4b99>] ? add_wait_queue+0x49/0x60
[<ffffffff81132f2a>] __pollwait+0x7a/0x100
[<ffffffff8115eb97>] eventfd_poll+0x27/0x70
[<ffffffff81133ce6>] do_select+0x3d6/0x730
[<ffffffff81132eb0>] ? poll_freewait+0xc0/0xc0
[<ffffffff81132fb0>] ? __pollwait+0x100/0x100
last message repeated 5 times
[<ffffffff8108239d>] ? sub_preempt_count+0x9d/0xd0
[<ffffffff81081111>] ? get_parent_ip+0x11/0x50
[<ffffffff8108239d>] ? sub_preempt_count+0x9d/0xd0
[<ffffffff81882d53>] ? __mutex_lock_slowpath+0x2a3/0x350
[<ffffffff811663bc>] compat_core_sys_select+0x1fc/0x280
[<ffffffff81120ce1>] ? do_sync_read+0xd1/0x120
[<ffffffff81081111>] ? get_parent_ip+0x11/0x50
[<ffffffff81043ef6>] ? read_tsc+0x16/0x40
[<ffffffff810ae732>] ? ktime_get_ts+0xb2/0xe0
[<ffffffff811666fa>] compat_sys_select+0x4a/0x120
[<ffffffff810c382b>] ? compat_sys_gettimeofday+0xbb/0xd0
[<ffffffff8188631c>] sysenter_dispatch+0x7/0x32
In all these cases the issue was triggered by unplugging a mounted ext3
USB stick + an automated umount -l -f afterwards by udev using something
like the script below. A few seconds after the unplug+umount the system
crashed with the above traces, followed by a secondary
Kernel panic - not syncing: Fatal exception in interrupt
Unfortunately I'm unable to reproduce the issue right now so there must
be some unknown precondition or it is a race. Script:
--------
#!/bin/sh
#
# /etc/udev/rules.d/99-local.rules:
# SUBSYSTEM=="block", ACTION=="add|remove", RUN+="/usr/local/sbin/plugdev"
media_add()
{
mkdir -p /media/$dev
mount -t "$1" -o "$2" /dev/$dev /media/$dev
}
media_remove()
{
umount -f -l /media/$dev
rmdir /media/* 2>/dev/null
}
dev=`echo $DEVNAME|sed 's/.*\///'`
case "$ID_FS_TYPE.$ACTION.$dev" in
ext[234].add.?*)
media_add $ID_FS_TYPE nodev,nosuid
;;
vfat.add.?*)
media_add vfat umask=0
;;
*.remove.?*)
media_remove
;;
esac
--------
--
Frank
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: 3.0.3 kernel BUG at kernel/timer.c:1035 2011-08-24 13:02 3.0.3 kernel BUG at kernel/timer.c:1035 Frank van Maarseveen @ 2011-08-30 9:08 ` Frank van Maarseveen 2011-09-02 8:10 ` Andrew Morton 1 sibling, 0 replies; 11+ messages in thread From: Frank van Maarseveen @ 2011-08-30 9:08 UTC (permalink / raw) To: linux-kernel Still present in 3.0.4. -- Frank ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035 2011-08-24 13:02 3.0.3 kernel BUG at kernel/timer.c:1035 Frank van Maarseveen 2011-08-30 9:08 ` Frank van Maarseveen @ 2011-09-02 8:10 ` Andrew Morton 2011-09-05 12:38 ` Frank van Maarseveen 1 sibling, 1 reply; 11+ messages in thread From: Andrew Morton @ 2011-09-02 8:10 UTC (permalink / raw) To: Frank van Maarseveen; +Cc: linux-kernel On Wed, 24 Aug 2011 15:02:38 +0200 Frank van Maarseveen <frankvm@frankvm.com> wrote: > Got several of these (logged via netconsole): > > kernel BUG at kernel/timer.c:1035! > invalid opcode: 0000 [#1] > PREEMPT > SMP > > ... > > Call Trace: > [<c107af48>] run_timer_softirq+0x148/0x1e0 > [<c1069265>] ? rebalance_domains+0x135/0x160 > [<c106a28b>] ? get_parent_ip+0xb/0x40 > [<c106a28b>] ? get_parent_ip+0xb/0x40 > [<c1075098>] __do_softirq+0x78/0x100 > [<c1075020>] ? local_bh_enable+0xa0/0xa0 > <IRQ> > > [<c10753ad>] ? irq_exit+0x5d/0x70 > [<c104df53>] ? smp_apic_timer_interrupt+0x53/0x90 > [<c178fa22>] ? apic_timer_interrupt+0x2a/0x30 > [<c103d3ed>] ? mwait_idle+0x4d/0x80 > [<c1034b0a>] ? cpu_idle+0x3a/0x80 > [<c176dc3b>] ? rest_init+0x7b/0x80 > [<c1b1471b>] ? start_kernel+0x2e2/0x2e8 > [<c1b141c1>] ? loglevel+0x1a/0x1a > [<c1b140b3>] ? i386_start_kernel+0xb3/0xbb > > Could be that a timer was freed while still running. Please ensure that all kernel debugging options are enabled. Especially CONFIG_DEBUG_OBJECTS=y CONFIG_DEBUG_OBJECTS_SELFTEST=y CONFIG_DEBUG_OBJECTS_FREE=y CONFIG_DEBUG_OBJECTS_TIMERS=y CONFIG_DEBUG_OBJECTS_WORK=y CONFIG_DEBUG_OBJECTS_RCU_HEAD=y CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER=y CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1 CONFIG_SLUB_DEBUG_ON=y CONFIG_DEBUG_OBJECTS_TIMERS might catch this one. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035 2011-09-02 8:10 ` Andrew Morton @ 2011-09-05 12:38 ` Frank van Maarseveen [not found] ` <CAF1ivSYui_=tHbxHiy15a9wfiphHcpoY+J1MmAJ=dMQsAfEVLw@mail.gmail.com> 0 siblings, 1 reply; 11+ messages in thread From: Frank van Maarseveen @ 2011-09-05 12:38 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel On Fri, Sep 02, 2011 at 01:10:58AM -0700, Andrew Morton wrote: > On Wed, 24 Aug 2011 15:02:38 +0200 Frank van Maarseveen <frankvm@frankvm.com> wrote: > > > Got several of these (logged via netconsole): > > > > kernel BUG at kernel/timer.c:1035! > > invalid opcode: 0000 [#1] > > PREEMPT > > SMP > > > > ... > > > > Call Trace: > > [<c107af48>] run_timer_softirq+0x148/0x1e0 > > [<c1069265>] ? rebalance_domains+0x135/0x160 > > [<c106a28b>] ? get_parent_ip+0xb/0x40 > > [<c106a28b>] ? get_parent_ip+0xb/0x40 > > [<c1075098>] __do_softirq+0x78/0x100 > > [<c1075020>] ? local_bh_enable+0xa0/0xa0 > > <IRQ> > > > > [<c10753ad>] ? irq_exit+0x5d/0x70 > > [<c104df53>] ? smp_apic_timer_interrupt+0x53/0x90 > > [<c178fa22>] ? apic_timer_interrupt+0x2a/0x30 > > [<c103d3ed>] ? mwait_idle+0x4d/0x80 > > [<c1034b0a>] ? cpu_idle+0x3a/0x80 > > [<c176dc3b>] ? rest_init+0x7b/0x80 > > [<c1b1471b>] ? start_kernel+0x2e2/0x2e8 > > [<c1b141c1>] ? loglevel+0x1a/0x1a > > [<c1b140b3>] ? i386_start_kernel+0xb3/0xbb > > > > > > Could be that a timer was freed while still running. > > Please ensure that all kernel debugging options are enabled. > Especially > > CONFIG_DEBUG_OBJECTS=y > CONFIG_DEBUG_OBJECTS_SELFTEST=y > CONFIG_DEBUG_OBJECTS_FREE=y > CONFIG_DEBUG_OBJECTS_TIMERS=y > CONFIG_DEBUG_OBJECTS_WORK=y > CONFIG_DEBUG_OBJECTS_RCU_HEAD=y > CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER=y > CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1 > CONFIG_SLUB_DEBUG_ON=y > > CONFIG_DEBUG_OBJECTS_TIMERS might catch this one. Got something after enabling the above. Unplugging USB storage before umount a couple of times produced the following trace: usb 1-7: USB disconnect, device number 10 Buffer I/O error on device sdb1, logical block 0 lost page write due to I/O error on sdb1 ------------[ cut here ]------------ WARNING: at lib/debugobjects.c:262 debug_print_object+0x85/0xa0() Hardware name: OptiPlex GX620 ODEBUG: free active (active state 0) object type: timer_list hint: wakeup_timer_fn+0x0/0x50 Modules linked in: [last unloaded: scsi_wait_scan] Pid: 5144, comm: umount Not tainted 3.0.4-y264 #1 Call Trace: [<c106f59d>] warn_slowpath_common+0x6d/0xa0 [<c12b5d75>] ? debug_print_object+0x85/0xa0 [<c12b5d75>] ? debug_print_object+0x85/0xa0 [<c106f64e>] warn_slowpath_fmt+0x2e/0x30 [<c12b5d75>] debug_print_object+0x85/0xa0 [<c10d4bf0>] ? bdi_init+0x150/0x150 [<c12b5f0a>] __debug_check_no_obj_freed+0xda/0x180 [<c106a2fb>] ? get_parent_ip+0xb/0x40 [<c12b68c5>] debug_check_no_obj_freed+0x15/0x20 [<c10f07ba>] kmem_cache_free+0x9a/0xc0 [<c12a9f18>] ? prop_local_destroy_percpu+0x8/0x10 [<c10d4e3b>] ? bdi_destroy+0xdb/0x110 [<c1299d95>] ? blk_release_queue+0x45/0x50 [<c1299d95>] blk_release_queue+0x45/0x50 [<c12a756a>] kobject_release+0x3a/0x80 [<c12a7530>] ? kobject_del+0x60/0x60 [<c12a891d>] kref_put+0x2d/0x60 [<c12a747d>] kobject_put+0x1d/0x50 [<c1084568>] ? __cancel_work_timer+0x68/0x70 [<c129707d>] blk_put_queue+0xd/0x10 [<c1465953>] scsi_device_dev_release_usercontext+0xe3/0x120 [<c1465870>] ? scsi_device_cls_release+0x10/0x10 [<c108432c>] execute_in_process_context+0x5c/0x70 [<c1465853>] scsi_device_dev_release+0x13/0x20 [<c1440559>] device_release+0x19/0x80 [<c10f06f9>] ? kfree+0xc9/0xd0 [<c12a7575>] ? kobject_release+0x45/0x80 [<c12a756a>] kobject_release+0x3a/0x80 [<c12a7530>] ? kobject_del+0x60/0x60 [<c12a891d>] kref_put+0x2d/0x60 [<c12a747d>] kobject_put+0x1d/0x50 [<c12a747d>] ? kobject_put+0x1d/0x50 [<c14402ff>] put_device+0xf/0x20 [<c145b2f3>] scsi_device_put+0x33/0x50 [<c148375b>] scsi_disk_put+0x2b/0x40 [<c1483daf>] sd_release+0x2f/0x60 [<c11200f0>] __blkdev_put+0x120/0x160 [<c11200cd>] __blkdev_put+0xfd/0x160 [<c1120153>] blkdev_put+0x23/0x110 [<c10f67a0>] kill_block_super+0x40/0x70 [<c10f6c0d>] deactivate_locked_super+0x3d/0x60 [<c10f72a9>] deactivate_super+0x49/0x70 [<c110e1c6>] mntput_no_expire+0x86/0xc0 [<c110f10d>] sys_umount+0x5d/0xb0 [<c179141c>] sysenter_do_call+0x12/0x2c ---[ end trace ad6863f336beb434 ]--- -- Frank ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <CAF1ivSYui_=tHbxHiy15a9wfiphHcpoY+J1MmAJ=dMQsAfEVLw@mail.gmail.com>]
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035 [not found] ` <CAF1ivSYui_=tHbxHiy15a9wfiphHcpoY+J1MmAJ=dMQsAfEVLw@mail.gmail.com> @ 2011-09-06 14:48 ` Lin Ming 2011-09-07 10:24 ` Frank van Maarseveen 0 siblings, 1 reply; 11+ messages in thread From: Lin Ming @ 2011-09-06 14:48 UTC (permalink / raw) To: Frank van Maarseveen; +Cc: Andrew Morton, linux-kernel On Mon, Sep 5, 2011 at 8:38 PM, Frank van Maarseveen <frankvm@frankvm.com> wrote: > On Fri, Sep 02, 2011 at 01:10:58AM -0700, Andrew Morton wrote: >> On Wed, 24 Aug 2011 15:02:38 +0200 Frank van Maarseveen <frankvm@frankvm.com> wrote: >> >> > Got several of these (logged via netconsole): >> > >> > kernel BUG at kernel/timer.c:1035! >> > invalid opcode: 0000 [#1] >> > PREEMPT >> > SMP >> > >> > ... >> > >> > Call Trace: >> > [<c107af48>] run_timer_softirq+0x148/0x1e0 >> > [<c1069265>] ? rebalance_domains+0x135/0x160 >> > [<c106a28b>] ? get_parent_ip+0xb/0x40 >> > [<c106a28b>] ? get_parent_ip+0xb/0x40 >> > [<c1075098>] __do_softirq+0x78/0x100 >> > [<c1075020>] ? local_bh_enable+0xa0/0xa0 >> > <IRQ> >> > >> > [<c10753ad>] ? irq_exit+0x5d/0x70 >> > [<c104df53>] ? smp_apic_timer_interrupt+0x53/0x90 >> > [<c178fa22>] ? apic_timer_interrupt+0x2a/0x30 >> > [<c103d3ed>] ? mwait_idle+0x4d/0x80 >> > [<c1034b0a>] ? cpu_idle+0x3a/0x80 >> > [<c176dc3b>] ? rest_init+0x7b/0x80 >> > [<c1b1471b>] ? start_kernel+0x2e2/0x2e8 >> > [<c1b141c1>] ? loglevel+0x1a/0x1a >> > [<c1b140b3>] ? i386_start_kernel+0xb3/0xbb >> > >> > >> >> Could be that a timer was freed while still running. >> >> Please ensure that all kernel debugging options are enabled. >> Especially >> >> CONFIG_DEBUG_OBJECTS=y >> CONFIG_DEBUG_OBJECTS_SELFTEST=y >> CONFIG_DEBUG_OBJECTS_FREE=y >> CONFIG_DEBUG_OBJECTS_TIMERS=y >> CONFIG_DEBUG_OBJECTS_WORK=y >> CONFIG_DEBUG_OBJECTS_RCU_HEAD=y >> CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER=y >> CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1 >> CONFIG_SLUB_DEBUG_ON=y >> >> CONFIG_DEBUG_OBJECTS_TIMERS might catch this one. > > Got something after enabling the above. Unplugging USB storage > before umount a couple of times produced the following trace: > > usb 1-7: USB disconnect, device number 10 > Buffer I/O error on device sdb1, logical block 0 > lost page write due to I/O error on sdb1 > ------------[ cut here ]------------ > WARNING: at lib/debugobjects.c:262 debug_print_object+0x85/0xa0() > Hardware name: OptiPlex GX620 > ODEBUG: free active (active state 0) object type: timer_list hint: wakeup_timer_fn+0x0/0x50 Does below patch help? >From a98b874437f871d5ecc3f6fe409b2b474b1f2731 Mon Sep 17 00:00:00 2001 From: Lin Ming <ming.m.lin@intel.com> Date: Tue, 6 Sep 2011 22:45:43 +0800 Subject: [PATCH] block: delete bdi writeback wakup_timer in blk_cleanup_queue() Signed-off-by: Lin Ming <ming.m.lin@intel.com> --- block/blk-core.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 90e1ffd..22529a3 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -363,6 +363,7 @@ void blk_cleanup_queue(struct request_queue *q) blk_sync_queue(q); del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer); + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer); mutex_lock(&q->sysfs_lock); queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q); mutex_unlock(&q->sysfs_lock); -- 1.7.2.3 Regards, Lin Ming > Modules linked in: [last unloaded: scsi_wait_scan] > Pid: 5144, comm: umount Not tainted 3.0.4-y264 #1 > Call Trace: > [<c106f59d>] warn_slowpath_common+0x6d/0xa0 > [<c12b5d75>] ? debug_print_object+0x85/0xa0 > [<c12b5d75>] ? debug_print_object+0x85/0xa0 > [<c106f64e>] warn_slowpath_fmt+0x2e/0x30 > [<c12b5d75>] debug_print_object+0x85/0xa0 > [<c10d4bf0>] ? bdi_init+0x150/0x150 > [<c12b5f0a>] __debug_check_no_obj_freed+0xda/0x180 > [<c106a2fb>] ? get_parent_ip+0xb/0x40 > [<c12b68c5>] debug_check_no_obj_freed+0x15/0x20 > [<c10f07ba>] kmem_cache_free+0x9a/0xc0 > [<c12a9f18>] ? prop_local_destroy_percpu+0x8/0x10 > [<c10d4e3b>] ? bdi_destroy+0xdb/0x110 > [<c1299d95>] ? blk_release_queue+0x45/0x50 > [<c1299d95>] blk_release_queue+0x45/0x50 > [<c12a756a>] kobject_release+0x3a/0x80 > [<c12a7530>] ? kobject_del+0x60/0x60 > [<c12a891d>] kref_put+0x2d/0x60 > [<c12a747d>] kobject_put+0x1d/0x50 > [<c1084568>] ? __cancel_work_timer+0x68/0x70 > [<c129707d>] blk_put_queue+0xd/0x10 > [<c1465953>] scsi_device_dev_release_usercontext+0xe3/0x120 > [<c1465870>] ? scsi_device_cls_release+0x10/0x10 > [<c108432c>] execute_in_process_context+0x5c/0x70 > [<c1465853>] scsi_device_dev_release+0x13/0x20 > [<c1440559>] device_release+0x19/0x80 > [<c10f06f9>] ? kfree+0xc9/0xd0 > [<c12a7575>] ? kobject_release+0x45/0x80 > [<c12a756a>] kobject_release+0x3a/0x80 > [<c12a7530>] ? kobject_del+0x60/0x60 > [<c12a891d>] kref_put+0x2d/0x60 > [<c12a747d>] kobject_put+0x1d/0x50 > [<c12a747d>] ? kobject_put+0x1d/0x50 > [<c14402ff>] put_device+0xf/0x20 > [<c145b2f3>] scsi_device_put+0x33/0x50 > [<c148375b>] scsi_disk_put+0x2b/0x40 > [<c1483daf>] sd_release+0x2f/0x60 > [<c11200f0>] __blkdev_put+0x120/0x160 > [<c11200cd>] __blkdev_put+0xfd/0x160 > [<c1120153>] blkdev_put+0x23/0x110 > [<c10f67a0>] kill_block_super+0x40/0x70 > [<c10f6c0d>] deactivate_locked_super+0x3d/0x60 > [<c10f72a9>] deactivate_super+0x49/0x70 > [<c110e1c6>] mntput_no_expire+0x86/0xc0 > [<c110f10d>] sys_umount+0x5d/0xb0 > [<c179141c>] sysenter_do_call+0x12/0x2c > ---[ end trace ad6863f336beb434 ]--- > > > -- > Frank ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035 2011-09-06 14:48 ` Lin Ming @ 2011-09-07 10:24 ` Frank van Maarseveen 2011-09-07 12:36 ` Lin Ming 0 siblings, 1 reply; 11+ messages in thread From: Frank van Maarseveen @ 2011-09-07 10:24 UTC (permalink / raw) To: Lin Ming; +Cc: Andrew Morton, linux-kernel On Tue, Sep 06, 2011 at 10:48:38PM +0800, Lin Ming wrote: > Does below patch help? > > >From a98b874437f871d5ecc3f6fe409b2b474b1f2731 Mon Sep 17 00:00:00 2001 > From: Lin Ming <ming.m.lin@intel.com> > Date: Tue, 6 Sep 2011 22:45:43 +0800 > Subject: [PATCH] block: delete bdi writeback wakup_timer in blk_cleanup_queue() > > Signed-off-by: Lin Ming <ming.m.lin@intel.com> > --- > block/blk-core.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/block/blk-core.c b/block/blk-core.c > index 90e1ffd..22529a3 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -363,6 +363,7 @@ void blk_cleanup_queue(struct request_queue *q) > blk_sync_queue(q); > > del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer); > + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer); > mutex_lock(&q->sysfs_lock); > queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q); > mutex_unlock(&q->sysfs_lock); > -- > 1.7.2.3 > No, bug still present. Stack trace is the same and I double checked that it was the new kernel (this time with a lot more debug enabled). -- Frank ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035 2011-09-07 10:24 ` Frank van Maarseveen @ 2011-09-07 12:36 ` Lin Ming 2011-09-07 21:30 ` Andrew Morton 0 siblings, 1 reply; 11+ messages in thread From: Lin Ming @ 2011-09-07 12:36 UTC (permalink / raw) To: Frank van Maarseveen; +Cc: Andrew Morton, linux-kernel@vger.kernel.org On Wed, 2011-09-07 at 18:24 +0800, Frank van Maarseveen wrote: > On Tue, Sep 06, 2011 at 10:48:38PM +0800, Lin Ming wrote: > > Does below patch help? > > > > >From a98b874437f871d5ecc3f6fe409b2b474b1f2731 Mon Sep 17 00:00:00 2001 > > From: Lin Ming <ming.m.lin@intel.com> > > Date: Tue, 6 Sep 2011 22:45:43 +0800 > > Subject: [PATCH] block: delete bdi writeback wakup_timer in blk_cleanup_queue() > > > > Signed-off-by: Lin Ming <ming.m.lin@intel.com> > > --- > > block/blk-core.c | 1 + > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > diff --git a/block/blk-core.c b/block/blk-core.c > > index 90e1ffd..22529a3 100644 > > --- a/block/blk-core.c > > +++ b/block/blk-core.c > > @@ -363,6 +363,7 @@ void blk_cleanup_queue(struct request_queue *q) > > blk_sync_queue(q); > > > > del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer); > > + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer); > > mutex_lock(&q->sysfs_lock); > > queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q); > > mutex_unlock(&q->sysfs_lock); > > -- > > 1.7.2.3 > > > > No, bug still present. Stack trace is the same and I double checked that > it was the new kernel (this time with a lot more debug enabled). Thanks for test. I'll try to reproduce this bug. Lin Ming ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035 2011-09-07 12:36 ` Lin Ming @ 2011-09-07 21:30 ` Andrew Morton 2011-09-09 14:21 ` Lin Ming 2011-09-20 13:16 ` Frank van Maarseveen 0 siblings, 2 replies; 11+ messages in thread From: Andrew Morton @ 2011-09-07 21:30 UTC (permalink / raw) To: Lin Ming; +Cc: Frank van Maarseveen, linux-kernel@vger.kernel.org, Jens Axboe On Wed, 07 Sep 2011 20:36:19 +0800 Lin Ming <ming.m.lin@intel.com> wrote: > On Wed, 2011-09-07 at 18:24 +0800, Frank van Maarseveen wrote: > > On Tue, Sep 06, 2011 at 10:48:38PM +0800, Lin Ming wrote: > > > Does below patch help? > > > > > > >From a98b874437f871d5ecc3f6fe409b2b474b1f2731 Mon Sep 17 00:00:00 2001 > > > From: Lin Ming <ming.m.lin@intel.com> > > > Date: Tue, 6 Sep 2011 22:45:43 +0800 > > > Subject: [PATCH] block: delete bdi writeback wakup_timer in blk_cleanup_queue() > > > > > > Signed-off-by: Lin Ming <ming.m.lin@intel.com> > > > --- > > > block/blk-core.c | 1 + > > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > > > diff --git a/block/blk-core.c b/block/blk-core.c > > > index 90e1ffd..22529a3 100644 > > > --- a/block/blk-core.c > > > +++ b/block/blk-core.c > > > @@ -363,6 +363,7 @@ void blk_cleanup_queue(struct request_queue *q) > > > blk_sync_queue(q); > > > > > > del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer); > > > + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer); > > > mutex_lock(&q->sysfs_lock); > > > queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q); > > > mutex_unlock(&q->sysfs_lock); > > > -- > > > 1.7.2.3 > > > > > > > No, bug still present. Stack trace is the same and I double checked that > > it was the new kernel (this time with a lot more debug enabled). > > Thanks for test. > I'll try to reproduce this bug. Probably this will "fix" it: --- a/block/blk-sysfs.c~a +++ a/block/blk-sysfs.c @@ -4,6 +4,7 @@ #include <linux/kernel.h> #include <linux/slab.h> #include <linux/module.h> +#include <linux/timer.h> #include <linux/bio.h> #include <linux/blkdev.h> #include <linux/blktrace_api.h> @@ -486,7 +487,7 @@ static void blk_release_queue(struct kob __blk_queue_free_tags(q); blk_trace_shutdown(q); - + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer); bdi_destroy(&q->backing_dev_info); kmem_cache_free(blk_requestq_cachep, q); } _ Jens, can you please take a look at this regression? blk_release_queue() is freeing a pending timer. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035 2011-09-07 21:30 ` Andrew Morton @ 2011-09-09 14:21 ` Lin Ming 2011-09-10 8:41 ` Frank van Maarseveen 2011-09-20 13:16 ` Frank van Maarseveen 1 sibling, 1 reply; 11+ messages in thread From: Lin Ming @ 2011-09-09 14:21 UTC (permalink / raw) To: Andrew Morton Cc: Lin Ming, Frank van Maarseveen, linux-kernel@vger.kernel.org, Jens Axboe On Thu, Sep 8, 2011 at 5:30 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Wed, 07 Sep 2011 20:36:19 +0800 > Lin Ming <ming.m.lin@intel.com> wrote: > >> On Wed, 2011-09-07 at 18:24 +0800, Frank van Maarseveen wrote: >> > On Tue, Sep 06, 2011 at 10:48:38PM +0800, Lin Ming wrote: >> > > Does below patch help? >> > > >> > > >From a98b874437f871d5ecc3f6fe409b2b474b1f2731 Mon Sep 17 00:00:00 2001 >> > > From: Lin Ming <ming.m.lin@intel.com> >> > > Date: Tue, 6 Sep 2011 22:45:43 +0800 >> > > Subject: [PATCH] block: delete bdi writeback wakup_timer in blk_cleanup_queue() >> > > >> > > Signed-off-by: Lin Ming <ming.m.lin@intel.com> >> > > --- >> > > block/blk-core.c | 1 + >> > > 1 files changed, 1 insertions(+), 0 deletions(-) >> > > >> > > diff --git a/block/blk-core.c b/block/blk-core.c >> > > index 90e1ffd..22529a3 100644 >> > > --- a/block/blk-core.c >> > > +++ b/block/blk-core.c >> > > @@ -363,6 +363,7 @@ void blk_cleanup_queue(struct request_queue *q) >> > > blk_sync_queue(q); >> > > >> > > del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer); >> > > + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer); >> > > mutex_lock(&q->sysfs_lock); >> > > queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q); >> > > mutex_unlock(&q->sysfs_lock); >> > > -- >> > > 1.7.2.3 >> > > >> > >> > No, bug still present. Stack trace is the same and I double checked that >> > it was the new kernel (this time with a lot more debug enabled). >> >> Thanks for test. >> I'll try to reproduce this bug. > > Probably this will "fix" it: Frank, Does Andrew's patch help? Lin Ming > > --- a/block/blk-sysfs.c~a > +++ a/block/blk-sysfs.c > @@ -4,6 +4,7 @@ > #include <linux/kernel.h> > #include <linux/slab.h> > #include <linux/module.h> > +#include <linux/timer.h> > #include <linux/bio.h> > #include <linux/blkdev.h> > #include <linux/blktrace_api.h> > @@ -486,7 +487,7 @@ static void blk_release_queue(struct kob > __blk_queue_free_tags(q); > > blk_trace_shutdown(q); > - > + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer); > bdi_destroy(&q->backing_dev_info); > kmem_cache_free(blk_requestq_cachep, q); > } > _ > > Jens, can you please take a look at this regression? > blk_release_queue() is freeing a pending timer. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035 2011-09-09 14:21 ` Lin Ming @ 2011-09-10 8:41 ` Frank van Maarseveen 0 siblings, 0 replies; 11+ messages in thread From: Frank van Maarseveen @ 2011-09-10 8:41 UTC (permalink / raw) To: Lin Ming Cc: Andrew Morton, Lin Ming, linux-kernel@vger.kernel.org, Jens Axboe On Fri, Sep 09, 2011 at 10:21:36PM +0800, Lin Ming wrote: > On Thu, Sep 8, 2011 at 5:30 AM, Andrew Morton <akpm@linux-foundation.org> wrote: (...) > > > > Probably this will "fix" it: > > Frank, > > Does Andrew's patch help? > > Lin Ming > > > > > --- a/block/blk-sysfs.c~a > > +++ a/block/blk-sysfs.c > > @@ -4,6 +4,7 @@ > > #include <linux/kernel.h> > > #include <linux/slab.h> > > #include <linux/module.h> > > +#include <linux/timer.h> > > #include <linux/bio.h> > > #include <linux/blkdev.h> > > #include <linux/blktrace_api.h> > > @@ -486,7 +487,7 @@ static void blk_release_queue(struct kob > > __blk_queue_free_tags(q); > > > > blk_trace_shutdown(q); > > - > > + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer); > > bdi_destroy(&q->backing_dev_info); > > kmem_cache_free(blk_requestq_cachep, q); > > } > > _ > > I have no physical access to the machines showing the issue for a day or 10 and it appears to be not very reproducable on my own machines: I got only one trace so far (with a pristine 3.0.4 tree). I try to make it more reproducable before trying out the above patch. So, it will take some time before I have conclusive information on this. -- Frank ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035 2011-09-07 21:30 ` Andrew Morton 2011-09-09 14:21 ` Lin Ming @ 2011-09-20 13:16 ` Frank van Maarseveen 1 sibling, 0 replies; 11+ messages in thread From: Frank van Maarseveen @ 2011-09-20 13:16 UTC (permalink / raw) To: Andrew Morton; +Cc: Lin Ming, linux-kernel@vger.kernel.org, Jens Axboe On Wed, Sep 07, 2011 at 02:30:06PM -0700, Andrew Morton wrote: > On Wed, 07 Sep 2011 20:36:19 +0800 > Lin Ming <ming.m.lin@intel.com> wrote: > > > On Wed, 2011-09-07 at 18:24 +0800, Frank van Maarseveen wrote: > > > On Tue, Sep 06, 2011 at 10:48:38PM +0800, Lin Ming wrote: > > > > Does below patch help? > > > > > > > > >From a98b874437f871d5ecc3f6fe409b2b474b1f2731 Mon Sep 17 00:00:00 2001 > > > > From: Lin Ming <ming.m.lin@intel.com> > > > > Date: Tue, 6 Sep 2011 22:45:43 +0800 > > > > Subject: [PATCH] block: delete bdi writeback wakup_timer in blk_cleanup_queue() > > > > > > > > Signed-off-by: Lin Ming <ming.m.lin@intel.com> > > > > --- > > > > block/blk-core.c | 1 + > > > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > > > > > diff --git a/block/blk-core.c b/block/blk-core.c > > > > index 90e1ffd..22529a3 100644 > > > > --- a/block/blk-core.c > > > > +++ b/block/blk-core.c > > > > @@ -363,6 +363,7 @@ void blk_cleanup_queue(struct request_queue *q) > > > > blk_sync_queue(q); > > > > > > > > del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer); > > > > + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer); > > > > mutex_lock(&q->sysfs_lock); > > > > queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q); > > > > mutex_unlock(&q->sysfs_lock); > > > > -- > > > > 1.7.2.3 > > > > > > > > > > No, bug still present. Stack trace is the same and I double checked that > > > it was the new kernel (this time with a lot more debug enabled). > > > > Thanks for test. > > I'll try to reproduce this bug. > > Probably this will "fix" it: > > --- a/block/blk-sysfs.c~a > +++ a/block/blk-sysfs.c > @@ -4,6 +4,7 @@ > #include <linux/kernel.h> > #include <linux/slab.h> > #include <linux/module.h> > +#include <linux/timer.h> > #include <linux/bio.h> > #include <linux/blkdev.h> > #include <linux/blktrace_api.h> > @@ -486,7 +487,7 @@ static void blk_release_queue(struct kob > __blk_queue_free_tags(q); > > blk_trace_shutdown(q); > - > + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer); > bdi_destroy(&q->backing_dev_info); > kmem_cache_free(blk_requestq_cachep, q); > } > _ > > Jens, can you please take a look at this regression? > blk_release_queue() is freeing a pending timer. Yep, this fixes it. This is the recipe I used for triggering the issue on 3.0.4 (it can probably be simplified): - mount an ext[34] formatted USB stick read-write on /mnt, preloaded with a (64k) file "bar" in subdirectory "foo". - cat /mnt/foo/bar >/dev/null - sleep 30 # or more - unplug USB stick - issue an "umount -l -f /mnt" After playing with the 30 second delay parameter (to get more details) the kernel somehow ended in a state where the bug was no longer reproducable. A reboot made the recipe work again. Thanks, -- Frank ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-09-20 13:16 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-24 13:02 3.0.3 kernel BUG at kernel/timer.c:1035 Frank van Maarseveen
2011-08-30 9:08 ` Frank van Maarseveen
2011-09-02 8:10 ` Andrew Morton
2011-09-05 12:38 ` Frank van Maarseveen
[not found] ` <CAF1ivSYui_=tHbxHiy15a9wfiphHcpoY+J1MmAJ=dMQsAfEVLw@mail.gmail.com>
2011-09-06 14:48 ` Lin Ming
2011-09-07 10:24 ` Frank van Maarseveen
2011-09-07 12:36 ` Lin Ming
2011-09-07 21:30 ` Andrew Morton
2011-09-09 14:21 ` Lin Ming
2011-09-10 8:41 ` Frank van Maarseveen
2011-09-20 13:16 ` Frank van Maarseveen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox