* 3.0.3 kernel BUG at kernel/timer.c:1035
@ 2011-08-24 13:02 Frank van Maarseveen
2011-08-30 9:08 ` Frank van Maarseveen
2011-09-02 8:10 ` Andrew Morton
0 siblings, 2 replies; 11+ messages in thread
From: Frank van Maarseveen @ 2011-08-24 13:02 UTC (permalink / raw)
To: linux-kernel
Got several of these (logged via netconsole):
kernel BUG at kernel/timer.c:1035!
invalid opcode: 0000 [#1]
PREEMPT
SMP
Modules linked in:
[last unloaded: scsi_wait_scan]
Pid: 0, comm: swapper Not tainted 3.0.3-x263 #1
Dell Inc. OptiPlex GX620
/0F8098
EIP: 0060:[<c107adfe>] EFLAGS: 00010812 CPU: 0
EIP is at cascade+0x6e/0x70
EAX: 6b6b6b6a EBX: c1bbb480 ECX: c1ac2d50 EDX: f541335c
ESI: c1ac2d50 EDI: f600bf60 EBP: f600bf74 ESP: f600bf5c
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process swapper (pid: 0, ti=f600a000 task=c1aba320 task.ti=c1a94000)
Stack:
00000034
f541335c
c1ac2d50
c1bbb480
00000000
f600bfac
f600bfc0
c107af48
00000004
00000000
f600bfb8
c1069265
00000000
f600bfa8
c1bbc29c
c1bbc09c
c1bbbe9c
c106a28b
00000100
c1bbbc9c
c106a28b
00000100
00000041
c1a99a84
Call Trace:
[<c107af48>] run_timer_softirq+0x148/0x1e0
[<c1069265>] ? rebalance_domains+0x135/0x160
[<c106a28b>] ? get_parent_ip+0xb/0x40
[<c106a28b>] ? get_parent_ip+0xb/0x40
[<c1075098>] __do_softirq+0x78/0x100
[<c1075020>] ? local_bh_enable+0xa0/0xa0
<IRQ>
[<c10753ad>] ? irq_exit+0x5d/0x70
[<c104df53>] ? smp_apic_timer_interrupt+0x53/0x90
[<c178fa22>] ? apic_timer_interrupt+0x2a/0x30
[<c103d3ed>] ? mwait_idle+0x4d/0x80
[<c1034b0a>] ? cpu_idle+0x3a/0x80
[<c176dc3b>] ? rest_init+0x7b/0x80
[<c1b1471b>] ? start_kernel+0x2e2/0x2e8
[<c1b141c1>] ? loglevel+0x1a/0x1a
[<c1b140b3>] ? i386_start_kernel+0xb3/0xbb
Got one stack trace on 64 bit:
kernel BUG at kernel/timer.c:1035!
invalid opcode: 0000 [#1]
PREEMPT
SMP
CPU 1
Modules linked in:
vmthrottle
radeon
[last unloaded: scsi_wait_scan]
Pid: 4312, comm: qemu Not tainted 3.0.3-x263lm #1
Dell Inc. Dell DXP051
/0FJ030
RIP: 0010:[<ffffffff8109438b>]
[<ffffffff8109438b>] cascade+0x9b/0xa0
RSP: 0018:ffff8800dfc83e40 EFLAGS: 00210096
RAX: 6b6b6b6b6b6b6b6a RBX: ffff8800dfc83e40 RCX: ffff8800df0ad080
RDX: ffff8800dfc83e40 RSI: ffff8800daa7c838 RDI: ffff8800df0ac000
RBP: ffff8800dfc83e70 R08: ffff8800dfc8c640 R09: ffff8800dfc90df8
R10: 0000000000000001 R11: ffffffff8189c230 R12: ffff8800df0ac000
R13: ffff8800dfc83e40 R14: 0000000000000005 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff8800dfc80000(0063) knlGS:00000000f760b770
CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000080582b8 CR3: 00000000d1b44000 CR4: 00000000000026e0
DR0: 0000000000000001 DR1: 0000000000000002 DR2: 0000000000000001
DR3: 000000000000000a DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process qemu (pid: 4312, threadinfo ffff8800c6052000, task ffff8800d1ae9c80)
Stack:
ffff8800daa7c838
ffff8800daa7c838
0000000000000000
ffff8800df0ac000
0000000000000101
ffff8800dfc83eb0
ffff8800dfc83ef0
ffffffff81094653
ffff8800c6053fd8
ffff8800c6053fd8
ffff8800df0adc30
ffff8800df0ad830
Call Trace:
<IRQ>
[<ffffffff81094653>] run_timer_softirq+0x183/0x250
[<ffffffff81058398>] ? lapic_next_event+0x18/0x20
[<ffffffff810b35f7>] ? clockevents_program_event+0x57/0xa0
[<ffffffff8108d9da>] __do_softirq+0x9a/0x150
[<ffffffff8188625c>] call_softirq+0x1c/0x30
[<ffffffff8103ebe5>] do_softirq+0x65/0xa0
[<ffffffff8108d72d>] irq_exit+0x7d/0xa0
[<ffffffff81058c99>] smp_apic_timer_interrupt+0x69/0xa0
[<ffffffff81885d13>] apic_timer_interrupt+0x13/0x20
<EOI>
[<ffffffff810a4b99>] ? add_wait_queue+0x49/0x60
[<ffffffff81884914>] ? _raw_spin_unlock_irqrestore+0x44/0x50
[<ffffffff810a4b99>] ? add_wait_queue+0x49/0x60
[<ffffffff81132f2a>] __pollwait+0x7a/0x100
[<ffffffff8115eb97>] eventfd_poll+0x27/0x70
[<ffffffff81133ce6>] do_select+0x3d6/0x730
[<ffffffff81132eb0>] ? poll_freewait+0xc0/0xc0
[<ffffffff81132fb0>] ? __pollwait+0x100/0x100
last message repeated 5 times
[<ffffffff8108239d>] ? sub_preempt_count+0x9d/0xd0
[<ffffffff81081111>] ? get_parent_ip+0x11/0x50
[<ffffffff8108239d>] ? sub_preempt_count+0x9d/0xd0
[<ffffffff81882d53>] ? __mutex_lock_slowpath+0x2a3/0x350
[<ffffffff811663bc>] compat_core_sys_select+0x1fc/0x280
[<ffffffff81120ce1>] ? do_sync_read+0xd1/0x120
[<ffffffff81081111>] ? get_parent_ip+0x11/0x50
[<ffffffff81043ef6>] ? read_tsc+0x16/0x40
[<ffffffff810ae732>] ? ktime_get_ts+0xb2/0xe0
[<ffffffff811666fa>] compat_sys_select+0x4a/0x120
[<ffffffff810c382b>] ? compat_sys_gettimeofday+0xbb/0xd0
[<ffffffff8188631c>] sysenter_dispatch+0x7/0x32
In all these cases the issue was triggered by unplugging a mounted ext3
USB stick + an automated umount -l -f afterwards by udev using something
like the script below. A few seconds after the unplug+umount the system
crashed with the above traces, followed by a secondary
Kernel panic - not syncing: Fatal exception in interrupt
Unfortunately I'm unable to reproduce the issue right now so there must
be some unknown precondition or it is a race. Script:
--------
#!/bin/sh
#
# /etc/udev/rules.d/99-local.rules:
# SUBSYSTEM=="block", ACTION=="add|remove", RUN+="/usr/local/sbin/plugdev"
media_add()
{
mkdir -p /media/$dev
mount -t "$1" -o "$2" /dev/$dev /media/$dev
}
media_remove()
{
umount -f -l /media/$dev
rmdir /media/* 2>/dev/null
}
dev=`echo $DEVNAME|sed 's/.*\///'`
case "$ID_FS_TYPE.$ACTION.$dev" in
ext[234].add.?*)
media_add $ID_FS_TYPE nodev,nosuid
;;
vfat.add.?*)
media_add vfat umask=0
;;
*.remove.?*)
media_remove
;;
esac
--------
--
Frank
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035
2011-08-24 13:02 3.0.3 kernel BUG at kernel/timer.c:1035 Frank van Maarseveen
@ 2011-08-30 9:08 ` Frank van Maarseveen
2011-09-02 8:10 ` Andrew Morton
1 sibling, 0 replies; 11+ messages in thread
From: Frank van Maarseveen @ 2011-08-30 9:08 UTC (permalink / raw)
To: linux-kernel
Still present in 3.0.4.
--
Frank
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035
2011-08-24 13:02 3.0.3 kernel BUG at kernel/timer.c:1035 Frank van Maarseveen
2011-08-30 9:08 ` Frank van Maarseveen
@ 2011-09-02 8:10 ` Andrew Morton
2011-09-05 12:38 ` Frank van Maarseveen
1 sibling, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2011-09-02 8:10 UTC (permalink / raw)
To: Frank van Maarseveen; +Cc: linux-kernel
On Wed, 24 Aug 2011 15:02:38 +0200 Frank van Maarseveen <frankvm@frankvm.com> wrote:
> Got several of these (logged via netconsole):
>
> kernel BUG at kernel/timer.c:1035!
> invalid opcode: 0000 [#1]
> PREEMPT
> SMP
>
> ...
>
> Call Trace:
> [<c107af48>] run_timer_softirq+0x148/0x1e0
> [<c1069265>] ? rebalance_domains+0x135/0x160
> [<c106a28b>] ? get_parent_ip+0xb/0x40
> [<c106a28b>] ? get_parent_ip+0xb/0x40
> [<c1075098>] __do_softirq+0x78/0x100
> [<c1075020>] ? local_bh_enable+0xa0/0xa0
> <IRQ>
>
> [<c10753ad>] ? irq_exit+0x5d/0x70
> [<c104df53>] ? smp_apic_timer_interrupt+0x53/0x90
> [<c178fa22>] ? apic_timer_interrupt+0x2a/0x30
> [<c103d3ed>] ? mwait_idle+0x4d/0x80
> [<c1034b0a>] ? cpu_idle+0x3a/0x80
> [<c176dc3b>] ? rest_init+0x7b/0x80
> [<c1b1471b>] ? start_kernel+0x2e2/0x2e8
> [<c1b141c1>] ? loglevel+0x1a/0x1a
> [<c1b140b3>] ? i386_start_kernel+0xb3/0xbb
>
>
Could be that a timer was freed while still running.
Please ensure that all kernel debugging options are enabled.
Especially
CONFIG_DEBUG_OBJECTS=y
CONFIG_DEBUG_OBJECTS_SELFTEST=y
CONFIG_DEBUG_OBJECTS_FREE=y
CONFIG_DEBUG_OBJECTS_TIMERS=y
CONFIG_DEBUG_OBJECTS_WORK=y
CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER=y
CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1
CONFIG_SLUB_DEBUG_ON=y
CONFIG_DEBUG_OBJECTS_TIMERS might catch this one.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035
2011-09-02 8:10 ` Andrew Morton
@ 2011-09-05 12:38 ` Frank van Maarseveen
[not found] ` <CAF1ivSYui_=tHbxHiy15a9wfiphHcpoY+J1MmAJ=dMQsAfEVLw@mail.gmail.com>
0 siblings, 1 reply; 11+ messages in thread
From: Frank van Maarseveen @ 2011-09-05 12:38 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
On Fri, Sep 02, 2011 at 01:10:58AM -0700, Andrew Morton wrote:
> On Wed, 24 Aug 2011 15:02:38 +0200 Frank van Maarseveen <frankvm@frankvm.com> wrote:
>
> > Got several of these (logged via netconsole):
> >
> > kernel BUG at kernel/timer.c:1035!
> > invalid opcode: 0000 [#1]
> > PREEMPT
> > SMP
> >
> > ...
> >
> > Call Trace:
> > [<c107af48>] run_timer_softirq+0x148/0x1e0
> > [<c1069265>] ? rebalance_domains+0x135/0x160
> > [<c106a28b>] ? get_parent_ip+0xb/0x40
> > [<c106a28b>] ? get_parent_ip+0xb/0x40
> > [<c1075098>] __do_softirq+0x78/0x100
> > [<c1075020>] ? local_bh_enable+0xa0/0xa0
> > <IRQ>
> >
> > [<c10753ad>] ? irq_exit+0x5d/0x70
> > [<c104df53>] ? smp_apic_timer_interrupt+0x53/0x90
> > [<c178fa22>] ? apic_timer_interrupt+0x2a/0x30
> > [<c103d3ed>] ? mwait_idle+0x4d/0x80
> > [<c1034b0a>] ? cpu_idle+0x3a/0x80
> > [<c176dc3b>] ? rest_init+0x7b/0x80
> > [<c1b1471b>] ? start_kernel+0x2e2/0x2e8
> > [<c1b141c1>] ? loglevel+0x1a/0x1a
> > [<c1b140b3>] ? i386_start_kernel+0xb3/0xbb
> >
> >
>
> Could be that a timer was freed while still running.
>
> Please ensure that all kernel debugging options are enabled.
> Especially
>
> CONFIG_DEBUG_OBJECTS=y
> CONFIG_DEBUG_OBJECTS_SELFTEST=y
> CONFIG_DEBUG_OBJECTS_FREE=y
> CONFIG_DEBUG_OBJECTS_TIMERS=y
> CONFIG_DEBUG_OBJECTS_WORK=y
> CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
> CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER=y
> CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1
> CONFIG_SLUB_DEBUG_ON=y
>
> CONFIG_DEBUG_OBJECTS_TIMERS might catch this one.
Got something after enabling the above. Unplugging USB storage
before umount a couple of times produced the following trace:
usb 1-7: USB disconnect, device number 10
Buffer I/O error on device sdb1, logical block 0
lost page write due to I/O error on sdb1
------------[ cut here ]------------
WARNING: at lib/debugobjects.c:262 debug_print_object+0x85/0xa0()
Hardware name: OptiPlex GX620
ODEBUG: free active (active state 0) object type: timer_list hint: wakeup_timer_fn+0x0/0x50
Modules linked in: [last unloaded: scsi_wait_scan]
Pid: 5144, comm: umount Not tainted 3.0.4-y264 #1
Call Trace:
[<c106f59d>] warn_slowpath_common+0x6d/0xa0
[<c12b5d75>] ? debug_print_object+0x85/0xa0
[<c12b5d75>] ? debug_print_object+0x85/0xa0
[<c106f64e>] warn_slowpath_fmt+0x2e/0x30
[<c12b5d75>] debug_print_object+0x85/0xa0
[<c10d4bf0>] ? bdi_init+0x150/0x150
[<c12b5f0a>] __debug_check_no_obj_freed+0xda/0x180
[<c106a2fb>] ? get_parent_ip+0xb/0x40
[<c12b68c5>] debug_check_no_obj_freed+0x15/0x20
[<c10f07ba>] kmem_cache_free+0x9a/0xc0
[<c12a9f18>] ? prop_local_destroy_percpu+0x8/0x10
[<c10d4e3b>] ? bdi_destroy+0xdb/0x110
[<c1299d95>] ? blk_release_queue+0x45/0x50
[<c1299d95>] blk_release_queue+0x45/0x50
[<c12a756a>] kobject_release+0x3a/0x80
[<c12a7530>] ? kobject_del+0x60/0x60
[<c12a891d>] kref_put+0x2d/0x60
[<c12a747d>] kobject_put+0x1d/0x50
[<c1084568>] ? __cancel_work_timer+0x68/0x70
[<c129707d>] blk_put_queue+0xd/0x10
[<c1465953>] scsi_device_dev_release_usercontext+0xe3/0x120
[<c1465870>] ? scsi_device_cls_release+0x10/0x10
[<c108432c>] execute_in_process_context+0x5c/0x70
[<c1465853>] scsi_device_dev_release+0x13/0x20
[<c1440559>] device_release+0x19/0x80
[<c10f06f9>] ? kfree+0xc9/0xd0
[<c12a7575>] ? kobject_release+0x45/0x80
[<c12a756a>] kobject_release+0x3a/0x80
[<c12a7530>] ? kobject_del+0x60/0x60
[<c12a891d>] kref_put+0x2d/0x60
[<c12a747d>] kobject_put+0x1d/0x50
[<c12a747d>] ? kobject_put+0x1d/0x50
[<c14402ff>] put_device+0xf/0x20
[<c145b2f3>] scsi_device_put+0x33/0x50
[<c148375b>] scsi_disk_put+0x2b/0x40
[<c1483daf>] sd_release+0x2f/0x60
[<c11200f0>] __blkdev_put+0x120/0x160
[<c11200cd>] __blkdev_put+0xfd/0x160
[<c1120153>] blkdev_put+0x23/0x110
[<c10f67a0>] kill_block_super+0x40/0x70
[<c10f6c0d>] deactivate_locked_super+0x3d/0x60
[<c10f72a9>] deactivate_super+0x49/0x70
[<c110e1c6>] mntput_no_expire+0x86/0xc0
[<c110f10d>] sys_umount+0x5d/0xb0
[<c179141c>] sysenter_do_call+0x12/0x2c
---[ end trace ad6863f336beb434 ]---
--
Frank
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035
[not found] ` <CAF1ivSYui_=tHbxHiy15a9wfiphHcpoY+J1MmAJ=dMQsAfEVLw@mail.gmail.com>
@ 2011-09-06 14:48 ` Lin Ming
2011-09-07 10:24 ` Frank van Maarseveen
0 siblings, 1 reply; 11+ messages in thread
From: Lin Ming @ 2011-09-06 14:48 UTC (permalink / raw)
To: Frank van Maarseveen; +Cc: Andrew Morton, linux-kernel
On Mon, Sep 5, 2011 at 8:38 PM, Frank van Maarseveen <frankvm@frankvm.com> wrote:
> On Fri, Sep 02, 2011 at 01:10:58AM -0700, Andrew Morton wrote:
>> On Wed, 24 Aug 2011 15:02:38 +0200 Frank van Maarseveen <frankvm@frankvm.com> wrote:
>>
>> > Got several of these (logged via netconsole):
>> >
>> > kernel BUG at kernel/timer.c:1035!
>> > invalid opcode: 0000 [#1]
>> > PREEMPT
>> > SMP
>> >
>> > ...
>> >
>> > Call Trace:
>> > [<c107af48>] run_timer_softirq+0x148/0x1e0
>> > [<c1069265>] ? rebalance_domains+0x135/0x160
>> > [<c106a28b>] ? get_parent_ip+0xb/0x40
>> > [<c106a28b>] ? get_parent_ip+0xb/0x40
>> > [<c1075098>] __do_softirq+0x78/0x100
>> > [<c1075020>] ? local_bh_enable+0xa0/0xa0
>> > <IRQ>
>> >
>> > [<c10753ad>] ? irq_exit+0x5d/0x70
>> > [<c104df53>] ? smp_apic_timer_interrupt+0x53/0x90
>> > [<c178fa22>] ? apic_timer_interrupt+0x2a/0x30
>> > [<c103d3ed>] ? mwait_idle+0x4d/0x80
>> > [<c1034b0a>] ? cpu_idle+0x3a/0x80
>> > [<c176dc3b>] ? rest_init+0x7b/0x80
>> > [<c1b1471b>] ? start_kernel+0x2e2/0x2e8
>> > [<c1b141c1>] ? loglevel+0x1a/0x1a
>> > [<c1b140b3>] ? i386_start_kernel+0xb3/0xbb
>> >
>> >
>>
>> Could be that a timer was freed while still running.
>>
>> Please ensure that all kernel debugging options are enabled.
>> Especially
>>
>> CONFIG_DEBUG_OBJECTS=y
>> CONFIG_DEBUG_OBJECTS_SELFTEST=y
>> CONFIG_DEBUG_OBJECTS_FREE=y
>> CONFIG_DEBUG_OBJECTS_TIMERS=y
>> CONFIG_DEBUG_OBJECTS_WORK=y
>> CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
>> CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER=y
>> CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1
>> CONFIG_SLUB_DEBUG_ON=y
>>
>> CONFIG_DEBUG_OBJECTS_TIMERS might catch this one.
>
> Got something after enabling the above. Unplugging USB storage
> before umount a couple of times produced the following trace:
>
> usb 1-7: USB disconnect, device number 10
> Buffer I/O error on device sdb1, logical block 0
> lost page write due to I/O error on sdb1
> ------------[ cut here ]------------
> WARNING: at lib/debugobjects.c:262 debug_print_object+0x85/0xa0()
> Hardware name: OptiPlex GX620
> ODEBUG: free active (active state 0) object type: timer_list hint: wakeup_timer_fn+0x0/0x50
Does below patch help?
>From a98b874437f871d5ecc3f6fe409b2b474b1f2731 Mon Sep 17 00:00:00 2001
From: Lin Ming <ming.m.lin@intel.com>
Date: Tue, 6 Sep 2011 22:45:43 +0800
Subject: [PATCH] block: delete bdi writeback wakup_timer in blk_cleanup_queue()
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
---
block/blk-core.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 90e1ffd..22529a3 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -363,6 +363,7 @@ void blk_cleanup_queue(struct request_queue *q)
blk_sync_queue(q);
del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer);
+ del_timer_sync(&q->backing_dev_info.wb.wakeup_timer);
mutex_lock(&q->sysfs_lock);
queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q);
mutex_unlock(&q->sysfs_lock);
--
1.7.2.3
Regards,
Lin Ming
> Modules linked in: [last unloaded: scsi_wait_scan]
> Pid: 5144, comm: umount Not tainted 3.0.4-y264 #1
> Call Trace:
> [<c106f59d>] warn_slowpath_common+0x6d/0xa0
> [<c12b5d75>] ? debug_print_object+0x85/0xa0
> [<c12b5d75>] ? debug_print_object+0x85/0xa0
> [<c106f64e>] warn_slowpath_fmt+0x2e/0x30
> [<c12b5d75>] debug_print_object+0x85/0xa0
> [<c10d4bf0>] ? bdi_init+0x150/0x150
> [<c12b5f0a>] __debug_check_no_obj_freed+0xda/0x180
> [<c106a2fb>] ? get_parent_ip+0xb/0x40
> [<c12b68c5>] debug_check_no_obj_freed+0x15/0x20
> [<c10f07ba>] kmem_cache_free+0x9a/0xc0
> [<c12a9f18>] ? prop_local_destroy_percpu+0x8/0x10
> [<c10d4e3b>] ? bdi_destroy+0xdb/0x110
> [<c1299d95>] ? blk_release_queue+0x45/0x50
> [<c1299d95>] blk_release_queue+0x45/0x50
> [<c12a756a>] kobject_release+0x3a/0x80
> [<c12a7530>] ? kobject_del+0x60/0x60
> [<c12a891d>] kref_put+0x2d/0x60
> [<c12a747d>] kobject_put+0x1d/0x50
> [<c1084568>] ? __cancel_work_timer+0x68/0x70
> [<c129707d>] blk_put_queue+0xd/0x10
> [<c1465953>] scsi_device_dev_release_usercontext+0xe3/0x120
> [<c1465870>] ? scsi_device_cls_release+0x10/0x10
> [<c108432c>] execute_in_process_context+0x5c/0x70
> [<c1465853>] scsi_device_dev_release+0x13/0x20
> [<c1440559>] device_release+0x19/0x80
> [<c10f06f9>] ? kfree+0xc9/0xd0
> [<c12a7575>] ? kobject_release+0x45/0x80
> [<c12a756a>] kobject_release+0x3a/0x80
> [<c12a7530>] ? kobject_del+0x60/0x60
> [<c12a891d>] kref_put+0x2d/0x60
> [<c12a747d>] kobject_put+0x1d/0x50
> [<c12a747d>] ? kobject_put+0x1d/0x50
> [<c14402ff>] put_device+0xf/0x20
> [<c145b2f3>] scsi_device_put+0x33/0x50
> [<c148375b>] scsi_disk_put+0x2b/0x40
> [<c1483daf>] sd_release+0x2f/0x60
> [<c11200f0>] __blkdev_put+0x120/0x160
> [<c11200cd>] __blkdev_put+0xfd/0x160
> [<c1120153>] blkdev_put+0x23/0x110
> [<c10f67a0>] kill_block_super+0x40/0x70
> [<c10f6c0d>] deactivate_locked_super+0x3d/0x60
> [<c10f72a9>] deactivate_super+0x49/0x70
> [<c110e1c6>] mntput_no_expire+0x86/0xc0
> [<c110f10d>] sys_umount+0x5d/0xb0
> [<c179141c>] sysenter_do_call+0x12/0x2c
> ---[ end trace ad6863f336beb434 ]---
>
>
> --
> Frank
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035
2011-09-06 14:48 ` Lin Ming
@ 2011-09-07 10:24 ` Frank van Maarseveen
2011-09-07 12:36 ` Lin Ming
0 siblings, 1 reply; 11+ messages in thread
From: Frank van Maarseveen @ 2011-09-07 10:24 UTC (permalink / raw)
To: Lin Ming; +Cc: Andrew Morton, linux-kernel
On Tue, Sep 06, 2011 at 10:48:38PM +0800, Lin Ming wrote:
> Does below patch help?
>
> >From a98b874437f871d5ecc3f6fe409b2b474b1f2731 Mon Sep 17 00:00:00 2001
> From: Lin Ming <ming.m.lin@intel.com>
> Date: Tue, 6 Sep 2011 22:45:43 +0800
> Subject: [PATCH] block: delete bdi writeback wakup_timer in blk_cleanup_queue()
>
> Signed-off-by: Lin Ming <ming.m.lin@intel.com>
> ---
> block/blk-core.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 90e1ffd..22529a3 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -363,6 +363,7 @@ void blk_cleanup_queue(struct request_queue *q)
> blk_sync_queue(q);
>
> del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer);
> + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer);
> mutex_lock(&q->sysfs_lock);
> queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q);
> mutex_unlock(&q->sysfs_lock);
> --
> 1.7.2.3
>
No, bug still present. Stack trace is the same and I double checked that
it was the new kernel (this time with a lot more debug enabled).
--
Frank
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035
2011-09-07 10:24 ` Frank van Maarseveen
@ 2011-09-07 12:36 ` Lin Ming
2011-09-07 21:30 ` Andrew Morton
0 siblings, 1 reply; 11+ messages in thread
From: Lin Ming @ 2011-09-07 12:36 UTC (permalink / raw)
To: Frank van Maarseveen; +Cc: Andrew Morton, linux-kernel@vger.kernel.org
On Wed, 2011-09-07 at 18:24 +0800, Frank van Maarseveen wrote:
> On Tue, Sep 06, 2011 at 10:48:38PM +0800, Lin Ming wrote:
> > Does below patch help?
> >
> > >From a98b874437f871d5ecc3f6fe409b2b474b1f2731 Mon Sep 17 00:00:00 2001
> > From: Lin Ming <ming.m.lin@intel.com>
> > Date: Tue, 6 Sep 2011 22:45:43 +0800
> > Subject: [PATCH] block: delete bdi writeback wakup_timer in blk_cleanup_queue()
> >
> > Signed-off-by: Lin Ming <ming.m.lin@intel.com>
> > ---
> > block/blk-core.c | 1 +
> > 1 files changed, 1 insertions(+), 0 deletions(-)
> >
> > diff --git a/block/blk-core.c b/block/blk-core.c
> > index 90e1ffd..22529a3 100644
> > --- a/block/blk-core.c
> > +++ b/block/blk-core.c
> > @@ -363,6 +363,7 @@ void blk_cleanup_queue(struct request_queue *q)
> > blk_sync_queue(q);
> >
> > del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer);
> > + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer);
> > mutex_lock(&q->sysfs_lock);
> > queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q);
> > mutex_unlock(&q->sysfs_lock);
> > --
> > 1.7.2.3
> >
>
> No, bug still present. Stack trace is the same and I double checked that
> it was the new kernel (this time with a lot more debug enabled).
Thanks for test.
I'll try to reproduce this bug.
Lin Ming
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035
2011-09-07 12:36 ` Lin Ming
@ 2011-09-07 21:30 ` Andrew Morton
2011-09-09 14:21 ` Lin Ming
2011-09-20 13:16 ` Frank van Maarseveen
0 siblings, 2 replies; 11+ messages in thread
From: Andrew Morton @ 2011-09-07 21:30 UTC (permalink / raw)
To: Lin Ming; +Cc: Frank van Maarseveen, linux-kernel@vger.kernel.org, Jens Axboe
On Wed, 07 Sep 2011 20:36:19 +0800
Lin Ming <ming.m.lin@intel.com> wrote:
> On Wed, 2011-09-07 at 18:24 +0800, Frank van Maarseveen wrote:
> > On Tue, Sep 06, 2011 at 10:48:38PM +0800, Lin Ming wrote:
> > > Does below patch help?
> > >
> > > >From a98b874437f871d5ecc3f6fe409b2b474b1f2731 Mon Sep 17 00:00:00 2001
> > > From: Lin Ming <ming.m.lin@intel.com>
> > > Date: Tue, 6 Sep 2011 22:45:43 +0800
> > > Subject: [PATCH] block: delete bdi writeback wakup_timer in blk_cleanup_queue()
> > >
> > > Signed-off-by: Lin Ming <ming.m.lin@intel.com>
> > > ---
> > > block/blk-core.c | 1 +
> > > 1 files changed, 1 insertions(+), 0 deletions(-)
> > >
> > > diff --git a/block/blk-core.c b/block/blk-core.c
> > > index 90e1ffd..22529a3 100644
> > > --- a/block/blk-core.c
> > > +++ b/block/blk-core.c
> > > @@ -363,6 +363,7 @@ void blk_cleanup_queue(struct request_queue *q)
> > > blk_sync_queue(q);
> > >
> > > del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer);
> > > + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer);
> > > mutex_lock(&q->sysfs_lock);
> > > queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q);
> > > mutex_unlock(&q->sysfs_lock);
> > > --
> > > 1.7.2.3
> > >
> >
> > No, bug still present. Stack trace is the same and I double checked that
> > it was the new kernel (this time with a lot more debug enabled).
>
> Thanks for test.
> I'll try to reproduce this bug.
Probably this will "fix" it:
--- a/block/blk-sysfs.c~a
+++ a/block/blk-sysfs.c
@@ -4,6 +4,7 @@
#include <linux/kernel.h>
#include <linux/slab.h>
#include <linux/module.h>
+#include <linux/timer.h>
#include <linux/bio.h>
#include <linux/blkdev.h>
#include <linux/blktrace_api.h>
@@ -486,7 +487,7 @@ static void blk_release_queue(struct kob
__blk_queue_free_tags(q);
blk_trace_shutdown(q);
-
+ del_timer_sync(&q->backing_dev_info.wb.wakeup_timer);
bdi_destroy(&q->backing_dev_info);
kmem_cache_free(blk_requestq_cachep, q);
}
_
Jens, can you please take a look at this regression?
blk_release_queue() is freeing a pending timer.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035
2011-09-07 21:30 ` Andrew Morton
@ 2011-09-09 14:21 ` Lin Ming
2011-09-10 8:41 ` Frank van Maarseveen
2011-09-20 13:16 ` Frank van Maarseveen
1 sibling, 1 reply; 11+ messages in thread
From: Lin Ming @ 2011-09-09 14:21 UTC (permalink / raw)
To: Andrew Morton
Cc: Lin Ming, Frank van Maarseveen, linux-kernel@vger.kernel.org,
Jens Axboe
On Thu, Sep 8, 2011 at 5:30 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Wed, 07 Sep 2011 20:36:19 +0800
> Lin Ming <ming.m.lin@intel.com> wrote:
>
>> On Wed, 2011-09-07 at 18:24 +0800, Frank van Maarseveen wrote:
>> > On Tue, Sep 06, 2011 at 10:48:38PM +0800, Lin Ming wrote:
>> > > Does below patch help?
>> > >
>> > > >From a98b874437f871d5ecc3f6fe409b2b474b1f2731 Mon Sep 17 00:00:00 2001
>> > > From: Lin Ming <ming.m.lin@intel.com>
>> > > Date: Tue, 6 Sep 2011 22:45:43 +0800
>> > > Subject: [PATCH] block: delete bdi writeback wakup_timer in blk_cleanup_queue()
>> > >
>> > > Signed-off-by: Lin Ming <ming.m.lin@intel.com>
>> > > ---
>> > > block/blk-core.c | 1 +
>> > > 1 files changed, 1 insertions(+), 0 deletions(-)
>> > >
>> > > diff --git a/block/blk-core.c b/block/blk-core.c
>> > > index 90e1ffd..22529a3 100644
>> > > --- a/block/blk-core.c
>> > > +++ b/block/blk-core.c
>> > > @@ -363,6 +363,7 @@ void blk_cleanup_queue(struct request_queue *q)
>> > > blk_sync_queue(q);
>> > >
>> > > del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer);
>> > > + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer);
>> > > mutex_lock(&q->sysfs_lock);
>> > > queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q);
>> > > mutex_unlock(&q->sysfs_lock);
>> > > --
>> > > 1.7.2.3
>> > >
>> >
>> > No, bug still present. Stack trace is the same and I double checked that
>> > it was the new kernel (this time with a lot more debug enabled).
>>
>> Thanks for test.
>> I'll try to reproduce this bug.
>
> Probably this will "fix" it:
Frank,
Does Andrew's patch help?
Lin Ming
>
> --- a/block/blk-sysfs.c~a
> +++ a/block/blk-sysfs.c
> @@ -4,6 +4,7 @@
> #include <linux/kernel.h>
> #include <linux/slab.h>
> #include <linux/module.h>
> +#include <linux/timer.h>
> #include <linux/bio.h>
> #include <linux/blkdev.h>
> #include <linux/blktrace_api.h>
> @@ -486,7 +487,7 @@ static void blk_release_queue(struct kob
> __blk_queue_free_tags(q);
>
> blk_trace_shutdown(q);
> -
> + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer);
> bdi_destroy(&q->backing_dev_info);
> kmem_cache_free(blk_requestq_cachep, q);
> }
> _
>
> Jens, can you please take a look at this regression?
> blk_release_queue() is freeing a pending timer.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035
2011-09-09 14:21 ` Lin Ming
@ 2011-09-10 8:41 ` Frank van Maarseveen
0 siblings, 0 replies; 11+ messages in thread
From: Frank van Maarseveen @ 2011-09-10 8:41 UTC (permalink / raw)
To: Lin Ming
Cc: Andrew Morton, Lin Ming, linux-kernel@vger.kernel.org, Jens Axboe
On Fri, Sep 09, 2011 at 10:21:36PM +0800, Lin Ming wrote:
> On Thu, Sep 8, 2011 at 5:30 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
(...)
> >
> > Probably this will "fix" it:
>
> Frank,
>
> Does Andrew's patch help?
>
> Lin Ming
>
> >
> > --- a/block/blk-sysfs.c~a
> > +++ a/block/blk-sysfs.c
> > @@ -4,6 +4,7 @@
> > #include <linux/kernel.h>
> > #include <linux/slab.h>
> > #include <linux/module.h>
> > +#include <linux/timer.h>
> > #include <linux/bio.h>
> > #include <linux/blkdev.h>
> > #include <linux/blktrace_api.h>
> > @@ -486,7 +487,7 @@ static void blk_release_queue(struct kob
> > __blk_queue_free_tags(q);
> >
> > blk_trace_shutdown(q);
> > -
> > + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer);
> > bdi_destroy(&q->backing_dev_info);
> > kmem_cache_free(blk_requestq_cachep, q);
> > }
> > _
> >
I have no physical access to the machines showing the issue for a day
or 10 and it appears to be not very reproducable on my own machines:
I got only one trace so far (with a pristine 3.0.4 tree). I try to make
it more reproducable before trying out the above patch. So, it will take
some time before I have conclusive information on this.
--
Frank
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3.0.3 kernel BUG at kernel/timer.c:1035
2011-09-07 21:30 ` Andrew Morton
2011-09-09 14:21 ` Lin Ming
@ 2011-09-20 13:16 ` Frank van Maarseveen
1 sibling, 0 replies; 11+ messages in thread
From: Frank van Maarseveen @ 2011-09-20 13:16 UTC (permalink / raw)
To: Andrew Morton; +Cc: Lin Ming, linux-kernel@vger.kernel.org, Jens Axboe
On Wed, Sep 07, 2011 at 02:30:06PM -0700, Andrew Morton wrote:
> On Wed, 07 Sep 2011 20:36:19 +0800
> Lin Ming <ming.m.lin@intel.com> wrote:
>
> > On Wed, 2011-09-07 at 18:24 +0800, Frank van Maarseveen wrote:
> > > On Tue, Sep 06, 2011 at 10:48:38PM +0800, Lin Ming wrote:
> > > > Does below patch help?
> > > >
> > > > >From a98b874437f871d5ecc3f6fe409b2b474b1f2731 Mon Sep 17 00:00:00 2001
> > > > From: Lin Ming <ming.m.lin@intel.com>
> > > > Date: Tue, 6 Sep 2011 22:45:43 +0800
> > > > Subject: [PATCH] block: delete bdi writeback wakup_timer in blk_cleanup_queue()
> > > >
> > > > Signed-off-by: Lin Ming <ming.m.lin@intel.com>
> > > > ---
> > > > block/blk-core.c | 1 +
> > > > 1 files changed, 1 insertions(+), 0 deletions(-)
> > > >
> > > > diff --git a/block/blk-core.c b/block/blk-core.c
> > > > index 90e1ffd..22529a3 100644
> > > > --- a/block/blk-core.c
> > > > +++ b/block/blk-core.c
> > > > @@ -363,6 +363,7 @@ void blk_cleanup_queue(struct request_queue *q)
> > > > blk_sync_queue(q);
> > > >
> > > > del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer);
> > > > + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer);
> > > > mutex_lock(&q->sysfs_lock);
> > > > queue_flag_set_unlocked(QUEUE_FLAG_DEAD, q);
> > > > mutex_unlock(&q->sysfs_lock);
> > > > --
> > > > 1.7.2.3
> > > >
> > >
> > > No, bug still present. Stack trace is the same and I double checked that
> > > it was the new kernel (this time with a lot more debug enabled).
> >
> > Thanks for test.
> > I'll try to reproduce this bug.
>
> Probably this will "fix" it:
>
> --- a/block/blk-sysfs.c~a
> +++ a/block/blk-sysfs.c
> @@ -4,6 +4,7 @@
> #include <linux/kernel.h>
> #include <linux/slab.h>
> #include <linux/module.h>
> +#include <linux/timer.h>
> #include <linux/bio.h>
> #include <linux/blkdev.h>
> #include <linux/blktrace_api.h>
> @@ -486,7 +487,7 @@ static void blk_release_queue(struct kob
> __blk_queue_free_tags(q);
>
> blk_trace_shutdown(q);
> -
> + del_timer_sync(&q->backing_dev_info.wb.wakeup_timer);
> bdi_destroy(&q->backing_dev_info);
> kmem_cache_free(blk_requestq_cachep, q);
> }
> _
>
> Jens, can you please take a look at this regression?
> blk_release_queue() is freeing a pending timer.
Yep, this fixes it. This is the recipe I used for triggering the issue
on 3.0.4 (it can probably be simplified):
- mount an ext[34] formatted USB stick read-write on /mnt, preloaded
with a (64k) file "bar" in subdirectory "foo".
- cat /mnt/foo/bar >/dev/null
- sleep 30 # or more
- unplug USB stick
- issue an "umount -l -f /mnt"
After playing with the 30 second delay parameter (to get more details) the
kernel somehow ended in a state where the bug was no longer reproducable.
A reboot made the recipe work again.
Thanks,
--
Frank
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-09-20 13:16 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-24 13:02 3.0.3 kernel BUG at kernel/timer.c:1035 Frank van Maarseveen
2011-08-30 9:08 ` Frank van Maarseveen
2011-09-02 8:10 ` Andrew Morton
2011-09-05 12:38 ` Frank van Maarseveen
[not found] ` <CAF1ivSYui_=tHbxHiy15a9wfiphHcpoY+J1MmAJ=dMQsAfEVLw@mail.gmail.com>
2011-09-06 14:48 ` Lin Ming
2011-09-07 10:24 ` Frank van Maarseveen
2011-09-07 12:36 ` Lin Ming
2011-09-07 21:30 ` Andrew Morton
2011-09-09 14:21 ` Lin Ming
2011-09-10 8:41 ` Frank van Maarseveen
2011-09-20 13:16 ` Frank van Maarseveen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox