Hi Hong and Jens, FYI, this patch still has the error that impacts the latest linux-next. git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master commit 2c575026fae6e63771bd2a4c1d407214a8096a89 Author: Hong Zhiguo AuthorDate: Wed Nov 20 10:35:05 2013 -0700 Commit: Jens Axboe CommitDate: Wed Nov 20 15:33:04 2013 -0700 Update of blkg_stat and blkg_rwstat may happen in bh context. While u64_stats_fetch_retry is only preempt_disable on 32bit UP system. This is not enough to avoid preemption by bh and may read strange 64 bit value. Signed-off-by: Hong Zhiguo Acked-by: Tejun Heo Cc: stable@kernel.org Signed-off-by: Jens Axboe +-----------------------------------------------------------------------+------------+------------+ | | 82023bb7f7 | 2c575026fa | +-----------------------------------------------------------------------+------------+------------+ | boot_successes | 496 | 0 | | boot_failures | 494 | 330 | | WARNING:CPU:PID:at_arch/x86/mm/ioremap.c:__early_ioremap() | 493 | 177 | | WARNING:CPU:PID:at_kernel/trace/ring_buffer.c:rb_reserve_next_event() | 493 | 177 | | backtrace:acpi_initialize_tables | 493 | 177 | | backtrace:acpi_table_init | 493 | 177 | | backtrace:acpi_boot_table_init | 493 | 177 | | backtrace:ring_buffer_producer_thread | 493 | 177 | | BUG:unable_to_handle_kernel_NULL_pointer_dereference | 3 | 2 | | Oops | 3 | 2 | | EIP_is_at_strlen | 3 | 2 | | Kernel_panic-not_syncing:Fatal_exception_in_interrupt | 2 | | | Kernel_panic-not_syncing:Fatal_exception | 1 | 2 | | backtrace:vfs_write | 1 | 2 | | backtrace:SyS_write | 1 | 2 | | WARNING:CPU:PID:at_kernel/softirq.c:local_bh_enable() | 0 | 330 | | inconsistent_IN-SOFTIRQ-W-SOFTIRQ-ON-W_usage | 0 | 330 | | backtrace:do_mount | 0 | 330 | | backtrace:SyS_mount | 0 | 330 | | backtrace:smpboot_thread_fn | 0 | 182 | +-----------------------------------------------------------------------+------------+------------+ [ 7.266963] scsi_id (235) used greatest stack depth: 6008 bytes left [ 7.403676] ------------[ cut here ]------------ [ 7.404033] WARNING: CPU: 0 PID: 264 at kernel/softirq.c:156 local_bh_enable+0x9c/0x1e0() [ 7.404033] CPU: 0 PID: 264 Comm: mount Tainted: G W 3.12.0-02795-g2c57502 #16 [ 7.404033] 00000001 511d1a58 420d4200 511d1a88 4109f3dd 426e5c40 00000000 00000108 [ 7.404033] 426e5fb0 0000009c 410a68dc 410a68dc 00000001 4183189d 00000001 511d1a98 [ 7.404033] 4109f4c2 00000009 00000000 511d1aac 410a68dc 51c9f008 51c9f23c 511d1ad8 [ 7.404033] Call Trace: [ 7.404033] [<420d4200>] dump_stack+0x16/0x18 [ 7.404033] [<4109f3dd>] warn_slowpath_common+0x8d/0xb0 [ 7.404033] [<410a68dc>] ? local_bh_enable+0x9c/0x1e0 [ 7.404033] [<410a68dc>] ? local_bh_enable+0x9c/0x1e0 [ 7.404033] [<4183189d>] ? cfqg_stats_update_avg_queue_size+0x2d/0x100 [ 7.404033] [<4109f4c2>] warn_slowpath_null+0x22/0x30 [ 7.404033] [<410a68dc>] local_bh_enable+0x9c/0x1e0 [ 7.404033] [<4183189d>] cfqg_stats_update_avg_queue_size+0x2d/0x100 [ 7.404033] [<41833f4a>] __cfq_set_active_queue+0x15a/0x210 [ 7.404033] [<418300d9>] ? cfq_group_service_tree_add+0x199/0x260 [ 7.404033] [<41831f84>] ? cfq_service_tree_add+0x404/0x4f0 [ 7.404033] [<418320a9>] ? cfq_resort_rr_list+0x39/0x40 [ 7.404033] [<41832fff>] ? cfq_add_cfqq_rr+0x16f/0x1c0 [ 7.404033] [<4183a014>] ? cfq_update_idle_window.isra.78+0x84/0x3a0 [ 7.404033] [<41836b4c>] cfq_select_queue+0x7ec/0xa90 [ 7.404033] [<4183988f>] cfq_dispatch_requests+0x2bf/0x9c0 [ 7.404033] [<410833ec>] ? pvclock_clocksource_read+0xfc/0x240 [ 7.404033] [<410822f3>] ? kvm_clock_read+0x13/0x20 [ 7.404033] [<4183a702>] ? cfq_insert_request+0x3d2/0x8b0 [ 7.404033] [<410e0fc3>] ? sched_clock_local.constprop.2+0x43/0x190 [ 7.404033] [<410f5bd3>] ? __lock_acquire+0x1113/0x11a0 [ 7.404033] [<410f2fcb>] ? trace_hardirqs_off+0xb/0x10 [ 7.404033] [<4180f572>] blk_peek_request+0x302/0x450 [ 7.404033] [<4183fa06>] ? kobject_get+0xd6/0x100 [ 7.404033] [<419f1e08>] scsi_request_fn+0x68/0x8d0 [ 7.404033] [<4180b322>] __blk_run_queue_uncond+0x42/0x50 [ 7.404033] [<4180b370>] __blk_run_queue+0x40/0x50 [ 7.404033] [<418104ed>] blk_queue_bio+0x3dd/0x4e0 [ 7.404033] [<4180d1b2>] generic_make_request+0x102/0x180 [ 7.404033] [<4180d3a7>] submit_bio+0x177/0x250 [ 7.404033] [<412459da>] ? __find_get_block+0x2da/0x390 [ 7.404033] [<4124b46d>] ? bio_alloc_bioset+0xfd/0x340 [ 7.404033] [<4124673c>] _submit_bh+0x2dc/0x320 [ 7.404033] [<4124a595>] __bread+0x85/0x1f0 [ 7.404033] [<41301fb9>] ext3_fill_super+0x1a9/0x1d10 [ 7.404033] [<4184c428>] ? snprintf+0x18/0x20 [ 7.404033] [<41301e10>] ? ext3_setup_super+0x340/0x340 [ 7.404033] [<4120090d>] mount_bdev+0x1ed/0x2c0 [ 7.404033] [<41301e10>] ? ext3_setup_super+0x340/0x340 [ 7.404033] [<411debb0>] ? __kmalloc_track_caller+0xd0/0x2a0 [ 7.404033] [<412fc2bf>] ext3_mount+0x1f/0x30 [ 7.404033] [<41301e10>] ? ext3_setup_super+0x340/0x340 [ 7.404033] [<41200d27>] mount_fs+0x47/0x1f0 [ 7.404033] [<4122a0a5>] ? alloc_vfsmnt+0x175/0x1c0 [ 7.404033] [<4122b633>] vfs_kern_mount+0xa3/0x1a0 [ 7.404033] [<420e8122>] ? _raw_read_unlock+0x22/0x30 [ 7.404033] [<4122e786>] do_mount+0xdb6/0x11c0 [ 7.404033] [<4122d9ac>] ? copy_mount_string+0x5c/0x80 [ 7.404033] [<4122f003>] SyS_mount+0xf3/0x120 [ 7.404033] [<420e8e8c>] syscall_call+0x7/0xb [ 7.404033] ---[ end trace 05e0c07eb1c663a9 ]--- [ 7.404033] [ 7.404033] ================================= [ 7.404033] [ INFO: inconsistent lock state ] [ 7.404033] 3.12.0-02795-g2c57502 #16 Tainted: G W [ 7.404033] --------------------------------- [ 7.404033] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. [ 7.404033] mount/264 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 7.404033] (&(&q->__queue_lock)->rlock){+.?...}, at: [<418104cb>] blk_queue_bio+0x3bb/0x4e0 [ 7.404033] {IN-SOFTIRQ-W} state was registered at: [ 7.404033] [<410f501b>] __lock_acquire+0x55b/0x11a0 [ 7.404033] [<410f63f3>] lock_acquire+0xd3/0x110 [ 7.404033] [<420e7b01>] _raw_spin_lock+0x41/0x90 [ 7.404033] [<419f2fa6>] scsi_device_unbusy+0x76/0xf0 [ 7.404033] [<419ea282>] scsi_finish_command+0x22/0x120 [ 7.404033] [<419f320f>] scsi_softirq_done+0xef/0x170 [ 7.404033] [<4181826b>] blk_done_softirq+0x7b/0x90 [ 7.404033] [<410a6214>] __do_softirq+0x134/0x450 [ 7.404033] [<410a656c>] run_ksoftirqd+0x3c/0x80 [ 7.404033] [<410d89fc>] smpboot_thread_fn+0x1dc/0x250 [ 7.404033] [<410cdec9>] kthread+0xf9/0x100 [ 7.404033] [<420e949b>] ret_from_kernel_thread+0x1b/0x30 [ 7.404033] irq event stamp: 2012 [ 7.404033] hardirqs last enabled at (2009): [<420e7dd7>] _raw_spin_unlock_irq+0x27/0x40 [ 7.404033] hardirqs last disabled at (2010): [<420e7c0a>] _raw_spin_lock_irq+0x1a/0x90 [ 7.404033] softirqs last enabled at (2012): [<4183189d>] cfqg_stats_update_avg_queue_size+0x2d/0x100 [ 7.404033] softirqs last disabled at (2011): [<41831888>] cfqg_stats_update_avg_queue_size+0x18/0x100 [ 7.404033] [ 7.404033] other info that might help us debug this: [ 7.404033] Possible unsafe locking scenario: [ 7.404033] [ 7.404033] CPU0 [ 7.404033] ---- [ 7.404033] lock(&(&q->__queue_lock)->rlock); [ 7.404033] [ 7.404033] lock(&(&q->__queue_lock)->rlock); [ 7.404033] [ 7.404033] *** DEADLOCK *** [ 7.404033] [ 7.404033] 2 locks held by mount/264: [ 7.404033] #0: (&type->s_umount_key#19/1){+.+.+.}, at: [<411ff563>] sget+0x2f3/0x530 [ 7.404033] #1: (&(&q->__queue_lock)->rlock){+.?...}, at: [<418104cb>] blk_queue_bio+0x3bb/0x4e0 [ 7.404033] [ 7.404033] stack backtrace: [ 7.404033] CPU: 0 PID: 264 Comm: mount Tainted: G W 3.12.0-02795-g2c57502 #16 [ 7.404033] 43159a40 511d19d4 420d4200 511d1a10 420b10dc 426e18ee 426e1c9c 00000108 [ 7.404033] 00000000 00000000 00000000 00000000 00000001 00000001 426e1c9c 00000006 [ 7.404033] 51208484 410f16f0 511d1a40 410f26bd 00000006 00000001 0000009c 511d1a44 [ 7.404033] Call Trace: [ 7.404033] [<420d4200>] dump_stack+0x16/0x18 [ 7.404033] [<420b10dc>] print_usage_bug+0x1d1/0x1db [ 7.404033] [<410f16f0>] ? check_usage_forwards+0xe0/0xe0 [ 7.404033] [<410f26bd>] mark_lock+0x18d/0x2e0 [ 7.404033] [<410f2911>] mark_held_locks+0x101/0x110 [ 7.404033] [<4109f33f>] ? print_oops_end_marker+0x2f/0x40 [ 7.404033] [<4109f3ec>] ? warn_slowpath_common+0x9c/0xb0 [ 7.404033] [<410a693b>] ? local_bh_enable+0xfb/0x1e0 [ 7.404033] [<410f2c8c>] trace_hardirqs_on_caller+0x17c/0x220 [ 7.404033] [<4183189d>] ? cfqg_stats_update_avg_queue_size+0x2d/0x100 [ 7.404033] [<410f2d3b>] trace_hardirqs_on+0xb/0x10 [ 7.404033] [<410a693b>] local_bh_enable+0xfb/0x1e0 [ 7.404033] [<4183189d>] cfqg_stats_update_avg_queue_size+0x2d/0x100 [ 7.404033] [<41833f4a>] __cfq_set_active_queue+0x15a/0x210 [ 7.404033] [<418300d9>] ? cfq_group_service_tree_add+0x199/0x260 [ 7.404033] [<41831f84>] ? cfq_service_tree_add+0x404/0x4f0 [ 7.404033] [<418320a9>] ? cfq_resort_rr_list+0x39/0x40 [ 7.404033] [<41832fff>] ? cfq_add_cfqq_rr+0x16f/0x1c0 [ 7.404033] [<4183a014>] ? cfq_update_idle_window.isra.78+0x84/0x3a0 [ 7.404033] [<41836b4c>] cfq_select_queue+0x7ec/0xa90 [ 7.404033] [<4183988f>] cfq_dispatch_requests+0x2bf/0x9c0 [ 7.404033] [<410833ec>] ? pvclock_clocksource_read+0xfc/0x240 [ 7.404033] [<410822f3>] ? kvm_clock_read+0x13/0x20 [ 7.404033] [<4183a702>] ? cfq_insert_request+0x3d2/0x8b0 [ 7.404033] [<410e0fc3>] ? sched_clock_local.constprop.2+0x43/0x190 [ 7.404033] [<410f5bd3>] ? __lock_acquire+0x1113/0x11a0 [ 7.404033] [<410f2fcb>] ? trace_hardirqs_off+0xb/0x10 [ 7.404033] [<4180f572>] blk_peek_request+0x302/0x450 [ 7.404033] [<4183fa06>] ? kobject_get+0xd6/0x100 [ 7.404033] [<419f1e08>] scsi_request_fn+0x68/0x8d0 [ 7.404033] [<4180b322>] __blk_run_queue_uncond+0x42/0x50 [ 7.404033] [<4180b370>] __blk_run_queue+0x40/0x50 [ 7.404033] [<418104ed>] blk_queue_bio+0x3dd/0x4e0 [ 7.404033] [<4180d1b2>] generic_make_request+0x102/0x180 [ 7.404033] [<4180d3a7>] submit_bio+0x177/0x250 [ 7.404033] [<412459da>] ? __find_get_block+0x2da/0x390 [ 7.404033] [<4124b46d>] ? bio_alloc_bioset+0xfd/0x340 [ 7.404033] [<4124673c>] _submit_bh+0x2dc/0x320 [ 7.404033] [<4124a595>] __bread+0x85/0x1f0 [ 7.404033] [<41301fb9>] ext3_fill_super+0x1a9/0x1d10 [ 7.404033] [<4184c428>] ? snprintf+0x18/0x20 [ 7.404033] [<41301e10>] ? ext3_setup_super+0x340/0x340 [ 7.404033] [<4120090d>] mount_bdev+0x1ed/0x2c0 [ 7.404033] [<41301e10>] ? ext3_setup_super+0x340/0x340 [ 7.404033] [<411debb0>] ? __kmalloc_track_caller+0xd0/0x2a0 [ 7.404033] [<412fc2bf>] ext3_mount+0x1f/0x30 [ 7.404033] [<41301e10>] ? ext3_setup_super+0x340/0x340 [ 7.404033] [<41200d27>] mount_fs+0x47/0x1f0 [ 7.404033] [<4122a0a5>] ? alloc_vfsmnt+0x175/0x1c0 [ 7.404033] [<4122b633>] vfs_kern_mount+0xa3/0x1a0 [ 7.404033] [<420e8122>] ? _raw_read_unlock+0x22/0x30 [ 7.404033] [<4122e786>] do_mount+0xdb6/0x11c0 [ 7.404033] [<4122d9ac>] ? copy_mount_string+0x5c/0x80 [ 7.404033] [<4122f003>] SyS_mount+0xf3/0x120 [ 7.404033] [<420e8e8c>] syscall_call+0x7/0xb [ 7.789066] qnx6: unable to read the first superblock [ 7.791067] UDF-fs: warning (device sda): udf_fill_super: No partition found (2) [ 7.795064] NILFS: Can't find nilfs on dev sda. [ 7.795738] BeFS(sda): No write support. Marking filesystem read-only [ 7.797064] BeFS(sda): invalid magic header git bisect start v3.13 v3.12 -- git bisect good 3bad8bb5cd3048a67df43ac6b1e2f191f19d9ff0 # 07:29 330+ 207 Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 git bisect bad dd0508093b79141e0044ca02f0acb6319f69f546 # 07:33 266- 234 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect bad 3f02ff5c2c69753666787ed125708d283a823ffb # 07:45 103- 70 Merge tag 'tty-3.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty git bisect good 1ab231b274ba51a54acebec23c6aded0f3cdf54e # 07:59 330+ 0 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect bad 5ee540613db504a10e15fafaf4c08cac96aa1823 # 08:18 53- 1 Merge branch 'for-linus' of git://git.kernel.dk/linux-block git bisect good 53c6de50262a8edd6932bb59a32db7b9d92f8d67 # 08:24 330+ 0 Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect good 59fb2f0e9e30ad99a8bab0ff1efaf8f4a3b7105f # 08:58 330+ 130 Merge tag 'fbdev-fixes-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux git bisect good ef1e4e32d595d3e6c9a6d3d2956f087d5886c5e5 # 09:06 330+ 87 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs git bisect good 29be6345bbaec8502a70c4e2204d5818b48c4e8f # 09:10 330+ 163 Merge tag 'nfs-for-3.13-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs git bisect bad e345d767f6530ec9cb0aabab7ea248072a9c6975 # 09:18 228- 125 Merge branch 'stable/for-jens-3.13-take-two' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip into for-linus git bisect bad c170bbb45febc03ac4d34ba2b8bb55e06104b7e7 # 09:21 0- 20 block: submit_bio_wait() conversions git bisect bad 2c575026fae6e63771bd2a4c1d407214a8096a89 # 09:24 0- 1 Update of blkg_stat and blkg_rwstat may happen in bh context. While u64_stats_fetch_retry is only preempt_disable on 32bit UP system. This is not enough to avoid preemption by bh and may read strange 64 bit value. # first bad commit: [2c575026fae6e63771bd2a4c1d407214a8096a89] Update of blkg_stat and blkg_rwstat may happen in bh context. While u64_stats_fetch_retry is only preempt_disable on 32bit UP system. This is not enough to avoid preemption by bh and may read strange 64 bit value. git bisect good 82023bb7f75b0052f40d3e74169d191c3e4e6286 # 09:32 990+ 494 Merge tag 'pm+acpi-2-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm git bisect bad 62287f766a9198a878852be74b35cc8a979c6b25 # 09:32 0- 11 0day head guard for 'devel-hourly-2014082110' git bisect bad 5317821c08533e5f42f974e4e68e092beaf099b1 # 09:42 43- 40 Merge branch 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata git bisect bad deb9705745a63948e0a147713d39ed2aaaac97d7 # 09:47 19- 16 Add linux-next specific files for 20140822 This script may reproduce the error. ---------------------------------------------------------------------------- #!/bin/bash kernel=$1 initrd=yocto-minimal-i386.cgz wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd kvm=( qemu-system-x86_64 -cpu kvm64 -enable-kvm -kernel $kernel -initrd $initrd -m 320 -smp 1 -net nic,vlan=1,model=e1000 -net user,vlan=1 -boot order=nc -no-reboot -watchdog i6300esb -rtc base=localtime -serial stdio -display none -monitor null ) append=( hung_task_panic=1 earlyprintk=ttyS0,115200 debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal root=/dev/ram0 rw drbd.minor_count=8 ) "${kvm[@]}" --append "${append[*]}" ---------------------------------------------------------------------------- Thanks, Fengguang