From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shaohua Li Subject: Re: [BUG] MD/RAID1 hung forever on bitmap_startwrite+0x122 Date: Tue, 29 Nov 2016 16:08:38 -0800 Message-ID: <20161130000838.ahkicokl6ccbrf5x@kernel.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-raid-owner@vger.kernel.org To: Jinpu Wang Cc: linux-raid@vger.kernel.org, NeilBrown List-Id: linux-raid.ids On Mon, Nov 28, 2016 at 09:45:07AM +0100, Jinpu Wang wrote: > Hi folks, > > We hit another hung task on bitmap_startwrite with our test machines, this time > it's hung in bitmap_startwrite. > > We build MD/RAID1 over 2 block devices exported via IB, > bitmap=internal. KVM build on top of > RAID1 on compute node, disks are on remote storage node, one storage > node crash/reboot, multiple raid1 on multiple compute node KVM run > into hung task: > > [106204.343870] INFO: task kvm:37669 blocked for more than 180 seconds. > > [106204.344138] Tainted: G IO 4.4.28-1-pserver #1 > > [106204.344385] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > > [106204.344798] kvm D ffff882037723710 0 37669 1 0x00000000 > > [106204.344805] ffff882037723710 ffff882038f08d00 ffff882029770d00 > ffff8820377236d8 > > [106204.344809] ffff8820377236d8 ffff882037724000 0000000000308648 > 0000000000000008 > > [106204.344813] ffff880f9bd9e8c0 ffff882037723768 ffff882037723728 > ffffffff81811c60 > > [106204.344818] Call Trace: > > [106204.344831] [] schedule+0x30/0x80 > > [106204.344841] [] bitmap_startwrite+0x122/0x190 [md_mod] > > [106204.344848] [] ? bio_clone_bioset+0x11b/0x310 > > [106204.344853] [] ? wait_woken+0x80/0x80 > > [106204.344859] [] 0xffffffffa0cc5127 > > [106204.344865] [] md_set_array_sectors+0xac3/0xe20 [md_mod] > > [106204.344871] [] ? generic_make_request_checks+0x234/0x4c0 > > [106204.344875] [] blk_prologue_bio+0x91/0xc0 > > [106204.344879] [] generic_make_request+0xfe/0x1e0 > > [106204.344883] [] submit_bio+0x62/0x150 > > [106204.344892] [] do_blockdev_direct_IO+0x2317/0x2ba0 > > [106204.344897] [] ? __remove_hrtimer+0x89/0xa0 > > [106204.344903] [] ? udp_poll+0x1f/0xb0 > > [106204.344908] [] ? sock_poll+0x57/0x120 > > [106204.344913] [] ? I_BDEV+0x10/0x10 > > [106204.344918] [] __blockdev_direct_IO+0x3e/0x40 > > [106204.344922] [] blkdev_direct_IO+0x47/0x50 > > [106204.344930] [] generic_file_direct_write+0xb0/0x170 > > [106204.344934] [] __generic_file_write_iter+0xcd/0x1f0 > > [106204.344943] [] ? kmem_cache_free+0x78/0x190 > > [106204.344948] [] ? bd_unlink_disk_holder+0xf0/0xf0 > > [106204.344952] [] blkdev_write_iter+0x87/0x110 > > [106204.344956] [] ? bd_unlink_disk_holder+0xf0/0xf0 > > [106204.344962] [] aio_run_iocb+0x236/0x2a0 > > [106204.344966] [] ? eventfd_ctx_read+0x53/0x200 > > [106204.344973] [] ? __fget_light+0x1f/0x60 > > [106204.344976] [] ? __fdget+0xe/0x10 > > [106204.344980] [] do_io_submit+0x23a/0x4d0 > > [106204.344985] [] SyS_io_submit+0xb/0x10 > > [106204.344989] [] entry_SYSCALL_64_fastpath+0x12/0x6a > > [106384.345330] INFO: task kvm:37669 blocked for more than 180 seconds. > > [106384.345621] Tainted: G IO 4.4.28-1-pserver #1 > > [106384.345866] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > > [106384.346275] kvm D ffff882037723710 0 37669 1 0x00000000 > > [106384.346282] ffff882037723710 ffff882038f08d00 ffff882029770d00 > ffff8820377236d8 > > [106384.346286] ffff8820377236d8 ffff882037724000 0000000000308648 > 0000000000000008 > > [106384.346290] ffff880f9bd9e8c0 ffff882037723768 ffff882037723728 > ffffffff81811c60 > > [106384.346294] Call Trace: > > [106384.346308] [] schedule+0x30/0x80 > > [106384.346317] [] bitmap_startwrite+0x122/0x190 [md_mod] > > [106384.346325] [] ? bio_clone_bioset+0x11b/0x310 > > [106384.346330] [] ? wait_woken+0x80/0x80 > > [106384.346336] [] 0xffffffffa0cc5127 > > [106384.346341] [] md_set_array_sectors+0xac3/0xe20 [md_mod] > > [106384.346347] [] ? generic_make_request_checks+0x234/0x4c0 > > [106384.346352] [] blk_prologue_bio+0x91/0xc0 > > [106384.346356] [] generic_make_request+0xfe/0x1e0 > > [106384.346360] [] submit_bio+0x62/0x150 > > [106384.346369] [] do_blockdev_direct_IO+0x2317/0x2ba0 > > > (gdb) l *bitmap_startwrite+0x122 > > 0x121d2 is in bitmap_startwrite (drivers/md/bitmap.c:1396). > > > > 1394 if (unlikely(COUNTER(*bmc) == COUNTER_MAX)) { > > 1395 DEFINE_WAIT(__wait); > > 1396 /* note that it is safe to do the prepare_to_wait > > 1397 * after the test as long as we do it > before dropping > > 1398 * the spinlock. > > 1399 */ > > 1400 prepare_to_wait(&bitmap->overflow_wait, &__wait, > > 1401 TASK_UNINTERRUPTIBLE); > > 1402 spin_unlock_irq(&bitmap->counts.lock); > > 1403 schedule(); > > 1404 finish_wait(&bitmap->overflow_wait, &__wait); > > 1405 continue; > > 1406 } > > so seem KVM is waiting on overflow_wait queue, but somehow no one wake > him up. During reboot one storage, raid1 has a lot IO errors in that > time, I guess some error handle part is broken. > > I haven't have a reproducer, just want to report to community, in case > this is known bug, or anyone has patch already :) Does the kernel report the raid disk is faulty and gets removed? Is this a real hang, eg maybe we are waitting for IO error reported.