From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Snitzer Subject: [REGRESSION][BISECTED] virtio-blk serial attribute causes guest to hang [Was: Re: [PATCH UPDATED 4/5] dm: implement REQ_FLUSH/FUA support for request-based dm] Date: Thu, 9 Sep 2010 11:26:58 -0400 Message-ID: <20100909152658.GA8118@redhat.com> References: <20100902032246.GA31484@redhat.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="7AUc2qLy4jB3hD7Z" Return-path: Content-Disposition: inline In-Reply-To: <20100902032246.GA31484@redhat.com> Sender: kvm-owner@vger.kernel.org To: Tejun Heo Cc: Mikulas Patocka , dm-devel@redhat.com, Vivek Goyal , ryanh@us.ibm.com, john.cooper@redhat.com, rusty@rustcorp.com.au, hch@infradead.org, kvm@vger.kernel.org List-Id: dm-devel.ids --7AUc2qLy4jB3hD7Z Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Sep 01 2010 at 11:22pm -0400, Mike Snitzer wrote: > On Wed, Sep 01 2010 at 2:59pm -0400, > Mike Snitzer wrote: > > > My hope was that the request-based deadlock I'm seeing would disappear > > if that relaxed ordering patch wasn't applied. Unfortunately, I still > > see the hang. > > Turns out I can reproduce the hang on a stock 2.6.36-rc3 (without _any_ > FLUSH+FUA patches)! > > I'll try to pin-point the root cause but I think my test is somehow > exposing a bug in my virt setup. [my virt setup == single kvm guest (RHEL6) with F13 host] My gut turned out to be correct. I finally tracked down the regression point to the following commit (cc'ing appropriate people): commit a5eb9e4ff18a33e43557d44b205f953b0c1efade Author: Ryan Harper Date: Wed Jun 23 22:19:57 2010 -0500 virtio_blk: Add 'serial' attribute to virtio-blk devices (v2) Create a new attribute for virtio-blk devices that will fetch the serial number of the block device. This attribute can be used by udev to create disk/by-id symlinks for devices that don't have a UUID (filesystem) associated with them. ATA_IDENTIFY strings are special in that they can be up to 20 chars long and aren't required to be nul-terminated. The buffer is also zero-padded meaning that if the serial is 19 chars or less that we get a nul-terminated string. When copying this value into a string buffer, we must be careful to copy up to the nul (if it present) and only 20 if it is longer and not to attempt to nul terminate; this isn't needed. Changes since v1: - Added BUILD_BUG_ON() for PAGE_SIZE check - Removed min() since BUILD_BUG_ON() handles the check - Replaced serial_sysfs() by copying id directly to buffer Signed-off-by: Ryan Harper Signed-off-by: john cooper Signed-off-by: Rusty Russell So the first released kernel to have this regression is 2.6.36-rc1. Some background: I have been working with Tejun to test the barrier to FLUSH+FUA conversion patchset. I crafted the attached script to test the DM changes that are part of the FLUSH+FUA patchset. Using this script with: while true ; do ./test_dm_discard_mpath_scsi_debug.sh ; done I can reliably trigger the following hang, always on the 5th iteration in my testing, IFF commit a5eb9e4ff18a33e43557d44b205f953b0c1efade is applied: INFO: task lvcreate:2484 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. lvcreate D 0000000100064871 4960 2484 2350 0x00000080 ffff88007b87b978 0000000000000046 ffff88007b87b8e8 ffff880000000000 ffff88007b87bfd8 ffff8800724fa400 00000000001d4040 ffff88007b87bfd8 00000000001d4040 00000000001d4040 00000000001d4040 00000000001d4040 Call Trace: [] io_schedule+0x73/0xb5 [] get_request_wait+0xf2/0x180 [] ? autoremove_wake_function+0x0/0x39 [] __make_request+0x310/0x434 [] generic_make_request+0x2f1/0x36e [] ? cpu_clock+0x43/0x5e [] submit_bio+0xde/0xfb [] ? trace_hardirqs_on+0xd/0xf [] dio_bio_submit+0x7b/0x9c [] dio_send_cur_page+0x4a/0xb0 [] __blockdev_direct_IO_newtrunc+0x7c5/0x97d [] blkdev_direct_IO+0x57/0x59 [] ? blkdev_get_blocks+0x0/0x90 [] generic_file_aio_read+0xed/0x5b4 [] ? might_fault+0x5c/0xac [] ? pvclock_clocksource_read+0x50/0xb9 [] do_sync_read+0xcb/0x108 [] ? __mutex_unlock_slowpath+0x119/0x12b [] ? trace_hardirqs_on_caller+0x11d/0x141 [] ? trace_hardirqs_on+0xd/0xf [] ? security_file_permission+0x16/0x18 [] vfs_read+0xab/0x108 [] ? trace_hardirqs_on_caller+0x11d/0x141 [] sys_read+0x4a/0x6e [] system_call_fastpath+0x16/0x1b no locks held by lvcreate/2484. lvcreate is just the first victim (sometimes it is the vgcreate). But if the guest is left running other new processes get hung with comparable traces (w/ get_request_wait). Until eventually the guest is completely unresponsive. Mike --7AUc2qLy4jB3hD7Z Content-Type: application/x-sh Content-Disposition: attachment; filename="test_dm_discard_mpath_scsi_debug.sh" Content-Transfer-Encoding: quoted-printable #!/bin/bash=0A=0Aset -xv=0A=0AMNTPT=3D/mnt/test=0A=0Aumount $MNTPT=0Avgchan= ge -an=0Asleep 1=0Amultipath -F=0A/etc/init.d/multipathd stop=0A=0Amodprobe= -r scsi_debug=0Asleep 1=0A=0A# 4K/4K -- UNMAP=0Amodprobe scsi_debug dev_si= ze_mb=3D100 unmap_max_desc=3D16 unmap_granularity=3D2048 sector_size=3D4096= =0A# 512K=0A#modprobe scsi_debug dev_size_mb=3D100 unmap_max_desc=3D16 unma= p_granularity=3D2048=0A=0A/etc/init.d/multipathd restart=0Asleep 3=0Amultip= ath -ll=0A=0ADEVICE=3D/dev/mapper/`multipath -ll | grep scsi_debug | cut -d= ' ' -f1`=0A=0Alvremove discard_vg/lv=0Avgremove discard_vg=0Apvremove $DEVI= CE=0A=0Apvcreate $DEVICE=0Avgcreate discard_vg $DEVICE=0Alvcreate -L 96M -n= lv discard_vg=0Alvchange -ay discard_vg/lv=0Amkfs.ext4 -b 4096 /dev/discar= d_vg/lv =0Amount -o discard /dev/discard_vg/lv $MNTPT=0A=0Async=0A=0Add if= =3D/dev/zero of=3D${MNTPT}/hmm bs=3D384k oflag=3Ddirect count=3D1=0Async; s= ync=0Afilefrag -v ${MNTPT}/hmm=0A=0Arm -f ${MNTPT}/hmm; sync; sync=0A=0Asle= ep 5=0A --7AUc2qLy4jB3hD7Z--