* [Qemu-devel] Data corruption in Qemu 2.7.1 @ 2017-01-13 10:44 Peter Lieven 2017-01-17 6:40 ` Fam Zheng 2017-01-17 7:33 ` [Qemu-devel] " Alexandre DERUMIER 0 siblings, 2 replies; 15+ messages in thread From: Peter Lieven @ 2017-01-13 10:44 UTC (permalink / raw) To: qemu-devel@nongnu.org; +Cc: qemu-stable Hi, i currently facing a problem in our testing environment where I see file system corruption with 2.7.1 on iSCSI and Local Storage (LVM). Trying to bisect, but has anyone observed this before? Thanks, Peter ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Data corruption in Qemu 2.7.1 2017-01-13 10:44 [Qemu-devel] Data corruption in Qemu 2.7.1 Peter Lieven @ 2017-01-17 6:40 ` Fam Zheng 2017-01-17 10:14 ` [Qemu-devel] [Qemu-stable] " Peter Lieven 2017-01-17 7:33 ` [Qemu-devel] " Alexandre DERUMIER 1 sibling, 1 reply; 15+ messages in thread From: Fam Zheng @ 2017-01-17 6:40 UTC (permalink / raw) To: Peter Lieven; +Cc: qemu-devel@nongnu.org, qemu-stable On Fri, 01/13 11:44, Peter Lieven wrote: > Hi, > > i currently facing a problem in our testing environment where I see file > system corruption with 2.7.1 on iSCSI and Local Storage (LVM). > Trying to bisect, but has anyone observed this before? The information here is too scarce to tell but a file corruption is more often a result of two writers modifying the disk concurrently. Have you ruled that out? Is the corruption reproducible? Fam ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1 2017-01-17 6:40 ` Fam Zheng @ 2017-01-17 10:14 ` Peter Lieven 0 siblings, 0 replies; 15+ messages in thread From: Peter Lieven @ 2017-01-17 10:14 UTC (permalink / raw) To: Fam Zheng; +Cc: qemu-devel@nongnu.org, qemu-stable Am 17.01.2017 um 07:40 schrieb Fam Zheng: > On Fri, 01/13 11:44, Peter Lieven wrote: >> Hi, >> >> i currently facing a problem in our testing environment where I see file >> system corruption with 2.7.1 on iSCSI and Local Storage (LVM). >> Trying to bisect, but has anyone observed this before? > The information here is too scarce to tell but a file corruption is more often a > result of two writers modifying the disk concurrently. Have you ruled that out? > Is the corruption reproducible? My issue was primary with iSCSI and I cut bisect it. I already send a patch to the list: commit 0bd57e907311be6e4f97394cfd9afebe271457e2 Author: Peter Lieven <pl@kamp.de> Date: Mon Jan 16 16:10:26 2017 +0100 block/iscsi: avoid data corruption with cache=writeback nb_cls_shrunk in iscsi_allocmap_update can become -1 if the request starts and ends within the same cluster. This results in passing -1 to bitmap_set and bitmap_clear and they don't handle negative values properly. In the end this leads to data corruption. Fixes: e1123a3b40a1a9a625a29c8ed4debb7e206ea690 Cc: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> However, one user also reported corruption with LVM. I will check if he uses virtio-scsi. Thanks, Peter ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Data corruption in Qemu 2.7.1 2017-01-13 10:44 [Qemu-devel] Data corruption in Qemu 2.7.1 Peter Lieven 2017-01-17 6:40 ` Fam Zheng @ 2017-01-17 7:33 ` Alexandre DERUMIER 2017-01-17 8:03 ` [Qemu-devel] [Qemu-stable] " Fabian Grünbichler 1 sibling, 1 reply; 15+ messages in thread From: Alexandre DERUMIER @ 2017-01-17 7:33 UTC (permalink / raw) To: Peter Lieven; +Cc: qemu-devel, qemu-stable Hi, proxmox users have reported recently corruption with qemu 2.7 and scsi-block (with passing physical /dev/sdX to virtio-scsi). working fine with qemu 2.6. qemu 2.7 + scsi-hd works fine https://forum.proxmox.com/threads/proxmox-4-4-virtio_scsi-regression.31471/page-2 ----- Mail original ----- De: "Peter Lieven" <pl@kamp.de> À: "qemu-devel" <qemu-devel@nongnu.org> Cc: "qemu-stable" <qemu-stable@nongnu.org> Envoyé: Vendredi 13 Janvier 2017 11:44:32 Objet: [Qemu-devel] Data corruption in Qemu 2.7.1 Hi, i currently facing a problem in our testing environment where I see file system corruption with 2.7.1 on iSCSI and Local Storage (LVM). Trying to bisect, but has anyone observed this before? Thanks, Peter ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1 2017-01-17 7:33 ` [Qemu-devel] " Alexandre DERUMIER @ 2017-01-17 8:03 ` Fabian Grünbichler 2017-01-17 10:41 ` Paolo Bonzini 0 siblings, 1 reply; 15+ messages in thread From: Fabian Grünbichler @ 2017-01-17 8:03 UTC (permalink / raw) To: Alexandre DERUMIER; +Cc: Peter Lieven, qemu-devel, qemu-stable, Paolo Bonzini On Tue, Jan 17, 2017 at 08:33:46AM +0100, Alexandre DERUMIER wrote: > Hi, > > proxmox users have reported recently corruption with qemu 2.7 and scsi-block (with passing physical /dev/sdX to virtio-scsi). > > working fine with qemu 2.6. > > qemu 2.7 + scsi-hd works fine > > https://forum.proxmox.com/threads/proxmox-4-4-virtio_scsi-regression.31471/page-2 > I am fairly sure this is a separate issue. Commit 8fdc7839e40f43a426bc7e858cf1dbfe315a3804 (first included in 2.7.0)[1] changed the behaviour of scsi-block passthrough. Previously this worked with SATA disks, now it doesn't anymore. A bisect run confirmed this, scsi-block with a SATA disk passed through via virtio-scsi-single corrupts on writes since that commit, scsi-hd and scsi-disk work fine (scsi-generic corrupts as well). PVE's detection logic for passthrough just differentiated between disks and tape drives, and unfortunately the SG_IO ioctl says SATA disks are disks as well.. we probably need to default to scsi-hd or scsi-disk instead of scsi-block, and only when we explicitly detect a "real" SCSI disk we are allowed to use scsi-block? @Paolo: was the old behaviour just an accident and the new bevaviour intentional? documentation is quite sparse, or maybe I am looking in the wrong places.. 1: scsi-block: always use SG_IO Using pread/pwrite or io_submit has the advantage of eliminating the bounce buffer, but drops the SCSI status. This keeps the guest from seeing unit attention codes, as well as statuses such as RESERVATION CONFLICT. Because we know scsi-block operates on an SBC device we can still use the DMA helpers with SG_IO; just remember to patch the CDBs if the transfer is split into multiple segments. This means that scsi-block will always use the thread-pool unfortunately, instead of respecting aio=native. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1 2017-01-17 8:03 ` [Qemu-devel] [Qemu-stable] " Fabian Grünbichler @ 2017-01-17 10:41 ` Paolo Bonzini 2017-01-17 11:22 ` Fabian Grünbichler 0 siblings, 1 reply; 15+ messages in thread From: Paolo Bonzini @ 2017-01-17 10:41 UTC (permalink / raw) To: Fabian Grünbichler, Alexandre DERUMIER Cc: Peter Lieven, qemu-devel, qemu-stable On 17/01/2017 09:03, Fabian Grünbichler wrote: > Commit 8fdc7839e40f43a426bc7e858cf1dbfe315a3804 (first included in > 2.7.0)[1] changed the behaviour of scsi-block passthrough. Previously > this worked with SATA disks, now it doesn't anymore. A bisect run > confirmed this, scsi-block with a SATA disk passed through via > virtio-scsi-single corrupts on writes since that commit, scsi-hd and > scsi-disk work fine (scsi-generic corrupts as well). > > PVE's detection logic for passthrough just differentiated between disks > and tape drives, and unfortunately the SG_IO ioctl says SATA disks are > disks as well.. we probably need to default to scsi-hd or scsi-disk > instead of scsi-block, and only when we explicitly detect a "real" SCSI > disk we are allowed to use scsi-block? > > @Paolo: was the old behaviour just an accident and the new bevaviour > intentional? documentation is quite sparse, or maybe I am looking in the > wrong places.. No, it would be a bug (QEMU or kernel). Do you have an easy reproducer with dd, as suggested at https://forum.proxmox.com/threads/proxmox-4-4-virtio_scsi-regression.31471/page-2? Paolo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1 2017-01-17 10:41 ` Paolo Bonzini @ 2017-01-17 11:22 ` Fabian Grünbichler 2017-01-17 15:03 ` Paolo Bonzini 0 siblings, 1 reply; 15+ messages in thread From: Fabian Grünbichler @ 2017-01-17 11:22 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Alexandre DERUMIER, Peter Lieven, qemu-devel, qemu-stable On Tue, Jan 17, 2017 at 11:41:44AM +0100, Paolo Bonzini wrote: > > > On 17/01/2017 09:03, Fabian Grünbichler wrote: > > Commit 8fdc7839e40f43a426bc7e858cf1dbfe315a3804 (first included in > > 2.7.0)[1] changed the behaviour of scsi-block passthrough. Previously > > this worked with SATA disks, now it doesn't anymore. A bisect run > > confirmed this, scsi-block with a SATA disk passed through via > > virtio-scsi-single corrupts on writes since that commit, scsi-hd and > > scsi-disk work fine (scsi-generic corrupts as well). > > > > PVE's detection logic for passthrough just differentiated between disks > > and tape drives, and unfortunately the SG_IO ioctl says SATA disks are > > disks as well.. we probably need to default to scsi-hd or scsi-disk > > instead of scsi-block, and only when we explicitly detect a "real" SCSI > > disk we are allowed to use scsi-block? > > > > @Paolo: was the old behaviour just an accident and the new bevaviour > > intentional? documentation is quite sparse, or maybe I am looking in the > > wrong places.. > > No, it would be a bug (QEMU or kernel). > > Do you have an easy reproducer with dd, as suggested at > https://forum.proxmox.com/threads/proxmox-4-4-virtio_scsi-regression.31471/page-2? > > Paolo > setup 1) dd 1G of (u)random data to a file on a file system on a SATA disk on the host (I tested with ext2/3/4, ZFS and btrfs, btrfs produces corruption the fastest here, but raw writes to the device should already trigger the kernel error messages) 2) calculate the md5sum of this file, and store it on the same FS test (with current qemu master, using PVE's 4.4.35 kernel as host kernel 1) start Ubuntu 16.04 VM (see full commandline at the end, I tested with three disks, one for each of the file systems above) - the same behaviour can also be observed when booting from an Alpine Linux iso 2) mount FS 3) dd random file from above to new file on same FS 4) sync 5) check md5sum of resulting file 6) repeat 3-5 until md5sum does not match, kernel spews error messages, or you are convinced that everything is OK sample kernel message (for ext3): Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Sense Key : Illegal Request [current] Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Add. Sense: Invalid field in cdb Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 CDB: Write(10) 2a 00 0f 3a 90 00 00 07 d8 00 Jan 17 11:39:32 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 255496192 Jan 17 11:39:32 ubuntu kernel: EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0 starting block 31937280) Jan 17 11:39:32 ubuntu kernel: buffer_io_error: 246 callbacks suppressed Jan 17 11:39:32 ubuntu kernel: Buffer I/O error on device sda1, logical block 31936768 Jan 17 11:39:32 ubuntu kernel: Buffer I/O error on device sda1, logical block 31936769 Jan 17 11:39:32 ubuntu kernel: Buffer I/O error on device sda1, logical block 31936770 Jan 17 11:39:32 ubuntu kernel: Buffer I/O error on device sda1, logical block 31936771 Jan 17 11:39:32 ubuntu kernel: Buffer I/O error on device sda1, logical block 31936772 Jan 17 11:39:32 ubuntu kernel: Buffer I/O error on device sda1, logical block 31936773 Jan 17 11:39:32 ubuntu kernel: Buffer I/O error on device sda1, logical block 31936774 Jan 17 11:39:32 ubuntu kernel: Buffer I/O error on device sda1, logical block 31936775 Jan 17 11:39:32 ubuntu kernel: Buffer I/O error on device sda1, logical block 31936776 Jan 17 11:39:32 ubuntu kernel: Buffer I/O error on device sda1, logical block 31936777 Jan 17 11:39:39 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 Jan 17 11:39:41 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 Jan 17 11:39:55 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 Jan 17 11:39:56 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 Jan 17 11:40:07 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 Jan 17 11:40:08 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 Jan 17 11:40:15 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 Jan 17 11:40:22 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 qemu commandline, originally generated by PVE, feel free to adapt or minimize.. sdb, sdc and sde are the three SATA disks from above, vm-101-disk-1 is the Ubuntu rootfs on LVM-thin. /root/qemu/build/x86_64-softmmu/qemu-system-x86_64 \ -enable-kvm \ -chardev 'socket,id=qmp,path=/var/run/qemu-server/101.qmp,server,nowait' \ -mon 'chardev=qmp,mode=control' \ -pidfile /var/run/qemu-server/101.pid \ -smbios 'type=1,uuid=3e550136-9d7f-4a12-9d98-e25a5d2d5806' \ -name diskpassthroughtest \ -smp '8,sockets=1,cores=8,maxcpus=8' \ -nodefaults \ -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \ -vga cirrus \ -vnc unix:/var/run/qemu-server/101.vnc,x509,password \ -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce \ -m 4096 \ -k de \ -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \ -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \ -device 'pci-bridge,id=pci.3,chassis_nr=3,bus=pci.0,addr=0x5' \ -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \ -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:c6676e1b1f72' \ -drive 'file=/mnt/pve/iso/template/iso/ubuntu-16.04.1-server-amd64.iso,if=none,id=drive-ide2,media=cdrom,aio=threads' \ -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' \ -device 'virtio-scsi-pci,id=virtioscsi0,bus=pci.3,addr=0x1' \ -drive 'file=/dev/sdb,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' \ -device 'scsi-block,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0' \ -device 'virtio-scsi-pci,id=virtioscsi1,bus=pci.3,addr=0x2' \ -drive 'file=/dev/sdc,if=none,id=drive-scsi1,format=raw,cache=none,aio=native,detect-zeroes=on' \ -device 'scsi-block,bus=virtioscsi1.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi1,id=scsi1' \ -device 'virtio-scsi-pci,id=virtioscsi3,bus=pci.3,addr=0x4' \ -drive 'file=/dev/sde,if=none,id=drive-scsi3,format=raw,cache=none,aio=native,detect-zeroes=on' \ -device 'scsi-block,bus=virtioscsi3.0,channel=0,scsi-id=0,lun=3,drive=drive-scsi3,id=scsi3' \ -drive 'file=/dev/pve/vm-101-disk-1,if=none,id=drive-virtio0,format=raw,cache=none,aio=native,detect-zeroes=on' \ -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=103' \ -netdev 'type=tap,id=net0,ifname=tap101i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \ -device 'virtio-net-pci,mac=46:83:4B:92:EC:88,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1 2017-01-17 11:22 ` Fabian Grünbichler @ 2017-01-17 15:03 ` Paolo Bonzini 2017-01-17 16:24 ` Paolo Bonzini 0 siblings, 1 reply; 15+ messages in thread From: Paolo Bonzini @ 2017-01-17 15:03 UTC (permalink / raw) To: Fabian Grünbichler Cc: Alexandre DERUMIER, Peter Lieven, qemu-devel, qemu-stable On 17/01/2017 12:22, Fabian Grünbichler wrote: > 6) repeat 3-5 until md5sum does not match, kernel spews error > messages, or you are convinced that everything is OK > > sample kernel message (for ext3): > Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Sense Key : Illegal Request [current] > Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Add. Sense: Invalid field in cdb > Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 CDB: Write(10) 2a 00 0f 3a 90 00 00 07 d8 00 > Jan 17 11:39:32 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 255496192 Can you reproduce it if QEMU runs under "strace -e ioctl -ff" in the host? Or also using this systemtap script. The important bit would be the lines with a nonzero status, but the others can be useful to see what the surroundings look like. # example output for "sudo stap -v strace.stp -c 'sg_opcodes /dev/sda'" # | sg_opcodes[3444] 00000000 12 00 00 00 24 00 00 00 be 91 # | sg_opcodes[3444] 08100002 a3 0c 00 00 00 00 00 00 20 00 global cdbs% global reqs% global names% function check_pid() { return target() == 0 || pid() == target(); } probe kernel.function("blk_fill_sghdr_rq") { if (!check_pid()) next; names[$rq]=sprintf("%s[%d]", execname(), tid()) cdbs[$rq]=sprintf("%02x %02x %02x %02x %02x %02x %02x %02x %02x %02x", $hdr->cmdp[0],$hdr->cmdp[1],$hdr->cmdp[2],$hdr->cmdp[3],$hdr->cmdp[4], $hdr->cmdp[6],$hdr->cmdp[5],$hdr->cmdp[7],$hdr->cmdp[8],$hdr->cmdp[9]) } probe kernel.function("scsi_setup_cmnd") { if (!($req in cdbs)) next; reqs[$req->special] = $req; } probe kernel.function("scsi_finish_command") { if (!($cmd in reqs)) next; rq = reqs[$cmd]; printf("%s %08x %s\n", names[rq], $cmd->result, cdbs[rq]); delete reqs[$cmd] delete cdbs[rq] } ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1 2017-01-17 15:03 ` Paolo Bonzini @ 2017-01-17 16:24 ` Paolo Bonzini 2017-01-18 11:50 ` Fabian Grünbichler 0 siblings, 1 reply; 15+ messages in thread From: Paolo Bonzini @ 2017-01-17 16:24 UTC (permalink / raw) To: Fabian Grünbichler Cc: Peter Lieven, qemu-devel, Alexandre DERUMIER, qemu-stable On 17/01/2017 16:03, Paolo Bonzini wrote: > > > On 17/01/2017 12:22, Fabian Grünbichler wrote: >> 6) repeat 3-5 until md5sum does not match, kernel spews error >> messages, or you are convinced that everything is OK >> >> sample kernel message (for ext3): >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Sense Key : Illegal Request [current] >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Add. Sense: Invalid field in cdb >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 CDB: Write(10) 2a 00 0f 3a 90 00 00 07 d8 00 >> Jan 17 11:39:32 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 255496192 > > Can you reproduce it if QEMU runs under "strace -e ioctl -ff" in the > host? Or also using this systemtap script. > > The important bit would be the lines with a nonzero status, but the > others can be useful to see what the surroundings look like. > > # example output for "sudo stap -v strace.stp -c 'sg_opcodes /dev/sda'" > # | sg_opcodes[3444] 00000000 12 00 00 00 24 00 00 00 be 91 > # | sg_opcodes[3444] 08100002 a3 0c 00 00 00 00 00 00 20 00 > > global cdbs% > global reqs% > global names% > > function check_pid() { > return target() == 0 || pid() == target(); > } > > probe kernel.function("blk_fill_sghdr_rq") { > if (!check_pid()) next; > > names[$rq]=sprintf("%s[%d]", execname(), tid()) > cdbs[$rq]=sprintf("%02x %02x %02x %02x %02x %02x %02x %02x %02x %02x", > $hdr->cmdp[0],$hdr->cmdp[1],$hdr->cmdp[2],$hdr->cmdp[3],$hdr->cmdp[4], > $hdr->cmdp[6],$hdr->cmdp[5],$hdr->cmdp[7],$hdr->cmdp[8],$hdr->cmdp[9]) > } > > probe kernel.function("scsi_setup_cmnd") { > if (!($req in cdbs)) next; > reqs[$req->special] = $req; > } > > probe kernel.function("scsi_finish_command") { > if (!($cmd in reqs)) next; > rq = reqs[$cmd]; > printf("%s %08x %s\n", names[rq], $cmd->result, cdbs[rq]); > delete reqs[$cmd] > delete cdbs[rq] Please add a "delete names[rq]" here too! Paolo > } > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1 2017-01-17 16:24 ` Paolo Bonzini @ 2017-01-18 11:50 ` Fabian Grünbichler 2017-01-18 16:19 ` Fabian Grünbichler 0 siblings, 1 reply; 15+ messages in thread From: Fabian Grünbichler @ 2017-01-18 11:50 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, Alexandre DERUMIER, qemu-stable On 17/01/2017 16:03, Paolo Bonzini wrote: > On 17/01/2017 12:22, Fabian Grünbichler wrote: >> 6) repeat 3-5 until md5sum does not match, kernel spews error >> messages, or you are convinced that everything is OK >> >> sample kernel message (for ext3): >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Sense Key : Illegal Request [current] >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Add. Sense: Invalid field in cdb >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 CDB: Write(10) 2a 00 0f 3a 90 00 00 07 d8 00 >> Jan 17 11:39:32 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 255496192 > > Can you reproduce it if QEMU runs under "strace -e ioctl -ff" in the > host? Or also using this systemtap script. > > The important bit would be the lines with a nonzero status, but the > others can be useful to see what the surroundings look like. > OT: systemtap is not working with your script under Debian Jessie (or maybe in general under Debian Jessie? not sure). after some further testing it seems like this change in Qemu exposes some subtle issue with our specific kernel (it works fine with the upstream Ubuntu 4.4 one which ours is based on). I am currently debugging further to narrow down potential causes - if I need further input from your side or if I suspect Qemu to be at fault I'll resurrect this thread (and include the strace output). thanks for your quick reaction anyhow! ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1 2017-01-18 11:50 ` Fabian Grünbichler @ 2017-01-18 16:19 ` Fabian Grünbichler 2017-01-18 16:30 ` Paolo Bonzini 0 siblings, 1 reply; 15+ messages in thread From: Fabian Grünbichler @ 2017-01-18 16:19 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, Alexandre DERUMIER, qemu-stable [-- Attachment #1: Type: text/plain, Size: 5664 bytes --] On Wed, Jan 18, 2017 at 12:50:50PM +0100, Fabian Grünbichler wrote: > On 17/01/2017 16:03, Paolo Bonzini wrote: > > On 17/01/2017 12:22, Fabian Grünbichler wrote: > >> 6) repeat 3-5 until md5sum does not match, kernel spews error > >> messages, or you are convinced that everything is OK > >> > >> sample kernel message (for ext3): > >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Sense Key : Illegal Request [current] > >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Add. Sense: Invalid field in cdb > >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 CDB: Write(10) 2a 00 0f 3a 90 00 00 07 d8 00 > >> Jan 17 11:39:32 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 255496192 > > > > Can you reproduce it if QEMU runs under "strace -e ioctl -ff" in the > > host? Or also using this systemtap script. > > > > The important bit would be the lines with a nonzero status, but the > > others can be useful to see what the surroundings look like. > > > > OT: systemtap is not working with your script under Debian Jessie (or > maybe in general under Debian Jessie? not sure). > > after some further testing it seems like this change in Qemu exposes > some subtle issue with our specific kernel (it works fine with the > upstream Ubuntu 4.4 one which ours is based on). I am currently > debugging further to narrow down potential causes - if I need further > input from your side or if I suspect Qemu to be at fault I'll resurrect > this thread (and include the strace output). > > thanks for your quick reaction anyhow! > okay, so this looks like either a bug in Qemu or the upstream kernel. disabling THP on the hypervisor host with # echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled allows reproducing the bug very reliably, shutting the VM down, then enabling THP (with 'always') and trying again makes it go away. Qemu was compiled with: ../configure --with-confsuffix=/kvm --target-list=x86_64-softmmu --disable-xen --enable-gnutls --enable-sdl --enable-uuid --enable-linux-aio --enable-libiscsi --disable-smartcard --audio-drv-list=alsa --enable-spice --enable-usb-redir --enable-libusb --disable-gtk --enable-xfsctl --enable-numa --disable-strip --enable-jemalloc --disable-libnfs --disable-fdt attached is an strace with qemu master and mainline 4.9 running on Debian Jessie - I will try to test it with Fedora or CentOS tomorrow. journal in the VM says the following: Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Sense Key : Illegal Request [current] Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Add. Sense: Invalid field in cdb Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 CDB: Write(10) 2a 00 0d d6 51 48 00 08 00 00 Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 232149320 Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0 starting block 29018921) Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018409 Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018410 Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018411 Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018412 Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018413 Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018414 Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018415 Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018416 Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018417 Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018418 Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Sense Key : Illegal Request [current] Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Add. Sense: Invalid field in cdb Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 CDB: Write(10) 2a 00 0d d6 59 48 00 08 00 00 Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 232151368 Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0 starting block 29019177) Jan 18 17:07:52 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 Jan 18 17:07:58 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 strace (with some random grep-ing): [pid 1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6, 51, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=17, dxfer_len=1048576, timeout=4294967295, flags=0x1, data[1048576]=["\0`\235=c\177\0\0\0\0\1\0\0\0\0\0\0`\236=c\177\0\0\0\0\1\0\0\0\0\0"...]}) = -1 EINVAL (Invalid argument) [pid 1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6, 59, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=16, dxfer_len=1048576, timeout=4294967295, flags=0x1, data[1048576]=["\0`-=c\177\0\0\0\0\1\0\0\0\0\0\0`.=c\177\0\0\0\0\1\0\0\0\0\0"...]}) = -1 EINVAL (Invalid argument) [-- Attachment #2: host-strace.gz --] [-- Type: application/gzip, Size: 314401 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1 2017-01-18 16:19 ` Fabian Grünbichler @ 2017-01-18 16:30 ` Paolo Bonzini 2017-01-18 17:17 ` Fabian Grünbichler 2017-01-19 11:59 ` Fabian Grünbichler 0 siblings, 2 replies; 15+ messages in thread From: Paolo Bonzini @ 2017-01-18 16:30 UTC (permalink / raw) To: Fabian Grünbichler; +Cc: qemu-devel, Alexandre DERUMIER, qemu-stable On 18/01/2017 17:19, Fabian Grünbichler wrote: > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Sense Key : Illegal Request [current] > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Add. Sense: Invalid field in cdb > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 CDB: Write(10) 2a 00 0d d6 51 48 00 08 00 00 > Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 232149320 > Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0 starting block 29018921) > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018409 > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018410 > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018411 > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018412 > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018413 > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018414 > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018415 > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018416 > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018417 > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018418 > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Sense Key : Illegal Request [current] > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Add. Sense: Invalid field in cdb > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 CDB: Write(10) 2a 00 0d d6 59 48 00 08 00 00 > Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 232151368 > Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0 starting block 29019177) > Jan 18 17:07:52 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 > Jan 18 17:07:58 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 > > > strace (with some random grep-ing): > [pid 1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6, 51, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=17, dxfer_len=1048576, timeout=4294967295, flags=0x1, data[1048576]=["\0`\235=c\177\0\0\0\0\1\0\0\0\0\0\0`\236=c\177\0\0\0\0\1\0\0\0\0\0"...]}) = -1 EINVAL (Invalid argument) > [pid 1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6, 59, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=16, dxfer_len=1048576, timeout=4294967295, flags=0x1, data[1048576]=["\0`-=c\177\0\0\0\0\1\0\0\0\0\0\0`.=c\177\0\0\0\0\1\0\0\0\0\0"...]}) = -1 EINVAL (Invalid argument) This is useful, thanks. I suspect blk_rq_map_user_iov is failing, meaning that the scatter/gather list has too many segments for the HBA in the host. (The limit can be found in /sys/block/sda/queue/max_segments). This is consistent with your finding here: > disabling THP on the hypervisor host with > > # echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled > > allows reproducing the bug very reliably, shutting the VM down, then > enabling THP (with 'always') and trying again makes it go away. because no THP means more memory fragmentation and thus more segments. I'm not sure how to fix it, unfortunately. :( Paolo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1 2017-01-18 16:30 ` Paolo Bonzini @ 2017-01-18 17:17 ` Fabian Grünbichler 2017-01-19 11:59 ` Fabian Grünbichler 1 sibling, 0 replies; 15+ messages in thread From: Fabian Grünbichler @ 2017-01-18 17:17 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, Alexandre DERUMIER, qemu-stable > Paolo Bonzini <pbonzini@redhat.com> hat am 18. Januar 2017 um 17:30 geschrieben: > > > > > On 18/01/2017 17:19, Fabian Grünbichler wrote: > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Sense Key : Illegal Request [current] > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Add. Sense: Invalid field in cdb > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 CDB: Write(10) 2a 00 0d d6 51 48 00 08 00 00 > > Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 232149320 > > Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0 starting block 29018921) > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018409 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018410 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018411 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018412 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018413 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018414 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018415 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018416 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018417 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018418 > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Sense Key : Illegal Request [current] > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Add. Sense: Invalid field in cdb > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 CDB: Write(10) 2a 00 0d d6 59 48 00 08 00 00 > > Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 232151368 > > Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0 starting block 29019177) > > Jan 18 17:07:52 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 > > Jan 18 17:07:58 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 > > > > > > strace (with some random grep-ing): > > [pid 1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6, 51, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=17, dxfer_len=1048576, timeout=4294967295, flags=0x1, data[1048576]=["\0`\235=c\177\0\0\0\0\1\0\0\0\0\0\0`\236=c\177\0\0\0\0\1\0\0\0\0\0"...]}) = -1 EINVAL (Invalid argument) > > [pid 1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6, 59, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=16, dxfer_len=1048576, timeout=4294967295, flags=0x1, data[1048576]=["\0`-=c\177\0\0\0\0\1\0\0\0\0\0\0`.=c\177\0\0\0\0\1\0\0\0\0\0"...]}) = -1 EINVAL (Invalid argument) > > This is useful, thanks. I suspect blk_rq_map_user_iov is failing, > meaning that the scatter/gather list has too many segments for the HBA > in the host. (The limit can be found in /sys/block/sda/queue/max_segments). I can try to get some more info tomorrow.. > > This is consistent with your finding here: > > > disabling THP on the hypervisor host with > > > > # echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled > > > > allows reproducing the bug very reliably, shutting the VM down, then > > enabling THP (with 'always') and trying again makes it go away. > > because no THP means more memory fragmentation and thus more segments. > > I'm not sure how to fix it, unfortunately. :( Well at least this means we have a (potentially too conservative) check for deciding when to use scsi-disk instead of scsi-block (maybe this could be detected in qemu as well?). Seems especially troublesome since the (hypervisor) admin can change it at runtime, and it seems like there are widespread recommendations to disable THP for e.g., DB use cases.. > > Paolo > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1 2017-01-18 16:30 ` Paolo Bonzini 2017-01-18 17:17 ` Fabian Grünbichler @ 2017-01-19 11:59 ` Fabian Grünbichler 2017-01-24 9:35 ` Paolo Bonzini 1 sibling, 1 reply; 15+ messages in thread From: Fabian Grünbichler @ 2017-01-19 11:59 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, Alexandre DERUMIER, qemu-stable On Wed, Jan 18, 2017 at 05:30:17PM +0100, Paolo Bonzini wrote: > > > On 18/01/2017 17:19, Fabian Grünbichler wrote: > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Sense Key : Illegal Request [current] > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Add. Sense: Invalid field in cdb > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 CDB: Write(10) 2a 00 0d d6 51 48 00 08 00 00 > > Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 232149320 > > Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0 starting block 29018921) > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018409 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018410 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018411 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018412 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018413 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018414 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018415 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018416 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018417 > > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018418 > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Sense Key : Illegal Request [current] > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Add. Sense: Invalid field in cdb > > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 CDB: Write(10) 2a 00 0d d6 59 48 00 08 00 00 > > Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 232151368 > > Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0 starting block 29019177) > > Jan 18 17:07:52 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 > > Jan 18 17:07:58 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8 > > > > > > strace (with some random grep-ing): > > [pid 1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6, 51, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=17, dxfer_len=1048576, timeout=4294967295, flags=0x1, data[1048576]=["\0`\235=c\177\0\0\0\0\1\0\0\0\0\0\0`\236=c\177\0\0\0\0\1\0\0\0\0\0"...]}) = -1 EINVAL (Invalid argument) > > [pid 1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6, 59, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=16, dxfer_len=1048576, timeout=4294967295, flags=0x1, data[1048576]=["\0`-=c\177\0\0\0\0\1\0\0\0\0\0\0`.=c\177\0\0\0\0\1\0\0\0\0\0"...]}) = -1 EINVAL (Invalid argument) > > This is useful, thanks. I suspect blk_rq_map_user_iov is failing, > meaning that the scatter/gather list has too many segments for the HBA > in the host. (The limit can be found in /sys/block/sda/queue/max_segments). limit is 168 for all the disks I tested with. > > This is consistent with your finding here: > > > disabling THP on the hypervisor host with > > > > # echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled > > > > allows reproducing the bug very reliably, shutting the VM down, then > > enabling THP (with 'always') and trying again makes it go away. > > because no THP means more memory fragmentation and thus more segments. it is also very easily reproducible with both THP enable and defrag set to madvise or always, if tested in fragmented- or low-memory conditions. my test host has 64G of memory, my test VM 4G, huge pages are 2k big if I simulate some memory load by repeatedly reserving 50G memory using stress-ng: # stress-ng --vm 50 --vm-bytes=1G --vm-hang 30 and then start the test VM and the dd-ing, I can see the big chunk of AnonHugePages allocated to the VM system grow: # grep -E 'AnonHugePages:[[:space:]]+[0-9]{5,} kB' /proc/$(pidof qemu-system-x86_64)/smaps up to about 3G (of 4G), and hit the issue. without the additional load and fragmentation using stress-ng, the AnonHugePages allocated to the qemu process grow to the expected 4G, and the issue does not occur. > > I'm not sure how to fix it, unfortunately. :( so this means either use non-transparent huge pages when using scsi-block (haven't verified but should work?), or use aggressive THP settings and/or always leave enough memory reserves? :-/ this is very unfortunate IMHO (and probably also not a very realistic usage scenario?) > > Paolo > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1 2017-01-19 11:59 ` Fabian Grünbichler @ 2017-01-24 9:35 ` Paolo Bonzini 0 siblings, 0 replies; 15+ messages in thread From: Paolo Bonzini @ 2017-01-24 9:35 UTC (permalink / raw) To: Fabian Grünbichler; +Cc: qemu-devel, Alexandre DERUMIER, qemu-stable On 19/01/2017 12:59, Fabian Grünbichler wrote: > > I'm not sure how to fix it, unfortunately. :( > > so this means either use non-transparent huge pages when using > scsi-block (haven't verified but should work?), or use aggressive THP > settings and/or always leave enough memory reserves? :-/ this is very > unfortunate IMHO (and probably also not a very realistic usage > scenario?) Yes, I agree. Unfortunately there is no API in Linux that lets you verify how many segments an iov is split into. Paolo ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2017-01-24 9:35 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-01-13 10:44 [Qemu-devel] Data corruption in Qemu 2.7.1 Peter Lieven 2017-01-17 6:40 ` Fam Zheng 2017-01-17 10:14 ` [Qemu-devel] [Qemu-stable] " Peter Lieven 2017-01-17 7:33 ` [Qemu-devel] " Alexandre DERUMIER 2017-01-17 8:03 ` [Qemu-devel] [Qemu-stable] " Fabian Grünbichler 2017-01-17 10:41 ` Paolo Bonzini 2017-01-17 11:22 ` Fabian Grünbichler 2017-01-17 15:03 ` Paolo Bonzini 2017-01-17 16:24 ` Paolo Bonzini 2017-01-18 11:50 ` Fabian Grünbichler 2017-01-18 16:19 ` Fabian Grünbichler 2017-01-18 16:30 ` Paolo Bonzini 2017-01-18 17:17 ` Fabian Grünbichler 2017-01-19 11:59 ` Fabian Grünbichler 2017-01-24 9:35 ` Paolo Bonzini
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).