From: Paolo Bonzini <pbonzini@redhat.com>
To: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Cc: qemu-devel <qemu-devel@nongnu.org>,
Alexandre DERUMIER <aderumier@odiso.com>,
qemu-stable <qemu-stable@nongnu.org>
Subject: Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1
Date: Wed, 18 Jan 2017 17:30:17 +0100 [thread overview]
Message-ID: <5de45f2f-d83b-64d2-6406-dc09fd4c455f@redhat.com> (raw)
In-Reply-To: <20170118161941.g72fzppyyc3pdwkf@nora.maurer-it.com>
On 18/01/2017 17:19, Fabian Grünbichler wrote:
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Sense Key : Illegal Request [current]
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Add. Sense: Invalid field in cdb
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 CDB: Write(10) 2a 00 0d d6 51 48 00 08 00 00
> Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 232149320
> Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0 starting block 29018921)
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018409
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018410
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018411
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018412
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018413
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018414
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018415
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018416
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018417
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 29018418
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Sense Key : Illegal Request [current]
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Add. Sense: Invalid field in cdb
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 CDB: Write(10) 2a 00 0d d6 59 48 00 08 00 00
> Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev sda, sector 232151368
> Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0 starting block 29019177)
> Jan 18 17:07:52 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8
> Jan 18 17:07:58 ubuntu kernel: JBD2: Detected IO errors while flushing file data on sda1-8
>
>
> strace (with some random grep-ing):
> [pid 1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6, 51, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=17, dxfer_len=1048576, timeout=4294967295, flags=0x1, data[1048576]=["\0`\235=c\177\0\0\0\0\1\0\0\0\0\0\0`\236=c\177\0\0\0\0\1\0\0\0\0\0"...]}) = -1 EINVAL (Invalid argument)
> [pid 1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6, 59, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=16, dxfer_len=1048576, timeout=4294967295, flags=0x1, data[1048576]=["\0`-=c\177\0\0\0\0\1\0\0\0\0\0\0`.=c\177\0\0\0\0\1\0\0\0\0\0"...]}) = -1 EINVAL (Invalid argument)
This is useful, thanks. I suspect blk_rq_map_user_iov is failing,
meaning that the scatter/gather list has too many segments for the HBA
in the host. (The limit can be found in /sys/block/sda/queue/max_segments).
This is consistent with your finding here:
> disabling THP on the hypervisor host with
>
> # echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled
>
> allows reproducing the bug very reliably, shutting the VM down, then
> enabling THP (with 'always') and trying again makes it go away.
because no THP means more memory fragmentation and thus more segments.
I'm not sure how to fix it, unfortunately. :(
Paolo
next prev parent reply other threads:[~2017-01-18 16:30 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-13 10:44 [Qemu-devel] Data corruption in Qemu 2.7.1 Peter Lieven
2017-01-17 6:40 ` Fam Zheng
2017-01-17 10:14 ` [Qemu-devel] [Qemu-stable] " Peter Lieven
2017-01-17 7:33 ` [Qemu-devel] " Alexandre DERUMIER
2017-01-17 8:03 ` [Qemu-devel] [Qemu-stable] " Fabian Grünbichler
2017-01-17 10:41 ` Paolo Bonzini
2017-01-17 11:22 ` Fabian Grünbichler
2017-01-17 15:03 ` Paolo Bonzini
2017-01-17 16:24 ` Paolo Bonzini
2017-01-18 11:50 ` Fabian Grünbichler
2017-01-18 16:19 ` Fabian Grünbichler
2017-01-18 16:30 ` Paolo Bonzini [this message]
2017-01-18 17:17 ` Fabian Grünbichler
2017-01-19 11:59 ` Fabian Grünbichler
2017-01-24 9:35 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5de45f2f-d83b-64d2-6406-dc09fd4c455f@redhat.com \
--to=pbonzini@redhat.com \
--cc=aderumier@odiso.com \
--cc=f.gruenbichler@proxmox.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).