From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37162) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Upg74-0006kR-2E for qemu-devel@nongnu.org; Thu, 20 Jun 2013 10:45:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Upg6w-00079u-M0 for qemu-devel@nongnu.org; Thu, 20 Jun 2013 10:45:25 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42321) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Upg6w-00079Z-F4 for qemu-devel@nongnu.org; Thu, 20 Jun 2013 10:45:18 -0400 Message-ID: <51C3160A.3030003@redhat.com> Date: Thu, 20 Jun 2013 16:47:38 +0200 From: Laszlo Ersek MIME-Version: 1.0 References: <20130616234827.23764.98763.malonedeb@wampee.canonical.com> <20130618180122.22327.47349.malone@gac.canonical.com> <51C0B6D8.5090900@redhat.com> <51C17724.5040309@redhat.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [Bug 1191606] Re: qemu crashes with iscsi initiator (libiscsi) when using virtio List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: ronnie sahlberg Cc: Bug 1191606 <1191606@bugs.launchpad.net>, qemu-devel On 06/20/13 15:33, ronnie sahlberg wrote: > http://pastebin.com/EuwZPna1 > > Last few thousand lines from the log with your patch. > > > The crash happens immediately after qemu has called out to iscsi_ioctl > with SG_IO to read the serial numbers vpd page. > We get the reply back from the target but as soon as ioctl_cb returns we crash. > If you comment out SG_IO in iscsi_ioctl then the crash does not happen > (but the qemu does nto get serial number either) > > > I will look more into it tonight. virtqueue_map_sg: mapped gpa=00000000790a9000 at hva=0x7f0cb10a9000 for length=4, is_write=1 (out: data) virtqueue_map_sg: mapped gpa=000000007726fc70 at hva=0x7f0caf26fc70 for length=96, is_write=1 (out: sense) virtqueue_map_sg: mapped gpa=00000000764e5aa0 at hva=0x7f0cae4e5aa0 for length=16, is_write=1 (out: errors, data_len, sense_len, residual) virtqueue_map_sg: mapped gpa=00000000764e5adc at hva=0x7f0cae4e5adc for length=1, is_write=1 (out: status) virtqueue_map_sg: mapped gpa=00000000764e5a90 at hva=0x7f0cae4e5a90 for length=16, is_write=0 (in: type, ioprio, sector) virtqueue_map_sg: mapped gpa=000000007ab80578 at hva=0x7f0cb2b80578 for length=6, is_write=0 (in: cmd) virtio_blk_handle_request: type=0x00000002 virtqueue_fill: unmapping hva=0x7f0c24008000 for length=4, access_len=1, is_write=1 Bad ram pointer 0x7f0c24008000 This looks related, in virtio_blk_handle_scsi(): } else if (req->elem.in_num > 3) { /* * If we have more than 3 input segments the guest wants to actually * read data. */ hdr.dxfer_direction = SG_DXFER_FROM_DEV; hdr.iovec_count = req->elem.in_num - 3; for (i = 0; i < hdr.iovec_count; i++) hdr.dxfer_len += req->elem.in_sg[i].iov_len; hdr.dxferp = req->elem.in_sg; } else { This sets - "hdr.iovec_count" to 1, - "hdr.dxfer_len" to 4, - "hdr.dxferp" as shown above, For "struct sg_io_hdr" (which is the type of "hdr"), the typedef & documentation are in : unsigned short iovec_count; /* [i] 0 implies no scatter gather */ void __user *dxferp; /* [i], [*io] points to data transfer memory or scatter gather list */ Now what we're seeing is a corruption of "req->elem.in_sg[0].iov_base", whose address equals that of "req->elem.in_sg" (it's at offset 0 in the struct at subscript #0 in the array). virtqueue_map_sg: mapped gpa=00000000790a9000 at hva=0x7f0cb10a9000 for length=4, is_write=1 [...] virtio_blk_handle_request: type=0x00000002 virtqueue_fill: unmapping hva=0x7f0c24008000 for length=4, access_len=1, is_write=1 Bad ram pointer 0x7f0c24008000 First I don't understand how access_len can only be "1". But, in any case, if the "req->elem.in_sg[0].iov_base" pointer is stored in little-endian order, and the kernel (or iscsi_scsi_command_async()?) for whatever reason misinterprets "hdr.dxferp" to point at an actual receive buffer (instead of an iovec array), that would be consistent with the symptoms: 0x7f0cb10a9000 <--- original value of req->elem.in_sg[0].iov_base 0x7f0c24008000 <--- corrupted value ^^^^^^^^ <--- 4 low bytes overwritten by SCSI data Laszlo