From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:37162)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <lersek@redhat.com>) id 1Upg74-0006kR-2E
	for qemu-devel@nongnu.org; Thu, 20 Jun 2013 10:45:33 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <lersek@redhat.com>) id 1Upg6w-00079u-M0
	for qemu-devel@nongnu.org; Thu, 20 Jun 2013 10:45:25 -0400
Received: from mx1.redhat.com ([209.132.183.28]:42321)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <lersek@redhat.com>) id 1Upg6w-00079Z-F4
	for qemu-devel@nongnu.org; Thu, 20 Jun 2013 10:45:18 -0400
Message-ID: <51C3160A.3030003@redhat.com>
Date: Thu, 20 Jun 2013 16:47:38 +0200
From: Laszlo Ersek <lersek@redhat.com>
MIME-Version: 1.0
References: <20130616234827.23764.98763.malonedeb@wampee.canonical.com>
	<20130618180122.22327.47349.malone@gac.canonical.com>
	<51C0B6D8.5090900@redhat.com>
	<CAN05THSA98Nk3x_5rwi9iEFSGs+wS9FJiCHGN3dKfPM20rzbeg@mail.gmail.com>
	<51C17724.5040309@redhat.com>
	<CAN05THQ8D6DyUJNAZga0KHj031GWLtZETYhBVBrBwVp2dZFmQw@mail.gmail.com>
In-Reply-To: <CAN05THQ8D6DyUJNAZga0KHj031GWLtZETYhBVBrBwVp2dZFmQw@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [Bug 1191606] Re: qemu crashes with iscsi
 initiator (libiscsi) when using virtio
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: ronnie sahlberg <ronniesahlberg@gmail.com>
Cc: Bug 1191606 <1191606@bugs.launchpad.net>, qemu-devel <qemu-devel@nongnu.org>

On 06/20/13 15:33, ronnie sahlberg wrote:
> http://pastebin.com/EuwZPna1
> 
> Last few thousand lines from the log with your patch.
> 
> 
> The crash happens immediately after qemu has called out to iscsi_ioctl
> with SG_IO to read the serial numbers vpd page.
> We get the reply back from the target but as soon as ioctl_cb returns we crash.
> If you comment out SG_IO in iscsi_ioctl then the crash does not happen
> (but the qemu does nto get serial number either)
> 
> 
> I will look more into it tonight.

  virtqueue_map_sg: mapped gpa=00000000790a9000 at hva=0x7f0cb10a9000 for length=4, is_write=1  (out: data)
  virtqueue_map_sg: mapped gpa=000000007726fc70 at hva=0x7f0caf26fc70 for length=96, is_write=1 (out: sense)
  virtqueue_map_sg: mapped gpa=00000000764e5aa0 at hva=0x7f0cae4e5aa0 for length=16, is_write=1 (out: errors, data_len, sense_len, residual)
  virtqueue_map_sg: mapped gpa=00000000764e5adc at hva=0x7f0cae4e5adc for length=1, is_write=1  (out: status)
  virtqueue_map_sg: mapped gpa=00000000764e5a90 at hva=0x7f0cae4e5a90 for length=16, is_write=0 (in: type, ioprio, sector)
  virtqueue_map_sg: mapped gpa=000000007ab80578 at hva=0x7f0cb2b80578 for length=6, is_write=0  (in: cmd)
  virtio_blk_handle_request: type=0x00000002
  virtqueue_fill: unmapping hva=0x7f0c24008000 for length=4, access_len=1, is_write=1
  Bad ram pointer 0x7f0c24008000

This looks related, in virtio_blk_handle_scsi():

    } else if (req->elem.in_num > 3) {
        /*
         * If we have more than 3 input segments the guest wants to actually
         * read data.
         */
        hdr.dxfer_direction = SG_DXFER_FROM_DEV;
        hdr.iovec_count = req->elem.in_num - 3;
        for (i = 0; i < hdr.iovec_count; i++)
            hdr.dxfer_len += req->elem.in_sg[i].iov_len;

        hdr.dxferp = req->elem.in_sg;
    } else {

This sets
- "hdr.iovec_count" to 1,
- "hdr.dxfer_len" to 4,
- "hdr.dxferp" as shown above,

For "struct sg_io_hdr" (which is the type of "hdr"), the typedef &
documentation are in <include/scsi/sg.h>:

    unsigned short iovec_count; /* [i] 0 implies no scatter gather */

    void __user *dxferp;        /* [i], [*io] points to data transfer memory
                                              or scatter gather list */

Now what we're seeing is a corruption of "req->elem.in_sg[0].iov_base",
whose address equals that of "req->elem.in_sg" (it's at offset 0 in the
struct at subscript #0 in the array).

  virtqueue_map_sg: mapped gpa=00000000790a9000 at hva=0x7f0cb10a9000 for length=4, is_write=1
  [...]
  virtio_blk_handle_request: type=0x00000002
  virtqueue_fill: unmapping hva=0x7f0c24008000 for length=4, access_len=1, is_write=1
  Bad ram pointer 0x7f0c24008000

First I don't understand how access_len can only be "1". But, in any
case, if the "req->elem.in_sg[0].iov_base" pointer is stored in
little-endian order, and the kernel (or iscsi_scsi_command_async()?) for
whatever reason misinterprets "hdr.dxferp" to point at an actual receive
buffer (instead of an iovec array), that would be consistent with the
symptoms:

  0x7f0cb10a9000 <--- original value of req->elem.in_sg[0].iov_base
  0x7f0c24008000 <--- corrupted value
        ^^^^^^^^ <--- 4 low bytes overwritten by SCSI data

Laszlo