qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
To: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: kwolf@redhat.com, fujita.tomonori@lab.ntt.co.jp,
	qemu-devel@nongnu.org, hch@lst.de
Subject: Re: [Qemu-devel] [PATCH] This patch adds a new block driver : iSCSI
Date: Mon, 12 Sep 2011 10:14:08 +0100	[thread overview]
Message-ID: <20110912091408.GA3465@stefanha-thinkpad.localdomain> (raw)
In-Reply-To: <1315628610-28222-2-git-send-email-ronniesahlberg@gmail.com>

On Sat, Sep 10, 2011 at 02:23:30PM +1000, Ronnie Sahlberg wrote:

Looking good.  I think this is worth merging because it does offer
benefits over host iSCSI.

> +static void
> +iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void *command_data,
> +                    void *private_data)
> +{
> +}
> +
> +static void
> +iscsi_aio_cancel(BlockDriverAIOCB *blockacb)
> +{
> +    IscsiAIOCB *acb = (IscsiAIOCB *)blockacb;
> +    IscsiLun *iscsilun = acb->iscsilun;
> +
> +    acb->status = -ECANCELED;
> +    acb->common.cb(acb->common.opaque, acb->status);
> +    acb->canceled = 1;
> +
> +    iscsi_task_mgmt_abort_task_async(iscsilun->iscsi, acb->task,
> +                                     iscsi_abort_task_cb, NULL);
> +}

The asynchronous abort task call looks odd.  If a caller allocates a
buffer and issues a read request, then we need to make sure that the
request is really aborted by the time .bdrv_aio_cancel() returns.

If I understand the code correctly, iscsi_aio_cancel() returns
immediately but the read request will still be in progress.  That means
the caller could now free their data buffer and the read request will
overwrite that unallocated memory.

> +static void
> +iscsi_aio_write10_cb(struct iscsi_context *iscsi, int status,
> +                     void *command_data, void *opaque)
> +{
> +    IscsiAIOCB *acb = opaque;
> +
> +    trace_iscsi_aio_write10_cb(iscsi, status, acb, acb->canceled);
> +
> +    if (acb->buf != NULL) {
> +        free(acb->buf);
> +    }

Please just use g_free(acb->buf).  g_free(NULL) is defined as a nop so
the check isn't necessary.  Also, this code uses free(3) when it should
use g_free(3).

> +
> +    if (acb->canceled != 0) {
> +        qemu_aio_release(acb);
> +        scsi_free_scsi_task(acb->task);
> +        acb->task = NULL;
> +        return;
> +    }
> +
> +    acb->status = 0;
> +    if (status < 0) {
> +        error_report("Failed to write10 data to iSCSI lun. %s",
> +                     iscsi_get_error(iscsi));
> +        acb->status = -EIO;
> +    }
> +
> +    iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb);
> +    scsi_free_scsi_task(acb->task);
> +    acb->task = NULL;
> +}
> +
> +static int64_t sector_qemu2lun(int64_t sector, IscsiLun *iscsilun)
> +{
> +    return sector * BDRV_SECTOR_SIZE / iscsilun->block_size;
> +}
> +
> +static BlockDriverAIOCB *
> +iscsi_aio_writev(BlockDriverState *bs, int64_t sector_num,
> +                 QEMUIOVector *qiov, int nb_sectors,
> +                 BlockDriverCompletionFunc *cb,
> +                 void *opaque)
> +{
> +    IscsiLun *iscsilun = bs->opaque;
> +    struct iscsi_context *iscsi = iscsilun->iscsi;
> +    IscsiAIOCB *acb;
> +    size_t size;
> +    int fua = 0;
> +
> +    /* set FUA on writes when cache mode is write through */
> +    if (!(bs->open_flags & (BDRV_O_CACHE_WB|BDRV_O_NOCACHE))) {
> +        fua = 1;
> +    }

FUA needs to reflect the guest semantics - does this disk have an
enabled write cache?  When bs->open_flags has BDRV_O_CACHE_WB, then the
guest knows it needs to send flushes because there is a write cache:

if (!(bs->open_flags & BDRV_O_CACHE_WB)) {
    fua = 1;
}

BDRV_O_NOCACHE is just for local files and sets the O_DIRECT hint.  It
doesn't affect the cache semantics that the guest sees.

> +/*
> + * We support iscsi url's on the form
> + * iscsi://[<username>%<password>@]<host>[:<port>]/<targetname>/<lun>
> + */
> +static int iscsi_open(BlockDriverState *bs, const char *filename, int flags)
> +{
> +    IscsiLun *iscsilun = bs->opaque;
> +    struct iscsi_context *iscsi = NULL;
> +    struct iscsi_url *iscsi_url = NULL;
> +    struct IscsiTask task;
> +    int ret;
> +
> +    if ((BDRV_SECTOR_SIZE % 512) != 0) {
> +        error_report("iSCSI: Invalid BDRV_SECTOR_SIZE. "
> +                     "BDRV_SECTOR_SIZE(%lld) is not a multiple "
> +                     "of 512", BDRV_SECTOR_SIZE);
> +        return -EINVAL;
> +    }

Another way of saying this is:
QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE % 512 != 0);

The advantage is that the build fails instead of waiting until iscsi is
used at runtime until the failure is detected.

What will happen if BDRV_SECTOR_SIZE is not a multiple of 512?

> +
> +    memset(iscsilun, 0, sizeof(IscsiLun));
> +
> +    /* Should really append the KVM name after the ':' here */
> +    iscsi = iscsi_create_context("iqn.2008-11.org.linux-kvm:");
> +    if (iscsi == NULL) {
> +        error_report("iSCSI: Failed to create iSCSI context.");
> +        ret = -ENOMEM;
> +        goto failed;
> +    }
> +
> +    iscsi_url = iscsi_parse_full_url(iscsi, filename);
> +    if (iscsi_url == NULL) {
> +        error_report("Failed to parse URL : %s %s", filename,
> +                     iscsi_get_error(iscsi));
> +        ret = -ENOMEM;

-EINVAL?

> +static BlockDriver bdrv_iscsi = {
> +    .format_name     = "iscsi",
> +    .protocol_name   = "iscsi",
> +
> +    .instance_size   = sizeof(IscsiLun),
> +    .bdrv_file_open  = iscsi_open,
> +    .bdrv_close      = iscsi_close,
> +
> +    .bdrv_getlength  = iscsi_getlength,
> +
> +    .bdrv_aio_readv  = iscsi_aio_readv,
> +    .bdrv_aio_writev = iscsi_aio_writev,
> +    .bdrv_aio_flush  = iscsi_aio_flush,

block.c does not emulate .bdrv_flush() using your .bdrv_aio_flush()
implementation.  I have sent a patch to add this emulation.  This will
turn bdrv_flush(iscsi_bs) into an actual operation instead of a nop :).

> diff --git a/trace-events b/trace-events
> index 3fdd60f..4e461e5 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -494,3 +494,10 @@ escc_sunkbd_event_in(int ch) "Untranslated keycode %2.2x"
>  escc_sunkbd_event_out(int ch) "Translated keycode %2.2x"
>  escc_kbd_command(int val) "Command %d"
>  escc_sunmouse_event(int dx, int dy, int buttons_state) "dx=%d dy=%d buttons=%01x"
> +
> +# block/iscsi.c
> +disable iscsi_aio_write10_cb(void *iscsi, int status, void *acb, int canceled) "iscsi %p status %d acb %p canceled %d"
> +disable iscsi_aio_writev(void *iscsi, int64_t sector_num, int nb_sectors, void *opaque, void *acb) "iscsi %p sector_num %"PRId64" nb_sectors %d opaque %p acb %p"
> +disable iscsi_aio_read10_cb(void *iscsi, int status, void *acb, int canceled) "iscsi %p status %d acb %p canceled %d"
> +disable iscsi_aio_readv(void *iscsi, int64_t sector_num, int nb_sectors, void *opaque, void *acb) "iscsi %p sector_num %"PRId64" nb_sectors %d opaque %p acb %p"

It is no longer necessary to prefix trace event definitions with
"disable".

The stderr and simple backends now start up with all events disabled by
default.  The set of events can be set using the -trace
events=<events-file> option or on the QEMU monitor using set trace-event
<name> on.

Stefan

  reply	other threads:[~2011-09-12  9:14 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-10  4:23 [Qemu-devel] [PATCH] Add iSCSI support for QEMU Ronnie Sahlberg
2011-09-10  4:23 ` [Qemu-devel] [PATCH] This patch adds a new block driver : iSCSI Ronnie Sahlberg
2011-09-12  9:14   ` Stefan Hajnoczi [this message]
2011-09-14 14:36     ` Christoph Hellwig
2011-09-14 15:50       ` Stefan Hajnoczi
2011-09-16 15:53         ` Christoph Hellwig
2011-09-17  7:11           ` Stefan Hajnoczi
2011-09-14 22:51       ` ronnie sahlberg
2011-09-15  8:02         ` Daniel P. Berrange
2011-09-15  9:03         ` Kevin Wolf
2011-09-14 23:08     ` ronnie sahlberg
2011-09-15  6:04       ` Paolo Bonzini
2011-09-15  8:48         ` Dor Laor
2011-09-15  9:11           ` Paolo Bonzini
2011-09-15 11:27             ` ronnie sahlberg
2011-09-15 11:42             ` Dor Laor
2011-09-15 11:46               ` Christoph Hellwig
2011-09-15 12:01                 ` Dor Laor
2011-09-15 12:04                   ` Paolo Bonzini
2011-09-15 11:58               ` Paolo Bonzini
2011-09-15 12:34                 ` Orit Wasserman
2011-09-15 12:58                   ` Paolo Bonzini
2011-09-15 16:59                     ` Orit Wasserman
2011-09-15  9:44           ` Daniel P. Berrange
2011-09-15  9:10         ` Kevin Wolf
2011-09-15  9:39           ` Paolo Bonzini
2011-09-21  9:48     ` ronnie sahlberg
2011-09-23  9:15   ` Mark Wu
2011-09-23 10:16     ` Paolo Bonzini
2011-09-12  8:56 ` [Qemu-devel] [PATCH] Add iSCSI support for QEMU Kevin Wolf
2011-09-14 12:24   ` Orit Wasserman
2011-09-14 14:33     ` Christoph Hellwig
2011-09-14 14:37     ` Christoph Hellwig
2011-09-14 15:35       ` Stefan Hajnoczi
2011-09-14 15:40         ` Christoph Hellwig
2011-09-14 15:51           ` Stefan Hajnoczi
2011-09-14 16:36             ` Orit Wasserman
2011-09-15  6:06               ` Paolo Bonzini
2011-09-15  9:52                 ` Orit Wasserman
2011-09-15  9:55                   ` Paolo Bonzini
2011-09-15 10:10                     ` Kevin Wolf
2011-09-17 19:08                 ` Laurent Vivier
2011-09-18  7:43                   ` Paolo Bonzini
2011-09-14 16:37             ` Paolo Bonzini
2011-09-14 22:46     ` ronnie sahlberg
  -- strict thread matches above, loose matches on Subject: below --
2011-09-21  9:37 Ronnie Sahlberg
2011-09-21  9:37 ` [Qemu-devel] [PATCH] This patch adds a new block driver : iSCSI Ronnie Sahlberg
2011-09-21  9:45   ` Paolo Bonzini
2011-09-21  9:52     ` ronnie sahlberg
2011-09-27 20:08       ` ronnie sahlberg
2011-09-28  5:54         ` Paolo Bonzini
2011-09-29  6:54   ` Stefan Hajnoczi
2011-10-09 20:46     ` ronnie sahlberg
2011-10-13  9:46       ` ronnie sahlberg
2011-10-13  9:48         ` Paolo Bonzini
2011-10-13  9:54         ` Stefan Hajnoczi
2011-10-13 10:01         ` Daniel P. Berrange
2011-10-13 10:55           ` Daniel P. Berrange
2011-10-13 10:52         ` Kevin Wolf
2011-10-24 13:33   ` Kevin Wolf
2011-10-25  8:04     ` ronnie sahlberg
2011-10-25  8:17       ` Kevin Wolf
2011-10-25  8:23         ` ronnie sahlberg
2011-10-25  8:46           ` Kevin Wolf
2011-10-28 10:46   ` Zhi Yong Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110912091408.GA3465@stefanha-thinkpad.localdomain \
    --to=stefanha@linux.vnet.ibm.com \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=hch@lst.de \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=ronniesahlberg@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).