qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Ming Lin <mlin@kernel.org>
Cc: qemu-devel@nongnu.org, Christoph Hellwig <hch@lst.de>,
	linux-nvme@lists.infradead.org,
	virtualization@lists.linux-foundation.org
Subject: Re: [Qemu-devel] [RFC PATCH 0/9] vhost-nvme: new qemu nvme backend using nvme target
Date: Mon, 23 Nov 2015 15:14:55 +0100	[thread overview]
Message-ID: <56531F5F.3050709@redhat.com> (raw)
In-Reply-To: <1448266667.18175.5.camel@hasee>



On 23/11/2015 09:17, Ming Lin wrote:
> On Sat, 2015-11-21 at 14:11 +0100, Paolo Bonzini wrote:
>>
>> On 20/11/2015 01:20, Ming Lin wrote:
>>> One improvment could be to use google's NVMe vendor extension that
>>> I send in another thread, aslo here:
>>> https://git.kernel.org/cgit/linux/kernel/git/mlin/linux.git/log/?h=nvme-google-ext
>>>
>>> Qemu side:
>>> http://www.minggr.net/cgit/cgit.cgi/qemu/log/?h=vhost-nvme.0
>>> Kernel side also here:
>>> https://git.kernel.org/cgit/linux/kernel/git/mlin/linux.git/log/?h=vhost-nvme.0
>>
>> How much do you get with vhost-nvme plus vendor extension, compared to
>> 190 MB/s for QEMU?
> 
> There is still some bug. I'll update.

Sure.

>> Note that in all likelihood, QEMU can actually do better than 190 MB/s,
>> and gain more parallelism too, by moving the processing of the
>> ioeventfds to a separate thread.  This is similar to
>> hw/block/dataplane/virtio-blk.c.
>>
>> It's actually pretty easy to do.  Even though
>> hw/block/dataplane/virtio-blk.c is still using some old APIs, all memory
>> access in QEMU is now thread-safe.  I have pending patches for 2.6 that
>> cut that file down to a mere 200 lines of code, NVMe would probably be
>> about the same.
> 
> Is there a git tree for your patches?

No, not yet.  I'll post them today or tomorrow, will make sure to Cc you.

> Did you mean some pseduo code as below?
> 1. need a iothread for each cq/sq?
> 2. need a AioContext for each cq/sq?
> 
>  hw/block/nvme.c | 32 ++++++++++++++++++++++++++++++--
>  hw/block/nvme.h |  8 ++++++++
>  2 files changed, 38 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index f27fd35..fed4827 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -28,6 +28,8 @@
>  #include "sysemu/sysemu.h"
>  #include "qapi/visitor.h"
>  #include "sysemu/block-backend.h"
> +#include "sysemu/iothread.h"
> +#include "qom/object_interfaces.h"
>  
>  #include "nvme.h"
>  
> @@ -558,9 +560,22 @@ static void nvme_init_cq_eventfd(NvmeCQueue *cq)
>      uint16_t offset = (cq->cqid*2+1) * (4 << NVME_CAP_DSTRD(n->bar.cap));
>  
>      event_notifier_init(&cq->notifier, 0);
> -    event_notifier_set_handler(&cq->notifier, nvme_cq_notifier);
>      memory_region_add_eventfd(&n->iomem,
>          0x1000 + offset, 4, false, 0, &cq->notifier);
> +
> +    object_initialize(&cq->internal_iothread_obj,
> +                      sizeof(cq->internal_iothread_obj),
> +                      TYPE_IOTHREAD);
> +    user_creatable_complete(OBJECT(&cq->internal_iothread_obj), &error_abort);

For now, you have to use one iothread for all cq/sq of a single NVMe
device; multiqueue block layer is planned for 2.7 or 2.8.  Otherwise
yes, it's very close to just these changes.

If you use "-object iothread,id=NN" and a iothread property, you can
also use an N:M model with multiple disks attached to the same iothread.
 Defining the iothread property is like

	object_property_add_link(obj, "iothread", TYPE_IOTHREAD,
				(Object **)&s->conf.iothread,
				qdev_prop_allow_set_link_before_realize,
				OBJ_PROP_LINK_UNREF_ON_RELEASE, NULL);

Thanks,

Paolo

> +    cq->iothread = &cq->internal_iothread_obj;
> +    cq->ctx = iothread_get_aio_context(cq->iothread);
> +    //Question: Need a conf.blk for each cq/sq???
> +    //blk_set_aio_context(cq->conf->conf.blk, cq->ctx);
> +    aio_context_acquire(cq->ctx);
> +    aio_set_event_notifier(cq->ctx, &cq->notifier, true,
> +                           nvme_cq_notifier);
> +    aio_context_release(cq->ctx);
>  }
>  
>  static void nvme_sq_notifier(EventNotifier *e)
> @@ -578,9 +593,22 @@ static void nvme_init_sq_eventfd(NvmeSQueue *sq)
>      uint16_t offset = sq->sqid * 2 * (4 << NVME_CAP_DSTRD(n->bar.cap));
>  
>      event_notifier_init(&sq->notifier, 0);
> -    event_notifier_set_handler(&sq->notifier, nvme_sq_notifier);
>      memory_region_add_eventfd(&n->iomem,
>          0x1000 + offset, 4, false, 0, &sq->notifier);
> +
> +    object_initialize(&sq->internal_iothread_obj,
> +                      sizeof(sq->internal_iothread_obj),
> +                      TYPE_IOTHREAD);
> +    user_creatable_complete(OBJECT(&sq->internal_iothread_obj), &error_abort);
> +    sq->iothread = &sq->internal_iothread_obj;
> +    sq->ctx = iothread_get_aio_context(sq->iothread);
> +    //Question: Need a conf.blk for each cq/sq???
> +    //blk_set_aio_context(sq->conf->conf.blk, sq->ctx);
> +
> +    aio_context_acquire(sq->ctx);
> +    aio_set_event_notifier(sq->ctx, &sq->notifier, true,
> +                           nvme_sq_notifier);
> +    aio_context_release(sq->ctx);
>  }
>  
>  static uint16_t nvme_set_db_memory(NvmeCtrl *n, const NvmeCmd *cmd)
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index 608f202..171ee0b 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -667,6 +667,10 @@ typedef struct NvmeSQueue {
>       * do not go over this value will not result in MMIO writes (but will
>       * still write the tail pointer to the "db_addr" location above). */
>      uint64_t    eventidx_addr;
> +
> +    IOThread *iothread;
> +    IOThread internal_iothread_obj;
> +    AioContext *ctx;
>      EventNotifier notifier;
>  } NvmeSQueue;
>  
> @@ -690,6 +694,10 @@ typedef struct NvmeCQueue {
>       * do not go over this value will not result in MMIO writes (but will
>       * still write the head pointer to the "db_addr" location above). */
>      uint64_t    eventidx_addr;
> +
> +    IOThread *iothread;
> +    IOThread internal_iothread_obj;
> +    AioContext *ctx;
>      EventNotifier notifier;
>  } NvmeCQueue;
>  
> 
>>
>> Paolo

  reply	other threads:[~2015-11-23 14:15 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-20  0:20 [Qemu-devel] [RFC PATCH 0/9] vhost-nvme: new qemu nvme backend using nvme target Ming Lin
2015-11-20  0:21 ` [Qemu-devel] [RFC PATCH 1/9] nvme-vhost: add initial commit Ming Lin
2015-11-20  0:21 ` [Qemu-devel] [RFC PATCH 2/9] nvme-vhost: add basic ioctl handlers Ming Lin
2015-11-20  0:21 ` [Qemu-devel] [RFC PATCH 3/9] nvme-vhost: add basic nvme bar read/write Ming Lin
2015-11-20  0:21 ` [Qemu-devel] [RFC PATCH 4/9] nvmet: add a controller "start" hook Ming Lin
2015-11-20  5:13   ` Christoph Hellwig
2015-11-20  5:31     ` Ming Lin
2015-11-20  0:21 ` [Qemu-devel] [RFC PATCH 5/9] nvme-vhost: add controller "start" callback Ming Lin
2015-11-20  0:21 ` [Qemu-devel] [RFC PATCH 6/9] nvmet: add a "parse_extra_admin_cmd" hook Ming Lin
2015-11-20  0:21 ` [Qemu-devel] [RFC PATCH 7/9] nvme-vhost: add "parse_extra_admin_cmd" callback Ming Lin
2015-11-20  0:21 ` [Qemu-devel] [RFC PATCH 8/9] nvme-vhost: add vhost memory helpers Ming Lin
2015-11-20  0:21 ` [Qemu-devel] [RFC PATCH 9/9] nvme-vhost: add nvme queue handlers Ming Lin
2015-11-20  5:16 ` [Qemu-devel] [RFC PATCH 0/9] vhost-nvme: new qemu nvme backend using nvme target Christoph Hellwig
2015-11-20  5:33   ` Ming Lin
2015-11-21 13:11 ` Paolo Bonzini
2015-11-23  8:17   ` Ming Lin
2015-11-23 14:14     ` Paolo Bonzini [this message]
2015-11-24  7:27       ` Ming Lin
2015-11-24  8:23         ` Ming Lin
2015-11-24 10:51         ` Paolo Bonzini
2015-11-24 19:25           ` Ming Lin
2015-11-25 11:27             ` Paolo Bonzini
2015-11-25 18:51               ` Ming Lin
2015-11-25 19:32                 ` Paolo Bonzini
2015-11-30 23:20       ` Ming Lin
2015-12-01 16:02         ` Paolo Bonzini
2015-12-01 16:26           ` Ming Lin
2015-12-01 16:59             ` Paolo Bonzini
2015-12-02  5:13               ` Ming Lin
2015-12-02 10:07                 ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56531F5F.3050709@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mlin@kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).