qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 2/2] raw-posix: add Linux native AIO support
Date: Fri, 21 Aug 2009 12:53:49 +0300	[thread overview]
Message-ID: <4A8E6EAD.3030809@redhat.com> (raw)
In-Reply-To: <20090820145835.GB24183@lst.de>

On 08/20/2009 05:58 PM, Christoph Hellwig wrote:
> Now that do have a nicer interface to work against we can add Linux native
> AIO support.  It's an extremly thing layer just setting up an iocb for
> the io_submit system call in the submission path, and registering an
> eventfd with the qemu poll handler to do complete the iocbs directly
> from there.
>
> This started out based on Anthony's earlier AIO patch, but after
> estimated 42,000 rewrites and just as many build system changes
> there's not much left of it.
>
> To enable native kernel aio use the aio=native sub-command on the
> drive command line.  I have also added an option to qemu-io to
> test the aio support without needing a guest.
>
>
> Signed-off-by: Christoph Hellwig<hch@lst.de>
>
> Index: qemu/Makefile
> ===================================================================
> --- qemu.orig/Makefile	2009-08-19 22:49:08.789354196 -0300
> +++ qemu/Makefile	2009-08-19 22:51:25.293352541 -0300
> @@ -56,6 +56,7 @@ recurse-all: $(SUBDIR_RULES) $(ROMSUBDIR
>   block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o
>   block-obj-y += nbd.o block.o aio.o aes.o
>   block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
> +block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>
>   block-nested-y += cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
>   block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
> Index: qemu/block/raw-posix.c
> ===================================================================
> --- qemu.orig/block/raw-posix.c	2009-08-19 22:49:08.793352540 -0300
> +++ qemu/block/raw-posix.c	2009-08-19 23:00:21.157402768 -0300
> @@ -115,6 +115,7 @@ typedef struct BDRVRawState {
>       int fd_got_error;
>       int fd_media_changed;
>   #endif
> +    int use_aio;
>       uint8_t* aligned_buf;
>   } BDRVRawState;
>
> @@ -159,6 +160,7 @@ static int raw_open_common(BlockDriverSt
>       }
>       s->fd = fd;
>       s->aligned_buf = NULL;
> +
>       if ((bdrv_flags&  BDRV_O_NOCACHE)) {
>           s->aligned_buf = qemu_blockalign(bs, ALIGNED_BUFFER_SIZE);
>           if (s->aligned_buf == NULL) {
> @@ -166,9 +168,22 @@ static int raw_open_common(BlockDriverSt
>           }
>       }
>
> -    s->aio_ctx = paio_init();
> -    if (!s->aio_ctx) {
> -        goto out_free_buf;
> +#ifdef CONFIG_LINUX_AIO
> +    if ((bdrv_flags&  (BDRV_O_NOCACHE|BDRV_O_NATIVE_AIO)) ==
> +                      (BDRV_O_NOCACHE|BDRV_O_NATIVE_AIO)) {
> +        s->aio_ctx = laio_init();
> +        if (!s->aio_ctx) {
> +            goto out_free_buf;
> +        }
> +        s->use_aio = 1;
> +    } else
> +#endif
> +    {
> +        s->aio_ctx = paio_init();
> +        if (!s->aio_ctx) {
> +            goto out_free_buf;
> +        }
> +        s->use_aio = 0;
>       }
>
>       return 0;
> @@ -524,8 +539,13 @@ static BlockDriverAIOCB *raw_aio_submit(
>        * boundary.  Check if this is the case or telll the low-level
>        * driver that it needs to copy the buffer.
>        */
> -    if (s->aligned_buf&&  !qiov_is_aligned(qiov)) {
> -        type |= QEMU_AIO_MISALIGNED;
> +    if (s->aligned_buf) {
> +        if (!qiov_is_aligned(qiov)) {
> +            type |= QEMU_AIO_MISALIGNED;
> +        } else if (s->use_aio) {
> +            return laio_submit(bs, s->aio_ctx, s->fd, sector_num, qiov,
> +	                       nb_sectors, cb, opaque, type);
> +        }
>       }
>
>       return paio_submit(bs, s->aio_ctx, s->fd, sector_num, qiov, nb_sectors,
> Index: qemu/configure
> ===================================================================
> --- qemu.orig/configure	2009-08-19 22:49:08.801352719 -0300
> +++ qemu/configure	2009-08-19 22:51:25.305393736 -0300
> @@ -197,6 +197,7 @@ build_docs="yes"
>   uname_release=""
>   curses="yes"
>   curl="yes"
> +linux_aio="yes"
>   io_thread="no"
>   nptl="yes"
>   mixemu="no"
> @@ -499,6 +500,8 @@ for opt do
>     ;;
>     --enable-mixemu) mixemu="yes"
>     ;;
> +  --disable-linux-aio) linux_aio="no"
> +  ;;
>     --enable-io-thread) io_thread="yes"
>     ;;
>     --disable-blobs) blobs="no"
> @@ -636,6 +639,7 @@ echo "  --oss-lib                path to
>   echo "  --enable-uname-release=R Return R for uname -r in usermode emulation"
>   echo "  --sparc_cpu=V            Build qemu for Sparc architecture v7, v8, v8plus, v8plusa, v9"
>   echo "  --disable-vde            disable support for vde network"
> +echo "  --disable-linux-aio      disable Linux AIO support"
>   echo "  --enable-io-thread       enable IO thread"
>   echo "  --disable-blobs          disable installing provided firmware blobs"
>   echo "  --kerneldir=PATH         look for kernel includes in PATH"
> @@ -1197,6 +1201,23 @@ if test "$pthread" = no; then
>   fi
>
>   ##########################################
> +# linux-aio probe
> +AIOLIBS=""
> +
> +if test "$linux_aio" = "yes" ; then
> +    linux_aio=no
> +    cat>  $TMPC<<EOF
> +#include<libaio.h>
> +#include<sys/eventfd.h>
> +int main(void) { io_setup(0, NULL); io_set_eventfd(NULL, 0); eventfd(0, 0); return 0; }
> +EOF
> +    if compile_prog "" "-laio" ; then
> +        linux_aio=yes
> +        LIBS="$LIBS -laio"
> +    fi
> +fi
> +
> +##########################################
>   # iovec probe
>   cat>  $TMPC<<EOF
>   #include<sys/types.h>
> @@ -1527,6 +1548,7 @@ echo "NPTL support      $nptl"
>   echo "GUEST_BASE        $guest_base"
>   echo "vde support       $vde"
>   echo "IO thread         $io_thread"
> +echo "Linux AIO support $linux_aio"
>   echo "Install blobs     $blobs"
>   echo -e "KVM support       $kvm"
>   echo "fdt support       $fdt"
> @@ -1700,6 +1722,9 @@ fi
>   if test "$io_thread" = "yes" ; then
>     echo "CONFIG_IOTHREAD=y">>  $config_host_mak
>   fi
> +if test "$linux_aio" = "yes" ; then
> +  echo "CONFIG_LINUX_AIO=y">>  $config_host_mak
> +fi
>   if test "$blobs" = "yes" ; then
>     echo "INSTALL_BLOBS=yes">>  $config_host_mak
>   fi
> Index: qemu/linux-aio.c
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ qemu/linux-aio.c	2009-08-20 10:54:10.924375300 -0300
> @@ -0,0 +1,204 @@
> +/*
> + * Linux native AIO support.
> + *
> + * Copyright (C) 2009 IBM, Corp.
> + * Copyright (C) 2009 Red Hat, Inc.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +#include "qemu-common.h"
> +#include "qemu-aio.h"
> +#include "block_int.h"
> +#include "block/raw-posix-aio.h"
> +
> +#include<sys/eventfd.h>
> +#include<libaio.h>
> +
> +/*
> + * Queue size (per-device).
> + *
> + * XXX: eventually we need to communicate this to the guest and/or make it
> + *      tunable by the guest.  If we get more outstanding requests at a time
> + *      than this we will get EAGAIN from io_submit which is communicated to
> + *      the guest as an I/O error.
> + */
> +#define MAX_EVENTS 128
>    

Or, we could queue any extra requests.

> +
> +
> +void *laio_init(void)
> +{
> +    struct qemu_laio_state *s;
> +
> +    s = qemu_mallocz(sizeof(*s));
> +    s->efd = eventfd(0, 0);
> +    if (s->efd == -1)
> +        goto out_free_state;
> +    fcntl(s->efd, F_SETFL, O_NONBLOCK);
> +
> +    if (io_setup(MAX_EVENTS,&s->ctx) != 0)
> +        goto out_close_efd;
> +
>    

One day we may want a global io context so we can dequeue many events 
with one syscall.  Or we may not, if we thread these things.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

  reply	other threads:[~2009-08-21  9:53 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-20 14:58 [Qemu-devel] [PATCH 0/2] native Linux AIO support revisited Christoph Hellwig
2009-08-20 14:58 ` [Qemu-devel] [PATCH 1/2] raw-posix: refactor AIO support Christoph Hellwig
2009-08-20 14:58 ` [Qemu-devel] [PATCH 2/2] raw-posix: add Linux native " Christoph Hellwig
2009-08-21  9:53   ` Avi Kivity [this message]
2009-08-21 14:48     ` Christoph Hellwig
2009-08-21 15:35       ` Avi Kivity
2009-08-20 19:06 ` [Qemu-devel] [PATCH 0/2] native Linux AIO support revisited Jamie Lokier
2009-08-21  7:40 ` Avi Kivity
2009-08-21 14:50   ` Christoph Hellwig
2009-08-21 15:38     ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A8E6EAD.3030809@redhat.com \
    --to=avi@redhat.com \
    --cc=hch@lst.de \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).