From: Avi Kivity <avi@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 2/2] raw-posix: add Linux native AIO support
Date: Fri, 21 Aug 2009 12:53:49 +0300 [thread overview]
Message-ID: <4A8E6EAD.3030809@redhat.com> (raw)
In-Reply-To: <20090820145835.GB24183@lst.de>
On 08/20/2009 05:58 PM, Christoph Hellwig wrote:
> Now that do have a nicer interface to work against we can add Linux native
> AIO support. It's an extremly thing layer just setting up an iocb for
> the io_submit system call in the submission path, and registering an
> eventfd with the qemu poll handler to do complete the iocbs directly
> from there.
>
> This started out based on Anthony's earlier AIO patch, but after
> estimated 42,000 rewrites and just as many build system changes
> there's not much left of it.
>
> To enable native kernel aio use the aio=native sub-command on the
> drive command line. I have also added an option to qemu-io to
> test the aio support without needing a guest.
>
>
> Signed-off-by: Christoph Hellwig<hch@lst.de>
>
> Index: qemu/Makefile
> ===================================================================
> --- qemu.orig/Makefile 2009-08-19 22:49:08.789354196 -0300
> +++ qemu/Makefile 2009-08-19 22:51:25.293352541 -0300
> @@ -56,6 +56,7 @@ recurse-all: $(SUBDIR_RULES) $(ROMSUBDIR
> block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o
> block-obj-y += nbd.o block.o aio.o aes.o
> block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
> +block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>
> block-nested-y += cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
> block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
> Index: qemu/block/raw-posix.c
> ===================================================================
> --- qemu.orig/block/raw-posix.c 2009-08-19 22:49:08.793352540 -0300
> +++ qemu/block/raw-posix.c 2009-08-19 23:00:21.157402768 -0300
> @@ -115,6 +115,7 @@ typedef struct BDRVRawState {
> int fd_got_error;
> int fd_media_changed;
> #endif
> + int use_aio;
> uint8_t* aligned_buf;
> } BDRVRawState;
>
> @@ -159,6 +160,7 @@ static int raw_open_common(BlockDriverSt
> }
> s->fd = fd;
> s->aligned_buf = NULL;
> +
> if ((bdrv_flags& BDRV_O_NOCACHE)) {
> s->aligned_buf = qemu_blockalign(bs, ALIGNED_BUFFER_SIZE);
> if (s->aligned_buf == NULL) {
> @@ -166,9 +168,22 @@ static int raw_open_common(BlockDriverSt
> }
> }
>
> - s->aio_ctx = paio_init();
> - if (!s->aio_ctx) {
> - goto out_free_buf;
> +#ifdef CONFIG_LINUX_AIO
> + if ((bdrv_flags& (BDRV_O_NOCACHE|BDRV_O_NATIVE_AIO)) ==
> + (BDRV_O_NOCACHE|BDRV_O_NATIVE_AIO)) {
> + s->aio_ctx = laio_init();
> + if (!s->aio_ctx) {
> + goto out_free_buf;
> + }
> + s->use_aio = 1;
> + } else
> +#endif
> + {
> + s->aio_ctx = paio_init();
> + if (!s->aio_ctx) {
> + goto out_free_buf;
> + }
> + s->use_aio = 0;
> }
>
> return 0;
> @@ -524,8 +539,13 @@ static BlockDriverAIOCB *raw_aio_submit(
> * boundary. Check if this is the case or telll the low-level
> * driver that it needs to copy the buffer.
> */
> - if (s->aligned_buf&& !qiov_is_aligned(qiov)) {
> - type |= QEMU_AIO_MISALIGNED;
> + if (s->aligned_buf) {
> + if (!qiov_is_aligned(qiov)) {
> + type |= QEMU_AIO_MISALIGNED;
> + } else if (s->use_aio) {
> + return laio_submit(bs, s->aio_ctx, s->fd, sector_num, qiov,
> + nb_sectors, cb, opaque, type);
> + }
> }
>
> return paio_submit(bs, s->aio_ctx, s->fd, sector_num, qiov, nb_sectors,
> Index: qemu/configure
> ===================================================================
> --- qemu.orig/configure 2009-08-19 22:49:08.801352719 -0300
> +++ qemu/configure 2009-08-19 22:51:25.305393736 -0300
> @@ -197,6 +197,7 @@ build_docs="yes"
> uname_release=""
> curses="yes"
> curl="yes"
> +linux_aio="yes"
> io_thread="no"
> nptl="yes"
> mixemu="no"
> @@ -499,6 +500,8 @@ for opt do
> ;;
> --enable-mixemu) mixemu="yes"
> ;;
> + --disable-linux-aio) linux_aio="no"
> + ;;
> --enable-io-thread) io_thread="yes"
> ;;
> --disable-blobs) blobs="no"
> @@ -636,6 +639,7 @@ echo " --oss-lib path to
> echo " --enable-uname-release=R Return R for uname -r in usermode emulation"
> echo " --sparc_cpu=V Build qemu for Sparc architecture v7, v8, v8plus, v8plusa, v9"
> echo " --disable-vde disable support for vde network"
> +echo " --disable-linux-aio disable Linux AIO support"
> echo " --enable-io-thread enable IO thread"
> echo " --disable-blobs disable installing provided firmware blobs"
> echo " --kerneldir=PATH look for kernel includes in PATH"
> @@ -1197,6 +1201,23 @@ if test "$pthread" = no; then
> fi
>
> ##########################################
> +# linux-aio probe
> +AIOLIBS=""
> +
> +if test "$linux_aio" = "yes" ; then
> + linux_aio=no
> + cat> $TMPC<<EOF
> +#include<libaio.h>
> +#include<sys/eventfd.h>
> +int main(void) { io_setup(0, NULL); io_set_eventfd(NULL, 0); eventfd(0, 0); return 0; }
> +EOF
> + if compile_prog "" "-laio" ; then
> + linux_aio=yes
> + LIBS="$LIBS -laio"
> + fi
> +fi
> +
> +##########################################
> # iovec probe
> cat> $TMPC<<EOF
> #include<sys/types.h>
> @@ -1527,6 +1548,7 @@ echo "NPTL support $nptl"
> echo "GUEST_BASE $guest_base"
> echo "vde support $vde"
> echo "IO thread $io_thread"
> +echo "Linux AIO support $linux_aio"
> echo "Install blobs $blobs"
> echo -e "KVM support $kvm"
> echo "fdt support $fdt"
> @@ -1700,6 +1722,9 @@ fi
> if test "$io_thread" = "yes" ; then
> echo "CONFIG_IOTHREAD=y">> $config_host_mak
> fi
> +if test "$linux_aio" = "yes" ; then
> + echo "CONFIG_LINUX_AIO=y">> $config_host_mak
> +fi
> if test "$blobs" = "yes" ; then
> echo "INSTALL_BLOBS=yes">> $config_host_mak
> fi
> Index: qemu/linux-aio.c
> ===================================================================
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ qemu/linux-aio.c 2009-08-20 10:54:10.924375300 -0300
> @@ -0,0 +1,204 @@
> +/*
> + * Linux native AIO support.
> + *
> + * Copyright (C) 2009 IBM, Corp.
> + * Copyright (C) 2009 Red Hat, Inc.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +#include "qemu-common.h"
> +#include "qemu-aio.h"
> +#include "block_int.h"
> +#include "block/raw-posix-aio.h"
> +
> +#include<sys/eventfd.h>
> +#include<libaio.h>
> +
> +/*
> + * Queue size (per-device).
> + *
> + * XXX: eventually we need to communicate this to the guest and/or make it
> + * tunable by the guest. If we get more outstanding requests at a time
> + * than this we will get EAGAIN from io_submit which is communicated to
> + * the guest as an I/O error.
> + */
> +#define MAX_EVENTS 128
>
Or, we could queue any extra requests.
> +
> +
> +void *laio_init(void)
> +{
> + struct qemu_laio_state *s;
> +
> + s = qemu_mallocz(sizeof(*s));
> + s->efd = eventfd(0, 0);
> + if (s->efd == -1)
> + goto out_free_state;
> + fcntl(s->efd, F_SETFL, O_NONBLOCK);
> +
> + if (io_setup(MAX_EVENTS,&s->ctx) != 0)
> + goto out_close_efd;
> +
>
One day we may want a global io context so we can dequeue many events
with one syscall. Or we may not, if we thread these things.
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
next prev parent reply other threads:[~2009-08-21 9:53 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-20 14:58 [Qemu-devel] [PATCH 0/2] native Linux AIO support revisited Christoph Hellwig
2009-08-20 14:58 ` [Qemu-devel] [PATCH 1/2] raw-posix: refactor AIO support Christoph Hellwig
2009-08-20 14:58 ` [Qemu-devel] [PATCH 2/2] raw-posix: add Linux native " Christoph Hellwig
2009-08-21 9:53 ` Avi Kivity [this message]
2009-08-21 14:48 ` Christoph Hellwig
2009-08-21 15:35 ` Avi Kivity
2009-08-20 19:06 ` [Qemu-devel] [PATCH 0/2] native Linux AIO support revisited Jamie Lokier
2009-08-21 7:40 ` Avi Kivity
2009-08-21 14:50 ` Christoph Hellwig
2009-08-21 15:38 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A8E6EAD.3030809@redhat.com \
--to=avi@redhat.com \
--cc=hch@lst.de \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).