* [Qemu-devel] [PATCH 0/2] improve Linux AIO support @ 2011-07-27 18:25 Frediano Ziglio 2011-07-27 18:25 ` [Qemu-devel] [PATCH 1/2] linux aio: support flush operation Frediano Ziglio 2011-07-27 18:25 ` [Qemu-devel] [PATCH 2/2] aio: use Linux AIO even if nocache is not specified Frediano Ziglio 0 siblings, 2 replies; 12+ messages in thread From: Frediano Ziglio @ 2011-07-27 18:25 UTC (permalink / raw) To: Kevin Wolf; +Cc: qemu-devel, Frediano Ziglio These patches avoid many fallbacks to POSIX AIO and enable Linux AIO even if nocache is not specified. Also add flush support with Linux AIO. Frediano Ziglio (2): linux aio: support flush operation aio: use Linux AIO even if nocache is not specified block/raw-posix.c | 30 +++++++++++++++++------------- linux-aio.c | 3 +++ 2 files changed, 20 insertions(+), 13 deletions(-) ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Qemu-devel] [PATCH 1/2] linux aio: support flush operation 2011-07-27 18:25 [Qemu-devel] [PATCH 0/2] improve Linux AIO support Frediano Ziglio @ 2011-07-27 18:25 ` Frediano Ziglio 2011-07-27 18:31 ` Christoph Hellwig 2011-07-27 18:25 ` [Qemu-devel] [PATCH 2/2] aio: use Linux AIO even if nocache is not specified Frediano Ziglio 1 sibling, 1 reply; 12+ messages in thread From: Frediano Ziglio @ 2011-07-27 18:25 UTC (permalink / raw) To: Kevin Wolf; +Cc: qemu-devel, Frediano Ziglio Signed-off-by: Frediano Ziglio <freddy77@gmail.com> --- block/raw-posix.c | 7 +++++++ linux-aio.c | 3 +++ 2 files changed, 10 insertions(+), 0 deletions(-) diff --git a/block/raw-posix.c b/block/raw-posix.c index 3c6bd4b..27ae81e 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -628,6 +628,13 @@ static BlockDriverAIOCB *raw_aio_flush(BlockDriverState *bs, if (fd_open(bs) < 0) return NULL; +#ifdef CONFIG_LINUX_AIO + if (s->use_aio) { + return laio_submit(bs, s->aio_ctx, s->fd, 0, NULL, + 0, cb, opaque, QEMU_AIO_FLUSH); + } +#endif + return paio_submit(bs, s->fd, 0, NULL, 0, cb, opaque, QEMU_AIO_FLUSH); } diff --git a/linux-aio.c b/linux-aio.c index 68f4b3d..d07c435 100644 --- a/linux-aio.c +++ b/linux-aio.c @@ -215,6 +215,9 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd, case QEMU_AIO_READ: io_prep_preadv(iocbs, fd, qiov->iov, qiov->niov, offset); break; + case QEMU_AIO_FLUSH: + io_prep_fdsync(iocbs, fd); + break; default: fprintf(stderr, "%s: invalid AIO request type 0x%x.\n", __func__, type); -- 1.7.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] linux aio: support flush operation 2011-07-27 18:25 ` [Qemu-devel] [PATCH 1/2] linux aio: support flush operation Frediano Ziglio @ 2011-07-27 18:31 ` Christoph Hellwig 2011-07-27 19:52 ` Frediano Ziglio 0 siblings, 1 reply; 12+ messages in thread From: Christoph Hellwig @ 2011-07-27 18:31 UTC (permalink / raw) To: Frediano Ziglio; +Cc: Kevin Wolf, qemu-devel Did you test this at all? On Wed, Jul 27, 2011 at 08:25:25PM +0200, Frediano Ziglio wrote: > + case QEMU_AIO_FLUSH: > + io_prep_fdsync(iocbs, fd); > + break; Looks great, but doesn't work as expected. Hint: grep for aio_fsync in the linux kernel. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] linux aio: support flush operation 2011-07-27 18:31 ` Christoph Hellwig @ 2011-07-27 19:52 ` Frediano Ziglio 2011-07-27 19:57 ` Christoph Hellwig 0 siblings, 1 reply; 12+ messages in thread From: Frediano Ziglio @ 2011-07-27 19:52 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Kevin Wolf, qemu-devel@nongnu.org Il giorno 27/lug/2011, alle ore 20:31, Christoph Hellwig <hch@lst.de> ha scritto: > Did you test this at all? > Yes! Not at kernel level :-) Usually I trust documentation and man pages. > On Wed, Jul 27, 2011 at 08:25:25PM +0200, Frediano Ziglio wrote: >> + case QEMU_AIO_FLUSH: >> + io_prep_fdsync(iocbs, fd); >> + break; > > Looks great, but doesn't work as expected. > > Hint: grep for aio_fsync in the linux kernel. > Thanks. I'll try to port misaligned access to Linux AIO. Also I'll add some comments on code to avoid somebody do the same mistache I did. Mainly however -k qemu-img and aio=native in blockdev options are silently ignored if nocache is not enabled. Also I notice that combining XFS, Linux AIO, O_DIRECT and O_DSYNC give impressive performance but currently there is no way to specify all that flags together cause nocache enable O_DIRECT while O_DSYNC is enabled with writethrough. Frediano ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] linux aio: support flush operation 2011-07-27 19:52 ` Frediano Ziglio @ 2011-07-27 19:57 ` Christoph Hellwig 2011-07-28 7:47 ` Kevin Wolf 0 siblings, 1 reply; 12+ messages in thread From: Christoph Hellwig @ 2011-07-27 19:57 UTC (permalink / raw) To: Frediano Ziglio; +Cc: Kevin Wolf, Christoph Hellwig, qemu-devel@nongnu.org On Wed, Jul 27, 2011 at 09:52:51PM +0200, Frediano Ziglio wrote: > > > > Yes! Not at kernel level :-) In that case we have a bad error handling problem somewhere in qemu. the IOCB_CMD_FDSYNC aio opcode will always return EINVAL currently, and we really should have cought that somewhere in qemu. > Thanks. I'll try to port misaligned access to Linux AIO. Also I'll add some comments on code to avoid somebody do the same mistache I did. It's direct I/O code in general that doesn't handle misaligned access. Given that we should never get misaligned I/O from guests I just didn't bother duplicating the read-modify-write code for that code path as well. > Mainly however -k qemu-img and aio=native in blockdev options are silently ignored if nocache is not enabled. Maybe we should indeed error out instead. Care to prepare a patch for that? > Also I notice that combining XFS, Linux AIO, O_DIRECT and O_DSYNC give impressive performance but currently there is no way to specify all that flags together cause nocache enable O_DIRECT while O_DSYNC is enabled with writethrough. Indeed. This has come up a few times, and actually is a mostly trivial task. Maybe we should give up waiting for -blockdev and separate cache mode settings and allow a nocache-writethrough or similar mode now? It's going to be around 10 lines of code + documentation. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] linux aio: support flush operation 2011-07-27 19:57 ` Christoph Hellwig @ 2011-07-28 7:47 ` Kevin Wolf 2011-07-28 12:15 ` Christoph Hellwig 0 siblings, 1 reply; 12+ messages in thread From: Kevin Wolf @ 2011-07-28 7:47 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Frediano Ziglio, qemu-devel@nongnu.org Am 27.07.2011 21:57, schrieb Christoph Hellwig: > On Wed, Jul 27, 2011 at 09:52:51PM +0200, Frediano Ziglio wrote: >> Also I notice that combining XFS, Linux AIO, O_DIRECT and O_DSYNC give impressive performance but currently there is no way to specify all that flags together cause nocache enable O_DIRECT while O_DSYNC is enabled with writethrough. > > Indeed. This has come up a few times, and actually is a mostly trivial > task. Maybe we should give up waiting for -blockdev and separate cache > mode settings and allow a nocache-writethrough or similar mode now? It's > going to be around 10 lines of code + documentation. I understand that there may be reasons for using O_DIRECT | O_DSYNC, but what is the explanation for O_DSYNC improving performance? Christoph, on another note: Can we rely on Linux AIO never returning short writes except on EOF? Currently we return -EINVAL in this case, so I hope it's true or we wouldn't return the correct error code. The reason why I'm asking is because I want to allow reads across EOF for growable images and pad with zeros (the synchronous code does this today in order to allow bdrv_pread/pwrite to work, and when we start using coroutines in the block layer, these cases will hit the AIO paths). Kevin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] linux aio: support flush operation 2011-07-28 7:47 ` Kevin Wolf @ 2011-07-28 12:15 ` Christoph Hellwig 2011-07-28 12:41 ` Kevin Wolf 2011-07-29 15:33 ` Stefan Hajnoczi 0 siblings, 2 replies; 12+ messages in thread From: Christoph Hellwig @ 2011-07-28 12:15 UTC (permalink / raw) To: Kevin Wolf; +Cc: qemu-devel@nongnu.org, Christoph Hellwig, Frediano Ziglio On Thu, Jul 28, 2011 at 09:47:05AM +0200, Kevin Wolf wrote: > > Indeed. This has come up a few times, and actually is a mostly trivial > > task. Maybe we should give up waiting for -blockdev and separate cache > > mode settings and allow a nocache-writethrough or similar mode now? It's > > going to be around 10 lines of code + documentation. > > I understand that there may be reasons for using O_DIRECT | O_DSYNC, but > what is the explanation for O_DSYNC improving performance? There isn't any, at least for modern Linux. O_DSYNC at this point is equivalent to a range fdatasync for each write call, and given that we're doing O_DIRECT the ranges flush doesn't matter. If you do have a modern host and an old guest it might end up beeing faster because the barrier implementation in Linux used to suck so badly, but that's not inhrent to the I/O model. If you guest however doesn't support cache flushes at all O_DIRECT | O_DSYNC is the only sane model to use for local filesystems and block devices. > Christoph, on another note: Can we rely on Linux AIO never returning > short writes except on EOF? Currently we return -EINVAL in this case, so > I hope it's true or we wouldn't return the correct error code. More or less. There's one corner case for all Linux I/O, and that is only writes up to INT_MAX are supported, and larger writes (and reads) get truncated to it. It's pretty nasty, but Linux has been vocally opposed to fixing this issue. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] linux aio: support flush operation 2011-07-28 12:15 ` Christoph Hellwig @ 2011-07-28 12:41 ` Kevin Wolf 2011-07-29 14:24 ` Christoph Hellwig 2011-07-29 15:33 ` Stefan Hajnoczi 1 sibling, 1 reply; 12+ messages in thread From: Kevin Wolf @ 2011-07-28 12:41 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Frediano Ziglio, qemu-devel@nongnu.org Am 28.07.2011 14:15, schrieb Christoph Hellwig: >> Christoph, on another note: Can we rely on Linux AIO never returning >> short writes except on EOF? Currently we return -EINVAL in this case, so "short reads" I meant, of course. >> I hope it's true or we wouldn't return the correct error code. > > More or less. There's one corner case for all Linux I/O, and that is > only writes up to INT_MAX are supported, and larger writes (and reads) > get truncated to it. It's pretty nasty, but Linux has been vocally > opposed to fixing this issue. I think we can safely ignore this. So just replacing the current ret = -EINVAL; by a memset(buf + ret, 0, len - ret); ret = 0; should be okay, right? (Of course using the qiov versions, but you get the idea) Kevin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] linux aio: support flush operation 2011-07-28 12:41 ` Kevin Wolf @ 2011-07-29 14:24 ` Christoph Hellwig 0 siblings, 0 replies; 12+ messages in thread From: Christoph Hellwig @ 2011-07-29 14:24 UTC (permalink / raw) To: Kevin Wolf; +Cc: qemu-devel@nongnu.org, Christoph Hellwig, Frediano Ziglio On Thu, Jul 28, 2011 at 02:41:02PM +0200, Kevin Wolf wrote: > > More or less. There's one corner case for all Linux I/O, and that is > > only writes up to INT_MAX are supported, and larger writes (and reads) > > get truncated to it. It's pretty nasty, but Linux has been vocally > > opposed to fixing this issue. > > I think we can safely ignore this. So just replacing the current > ret = -EINVAL; by a memset(buf + ret, 0, len - ret); ret = 0; should be > okay, right? (Of course using the qiov versions, but you get the idea) This should be safe. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] linux aio: support flush operation 2011-07-28 12:15 ` Christoph Hellwig 2011-07-28 12:41 ` Kevin Wolf @ 2011-07-29 15:33 ` Stefan Hajnoczi 1 sibling, 0 replies; 12+ messages in thread From: Stefan Hajnoczi @ 2011-07-29 15:33 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Kevin Wolf, qemu-devel@nongnu.org, Frediano Ziglio On Thu, Jul 28, 2011 at 1:15 PM, Christoph Hellwig <hch@lst.de> wrote: > On Thu, Jul 28, 2011 at 09:47:05AM +0200, Kevin Wolf wrote: >> > Indeed. This has come up a few times, and actually is a mostly trivial >> > task. Maybe we should give up waiting for -blockdev and separate cache >> > mode settings and allow a nocache-writethrough or similar mode now? It's >> > going to be around 10 lines of code + documentation. >> >> I understand that there may be reasons for using O_DIRECT | O_DSYNC, but >> what is the explanation for O_DSYNC improving performance? > > There isn't any, at least for modern Linux. O_DSYNC at this point is > equivalent to a range fdatasync for each write call, and given that we're > doing O_DIRECT the ranges flush doesn't matter. If you do have a modern > host and an old guest it might end up beeing faster because the barrier > implementation in Linux used to suck so badly, but that's not inhrent > to the I/O model. If you guest however doesn't support cache flushes > at all O_DIRECT | O_DSYNC is the only sane model to use for local filesystems > and block devices. I can rebase this cache=directsync patch and send it: http://repo.or.cz/w/qemu/stefanha.git/commitdiff/6756719a46ac9876ac6d0460a33ad98e96a3a923 The other weird caching-related option I was playing with is -drive ...,readahead=on|off. It lets you disable the host kernel readahead on buffered modes (cache=writeback|writethrough): http://repo.or.cz/w/qemu/stefanha.git/commitdiff/f2fc2b297a2b2dd0cccd1dc2f7c519f3b0374e0d Stefan ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Qemu-devel] [PATCH 2/2] aio: use Linux AIO even if nocache is not specified 2011-07-27 18:25 [Qemu-devel] [PATCH 0/2] improve Linux AIO support Frediano Ziglio 2011-07-27 18:25 ` [Qemu-devel] [PATCH 1/2] linux aio: support flush operation Frediano Ziglio @ 2011-07-27 18:25 ` Frediano Ziglio 2011-07-27 18:32 ` Christoph Hellwig 1 sibling, 1 reply; 12+ messages in thread From: Frediano Ziglio @ 2011-07-27 18:25 UTC (permalink / raw) To: Kevin Wolf; +Cc: qemu-devel, Frediano Ziglio Currently Linux AIO are used only if nocache is specified. Linux AIO works in all cases. The only problem is that currently Linux AIO does not align data so I add a test that use POSIX AIO in this case. Signed-off-by: Frediano Ziglio <freddy77@gmail.com> --- block/raw-posix.c | 23 ++++++++++------------- 1 files changed, 10 insertions(+), 13 deletions(-) diff --git a/block/raw-posix.c b/block/raw-posix.c index 27ae81e..078a256 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -236,21 +236,16 @@ static int raw_open_common(BlockDriverState *bs, const char *filename, } #ifdef CONFIG_LINUX_AIO - if ((bdrv_flags & (BDRV_O_NOCACHE|BDRV_O_NATIVE_AIO)) == - (BDRV_O_NOCACHE|BDRV_O_NATIVE_AIO)) { + s->use_aio = 0; + if ((bdrv_flags & BDRV_O_NATIVE_AIO)) { s->aio_ctx = laio_init(); if (!s->aio_ctx) { goto out_free_buf; } s->use_aio = 1; - } else -#endif - { -#ifdef CONFIG_LINUX_AIO - s->use_aio = 0; -#endif } +#endif #ifdef CONFIG_XFS if (platform_test_xfs_fd(s->fd)) { @@ -592,14 +587,16 @@ static BlockDriverAIOCB *raw_aio_submit(BlockDriverState *bs, if (s->aligned_buf) { if (!qiov_is_aligned(bs, qiov)) { type |= QEMU_AIO_MISALIGNED; -#ifdef CONFIG_LINUX_AIO - } else if (s->use_aio) { - return laio_submit(bs, s->aio_ctx, s->fd, sector_num, qiov, - nb_sectors, cb, opaque, type); -#endif } } +#ifdef CONFIG_LINUX_AIO + if (s->use_aio && !(type & QEMU_AIO_MISALIGNED)) { + return laio_submit(bs, s->aio_ctx, s->fd, sector_num, qiov, + nb_sectors, cb, opaque, type); + } +#endif + return paio_submit(bs, s->fd, sector_num, qiov, nb_sectors, cb, opaque, type); } -- 1.7.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] aio: use Linux AIO even if nocache is not specified 2011-07-27 18:25 ` [Qemu-devel] [PATCH 2/2] aio: use Linux AIO even if nocache is not specified Frediano Ziglio @ 2011-07-27 18:32 ` Christoph Hellwig 0 siblings, 0 replies; 12+ messages in thread From: Christoph Hellwig @ 2011-07-27 18:32 UTC (permalink / raw) To: Frediano Ziglio; +Cc: Kevin Wolf, qemu-devel On Wed, Jul 27, 2011 at 08:25:26PM +0200, Frediano Ziglio wrote: > Currently Linux AIO are used only if nocache is specified. > Linux AIO works in all cases. The only problem is that currently Linux AIO > does not align data so I add a test that use POSIX AIO in this case. The kernel will accept buffered I/O requests, and even handle them 100% correctly. The only thing it won't do is to handle them asynchronously, so with your patch you're back to executing I/O synchronously in guest context. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-07-29 15:33 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-07-27 18:25 [Qemu-devel] [PATCH 0/2] improve Linux AIO support Frediano Ziglio 2011-07-27 18:25 ` [Qemu-devel] [PATCH 1/2] linux aio: support flush operation Frediano Ziglio 2011-07-27 18:31 ` Christoph Hellwig 2011-07-27 19:52 ` Frediano Ziglio 2011-07-27 19:57 ` Christoph Hellwig 2011-07-28 7:47 ` Kevin Wolf 2011-07-28 12:15 ` Christoph Hellwig 2011-07-28 12:41 ` Kevin Wolf 2011-07-29 14:24 ` Christoph Hellwig 2011-07-29 15:33 ` Stefan Hajnoczi 2011-07-27 18:25 ` [Qemu-devel] [PATCH 2/2] aio: use Linux AIO even if nocache is not specified Frediano Ziglio 2011-07-27 18:32 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).