qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Gerd Hoffmann <kraxel@redhat.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: kvm-devel <kvm@vger.kernel.org>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC] Replace posix-aio with custom thread pool
Date: Thu, 11 Dec 2008 17:11:08 +0100	[thread overview]
Message-ID: <49413B9C.3030703@redhat.com> (raw)
In-Reply-To: <20081211155335.GE14908@random.random>

[-- Attachment #1: Type: text/plain, Size: 1260 bytes --]

Andrea Arcangeli wrote:
>>   * It can't handle block allocation.  Kernel handles that by doing
>>     such writes synchronously via VFS layer (instead of the separate
>>     aio code paths).  Leads to horrible performance and bug reports
>>     such as "installs on sparse files are very slow".
> 
> I think here you mean O_DIRECT regardless of aio/sync API,

Yes.  But kernel aio requires O_DIRECT, so aio users are affected
nevertheless.

> So in kernels that don't support IOCB_CMD_READV/WRITEV, we've simply
> to an array of iocb through io_submit (i.e. to conver the iov into a
> vector of iocb, instead of a single iocb pointing to the
> iov). Internally to io_submit a single dma command should be generated
> and the same sg list should be built the same as if we used
> READV/WRITEV. In theory READV/WRITEV should be just a cpu saving
> feature, it shouldn't influence disk bandwidth. If it does, it means
> the bio layer is broken and needs fixing.

Havn't tested that.  Could be it isn't a big problem, extra code size
for the two modes aside.

>   > > ahem: http://www.daemon-systems.org/man/preadv.2.html > >
> 
> Too bad nobody implemented it yet...

Kernel side looks easy, attached patch + syscall table windup in all
archs ...

cheers,
  Gerd

[-- Attachment #2: preadv.diff --]
[-- Type: text/plain, Size: 1390 bytes --]

diff --git a/fs/read_write.c b/fs/read_write.c
index 969a6d9..d1ea2fd 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -701,6 +701,54 @@ sys_writev(unsigned long fd, const struct iovec __user *vec, unsigned long vlen)
 	return ret;
 }
 
+asmlinkage ssize_t sys_preadv(unsigned int fd, const struct iovec __user *vec,
+                              unsigned long vlen, loff_t pos)
+{
+	struct file *file;
+	ssize_t ret = -EBADF;
+	int fput_needed;
+
+	if (pos < 0)
+		return -EINVAL;
+
+	file = fget_light(fd, &fput_needed);
+	if (file) {
+		ret = -ESPIPE;
+		if (file->f_mode & FMODE_PREAD)
+			ret = vfs_readv(file, vec, vlen, &pos);
+		fput_light(file, fput_needed);
+	}
+
+	if (ret > 0)
+		add_rchar(current, ret);
+	inc_syscr(current);
+	return ret;
+}
+
+asmlinkage ssize_t sys_pwritev(unsigned int fd, const struct iovec __user *vec,
+                              unsigned long vlen, loff_t pos)
+{
+	struct file *file;
+	ssize_t ret = -EBADF;
+	int fput_needed;
+
+	if (pos < 0)
+		return -EINVAL;
+
+	file = fget_light(fd, &fput_needed);
+	if (file) {
+		ret = -ESPIPE;
+		if (file->f_mode & FMODE_PWRITE)
+			ret = vfs_writev(file, vec, vlen, &pos);
+		fput_light(file, fput_needed);
+	}
+
+	if (ret > 0)
+		add_wchar(current, ret);
+	inc_syscw(current);
+	return ret;
+}
+
 static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
 			   size_t count, loff_t max)
 {

  reply	other threads:[~2008-12-11 16:11 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-05 21:21 [Qemu-devel] [RFC] Replace posix-aio with custom thread pool Anthony Liguori
2008-12-06  9:03 ` Blue Swirl
2008-12-06 18:26   ` Jamie Lokier
2008-12-08 18:23   ` Anthony Liguori
2008-12-09 15:51 ` Gerd Hoffmann
2008-12-09 16:01   ` Anthony Liguori
2008-12-10 16:44     ` Andrea Arcangeli
2008-12-10 17:21       ` Anthony Liguori
2008-12-10 17:29         ` Gerd Hoffmann
2008-12-10 18:50           ` Anthony Liguori
2008-12-10 19:08             ` Andrea Arcangeli
2008-12-11 13:12               ` Andrea Arcangeli
2008-12-11 15:24                 ` Gerd Hoffmann
2008-12-11 15:53                   ` Andrea Arcangeli
2008-12-11 16:11                     ` Gerd Hoffmann [this message]
2008-12-11 16:49                       ` Andrea Arcangeli
2008-12-11 17:20                         ` Gerd Hoffmann
2008-12-11 18:11                           ` Andrea Arcangeli
2008-12-11 20:38                             ` Gerd Hoffmann
2008-12-11 20:40                             ` Anthony Liguori
2008-12-12  8:23                             ` Jens Axboe
2008-12-12 11:51                               ` Andrea Arcangeli
2008-12-12 11:54                                 ` Jens Axboe
2008-12-12 14:13                                   ` Andrea Arcangeli
2008-12-12 14:24                                     ` Anthony Liguori
2008-12-12 16:33                                       ` Chris Wright
2008-12-12 16:51                                         ` Anthony Liguori
2008-12-12 16:52                                           ` Chris Wright
2008-12-11 21:32                         ` Christoph Hellwig
2008-12-12  0:27                           ` Andrea Arcangeli
2008-12-11 21:30                     ` Christoph Hellwig
2008-12-11 16:41                   ` Anthony Liguori
2008-12-12 14:24               ` Andrea Arcangeli
2008-12-12 14:35                 ` Anthony Liguori
2008-12-12 15:44                   ` Andrea Arcangeli
2008-12-12 16:49                     ` Anthony Liguori
2008-12-12 17:09                       ` Andrea Arcangeli
2008-12-12 17:25                         ` Anthony Liguori
2008-12-12 17:52                           ` Andrea Arcangeli
2008-12-12 18:17                             ` Anthony Liguori
2008-12-12 18:26                               ` Andrea Arcangeli
2008-12-12 20:12                                 ` Gerd Hoffmann
2008-12-12 20:17                                   ` Anthony Liguori
2008-12-12 20:35                                     ` Gerd Hoffmann
2008-12-09 17:16   ` Avi Kivity
2008-12-17 14:44 ` Ian Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49413B9C.3030703@redhat.com \
    --to=kraxel@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).