From: Paolo Bonzini <pbonzini@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Dave Chinner <david@fromorbit.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>,
linux-aio@kvack.org,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Linux API <linux-api@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 07/13] aio: enabled thread based async fsync
Date: Thu, 14 Jan 2016 10:19:56 +0100 [thread overview]
Message-ID: <5697683C.5070402@redhat.com> (raw)
In-Reply-To: <CA+55aFxtvMqHgHmHCcszV_QKQ2BY0wzenmrvc6BYN+tLFxesMA@mail.gmail.com>
On 12/01/2016 02:20, Linus Torvalds wrote:
> On Mon, Jan 11, 2016 at 5:11 PM, Dave Chinner <david@fromorbit.com> wrote:
>>
>> Insufficient. Needs the range to be passed through and call
>> vfs_fsync_range(), as I implemented here:
>
> And I think that's insufficient *also*.
>
> What you actually want is "sync_file_range()", with the full set of arguments.
>
> Yes, really. Sometimes you want to start the writeback, sometimes you
> want to wait for it. Sometimes you want both.
>
> For example, if you are doing your own manual write-behind logic, it
> is not sufficient for "wait for data". What you want is "start IO on
> new data" followed by "wait for old data to have been written out".
>
> I think this only strengthens my "stop with the idiotic
> special-case-AIO magic already" argument. If we want something more
> generic than the usual aio, then we should go all in. Not "let's make
> more limited special cases".
The question is, do we really want something more generic than the usual
AIO?
Virt is one of the 10 (that's a binary number) users of AIO, and we
don't even use it by default because in most cases it's really a wash.
Let's compare AIO with a simple userspace thread pool.
AIO has the ability to submit and retrieve the results of multiple
operations at once. Thread pools do not have the ability to submit
multiple operations at a time (you could play games with FUTEX_WAKE, but
then all the threads in the pool would have cacheline bounces on the futex).
The syscall overhead on the critical path is comparable. For AIO it's
io_submit+io_getevents, for a thread pool it's FUTEX_WAKE plus invoking
the actual syscall. Again, the only difference for AIO is batching.
Unless userspace is submitting tens of thousands of operations per
second, which is pretty much the case only for read/write, there's no
real benefit in asynchronous system calls over a userspace thread pool.
That applies to openat, unlinkat, fadvise (for readahead). It also
applies to msync and fsync, etc. because if your workload is doing tons
of those you'd better buy yourself a disk with a battery-backed cache,
or an UPS, and remove the msync/fsync altogether.
So I'm really happy if we can move the thread creation overhead for such
a thread pool to the kernel. It keeps the benefits of batching, it uses
the optimized kernel workqueues, it doesn't incur the cost of pthreads,
it makes it easy to remove the cases where AIO is blocking, it makes it
easy to add support for !O_DIRECT. But everything else seems overkill.
Paolo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-01-14 9:19 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-11 22:06 [PATCH 00/13] aio: thread (work queue) based aio and new aio functionality Benjamin LaHaise
2016-01-11 22:06 ` [PATCH 01/13] signals: distinguish signals sent due to i/o via io_send_sig() Benjamin LaHaise
2016-01-11 22:06 ` [PATCH 02/13] aio: add aio_get_mm() helper Benjamin LaHaise
2016-01-11 22:06 ` [PATCH 03/13] aio: for async operations, make the iter argument persistent Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 04/13] signals: add and use aio_get_task() to direct signals sent via io_send_sig() Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 05/13] fs: make do_loop_readv_writev() non-static Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 06/13] aio: add queue_work() based threaded aio support Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 07/13] aio: enabled thread based async fsync Benjamin LaHaise
2016-01-12 1:11 ` Dave Chinner
2016-01-12 1:20 ` Linus Torvalds
2016-01-12 2:25 ` Dave Chinner
2016-01-12 2:38 ` Linus Torvalds
2016-01-12 3:37 ` Dave Chinner
2016-01-12 4:03 ` Linus Torvalds
2016-01-12 4:48 ` Linus Torvalds
2016-01-12 22:50 ` Benjamin LaHaise
2016-01-15 20:21 ` Benjamin LaHaise
2016-01-20 3:59 ` Linus Torvalds
2016-01-20 5:02 ` Theodore Ts'o
2016-01-20 19:59 ` Dave Chinner
2016-01-20 20:29 ` Linus Torvalds
2016-01-20 20:44 ` Benjamin LaHaise
2016-01-20 21:45 ` Dave Chinner
2016-01-20 21:56 ` Benjamin LaHaise
2016-01-23 4:24 ` Dave Chinner
2016-01-23 4:50 ` Benjamin LaHaise
2016-01-23 22:22 ` Dave Chinner
2016-01-20 23:07 ` Linus Torvalds
2016-01-23 4:39 ` Dave Chinner
2016-03-14 17:17 ` aio openat " Benjamin LaHaise
2016-03-20 1:20 ` Linus Torvalds
2016-03-20 1:26 ` Al Viro
2016-03-20 1:45 ` Linus Torvalds
2016-03-20 1:55 ` Al Viro
2016-03-20 2:03 ` Linus Torvalds
2016-01-20 21:57 ` Dave Chinner
2016-01-22 15:41 ` Andres Freund
2016-01-12 22:59 ` Andy Lutomirski
2016-01-14 9:19 ` Paolo Bonzini [this message]
2016-01-12 1:30 ` Benjamin LaHaise
2016-01-22 15:31 ` Andres Freund
2016-01-11 22:07 ` [PATCH 08/13] aio: add support for aio poll via aio thread helper Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 09/13] aio: add support for async openat() Benjamin LaHaise
2016-01-12 0:22 ` Linus Torvalds
2016-01-12 1:17 ` Benjamin LaHaise
2016-01-12 1:45 ` Chris Mason
2016-01-12 9:53 ` Ingo Molnar
2016-01-11 22:07 ` [PATCH 10/13] aio: add async unlinkat functionality Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 11/13] mm: enable __do_page_cache_readahead() to include present pages Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 12/13] aio: add support for aio readahead Benjamin LaHaise
2016-01-11 22:08 ` [PATCH 13/13] aio: add support for aio renameat operation Benjamin LaHaise
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5697683C.5070402@redhat.com \
--to=pbonzini@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=bcrl@kvack.org \
--cc=david@fromorbit.com \
--cc=linux-aio@kvack.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).