From: Jens Axboe <axboe@kernel.dk>
To: David Laight <David.Laight@ACULAB.COM>,
'Lennert Buytenhek' <buytenh@wantstofly.org>,
Al Viro <viro@zeniv.linux.org.uk>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"io-uring@vger.kernel.org" <io-uring@vger.kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH v3 0/2] io_uring: add support for IORING_OP_GETDENTS
Date: Sat, 20 Feb 2021 11:29:22 -0700 [thread overview]
Message-ID: <28a71bb1-0aac-c166-ade7-93665811d441@kernel.dk> (raw)
In-Reply-To: <247d154f2ba549b88a77daf29ec1791f@AcuMS.aculab.com>
On 2/20/21 10:44 AM, David Laight wrote:
> From: Lennert Buytenhek
>> Sent: 18 February 2021 12:27
>>
>> These patches add support for IORING_OP_GETDENTS, which is a new io_uring
>> opcode that more or less does an lseek(sqe->fd, sqe->off, SEEK_SET)
>> followed by a getdents64(sqe->fd, (void *)sqe->addr, sqe->len).
>>
>> A dumb test program for IORING_OP_GETDENTS is available here:
>>
>> https://krautbox.wantstofly.org/~buytenh/uringfind-v2.c
>>
>> This test program does something along the lines of what find(1) does:
>> it scans recursively through a directory tree and prints the names of
>> all directories and files it encounters along the way -- but then using
>> io_uring. (The io_uring version prints the names of encountered files and
>> directories in an order that's determined by SQE completion order, which
>> is somewhat nondeterministic and likely to differ between runs.)
>>
>> On a directory tree with 14-odd million files in it that's on a
>> six-drive (spinning disk) btrfs raid, find(1) takes:
>>
>> # echo 3 > /proc/sys/vm/drop_caches
>> # time find /mnt/repo > /dev/null
>>
>> real 24m7.815s
>> user 0m15.015s
>> sys 0m48.340s
>> #
>>
>> And the io_uring version takes:
>>
>> # echo 3 > /proc/sys/vm/drop_caches
>> # time ./uringfind /mnt/repo > /dev/null
>>
>> real 10m29.064s
>> user 0m4.347s
>> sys 0m1.677s
>> #
>
> While there may be uses for IORING_OP_GETDENTS are you sure your
> test is comparing like with like?
> The underlying work has to be done in either case, so you are
> swapping system calls for code complexity.
What complexity?
> I suspect that find is actually doing a stat() call on every
> directory entry and that your io_uring example is just believing
> the 'directory' flag returned in the directory entry for most
> modern filesystems.
While that may be true (find doing stat as well), the runtime is
clearly dominated by IO. Adding a stat on top would be an extra
copy, but no extra IO.
> If you write a program that does openat(), readdir(), close()
> for each directory and with a long enough buffer (mostly) do
> one readdir() per directory you'll get a much better comparison.
>
> You could even write a program with 2 threads, one does all the
> open/readdir/close system calls and the other does the printing
> and generating the list of directories to process.
> That should get the equivalent overlapping that io_uring gives
> without much of the complexity.
But this is what take the most offense to - it's _trivial_ to
write that program with io_uring, especially compared to managing
threads. Threads are certainly a more known paradigm at this point,
but an io_uring submit + reap loop is definitely not "much of the
complexity". If you're referring to the kernel change itself, that's
trivial, as the diffstat shows.
--
Jens Axboe
next prev parent reply other threads:[~2021-02-20 18:30 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-18 12:26 [PATCH v3 0/2] io_uring: add support for IORING_OP_GETDENTS Lennert Buytenhek
2021-02-18 12:27 ` [PATCH v3 1/2] readdir: split the core of getdents64(2) out into vfs_getdents() Lennert Buytenhek
2021-02-18 12:27 ` [PATCH v3 2/2] io_uring: add support for IORING_OP_GETDENTS Lennert Buytenhek
2021-02-19 12:05 ` Pavel Begunkov
2021-02-19 12:10 ` Pavel Begunkov
2021-02-19 18:06 ` Lennert Buytenhek
2021-02-19 12:34 ` Matthew Wilcox
2021-02-19 18:07 ` Lennert Buytenhek
2021-02-19 18:59 ` Lennert Buytenhek
2021-02-20 17:44 ` [PATCH v3 0/2] " David Laight
2021-02-20 18:29 ` Jens Axboe [this message]
2021-02-21 19:38 ` David Laight
2021-02-21 21:12 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=28a71bb1-0aac-c166-ade7-93665811d441@kernel.dk \
--to=axboe@kernel.dk \
--cc=David.Laight@ACULAB.COM \
--cc=buytenh@wantstofly.org \
--cc=io-uring@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.