From: Pavel Begunkov <asml.silence@gmail.com>
To: Hao Xu <haoxu@linux.alibaba.com>, Jens Axboe <axboe@kernel.dk>
Cc: io-uring@vger.kernel.org, Joseph Qi <joseph.qi@linux.alibaba.com>
Subject: Re: [POC RFC 0/3] support graph like dependent sqes
Date: Fri, 17 Dec 2021 19:33:43 +0000 [thread overview]
Message-ID: <aebc5433-258d-2d36-9e38-36860b99a669@gmail.com> (raw)
In-Reply-To: <c6e18c00-7c1b-d1e9-a152-91b86f426289@linux.alibaba.com>
On 12/16/21 16:55, Hao Xu wrote:
> 在 2021/12/15 上午2:16, Pavel Begunkov 写道:
>> On 12/14/21 16:53, Hao Xu wrote:
>>> 在 2021/12/14 下午11:21, Pavel Begunkov 写道:
>>>> On 12/14/21 05:57, Hao Xu wrote:
>>>>> This is just a proof of concept which is incompleted, send it early for
>>>>> thoughts and suggestions.
>>>>>
>>>>> We already have IOSQE_IO_LINK to describe linear dependency
>>>>> relationship sqes. While this patchset provides a new feature to
>>>>> support DAG dependency. For instance, 4 sqes have a relationship
>>>>> as below:
>>>>> --> 2 --
>>>>> / \
>>>>> 1 --- ---> 4
>>>>> \ /
>>>>> --> 3 --
>>>>> IOSQE_IO_LINK serializes them to 1-->2-->3-->4, which unneccessarily
>>>>> serializes 2 and 3. But a DAG can fully describe it.
>>>>>
>>>>> For the detail usage, see the following patches' messages.
>>>>>
>>>>> Tested it with 100 direct read sqes, each one reads a BS=4k block data
>>>>> in a same file, blocks are not overlapped. These sqes form a graph:
>>>>> 2
>>>>> 3
>>>>> 1 --> 4 --> 100
>>>>> ...
>>>>> 99
>>>>>
>>>>> This is an extreme case, just to show the idea.
>>>>>
>>>>> results below:
>>>>> io_link:
>>>>> IOPS: 15898251
>>>>> graph_link:
>>>>> IOPS: 29325513
>>>>> io_link:
>>>>> IOPS: 16420361
>>>>> graph_link:
>>>>> IOPS: 29585798
>>>>> io_link:
>>>>> IOPS: 18148820
>>>>> graph_link:
>>>>> IOPS: 27932960
>>>>
>>>> Hmm, what do we compare here? IIUC,
>>>> "io_link" is a huge link of 100 requests. Around 15898251 IOPS
>>>> "graph_link" is a graph of diameter 3. Around 29585798 IOPS
>>
>> Diam 2 graph, my bad
>>
>>
>>>> Is that right? If so it'd more more fair to compare with a
>>>> similar graph-like scheduling on the userspace side.
>>>
>>> The above test is more like to show the disadvantage of LINK
>>
>> Oh yeah, links can be slow, especially when it kills potential
>> parallelism or need extra allocations for keeping state, like
>> READV and WRITEV.
>>
>>
>>> But yes, it's better to test the similar userspace scheduling since
>>>
>>> LINK is definitely not a good choice so have to prove the graph stuff
>>>
>>> beat the userspace scheduling. Will test that soon. Thanks.
>>
>> Would be also great if you can also post the benchmark once
>> it's done
>
> Wrote a new test to test nop sqes forming a full binary tree with (2^10)-1 nodes,
> which I think it a more general case. Turns out the result is still not stable and
> the kernel side graph link is much slow. I'll try to optimize it.
That's expected unfortunately. And without reacting on results
of previous requests, it's hard to imagine to be useful. BPF may
have helped, e.g. not keeping an explicit graph but just generating
new requests from the kernel... But apparently even with this it's
hard to compete with just leaving it in userspace.
> Btw, is there any comparison data between the current io link feature and the
> userspace scheduling.
Don't remember. I'd try to look up the cover-letter for the patches
implementing it, I believe there should've been some numbers and
hopefully test description.
fwiw, before io_uring mailing list got established patches/etc.
were mostly going through linux-block mailing list. Links are old, so
patches might be there.
--
Pavel Begunkov
next prev parent reply other threads:[~2021-12-17 19:34 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-14 5:57 [POC RFC 0/3] support graph like dependent sqes Hao Xu
2021-12-14 5:57 ` [PATCH 1/3] io_uring: add data structure for graph sqe feature Hao Xu
2021-12-14 5:57 ` [PATCH 2/3] io_uring: implement new sqe opcode to build graph like links Hao Xu
2021-12-14 5:57 ` [PATCH 3/3] io_uring: implement logic of IOSQE_GRAPH request Hao Xu
2021-12-14 15:21 ` [POC RFC 0/3] support graph like dependent sqes Pavel Begunkov
2021-12-14 16:53 ` Hao Xu
2021-12-14 18:16 ` Pavel Begunkov
2021-12-16 16:55 ` Hao Xu
2021-12-17 19:33 ` Pavel Begunkov [this message]
2021-12-18 6:57 ` Hao Xu
2021-12-21 16:19 ` Pavel Begunkov
2021-12-23 4:14 ` Hao Xu
2021-12-23 10:06 ` Christian Dietrich
2021-12-27 3:27 ` Hao Xu
2021-12-27 5:49 ` Christian Dietrich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aebc5433-258d-2d36-9e38-36860b99a669@gmail.com \
--to=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=haoxu@linux.alibaba.com \
--cc=io-uring@vger.kernel.org \
--cc=joseph.qi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).