All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Hao Xu <haoxu.linux@gmail.com>, io-uring@vger.kernel.org
Cc: Pavel Begunkov <asml.silence@gmail.com>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 0/5] fast poll multishot mode
Date: Fri, 6 May 2022 17:26:44 -0600	[thread overview]
Message-ID: <08ff00da-b871-2f2a-7b23-c8b2621df9dd@kernel.dk> (raw)
In-Reply-To: <9157fe69-b5d4-2478-7a0d-e037b5550168@kernel.dk>

On 5/6/22 4:23 PM, Jens Axboe wrote:
> On 5/6/22 1:00 AM, Hao Xu wrote:
>> Let multishot support multishot mode, currently only add accept as its
>> first comsumer.
>> theoretical analysis:
>>   1) when connections come in fast
>>     - singleshot:
>>               add accept sqe(userpsace) --> accept inline
>>                               ^                 |
>>                               |-----------------|
>>     - multishot:
>>              add accept sqe(userspace) --> accept inline
>>                                               ^     |
>>                                               |--*--|
>>
>>     we do accept repeatedly in * place until get EAGAIN
>>
>>   2) when connections come in at a low pressure
>>     similar thing like 1), we reduce a lot of userspace-kernel context
>>     switch and useless vfs_poll()
>>
>>
>> tests:
>> Did some tests, which goes in this way:
>>
>>   server    client(multiple)
>>   accept    connect
>>   read      write
>>   write     read
>>   close     close
>>
>> Basically, raise up a number of clients(on same machine with server) to
>> connect to the server, and then write some data to it, the server will
>> write those data back to the client after it receives them, and then
>> close the connection after write return. Then the client will read the
>> data and then close the connection. Here I test 10000 clients connect
>> one server, data size 128 bytes. And each client has a go routine for
>> it, so they come to the server in short time.
>> test 20 times before/after this patchset, time spent:(unit cycle, which
>> is the return value of clock())
>> before:
>>   1930136+1940725+1907981+1947601+1923812+1928226+1911087+1905897+1941075
>>   +1934374+1906614+1912504+1949110+1908790+1909951+1941672+1969525+1934984
>>   +1934226+1914385)/20.0 = 1927633.75
>> after:
>>   1858905+1917104+1895455+1963963+1892706+1889208+1874175+1904753+1874112
>>   +1874985+1882706+1884642+1864694+1906508+1916150+1924250+1869060+1889506
>>   +1871324+1940803)/20.0 = 1894750.45
>>
>> (1927633.75 - 1894750.45) / 1927633.75 = 1.65%
>>
>>
>> A liburing test is here:
>> https://github.com/HowHsu/liburing/blob/multishot_accept/test/accept.c
> 
> Wish I had seen that, I wrote my own! But maybe that's good, you tend to
> find other issues through that.
> 
> Anyway, works for me in testing, and I can see this being a nice win for
> accept intensive workloads. I pushed a bunch of cleanup patches that
> should just get folded in. Can you fold them into your patches and
> address the other feedback, and post a v3? I pushed the test branch
> here:
> 
> https://git.kernel.dk/cgit/linux-block/log/?h=fastpoll-mshot

Quick benchmark here, accepting 10k connections:

Stock kernel
real	0m0.728s
user	0m0.009s
sys	0m0.192s

Patched
real	0m0.684s
user	0m0.018s
sys	0m0.102s

Looks like a nice win for a highly synthetic benchmark. Nothing
scientific, was just curious.

-- 
Jens Axboe


  reply	other threads:[~2022-05-06 23:26 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-06  7:00 [PATCH v2 0/5] fast poll multishot mode Hao Xu
2022-05-06  7:00 ` [PATCH 1/5] io_uring: add IORING_ACCEPT_MULTISHOT for accept Hao Xu
2022-05-06 14:32   ` Jens Axboe
2022-05-07  4:05     ` Hao Xu
2022-05-06  7:00 ` [PATCH 2/5] io_uring: add REQ_F_APOLL_MULTISHOT for requests Hao Xu
2022-05-06  7:01 ` [PATCH 3/5] io_uring: let fast poll support multishot Hao Xu
2022-05-06 17:19   ` Pavel Begunkov
2022-05-06 22:02     ` Jens Axboe
2022-05-07  6:32       ` Hao Xu
2022-05-07  9:26       ` Pavel Begunkov
2022-05-07  7:08     ` Hao Xu
2022-05-07  9:47       ` Pavel Begunkov
2022-05-07 11:06         ` Hao Xu
2022-05-06 18:02   ` kernel test robot
2022-05-06  7:01 ` [PATCH 4/5] io_uring: add a helper for poll clean Hao Xu
2022-05-06 11:04   ` kernel test robot
2022-05-06 12:47   ` kernel test robot
2022-05-06 14:36   ` Jens Axboe
2022-05-07  6:37     ` Hao Xu
2022-05-06 16:22   ` Pavel Begunkov
2022-05-07  6:43     ` Hao Xu
2022-05-07  9:29       ` Pavel Begunkov
2022-05-06  7:01 ` [PATCH 5/5] io_uring: implement multishot mode for accept Hao Xu
2022-05-06 14:42   ` Jens Axboe
2022-05-07  9:13     ` Hao Xu
2022-05-06 20:50   ` Jens Axboe
2022-05-06 21:29     ` Jens Axboe
2022-05-06  7:36 ` [PATCH v2 0/5] fast poll multishot mode Hao Xu
2022-05-06 14:18   ` Jens Axboe
2022-05-06 16:01     ` Pavel Begunkov
2022-05-06 16:03       ` Jens Axboe
2022-05-06 22:23 ` Jens Axboe
2022-05-06 23:26   ` Jens Axboe [this message]
2022-05-07  2:33     ` Jens Axboe
2022-05-07  3:08       ` Jens Axboe
2022-05-07 16:01         ` Hao Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=08ff00da-b871-2f2a-7b23-c8b2621df9dd@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=asml.silence@gmail.com \
    --cc=haoxu.linux@gmail.com \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.