All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Pavel Begunkov <asml.silence@gmail.com>, io-uring@vger.kernel.org
Subject: Re: [PATCH 4/4] io_uring: optimise compl locking for non-shared rings
Date: Fri, 18 Mar 2022 09:21:08 -0600	[thread overview]
Message-ID: <ce480edd-c82b-c094-39cd-d45d6b76e5a3@kernel.dk> (raw)
In-Reply-To: <7ef3335a-8e7c-d559-5a78-f48bf506f53c@gmail.com>

On 3/18/22 9:13 AM, Pavel Begunkov wrote:
> On 3/18/22 14:54, Jens Axboe wrote:
>> On 3/18/22 7:52 AM, Pavel Begunkov wrote:
>>> When only one task submits requests, most of CQEs are expected to be
>>> filled from that task context so we have natural serialisation. That
>>> would mean that in those cases we don't need spinlocking around CQE
>>> posting. One downside is that it also mean that io-wq workers can't emit
>>> CQEs directly but should do it through the original task context using
>>> task_works. That may hurt latency and performance and might matter much
>>> to some workloads, but it's not a huge deal in general as io-wq is a
>>> slow path and there is some additional merit from tw completion
>>> batching.
>>
>> Not too worried about io-wq task_work for cq filling, it is the slower
>> path after all. And I think we can get away with doing notifications as
>> it's just for CQ filling. If the task is currently waiting in
>> cqring_wait, then it'll get woken anyway and it will process task work.
>> If it's in userspace, it doesn't need a notification. That should make
>> it somewhat lighter than requiring using TIF_NOTIFY_SIGNAL for that.
>>
>>> The feature should be opted-in by the userspace by setting a new
>>> IORING_SETUP_PRIVATE_CQ flag. It doesn't work with IOPOLL, and also for
>>> now only the task that created a ring can submit requests to it.
>>
>> I know this is a WIP, but why do we need CQ_PRIVATE? And this needs to
> 
> One reason is because of the io-wq -> tw punting, which is not optimal
> for e.g. active users of IOSQE_ASYNC. The second is because the
> fundamental requirement is that only one task should be submitting
> requests. Was thinking about automating it, e.g. when we register
> a second tctx we go through a slow path waiting for all current tw
> to complete and then removing an internal and not userspace visible
> CQ_PRIVATE flag.

Was thinking something along those lines too. The alternative is setting
up the ring with SETUP_SINGLE_ISSUER or something like that, having the
application tell us that it is a single issuer and no submits are
shared across threads. Serves the same kind of purpose as CQ_PRIVATE,
but enables us to simply fail things if the task violates those
constraints. Would also be a better name I believe as it might enable
further optimizations in the future, like for example the mutex
reduction for submits.

> Also, as SQPOLL task is by definition the only one submitting SQEs,
> was thinking about enabling it by default for them, but didn't do
> because of the io-wq / IOSQE_ASYNC.

Gotcha.

>> work with registered files (and ring fd) as that is probably a bigger
>> win than skipping the completion_lock if you're not shared anyway.
> 
> It does work with fixed/registered files and registered io_uring fds.

t/io_uring fails for me with registered files or rings, getting EINVAL.
Might be user error, but that's simply just setting CQ_PRIVATE for
setup.

> In regards of "a bigger win", probably in many cases, but if you submit
> a good batch at once, and completion tw batching doesn't kick in (e.g.
> direct bdev read of not too high intensity), it might save
> N spinlock/unlock when registered ring fd would kill only one pair of
> fdget/fdput.

Definitely, various cases where one would be a bigger win than the
other, agree on that. But let's just ensure that both work together :-)

-- 
Jens Axboe


  reply	other threads:[~2022-03-18 15:21 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-18 13:52 [RFC 0/4] completion locking optimisation feature Pavel Begunkov
2022-03-18 13:52 ` [PATCH 1/4] io_uring: get rid of raw fill cqe in kill_timeout Pavel Begunkov
2022-03-18 13:52 ` [PATCH 2/4] io_uring: get rid of raw fill_cqe in io_fail_links Pavel Begunkov
2022-03-18 13:52 ` [PATCH 3/4] io_uring: remove raw fill_cqe from linked timeout Pavel Begunkov
2022-03-18 13:52 ` [PATCH 4/4] io_uring: optimise compl locking for non-shared rings Pavel Begunkov
2022-03-18 14:54   ` Jens Axboe
2022-03-18 15:13     ` Pavel Begunkov
2022-03-18 15:21       ` Jens Axboe [this message]
2022-03-18 15:32         ` Pavel Begunkov
2022-03-18 16:06           ` Jens Axboe
2022-03-18 14:42 ` [RFC 0/4] completion locking optimisation feature Pavel Begunkov
2022-03-18 14:52   ` Jens Axboe
2022-03-18 15:00     ` Pavel Begunkov
2022-03-18 15:22       ` Jens Axboe
2022-03-18 15:34         ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ce480edd-c82b-c094-39cd-d45d6b76e5a3@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=asml.silence@gmail.com \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.