All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Jens Axboe <axboe@kernel.dk>,
	io-uring@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>, Dennis Zhou <dennis@kernel.org>,
	Christoph Lameter <cl@linux.com>
Subject: Re: [PATCH RFC v2 3/3] io_uring: batch get(ctx->ref) across submits
Date: Sat, 21 Dec 2019 20:26:22 +0300	[thread overview]
Message-ID: <b5c0e2ab-ded3-d06b-afda-e7a72f1368e4@gmail.com> (raw)
In-Reply-To: <fef3a245-d2a2-23b3-ff03-3e05af19b752@kernel.dk>


[-- Attachment #1.1: Type: text/plain, Size: 2970 bytes --]

On 21/12/2019 20:01, Jens Axboe wrote:
> On 12/21/19 9:48 AM, Pavel Begunkov wrote:
>> On 21/12/2019 19:38, Jens Axboe wrote:
>>> On 12/21/19 9:20 AM, Pavel Begunkov wrote:
>>>> On 21/12/2019 19:15, Pavel Begunkov wrote:
>>>>> Double account ctx->refs keeping number of taken refs in ctx. As
>>>>> io_uring gets per-request ctx->refs during submission, while holding
>>>>> ctx->uring_lock, this allows in most of the time to bypass
>>>>> percpu_ref_get*() and its overhead.
>>>>
>>>> Jens, could you please benchmark with this one? Especially for offloaded QD1
>>>> case. I haven't got any difference for nops test and don't have a decent SSD
>>>> at hands to test it myself. We could drop it, if there is no benefit.
>>>>
>>>> This rewrites that @extra_refs from the second one, so I left it for now.
>>>
>>> Sure, let me run a peak test, qd1 test, qd1+sqpoll test on
>>> for-5.6/io_uring, same branch with 1-2, and same branch with 1-3. That
>>> should give us a good comparison. One core used for all, and we're going
>>> to be core speed bound for the performance in all cases on this setup.
>>> So it'll be a good comparison.
>>>
>> Great, thanks!
> 
> For some reason, not seeing much of a change between for-5.6/io_uring
> and 1+2 and 1+2+3, it's about the same and results seem very stable.
> For reference, top of profile with 1-3 applied looks like this:

I see. I'll probably drop the last one, as it only complicates things.

My apologies for misleading terminology. Read-only QD1 (submit and
wait until the userspace completes it) obviously won't saturate a CPU.
Writes probably wouldn't as well (though, depends on HW). And it would be
better to say -- submit by one, complete in a bunch.
Just curious, what you used for testing? Is it fio?

> 
> +    3.92%  io_uring  [kernel.vmlinux]  [k] blkdev_direct_IO
> +    3.87%  io_uring  [kernel.vmlinux]  [k] blk_mq_get_request
> +    3.43%  io_uring  [kernel.vmlinux]  [k] io_iopoll_getevents
> +    3.03%  io_uring  [kernel.vmlinux]  [k] __slab_free
> +    2.87%  io_uring  io_uring          [.] submitter_fn
> +    2.79%  io_uring  [kernel.vmlinux]  [k] io_submit_sqes
> +    2.75%  io_uring  [kernel.vmlinux]  [k] bio_alloc_bioset
> +    2.70%  io_uring  [nvme_core]       [k] nvme_setup_cmd
> +    2.59%  io_uring  [kernel.vmlinux]  [k] blk_mq_make_request
> +    2.46%  io_uring  [kernel.vmlinux]  [k] io_prep_rw
> +    2.32%  io_uring  [kernel.vmlinux]  [k] io_read
> +    2.25%  io_uring  [kernel.vmlinux]  [k] blk_mq_free_request
> +    2.19%  io_uring  [kernel.vmlinux]  [k] io_put_req
> +    2.06%  io_uring  [kernel.vmlinux]  [k] kmem_cache_alloc
> +    2.01%  io_uring  [kernel.vmlinux]  [k] generic_make_request_checks
> +    1.90%  io_uring  [kernel.vmlinux]  [k] __sbitmap_get_word
> +    1.86%  io_uring  [kernel.vmlinux]  [k] sbitmap_queue_clear
> +    1.85%  io_uring  [kernel.vmlinux]  [k] io_issue_sqe
> 
> 

-- 
Pavel Begunkov


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2019-12-21 17:27 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-17 22:28 [PATCH 0/2] optimise ctx's refs grabbing in io_uring Pavel Begunkov
2019-12-17 22:28 ` [PATCH 1/2] pcpu_ref: add percpu_ref_tryget_many() Pavel Begunkov
2019-12-17 23:42   ` Jens Axboe
2019-12-18 16:26     ` Tejun Heo
2019-12-18 17:49       ` Dennis Zhou
2019-12-21 15:36         ` Pavel Begunkov
2019-12-17 22:28 ` [PATCH 2/2] io_uring: batch getting pcpu references Pavel Begunkov
2019-12-17 23:21   ` Jens Axboe
2019-12-17 23:31     ` Jens Axboe
2019-12-18  9:25       ` Pavel Begunkov
2019-12-18  9:23     ` Pavel Begunkov
2019-12-18  0:02   ` Jens Axboe
2019-12-18 10:41     ` Pavel Begunkov
2019-12-21 16:15   ` [PATCH v2 0/3] optimise ctx's refs grabbing in io_uring Pavel Begunkov
2019-12-21 16:15     ` [PATCH v2 1/3] pcpu_ref: add percpu_ref_tryget_many() Pavel Begunkov
2019-12-21 16:15     ` [PATCH v2 2/3] io_uring: batch getting pcpu references Pavel Begunkov
2019-12-21 16:15     ` [PATCH RFC v2 3/3] io_uring: batch get(ctx->ref) across submits Pavel Begunkov
2019-12-21 16:20       ` Pavel Begunkov
2019-12-21 16:38         ` Jens Axboe
2019-12-21 16:48           ` Pavel Begunkov
2019-12-21 17:01             ` Jens Axboe
2019-12-21 17:26               ` Pavel Begunkov [this message]
2019-12-21 20:12       ` [PATCH v3 0/2] optimise ctx's refs grabbing in io_uring Pavel Begunkov
2019-12-21 20:12         ` [PATCH v3 1/2] pcpu_ref: add percpu_ref_tryget_many() Pavel Begunkov
2019-12-21 20:12         ` [PATCH v3 2/2] io_uring: batch getting pcpu references Pavel Begunkov
2019-12-21 21:56           ` Pavel Begunkov
2019-12-28 11:13         ` [PATCH v4 0/2] optimise ctx's refs grabbing in io_uring Pavel Begunkov
2019-12-28 11:13           ` [PATCH v4 1/2] pcpu_ref: add percpu_ref_tryget_many() Pavel Begunkov
2019-12-28 11:13           ` [PATCH v4 2/2] io_uring: batch getting pcpu references Pavel Begunkov
2019-12-28 11:15             ` Pavel Begunkov
2019-12-28 17:03               ` Jens Axboe
2019-12-28 18:37                 ` Pavel Begunkov
2019-12-30  3:33                   ` Brian Gianforcaro
2019-12-30 18:45                     ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b5c0e2ab-ded3-d06b-afda-e7a72f1368e4@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=cl@linux.com \
    --cc=dennis@kernel.org \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.