public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Ming Lei <ming.lei@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>,
	hch@lst.de, dm-devel@redhat.com, linux-block@vger.kernel.org
Subject: Re: [PATCH v6 2/2] dm: support bio polling
Date: Wed, 9 Mar 2022 09:11:26 -0700	[thread overview]
Message-ID: <d4657e24-4cc7-7372-bafe-d6c9c8005c6b@kernel.dk> (raw)
In-Reply-To: <Yif/Or0s1rV87a5R@T590>

On 3/8/22 6:13 PM, Ming Lei wrote:
> On Tue, Mar 08, 2022 at 06:02:50PM -0700, Jens Axboe wrote:
>> On 3/7/22 11:53 AM, Mike Snitzer wrote:
>>> From: Ming Lei <ming.lei@redhat.com>
>>>
>>> Support bio(REQ_POLLED) polling in the following approach:
>>>
>>> 1) only support io polling on normal READ/WRITE, and other abnormal IOs
>>> still fallback to IRQ mode, so the target io is exactly inside the dm
>>> io.
>>>
>>> 2) hold one refcnt on io->io_count after submitting this dm bio with
>>> REQ_POLLED
>>>
>>> 3) support dm native bio splitting, any dm io instance associated with
>>> current bio will be added into one list which head is bio->bi_private
>>> which will be recovered before ending this bio
>>>
>>> 4) implement .poll_bio() callback, call bio_poll() on the single target
>>> bio inside the dm io which is retrieved via bio->bi_bio_drv_data; call
>>> dm_io_dec_pending() after the target io is done in .poll_bio()
>>>
>>> 5) enable QUEUE_FLAG_POLL if all underlying queues enable QUEUE_FLAG_POLL,
>>> which is based on Jeffle's previous patch.
>>
>> It's not the prettiest thing in the world with the overlay on bi_private,
>> but at least it's nicely documented now.
>>
>> I would encourage you to actually test this on fast storage, should make
>> a nice difference. I can run this on a gen2 optane, it's 10x the IOPS
>> of what it was tested on and should help better highlight where it
>> makes a difference.
>>
>> If either of you would like that, then send me a fool proof recipe for
>> what should be setup so I have a poll capable dm device.
> 
> Follows steps for setup dm stripe over two nvmes, then run io_uring on
> the dm stripe dev.

Thanks! Much easier when I don't have to figure it out... Setup:

CPU: 12900K
Drives: 2x P5800X gen2 optane (~5M IOPS each at 512b)

Baseline kernel:

sudo taskset -c 10 t/io_uring -d128 -b512 -s31 -c16 -p1 -F1 -B1 -n1 -R1 -X1 /dev/dm-0
Added file /dev/dm-0 (submitter 0)
polled=1, fixedbufs=1/0, register_files=1, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
submitter=0, tid=1004
IOPS=2794K, BW=1364MiB/s, IOS/call=31/30, inflight=(124)
IOPS=2793K, BW=1363MiB/s, IOS/call=31/31, inflight=(62)
IOPS=2789K, BW=1362MiB/s, IOS/call=31/30, inflight=(124)
IOPS=2779K, BW=1357MiB/s, IOS/call=31/31, inflight=(124)
IOPS=2780K, BW=1357MiB/s, IOS/call=31/31, inflight=(62)
IOPS=2779K, BW=1357MiB/s, IOS/call=31/31, inflight=(62)
^CExiting on signal
Maximum IOPS=2794K

generating about 500K ints/sec, and using 4k blocks:

sudo taskset -c 10 t/io_uring -d128 -b4096 -s31 -c16 -p1 -F1 -B1 -n1 -R1 -X1 /dev/dm-0
Added file /dev/dm-0 (submitter 0)
polled=1, fixedbufs=1/0, register_files=1, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
submitter=0, tid=967
IOPS=1683K, BW=6575MiB/s, IOS/call=24/24, inflight=(93)
IOPS=1685K, BW=6584MiB/s, IOS/call=24/24, inflight=(124)
IOPS=1686K, BW=6588MiB/s, IOS/call=24/24, inflight=(124)
IOPS=1684K, BW=6581MiB/s, IOS/call=24/24, inflight=(93)
IOPS=1686K, BW=6589MiB/s, IOS/call=24/24, inflight=(124)
IOPS=1687K, BW=6593MiB/s, IOS/call=24/24, inflight=(128)
IOPS=1687K, BW=6590MiB/s, IOS/call=24/24, inflight=(93)
^CExiting on signal
Maximum IOPS=1687K

which ends up being bw limited for me, because the devices aren't linked
gen4. That's about 1.4M ints/sec.

With the patched kernel, same test:

sudo taskset -c 10 t/io_uring -d128 -b512 -s31 -c16 -p1 -F1 -B1 -n1 -R1 -X1 /dev/dm-0
Added file /dev/dm-0 (submitter 0)
polled=1, fixedbufs=1/0, register_files=1, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
submitter=0, tid=989
IOPS=4151K, BW=2026MiB/s, IOS/call=16/15, inflight=(128)
IOPS=4159K, BW=2031MiB/s, IOS/call=15/15, inflight=(128)
IOPS=4193K, BW=2047MiB/s, IOS/call=15/15, inflight=(128)
IOPS=4191K, BW=2046MiB/s, IOS/call=15/15, inflight=(128)
IOPS=4202K, BW=2052MiB/s, IOS/call=15/15, inflight=(128)
^CExiting on signal
Maximum IOPS=4202K

with basically zero interrupts, and 4k:

sudo taskset -c 10 t/io_uring -d128 -b4096 -s31 -c16 -p1 -F1 -B1 -n1 -R1 -X1 /dev/dm-0
Added file /dev/dm-0 (submitter 0)
polled=1, fixedbufs=1/0, register_files=1, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
submitter=0, tid=1015
IOPS=1706K, BW=6666MiB/s, IOS/call=15/15, inflight=(128)
IOPS=1704K, BW=6658MiB/s, IOS/call=15/15, inflight=(128)
IOPS=1704K, BW=6658MiB/s, IOS/call=15/15, inflight=(128)
IOPS=1704K, BW=6658MiB/s, IOS/call=15/15, inflight=(128)
IOPS=1704K, BW=6658MiB/s, IOS/call=15/15, inflight=(128)
^CExiting on signal
Maximum IOPS=1706K

again with basically zero interrupts.

That's about a 50% improvement for polled IO. This is using 2 gen2
optanes, which are good for ~5M IOPS each. Using two threads on a single
core, baseline kernel:

sudo taskset -c 10,11 t/io_uring -d128 -b512 -s31 -c16 -p1 -F1 -B1 -n2 -R1 -X1 /dev/dm-0
Added file /dev/dm-0 (submitter 0)
Added file /dev/dm-0 (submitter 1)
polled=1, fixedbufs=1/0, register_files=1, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
submitter=0, tid=1081
submitter=1, tid=1082
IOPS=3515K, BW=1716MiB/s, IOS/call=31/30, inflight=(124 62)
IOPS=3515K, BW=1716MiB/s, IOS/call=31/31, inflight=(62 124)
IOPS=3517K, BW=1717MiB/s, IOS/call=30/30, inflight=(113 124)
IOPS=3517K, BW=1717MiB/s, IOS/call=31/31, inflight=(62 62)
^CExiting on signal
Maximum IOPS=3517K

and patched:

udo taskset -c 10,11 t/io_uring -d128 -b512 -s31 -c16 -p1 -F1 -B1 -n2 -R1 -X1 /dev/dm-0
Added file /dev/dm-0 (submitter 0)
Added file /dev/dm-0 (submitter 1)
polled=1, fixedbufs=1/0, register_files=1, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
submitter=0, tid=949
submitter=1, tid=950
IOPS=4988K, BW=2435MiB/s, IOS/call=15/15, inflight=(128 128)
IOPS=4985K, BW=2434MiB/s, IOS/call=15/15, inflight=(128 128)
IOPS=4970K, BW=2426MiB/s, IOS/call=15/15, inflight=(128 128)
IOPS=4985K, BW=2434MiB/s, IOS/call=15/15, inflight=(128 128)
^CExiting on signal
Maximum IOPS=4988K

which is about a 42% improvement in IOPS.

-- 
Jens Axboe


  reply	other threads:[~2022-03-09 16:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-07 18:53 [PATCH v6 0/2] block/dm: support bio polling Mike Snitzer
2022-03-07 18:53 ` [PATCH v6 1/2] block: add ->poll_bio to block_device_operations Mike Snitzer
2022-03-09  1:01   ` Jens Axboe
2022-03-07 18:53 ` [PATCH v6 2/2] dm: support bio polling Mike Snitzer
2022-03-09  1:02   ` Jens Axboe
2022-03-09  1:13     ` Ming Lei
2022-03-09 16:11       ` Jens Axboe [this message]
2022-03-10  4:00         ` Ming Lei
2022-03-10  4:06           ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d4657e24-4cc7-7372-bafe-d6c9c8005c6b@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox