Re: [LSF/MM/BPF TOPIC] Improving Zoned Storage Support

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

From: Jens Axboe <axboe@kernel.dk>
To: Bart Van Assche <bvanassche@acm.org>,
	Damien Le Moal <dlemoal@kernel.org>,
	"lsf-pc@lists.linux-foundation.org"
	<lsf-pc@lists.linux-foundation.org>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [LSF/MM/BPF TOPIC] Improving Zoned Storage Support
Date: Wed, 17 Jan 2024 13:06:19 -0700	[thread overview]
Message-ID: <276eedc2-e3d0-40c7-b355-46232ea65662@kernel.dk> (raw)
In-Reply-To: <9af03351-a04a-4e61-a6d8-b58236b041a3@kernel.dk>

On 1/17/24 11:43 AM, Jens Axboe wrote:
> Certainly slower. Now let's try and have the scheduler place the same 4
> threads where it sees fit:
> 
> IOPS=1.56M, BW=759MiB/s, IOS/call=32/31
> 
> Yikes! That's still substantially more than 200K IOPS even with heavy
> contention, let's take a look at the profile:
> 
> -   70.63%  io_uring  [kernel.kallsyms]  [k] queued_spin_lock_slowpath
>    - submitter_uring_fn
>       - entry_SYSCALL_64
>       - do_syscall_64
>          - __se_sys_io_uring_enter
>             - 70.62% io_submit_sqes
>                  blk_finish_plug
>                  __blk_flush_plug
>                - blk_mq_flush_plug_list
>                   - 69.65% blk_mq_run_hw_queue
>                        blk_mq_sched_dispatch_requests
>                      - __blk_mq_sched_dispatch_requests
>                         + 60.61% dd_dispatch_request
>                         + 8.98% blk_mq_dispatch_rq_list
>                   + 0.98% dd_insert_requests
> 
> which is exactly as expected, we're spending 70% of the CPU cycles
> banging on dd->lock.

Case in point, I spent 10 min hacking up some smarts on the insertion
and dispatch side, and then we get:

IOPS=2.54M, BW=1240MiB/s, IOS/call=32/32

or about a 63% improvement when running the _exact same thing_. Looking
at profiles:

-   13.71%  io_uring  [kernel.kallsyms]  [k] queued_spin_lock_slowpath

reducing the > 70% of locking contention down to ~14%. No change in data
structures, just an ugly hack that:

- Serializes dispatch, no point having someone hammer on dd->lock for
  dispatch when already running
- Serialize insertions, punt to one of N buckets if insertion is already
  busy. Current insertion will notice someone else did that, and will
  prune the buckets and re-run insertion.

And while I seriously doubt that my quick hack is 100% fool proof, it
works as a proof of concept. If we can get that kind of reduction with
minimal effort, well...

-- 
Jens Axboe

next prev parent reply	other threads:[~2024-01-17 20:06 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-16 18:20 [LSF/MM/BPF TOPIC] Improving Zoned Storage Support Bart Van Assche
2024-01-16 23:34 ` Damien Le Moal
2024-01-17  1:21   ` Bart Van Assche
2024-01-17 17:36   ` Bart Van Assche
2024-01-17 17:48     ` Jens Axboe
2024-01-17 18:22       ` Bart Van Assche
2024-01-17 18:43         ` Jens Axboe
2024-01-17 20:06           ` Jens Axboe [this message]
2024-01-17 20:18             ` Bart Van Assche
2024-01-17 20:20               ` Jens Axboe
2024-01-17 21:02                 ` Jens Axboe
2024-01-17 21:14                   ` Jens Axboe
2024-01-17 21:33                     ` Bart Van Assche
2024-01-17 21:40                       ` Jens Axboe
2024-01-18  0:43                         ` Bart Van Assche
2024-01-18 14:51                           ` Jens Axboe
2024-01-18  0:38           ` Bart Van Assche
2024-01-18  0:42             ` Jens Axboe
2024-01-18  0:54               ` Bart Van Assche
2024-01-18 15:07                 ` Jens Axboe
2024-01-17  8:15 ` Viacheslav Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=276eedc2-e3d0-40c7-b355-46232ea65662@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=dlemoal@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox