Re: [LSF/MM/BPF TOPIC] Improving Zoned Storage Support

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jens Axboe <axboe@kernel.dk>
To: Bart Van Assche <bvanassche@acm.org>,
	Damien Le Moal <dlemoal@kernel.org>,
	"lsf-pc@lists.linux-foundation.org"
	<lsf-pc@lists.linux-foundation.org>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [LSF/MM/BPF TOPIC] Improving Zoned Storage Support
Date: Wed, 17 Jan 2024 13:06:19 -0700	[thread overview]
Message-ID: <276eedc2-e3d0-40c7-b355-46232ea65662@kernel.dk> (raw)
In-Reply-To: <9af03351-a04a-4e61-a6d8-b58236b041a3@kernel.dk>

On 1/17/24 11:43 AM, Jens Axboe wrote:
> Certainly slower. Now let's try and have the scheduler place the same 4
> threads where it sees fit:
> 
> IOPS=1.56M, BW=759MiB/s, IOS/call=32/31
> 
> Yikes! That's still substantially more than 200K IOPS even with heavy
> contention, let's take a look at the profile:
> 
> -   70.63%  io_uring  [kernel.kallsyms]  [k] queued_spin_lock_slowpath
>    - submitter_uring_fn
>       - entry_SYSCALL_64
>       - do_syscall_64
>          - __se_sys_io_uring_enter
>             - 70.62% io_submit_sqes
>                  blk_finish_plug
>                  __blk_flush_plug
>                - blk_mq_flush_plug_list
>                   - 69.65% blk_mq_run_hw_queue
>                        blk_mq_sched_dispatch_requests
>                      - __blk_mq_sched_dispatch_requests
>                         + 60.61% dd_dispatch_request
>                         + 8.98% blk_mq_dispatch_rq_list
>                   + 0.98% dd_insert_requests
> 
> which is exactly as expected, we're spending 70% of the CPU cycles
> banging on dd->lock.

Case in point, I spent 10 min hacking up some smarts on the insertion
and dispatch side, and then we get:

IOPS=2.54M, BW=1240MiB/s, IOS/call=32/32

or about a 63% improvement when running the _exact same thing_. Looking
at profiles:

-   13.71%  io_uring  [kernel.kallsyms]  [k] queued_spin_lock_slowpath

reducing the > 70% of locking contention down to ~14%. No change in data
structures, just an ugly hack that:

- Serializes dispatch, no point having someone hammer on dd->lock for
  dispatch when already running
- Serialize insertions, punt to one of N buckets if insertion is already
  busy. Current insertion will notice someone else did that, and will
  prune the buckets and re-run insertion.

And while I seriously doubt that my quick hack is 100% fool proof, it
works as a proof of concept. If we can get that kind of reduction with
minimal effort, well...

-- 
Jens Axboe

next prev parent reply	other threads:[~2024-01-17 20:06 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-16 18:20 [LSF/MM/BPF TOPIC] Improving Zoned Storage Support Bart Van Assche
2024-01-16 23:34 ` Damien Le Moal
2024-01-17  1:21   ` Bart Van Assche
2024-01-17 17:36   ` Bart Van Assche
2024-01-17 17:48     ` Jens Axboe
2024-01-17 18:22       ` Bart Van Assche
2024-01-17 18:43         ` Jens Axboe
2024-01-17 20:06           ` Jens Axboe [this message]
2024-01-17 20:18             ` Bart Van Assche
2024-01-17 20:20               ` Jens Axboe
2024-01-17 21:02                 ` Jens Axboe
2024-01-17 21:14                   ` Jens Axboe
2024-01-17 21:33                     ` Bart Van Assche
2024-01-17 21:40                       ` Jens Axboe
2024-01-18  0:43                         ` Bart Van Assche
2024-01-18 14:51                           ` Jens Axboe
2024-01-18  0:38           ` Bart Van Assche
2024-01-18  0:42             ` Jens Axboe
2024-01-18  0:54               ` Bart Van Assche
2024-01-18 15:07                 ` Jens Axboe
2024-01-17  8:15 ` Viacheslav Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=276eedc2-e3d0-40c7-b355-46232ea65662@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=dlemoal@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.