From: Jens Axboe <axboe@kernel.dk>
To: Bart Van Assche <bvanassche@acm.org>,
Damien Le Moal <dlemoal@kernel.org>,
"lsf-pc@lists.linux-foundation.org"
<lsf-pc@lists.linux-foundation.org>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
Christoph Hellwig <hch@lst.de>
Subject: Re: [LSF/MM/BPF TOPIC] Improving Zoned Storage Support
Date: Wed, 17 Jan 2024 14:14:42 -0700 [thread overview]
Message-ID: <e8c32676-114b-4aaf-8753-5a6d7b04fc4b@kernel.dk> (raw)
In-Reply-To: <207a985d-ad4e-4cad-ac07-961633967bfc@kernel.dk>
On 1/17/24 2:02 PM, Jens Axboe wrote:
> On 1/17/24 1:20 PM, Jens Axboe wrote:
>> On 1/17/24 1:18 PM, Bart Van Assche wrote:
>>> On 1/17/24 12:06, Jens Axboe wrote:
>>>> Case in point, I spent 10 min hacking up some smarts on the insertion
>>>> and dispatch side, and then we get:
>>>>
>>>> IOPS=2.54M, BW=1240MiB/s, IOS/call=32/32
>>>>
>>>> or about a 63% improvement when running the _exact same thing_. Looking
>>>> at profiles:
>>>>
>>>> - 13.71% io_uring [kernel.kallsyms] [k] queued_spin_lock_slowpath
>>>>
>>>> reducing the > 70% of locking contention down to ~14%. No change in data
>>>> structures, just an ugly hack that:
>>>>
>>>> - Serializes dispatch, no point having someone hammer on dd->lock for
>>>> dispatch when already running
>>>> - Serialize insertions, punt to one of N buckets if insertion is already
>>>> busy. Current insertion will notice someone else did that, and will
>>>> prune the buckets and re-run insertion.
>>>>
>>>> And while I seriously doubt that my quick hack is 100% fool proof, it
>>>> works as a proof of concept. If we can get that kind of reduction with
>>>> minimal effort, well...
>>>
>>> If nobody else beats me to it then I will look into using separate
>>> locks in the mq-deadline scheduler for insertion and dispatch.
>>
>> That's not going to help by itself, as most of the contention (as I
>> showed in the profile trace in the email) is from dispatch competing
>> with itself, and not necessarily dispatch competing with insertion. And
>> not sure how that would even work, as insert and dispatch are working on
>> the same structures.
>>
>> Do some proper analysis first, then that will show you where the problem
>> is.
>
> Here's a quick'n dirty that brings it from 1.56M to:
>
> IOPS=3.50M, BW=1711MiB/s, IOS/call=32/32
>
> by just doing something stupid - if someone is already dispatching, then
> don't dispatch anything. Clearly shows that this is just dispatch
> contention. But a 160% improvement from looking at the initial profile I
224%, not sure where that math came from...
Anyway, just replying as I sent out the wrong patch. Here's the one I
tested.
diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index f958e79277b8..133ab4a2673b 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -80,6 +80,13 @@ struct dd_per_prio {
};
struct deadline_data {
+ struct {
+ spinlock_t lock;
+ spinlock_t zone_lock;
+ } ____cacheline_aligned_in_smp;
+
+ unsigned long dispatch_state;
+
/*
* run time data
*/
@@ -100,9 +107,6 @@ struct deadline_data {
int front_merges;
u32 async_depth;
int prio_aging_expire;
-
- spinlock_t lock;
- spinlock_t zone_lock;
};
/* Maps an I/O priority class to a deadline scheduler priority. */
@@ -600,6 +604,10 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
struct request *rq;
enum dd_prio prio;
+ if (test_bit(0, &dd->dispatch_state) ||
+ test_and_set_bit(0, &dd->dispatch_state))
+ return NULL;
+
spin_lock(&dd->lock);
rq = dd_dispatch_prio_aged_requests(dd, now);
if (rq)
@@ -616,6 +624,7 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
}
unlock:
+ clear_bit(0, &dd->dispatch_state);
spin_unlock(&dd->lock);
return rq;
--
Jens Axboe
next prev parent reply other threads:[~2024-01-17 21:14 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-16 18:20 [LSF/MM/BPF TOPIC] Improving Zoned Storage Support Bart Van Assche
2024-01-16 23:34 ` Damien Le Moal
2024-01-17 1:21 ` Bart Van Assche
2024-01-17 17:36 ` Bart Van Assche
2024-01-17 17:48 ` Jens Axboe
2024-01-17 18:22 ` Bart Van Assche
2024-01-17 18:43 ` Jens Axboe
2024-01-17 20:06 ` Jens Axboe
2024-01-17 20:18 ` Bart Van Assche
2024-01-17 20:20 ` Jens Axboe
2024-01-17 21:02 ` Jens Axboe
2024-01-17 21:14 ` Jens Axboe [this message]
2024-01-17 21:33 ` Bart Van Assche
2024-01-17 21:40 ` Jens Axboe
2024-01-18 0:43 ` Bart Van Assche
2024-01-18 14:51 ` Jens Axboe
2024-01-18 0:38 ` Bart Van Assche
2024-01-18 0:42 ` Jens Axboe
2024-01-18 0:54 ` Bart Van Assche
2024-01-18 15:07 ` Jens Axboe
2024-01-17 8:15 ` Viacheslav Dubeyko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e8c32676-114b-4aaf-8753-5a6d7b04fc4b@kernel.dk \
--to=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=dlemoal@kernel.org \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox