From: Tejun Heo <tj@kernel.org>
To: Shaohua Li <shli@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH 1/2] blk-throtl: make latency= absolute
Date: Mon, 13 Nov 2017 03:27:10 -0800 [thread overview]
Message-ID: <20171113112710.GG983427@devbig577.frc2.facebook.com> (raw)
In-Reply-To: <20171113042940.d6lcarmnakuctinn@kernel.org>
Hello, Shoahua.
On Sun, Nov 12, 2017 at 08:29:40PM -0800, Shaohua Li wrote:
> Didn't get this. What did you mean 'queueing time on the host side'? You mean
> the application think time delay?
>
> My point is absolute latency doen't protect as we expected. Let me have an
> example. Say 4k latency is 60us, BW is 100MB/s. When 4k BW is 50MB/s, the
> latency is 200us. 1M latency is 500us. If you set the absolute latency to
> 600us, you can't protect the 4k BW to above 50MB/s. To do the protection, you
> really want to set the absolute latency below 500us, which doesn't work for the
> 1M IO.
What I'm trying to say is that the latency is defined as "from bio
issue to completion", not "in-flight time on device". Whether the
on-device latency is 50us or 500us, the host side queueing latency can
be in orders of magnitude higher.
For things like starvation protection for managerial workloads which
work fine on rotating disks, the only thing we need to protect against
is excessive host side queue overflowing leading to starvation of such
workloads. IOW, we're talking about latency target in tens or lower
hundreds of millisecs. Whether the on-device time is 50 or 500us
doesn't matter that much.
> We don't overload the meaning of "N". Untill your next patch, the "N" actually
> means "+N".
>
> Ponder a little bit, I think 4ms base latency for HD actually is reasonable. We
> have LATENCY_FILTERED_HD to filter out small latency bios, which come from
> sequential IO. So remaining IO is random IO. 4k base latency for HD random IO
> should be ok. Probably something else is wrong. I think we need understand
> what's wrong for HD throttling first before we make any change.
So, even purely from user-interface perspective, I think it can be
very confusing to use "N" to mean "base + N". Explicitly saying
what's going on through "+N" or "N%" is a lot more straight-forward.
I mean, we can decide to change the config syntax but not support abs
targets but I think abs targets are useful and it's not like this adds
significant overhead / complexity.
As for why it isn't working well for disks, I think some of it is
coming from buffering behaviors and not handling merges properly.
Write latencies aren't evenly spread across commands. Most of them
really fast and then one of them or the flush take a really long time.
Filtering based on LATENCY_FILTERED_HD simply ignores those fast
completions which means that the eventual latency spike is attributed
arbitrarily, which doesn't really work.
The other part is that blk-throtl was seeing wildly different IO
numbers than the underlying device does when there are a lot of
merges, which still happens. This means that iops limits were just
badly broken.
Thanks.
--
tejun
next prev parent reply other threads:[~2017-11-13 11:27 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-09 22:19 [PATCH 1/2] blk-throtl: make latency= absolute Tejun Heo
2017-11-09 22:20 ` [PATCH 2/2] blk-throtl: add relative percentage support to latency= Tejun Heo
2017-11-14 22:06 ` Shaohua Li
2017-11-09 23:12 ` [PATCH 1/2] blk-throtl: make latency= absolute Shaohua Li
2017-11-09 23:42 ` Tejun Heo
2017-11-10 4:27 ` Shaohua Li
2017-11-10 15:43 ` Tejun Heo
2017-11-13 4:29 ` Shaohua Li
2017-11-13 11:27 ` Tejun Heo [this message]
2017-11-13 14:18 ` Tejun Heo
2017-11-13 22:08 ` Shaohua Li
2017-11-14 14:52 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171113112710.GG983427@devbig577.frc2.facebook.com \
--to=tj@kernel.org \
--cc=axboe@kernel.dk \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox