Re: [PATCH] blk-mq: modify hybrid sleep time to aggressive

Linux block layer
 help / color / mirror / Atom feed

From: Pavel Begunkov <asml.silence@gmail.com>
To: dongjoo seo <commisori28@gmail.com>
Cc: Damien Le Moal <Damien.LeMoal@wdc.com>,
	"hch@infradead.org" <hch@infradead.org>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	"ming.lei@redhat.com" <ming.lei@redhat.com>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"sbates@raithlin.com" <sbates@raithlin.com>
Subject: Re: [PATCH] blk-mq: modify hybrid sleep time to aggressive
Date: Wed, 18 Nov 2020 14:17:13 +0000	[thread overview]
Message-ID: <dd1fa6ef-0730-37d6-d3a8-50a3b98e2e6a@gmail.com> (raw)
In-Reply-To: <7F6FFFCB-3FD1-4A7B-8D30-FF4BBAD4AEA4@gmail.com>

On 18/11/2020 10:35, dongjoo seo wrote:
> I agree with your opinion. And your patch is also good approach.
> How about combine it? Adaptive solution with 3/4.

I couldn't disclose numbers back then, but thanks to a steep skewed
latency distribution of NAND/SSDs, it actually was automatically
adjusting it to ~3/4 for QD1 and long enough requests (~75+ us).
Also, if "max(sleep_ns, half_mean)" is removed, it was keeping the
time below 1/2 for fast requests (less than ~30us), and that is a
good thing because it was constantly oversleeping them.
Though new ultra low-latency SSDs came since then.

The real problem is to find anyone who actually uses it, otherwise
it's just a chunk of dead code. Do you? Anyone? I remember once it
was completely broken for months, but that was barely noticed.


> Because, if we get the intensive workloads then we need to 
> decrease the whole cpu utilization even with [1].
> 
> [1] https://lkml.org/lkml/2019/4/30/117 <https://lkml.org/lkml/2019/4/30/117>
> 
>> On Nov 18, 2020, at 6:26 PM, Pavel Begunkov <asml.silence@gmail.com> wrote:
>>
>> On 18/11/2020 07:16, Damien Le Moal wrote:
>>> On 2020/11/18 16:07, Christoph Hellwig wrote:
>>>> Adding Damien who wrote this code.
>>>
>>> Nope. It wasn't me. I think it was Stephen Bates:
>>>
>>> commit 720b8ccc4500 ("blk-mq: Add a polling specific stats function")
>>>
>>> So +Stephen.
>>>>
>>>> On Wed, Nov 18, 2020 at 09:47:46AM +0900, Dongjoo Seo wrote:
>>>>> Current sleep time for hybrid polling is half of mean time.
>>>>> The 'half' sleep time is good for minimizing the cpu utilization.
>>>>> But, the problem is that its cpu utilization is still high.
>>>>> this patch can help to minimize the cpu utilization side.
>>
>> This won't work well. When I was experimenting I saw that half mean
>> is actually is too much for fast enough requests, like <20us 4K writes,
>> it's oversleeping them. Even more I'm afraid of getting in a vicious
>> cycle, when oversleeping increases statistical mean, that increases
>> sleep time, that again increases stat mean, and so on. That what
>> happened for me when the scheme was too aggressive.
>>
>> I actually sent once patches [1] for automatic dynamic sleep time
>> adjustment, but nobody cared.
>>
>> [1] https://lkml.org/lkml/2019/4/30/117 <https://lkml.org/lkml/2019/4/30/117>
>>
>>>>>
>>>>> Below 1,2 is my test hardware sets.
>>>>>
>>>>> 1. Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz + Samsung 970 pro 1Tb
>>>>> 2. Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz + INTEL SSDPED1D480GA 480G
>>>>>
>>>>>        |  Classic Polling | Hybrid Polling  | this Patch
>>>>> -----------------------------------------------------------------
>>>>>        cpu util | IOPS(k) | cpu util | IOPS | cpu util | IOPS  |
>>>>> -----------------------------------------------------------------
>>>>> 1.       99.96   |   491   |  56.98   | 467  | 35.98    | 442   |
>>>>> -----------------------------------------------------------------
>>>>> 2.       99.94   |   582   |  56.3    | 582  | 35.28    | 582   |
>>>>>
>>>>> cpu util means that sum of sys and user util.
>>>>>
>>>>> I used 4k rand read for this test.
>>>>> because that case is worst case of I/O performance side.
>>>>> below one is my fio setup.
>>>>>
>>>>> name=pollTest
>>>>> ioengine=pvsync2
>>>>> hipri
>>>>> direct=1
>>>>> size=100%
>>>>> randrepeat=0
>>>>> time_based
>>>>> ramp_time=0
>>>>> norandommap
>>>>> refill_buffers
>>>>> log_avg_msec=1000
>>>>> log_max_value=1
>>>>> group_reporting
>>>>> filename=/dev/nvme0n1
>>>>> [rd_rnd_qd_1_4k_1w]
>>>>> bs=4k
>>>>> iodepth=32
>>>>> numjobs=[num of cpus]
>>>>> rw=randread
>>>>> runtime=60
>>>>> write_bw_log=bw_rd_rnd_qd_1_4k_1w
>>>>> write_iops_log=iops_rd_rnd_qd_1_4k_1w
>>>>> write_lat_log=lat_rd_rnd_qd_1_4k_1w
>>>>>
>>>>> Thanks
>>>>>
>>>>> Signed-off-by: Dongjoo Seo <commisori28@gmail.com>
>>>>> ---
>>>>> block/blk-mq.c | 3 +--
>>>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>>>> index 1b25ec2fe9be..c3d578416899 100644
>>>>> --- a/block/blk-mq.c
>>>>> +++ b/block/blk-mq.c
>>>>> @@ -3749,8 +3749,7 @@ static unsigned long blk_mq_poll_nsecs(struct request_queue *q,
>>>>> 		return ret;
>>>>>
>>>>> 	if (q->poll_stat[bucket].nr_samples)
>>>>> -		ret = (q->poll_stat[bucket].mean + 1) / 2;
>>>>> -
>>>>> +		ret = (q->poll_stat[bucket].mean + 1) * 3 / 4;
>>>>> 	return ret;
>>>>> }
>>>>>
>>>>> -- 
>>>>> 2.17.1
>>>>>
>>>> ---end quoted text---
>>>>
>>>
>>>
>>
>> -- 
>> Pavel Begunkov
> 
> 

-- 
Pavel Begunkov

next prev parent reply	other threads:[~2020-11-18 14:20 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-18  0:47 [PATCH] blk-mq: modify hybrid sleep time to aggressive Dongjoo Seo
2020-11-18  7:07 ` Christoph Hellwig
2020-11-18  7:16   ` Damien Le Moal
2020-11-18  9:26     ` Pavel Begunkov
     [not found]       ` <7F6FFFCB-3FD1-4A7B-8D30-FF4BBAD4AEA4@gmail.com>
2020-11-18 14:17         ` Pavel Begunkov [this message]
     [not found]           ` <CABM9hu3FE6ZZL=oWznbJUw2i9i8qJ1AYKotg_uEeAe1Vu+8Ong@mail.gmail.com>
2020-11-19 17:51             ` Pavel Begunkov
  -- strict thread matches above, loose matches on Subject: below --
2020-11-16 16:43 Dongjoo Seo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dd1fa6ef-0730-37d6-d3a8-50a3b98e2e6a@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=commisori28@gmail.com \
    --cc=hch@infradead.org \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=sbates@raithlin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox