public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Chaitanya Kulkarni <chaitanyak@nvidia.com>
To: Gabriel Krisman Bertazi <krisman@suse.de>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Hugh Dickins <hughd@google.com>, Keith Busch <kbusch@kernel.org>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	Liu Song <liusong@linux.alibaba.com>, Jan Kara <jack@suse.cz>
Subject: Re: [PATCH] sbitmap: Use single per-bitmap counting to wake up queued tags
Date: Wed, 9 Nov 2022 03:35:08 +0000	[thread overview]
Message-ID: <1e9b8ff5-76ba-a8de-e8b9-bbdd07ebede8@nvidia.com> (raw)
In-Reply-To: <871qqcg77l.fsf@suse.de>

On 11/8/22 19:03, Gabriel Krisman Bertazi wrote:
> Chaitanya Kulkarni <chaitanyak@nvidia.com> writes:
> 
>>> For more interesting cases, where there is queueing, we need to take
>>> into account the cross-communication of the atomic operations.  I've
>>> been benchmarking by running parallel fio jobs against a single hctx
>>> nullb in different hardware queue depth scenarios, and verifying both
>>> IOPS and queueing.
>>>
>>> Each experiment was repeated 5 times on a 20-CPU box, with 20 parallel
>>> jobs. fio was issuing fixed-size randwrites with qd=64 against nullb,
>>> varying only the hardware queue length per test.
>>>
>>> queue size 2                 4                 8                 16                 32                 64
>>> 6.1-rc2    1681.1K (1.6K)    2633.0K (12.7K)   6940.8K (16.3K)   8172.3K (617.5K)   8391.7K (367.1K)   8606.1K (351.2K)
>>> patched    1721.8K (15.1K)   3016.7K (3.8K)    7543.0K (89.4K)   8132.5K (303.4K)   8324.2K (230.6K)   8401.8K (284.7K)
>>
>>>
> 
> Hi Chaitanya,
> 
> Thanks for the feedback.
> 
>> So if I understand correctly
>> QD 2,4,8 shows clear performance benefit from this patch whereas
>> QD 16, 32, 64 shows drop in performance it that correct ?
>>
>> If my observation is correct then applications with high QD will
>> observe drop in the performance ?
> 
> To be honest, I'm not sure.  Given the overlap of the standard variation
> (in parenthesis) with the mean, I'm not sure the observed drop is
> statistically significant. In my prior analysis, I thought it wasn't.
> 
> I don't see where a significant difference would come from, to be honest,
> because the higher the QD, the more likely it is  to go through the
> not-contended path, where sbq->ws_active == 0.  This hot path is
> identical to the existing implementation.
> 

The numbers are  taken on the null_blk, with the drop I could see here
may end up being different on the real H/W ? and I cannot
comment on that since we don't have that data ...

Did you repeat the experiment with the real H/W like NVMe SSD ?

>> Also, please share a table with block size/IOPS/BW/CPU (system/user)
>> /LAT/SLAT with % increase/decrease and document the raw numbers at the
>> end of the cover-letter for completeness along with fio job to others
>> can repeat the experiment...
> 
> This was issued against the nullb and the IO size is fixed, matching the
> device's block size (512b), which is why I am not tracking BW, only
> IOPS.  I'm not sure the BW is still relevant in this scenario.
> 
> I'll definitely follow up with CPU time and latencies, and share the
> fio job.  I'll also take another look on the significance of the
> measured values for high QD.
> 

Yes, please if CPU usage way higher then we need to know that above
numbers are at the cost of the higher CPU, in that case IOPs per core
B/W per core matrix can be very useful ?

-ck


  reply	other threads:[~2022-11-09  3:35 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-05 23:10 [PATCH] sbitmap: Use single per-bitmap counting to wake up queued tags Gabriel Krisman Bertazi
2022-11-08 23:28 ` Chaitanya Kulkarni
2022-11-09  3:03   ` Gabriel Krisman Bertazi
2022-11-09  3:35     ` Chaitanya Kulkarni [this message]
2022-11-09 22:06 ` Jens Axboe
2022-11-09 22:48   ` Gabriel Krisman Bertazi
2022-11-10  3:25     ` Jens Axboe
2022-11-10  9:42 ` Yu Kuai
2022-11-10 11:16   ` Jan Kara
2022-11-10 13:18     ` Yu Kuai
2022-11-10 15:35       ` Jan Kara
2022-11-11  0:59         ` Yu Kuai
2022-11-11 15:38 ` Jens Axboe
2022-11-14 13:23 ` Jan Kara
2022-11-14 14:20   ` [PATCH] sbitmap: Advance the queue index before waking up the queue Gabriel Krisman Bertazi
2022-11-14 14:34     ` Jan Kara
2022-11-15  3:52   ` [PATCH] sbitmap: Use single per-bitmap counting to wake up queued tags Gabriel Krisman Bertazi
2022-11-15 10:24     ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1e9b8ff5-76ba-a8de-e8b9-bbdd07ebede8@nvidia.com \
    --to=chaitanyak@nvidia.com \
    --cc=axboe@kernel.dk \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kbusch@kernel.org \
    --cc=krisman@suse.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liusong@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox