All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <krisman@suse.de>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
	Hugh Dickins <hughd@google.com>, Keith Busch <kbusch@kernel.org>,
	Liu Song <liusong@linux.alibaba.com>, Jan Kara <jack@suse.cz>
Subject: Re: [PATCH] sbitmap: Use single per-bitmap counting to wake up queued tags
Date: Wed, 09 Nov 2022 17:48:08 -0500	[thread overview]
Message-ID: <87wn83eod3.fsf@suse.de> (raw)
In-Reply-To: <cd88f306-1da4-a243-ec23-fea033142fbb@kernel.dk> (Jens Axboe's message of "Wed, 9 Nov 2022 15:06:52 -0700")

Jens Axboe <axboe@kernel.dk> writes:

> On 11/5/22 5:10 PM, Gabriel Krisman Bertazi wrote:
>> Performance-wise, one should expect very similar performance to the
>> original algorithm for the case where there is no queueing.  In both the
>> old algorithm and this implementation, the first thing is to check
>> ws_active, which bails out if there is no queueing to be managed. In the
>> new code, we took care to avoid accounting completions and wakeups when
>> there is no queueing, to not pay the cost of atomic operations
>> unnecessarily, since it doesn't skew the numbers.
>> 
>> For more interesting cases, where there is queueing, we need to take
>> into account the cross-communication of the atomic operations.  I've
>> been benchmarking by running parallel fio jobs against a single hctx
>> nullb in different hardware queue depth scenarios, and verifying both
>> IOPS and queueing.
>> 
>> Each experiment was repeated 5 times on a 20-CPU box, with 20 parallel
>> jobs. fio was issuing fixed-size randwrites with qd=64 against nullb,
>> varying only the hardware queue length per test.
>> 
>> queue size 2                 4                 8                 16                 32                 64
>> 6.1-rc2    1681.1K (1.6K)    2633.0K (12.7K)   6940.8K (16.3K)   8172.3K (617.5K)   8391.7K (367.1K)   8606.1K (351.2K)
>> patched    1721.8K (15.1K)   3016.7K (3.8K)    7543.0K (89.4K)   8132.5K (303.4K)   8324.2K (230.6K)   8401.8K (284.7K)
>> 
>> The following is a similar experiment, ran against a nullb with a single
>> bitmap shared by 20 hctx spread across 2 NUMA nodes. This has 40
>> parallel fio jobs operating on the same device
>> 
>> queue size 2 	             4                 8              	16             	    32		       64
>> 6.1-rc2	   1081.0K (2.3K)    957.2K (1.5K)     1699.1K (5.7K) 	6178.2K (124.6K)    12227.9K (37.7K)   13286.6K (92.9K)
>> patched	   1081.8K (2.8K)    1316.5K (5.4K)    2364.4K (1.8K) 	6151.4K  (20.0K)    11893.6K (17.5K)   12385.6K (18.4K)
>
> What's the queue depth of these devices? That's the interesting question
> here, as it'll tell us if any of these are actually hitting the slower
> path where you made changes. 
>

Hi Jens,

The hardware queue depth is a parameter being varied in this experiment.
Each column of the tables has a different queue depth.  Its value is the
first line (queue size) of both tables.  For instance, looking at the
first table, for a device with hardware queue depth=2, 6.1-rc2 gave
1681K IOPS and the patched version gave 1721.8K IOPS.

As mentioned, I monitored the size of the sbitmap wqs during the
benchmark execution to confirm it was indeed hitting the slow path and
queueing.  Indeed, I observed less queueing on higher QDs (16,32) and
even less for QD=64.  For QD<=8, there was extensive queueing present
throughout the execution.

I should provide the queue size over time alongside the latency numbers.
I have to rerun the benchmarks already to collect the information
Chaitanya requested.

> I suspect you are for the second set of numbers, but not for the first
> one?

No. both tables show some level of queueing. The shared bitmap in
table 2 surely has way more intensive queueing, though.

> Anything that isn't hitting the wait path for tags isn't a very useful
> test, as I would not expect any changes there.

Even when there is less to no queueing (QD=64 in this data), we still
enter sbitmap_queue_wake_up and bail out on the first line
!wait_active. This is why I think it is important to include QD=64
here. it is less interesting data, as I mentioned, but it shows no
regressions of the faspath.

Thanks,

-- 
Gabriel Krisman Bertazi

  reply	other threads:[~2022-11-09 22:48 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-05 23:10 [PATCH] sbitmap: Use single per-bitmap counting to wake up queued tags Gabriel Krisman Bertazi
2022-11-08 23:28 ` Chaitanya Kulkarni
2022-11-09  3:03   ` Gabriel Krisman Bertazi
2022-11-09  3:35     ` Chaitanya Kulkarni
2022-11-09 22:06 ` Jens Axboe
2022-11-09 22:48   ` Gabriel Krisman Bertazi [this message]
2022-11-10  3:25     ` Jens Axboe
2022-11-10  9:42 ` Yu Kuai
2022-11-10 11:16   ` Jan Kara
2022-11-10 13:18     ` Yu Kuai
2022-11-10 15:35       ` Jan Kara
2022-11-11  0:59         ` Yu Kuai
2022-11-11 15:38 ` Jens Axboe
2022-11-14 13:23 ` Jan Kara
2022-11-14 14:20   ` [PATCH] sbitmap: Advance the queue index before waking up the queue Gabriel Krisman Bertazi
2022-11-14 14:34     ` Jan Kara
2022-11-15  3:52   ` [PATCH] sbitmap: Use single per-bitmap counting to wake up queued tags Gabriel Krisman Bertazi
2022-11-15 10:24     ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wn83eod3.fsf@suse.de \
    --to=krisman@suse.de \
    --cc=axboe@kernel.dk \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liusong@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.