Re: [BUG?] bcachefs performance: read is way too slow when a file has no overwrite.

public inbox for linux-bcachefs@vger.kernel.org
 help / color / mirror / Atom feed

From: David Wang <00107082@163.com>
To: kent.overstreet@linux.dev
Cc: 00107082@163.com, linux-bcachefs@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [BUG?] bcachefs performance: read is way too slow when a file has no overwrite.
Date: Sat,  7 Sep 2024 18:34:37 +0800	[thread overview]
Message-ID: <20240907103437.71139-1-00107082@163.com> (raw)
In-Reply-To: <ka3sjrka6dugdaab2bvewfbonc3ksixumue3hs2juhajhjm37w@bnxvz5mozpgr>

At 2024-09-07 01:38:11, "Kent Overstreet" <kent.overstreet@linux.dev> wrote:
>On Fri, Sep 06, 2024 at 11:43:54PM GMT, David Wang wrote:
>> 
>> Hi,
>> 
>> I notice a very strange performance issue:
>> When run `fio direct randread` test on a fresh new bcachefs, the performance is very bad:
>> 	fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test  --bs=4k --iodepth=64 --size=1G --readwrite=randread  --runtime=600 --numjobs=8 --time_based=1
>> 	...
>> 	Run status group 0 (all jobs):
>> 	   READ: bw=87.0MiB/s (91.2MB/s), 239B/s-14.2MiB/s (239B/s-14.9MB/s), io=1485MiB (1557MB), run=15593-17073msec
>> 
>> But if the files already exist and have alreay been thoroughly overwritten, the read performance is about 850MB+/s,
>> almost 10-times better!
>> 
>> This means, if I copy some file from somewhere else, and make read access only afterwards, I would get really bad performance.
>> (I copy files from other filesystem, and run fio read test on those files, the performance is indeed bad.)
>> Copy some prepared files, and make readonly usage afterwards, this usage scenario is quite normal for lots of apps, I think.
>
>That's because checksums are at extent granularity, not block: if you're
>doing O_DIRECT reads that are smaller than the writes the data was
>written with, performance will be bad because we have to read the entire
>extent to verify the checksum.


>
>block granular checksums will come at some point, as an optional feature
>(most of the time you don't want them, and you'd prefer more compact
>metadata)

Hi, I made further tests combining different write and read size, the results
are not confirming the explanation for O_DIRECT.

Without O_DIRECT (fio  --direct=0....), the average read bandwidth
is improved, but with a very big standard deviation:
+--------------------+----------+----------+----------+----------+
| prepare-write\read |    1k    |    4k    |    8K    |   16K    |
+--------------------+----------+----------+----------+----------+
|         1K         | 328MiB/s | 395MiB/s | 465MiB/s |          |
|         4K         | 193MiB/s | 219MiB/s | 274MiB/s | 392MiB/s |
|         8K         | 251MiB/s | 280MiB/s | 368MiB/s | 435MiB/s |
|        16K         | 302MiB/s | 380MiB/s | 464MiB/s | 577MiB/s |
+--------------------+----------+----------+----------+----------+
(Rows are write size when preparing the test files, and columns are read size for fio test.)

And with O_DIRECT, the result is:
+--------------------+-----------+-----------+----------+----------+
| prepare-write\read |     1k    |     4k    |    8K    |   16K    |
+--------------------+-----------+-----------+----------+----------+
|         1K         | 24.1MiB/s | 96.5MiB/s | 193MiB/s |          |
|         4K         | 14.4MiB/s | 57.6MiB/s | 116MiB/s | 230MiB/s |
|         8K         | 24.6MiB/s | 97.6MiB/s | 192MiB/s | 309MiB/s |
|        16K         | 26.4MiB/s |  104MiB/s | 206MiB/s | 402MiB/s |
+--------------------+-----------+-----------+----------+----------+

code to prepare the test files:
	#define KN 8 //<- adjust this for each row
	char name[32];
	char buf[1024*KN];
	int main() {
		int i, m = 1024*1024/KN, k, df;
		for (i=0; i<8; i++) {
			sprintf(name, "test.%d.0", i);
			fd = open(name, O_CREAT|O_DIRECT|O_SYNC|O_TRUNC|O_WRONLY);
			for (k=0; k<m; k++) write(fd, buf, sizeof(buf));
			close(fd);
		}
		return 0;
	}

Based on the result:
1. The row with prepare-write size 4K stands out, here.
When files were prepaired with write size 4K, the afterwards
 read performance is worse.  (I did double check the result,
but it is possible that I miss some affecting factors.);
2. Without O_DIRECT, read performance seems correlated with the difference
 between read size and prepare write size, but with O_DIRECT, correlation is not obvious.

And, to mention it again, if I overwrite the files **thoroughly** with fio write test
(using same size), the read performance afterwards would be very good:

	# overwrite the files with randwrite, block size 8k
	$ fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test  --bs=8k --iodepth=64 --size=1G --readwrite=randwrite  --runtime=300 --numjobs=8 --time_based=1
	# test the read performance with randread, block size 8k
	$ fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test  --bs=8k --iodepth=64 --size=1G --readwrite=randread  --runtime=300 --numjobs=8 --time_based=1
	...
	Run status group 0 (all jobs):
	   READ: bw=964MiB/s (1011MB/s), 116MiB/s-123MiB/s (121MB/s-129MB/s), io=283GiB (303GB), run=300004-300005msec



FYI
David

next prev parent reply	other threads:[~2024-09-07 10:35 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-06 15:43 [BUG?] bcachefs performance: read is way too slow when a file has no overwrite David Wang
2024-09-06 17:38 ` Kent Overstreet
2024-09-07 10:34   ` David Wang [this message]
2024-09-09 13:37     ` Kent Overstreet
2024-09-12  2:39       ` David Wang
2024-09-12  7:52         ` David Wang
2024-09-21 16:02       ` David Wang
2024-09-21 16:12         ` Kent Overstreet
2024-09-22  1:39           ` David Wang
2024-09-22  8:31             ` David Wang
2024-09-22  8:47               ` David Wang
2024-09-24 11:08     ` David Wang
2024-09-24 11:30       ` Kent Overstreet
2024-09-24 12:38         ` David Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240907103437.71139-1-00107082@163.com \
    --to=00107082@163.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox