From: David Wang <00107082@163.com>
To: kent.overstreet@linux.dev
Cc: linux-bcachefs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [BUG?] bcachefs performance: read is way too slow when a file has no overwrite.
Date: Thu, 12 Sep 2024 15:52:46 +0800 [thread overview]
Message-ID: <20240912075246.5810-1-00107082@163.com> (raw)
In-Reply-To: <f69544.2e70.191e419e656.Coremail.00107082@163.com>
Hi,
> I made some debug, when performance is bad, the conditions
> bvec_iter_sectors(iter) != pick.crc.uncompressed_size and
> bvec_iter_sectors(iter) != pick.crc.live_size are "almost" always both "true",
> while when performance is good (after "thorough" write), they are only little
> percent (~350 out of 1000000) to be true.
>
> And if those conditions are "true", "bounce" would be set and code seems to run
> on a time consuming path.
>
> I suspect "merely read" could never change those conditions, but "write" can?
>
More update:
1. Without a "thorough" write, it seems no matter what the prepare write size is,
crc.compressed_size is always 128 sectors = 64K?
2. With a "thorough" write with 4K block size, crc.compressed_size mostly descreases to 4K,
only a few crc.compressed_size left with 8/12/16/20K...
3. If a 4K-thorough-write followed by 40K-thorough-write, crc.compressed_size then
increases to 40K, and 4K direct read suffers again....
4. A 40K-through-write followed by 256K-thorough-write, crc.compressed_size only
increase to 64K, I guess 64K is maximum crc.compressed_size.
So I think current conclusion is:
1. The initial crc.compressed_size is always 64K when file was created/prepared.
2. Afterward writes can change crc size based on write size. (optimized for write?)
3. Direct read performance is sensitive to this crc size, more test result:
+-----------+--------+----------+
| rand read | IOPS | BW |
+-----------+--------+----------+
| 4K !E | 24.7K | 101MB/s |
| 16K !E | 24.7K | 404MB/s |
| 64K !E | 24.7K | 1617MB/s |
| 4K E | ~220K | ~900MB/s |
| 16K E | ~55K | ~900MB/s |
| 64K E | ~13.8K | ~900MB/s |
+-----------+--------+----------+
E stands for the event that a "thorough" 4k write happened before the test.
Or put it more specific:
E: lots of rand 4k-write, crc.compressed_size = 4K
!E: file was just created, crc.compressed_size = 64K
The behavior seems reasonable from write's point of view, but for read it
dose not sounds good....If a mmaped readonly file, page in less than
16 pages, those extra data would waste lots of disk bandwidth.
David
next prev parent reply other threads:[~2024-09-12 7:53 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-06 15:43 [BUG?] bcachefs performance: read is way too slow when a file has no overwrite David Wang
2024-09-06 17:38 ` Kent Overstreet
2024-09-07 10:34 ` David Wang
2024-09-09 13:37 ` Kent Overstreet
2024-09-12 2:39 ` David Wang
2024-09-12 7:52 ` David Wang [this message]
2024-09-21 16:02 ` David Wang
2024-09-21 16:12 ` Kent Overstreet
2024-09-22 1:39 ` David Wang
2024-09-22 8:31 ` David Wang
2024-09-22 8:47 ` David Wang
2024-09-24 11:08 ` David Wang
2024-09-24 11:30 ` Kent Overstreet
2024-09-24 12:38 ` David Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240912075246.5810-1-00107082@163.com \
--to=00107082@163.com \
--cc=kent.overstreet@linux.dev \
--cc=linux-bcachefs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox