From: "David Wang" <00107082@163.com>
To: "Kent Overstreet" <kent.overstreet@linux.dev>
Cc: linux-bcachefs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [BUG?] bcachefs performance: read is way too slow when a file has no overwrite.
Date: Thu, 12 Sep 2024 10:39:48 +0800 (CST) [thread overview]
Message-ID: <f69544.2e70.191e419e656.Coremail.00107082@163.com> (raw)
In-Reply-To: <ebqvaqme76nrgr2dh7avy7yjwxsgnnybxuybgxejahupgbrqw5@a6d244ghjqis>
Hi,
At 2024-09-09 21:37:35, "Kent Overstreet" <kent.overstreet@linux.dev> wrote:
>On Sat, Sep 07, 2024 at 06:34:37PM GMT, David Wang wrote:
>>
>> Based on the result:
>> 1. The row with prepare-write size 4K stands out, here.
>> When files were prepaired with write size 4K, the afterwards
>> read performance is worse. (I did double check the result,
>> but it is possible that I miss some affecting factors.);
>
>On small blocksize tests you should be looking at IOPS, not MB/s.
>
>Prepare-write size is the column?
Each row is for a specific prepare-write size indicated by first column.
>
>Another factor is that we do merge extents (including checksums); so if
>the preparet-write is done sequentially we won't actually be ending up
>with extents of the same size as what we wrote.
>
>I believe there's a knob somewhere to turn off extent merging (module
>parameter? it's intended for debugging).
I made some debug, when performance is bad, the conditions
bvec_iter_sectors(iter) != pick.crc.uncompressed_size and
bvec_iter_sectors(iter) != pick.crc.live_size are "almost" always both "true",
while when performance is good (after "thorough" write), they are only little
percent (~350 out of 1000000) to be true.
And if those conditions are "true", "bounce" would be set and code seems to run
on a time consuming path.
I suspect "merely read" could never change those conditions, but "write" can?
>
>> 2. Without O_DIRECT, read performance seems correlated with the difference
>> between read size and prepare write size, but with O_DIRECT, correlation is not obvious.
>
>So the O_DIRECT and buffered IO paths are very different (in every
>filesystem) - you're looking at very different things. They are both
>subject to the checksum granularity issue, but in buffered mode we round
>up reads to extent size, when filling into the page cache.
>
>Big standard deviation (high tail latency?) is something we'd want to
>track down. There's a bunch of time_stats in sysfs, but they're mostly
>for the write paths. If you're trying to identify where the latencies
>are coming from, we can look at adding some new time stats to isolate.
next prev parent reply other threads:[~2024-09-12 2:40 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-06 15:43 [BUG?] bcachefs performance: read is way too slow when a file has no overwrite David Wang
2024-09-06 17:38 ` Kent Overstreet
2024-09-07 10:34 ` David Wang
2024-09-09 13:37 ` Kent Overstreet
2024-09-12 2:39 ` David Wang [this message]
2024-09-12 7:52 ` David Wang
2024-09-21 16:02 ` David Wang
2024-09-21 16:12 ` Kent Overstreet
2024-09-22 1:39 ` David Wang
2024-09-22 8:31 ` David Wang
2024-09-22 8:47 ` David Wang
2024-09-24 11:08 ` David Wang
2024-09-24 11:30 ` Kent Overstreet
2024-09-24 12:38 ` David Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f69544.2e70.191e419e656.Coremail.00107082@163.com \
--to=00107082@163.com \
--cc=kent.overstreet@linux.dev \
--cc=linux-bcachefs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox