public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Filipe Manana <fdmanana@kernel.org>, Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH RFC 0/2] btrfs: remove the metadata readahead mechanism
Date: Tue, 7 Dec 2021 20:01:04 +0800	[thread overview]
Message-ID: <a91e60a4-7f5a-43eb-3c10-af2416aade9f@suse.com> (raw)
In-Reply-To: <Ya9L2qSe+XKgtesq@debian9.Home>



On 2021/12/7 19:56, Filipe Manana wrote:
> On Tue, Dec 07, 2021 at 07:43:49PM +0800, Qu Wenruo wrote:
>>
>>
>> On 2021/12/7 19:02, Filipe Manana wrote:
>>> On Tue, Dec 07, 2021 at 03:43:58PM +0800, Qu Wenruo wrote:
>>>> This is originally just my preparation for scrub refactors, but when the
>>>> readahead is involved, it won't just be a small cleanup.
>>>>
>>>> The metadata readahead code is introduced in 2011 (surprisingly, the
>>>> commit message even contains changelog), but now only one user for it,
>>>> and even for the only one user, the readahead mechanism can't provide
>>>> much help in fact.
>>>>
>>>> Scrub needs readahead for commit root, but the existing one can only do
>>>> current root readahead.
>>>
>>> If support for the commit root is added, is there a noticeable speedup?
>>> Have you tested that?
>>>
>>
>> Will craft a benchmark for that.
>>
>> Although I don't have any HDD available for benchmark, thus would only
>> have result from SATA SSD.
>>
>>>>
>>>> And the code is at a very bad layer inside btrfs, all metadata are at
>>>> btrfs logical address space, but the readahead is kinda working at
>>>> device layer (to manage the in-flight readahead).
>>>>
>>>> Personally speaking, I don't think such "optimization" is really even
>>>> needed, since we have better way like setting IO priority.
>>>
>>> Have you done any benchmarks?
>>> How? On physical machines or VMs?
>>>
>>> Please include such details in the changelogs.
>>>
>>>>
>>>> I really prefer to let the professional block layer guys do whatever
>>>> they are good at (and in fact, those block layer guys rock!).
>>>> Immature optimization is the cause of bugs, and it has already caused
>>>> several bugs recently.
>>>>
>>>> Nowadays we have btrfs_path::reada to do the readahead, I doubt if we
>>>> really need such facility.
>>>
>>> btrfs_path:reada is important and it makes a difference.
>>> I recently changed send to use it, and benchmarks can be found in the
>>> changelogs.
>>
>> For the "such facility" I mean the btrfs_reada_add() facility, not the
>> btrfs_path::reada one.
>>
>>>
>>> There are also other places where it makes a difference, such as when
>>> reading a large chunk tree during mount or when reading a large directory.
>>>
>>> It's all about reading other leaves/nodes in the background that will be
>>> needed in the near future while the task is doing something else. Even if
>>> the nodes/leaves are not physically contiguous on disk (that's the main
>>> reason why the mechanism exists).
>>
>> Unfortunately, not really in the background.
>>
>> For scrub usage, it kicks readahead and wait for it, not really doing it
>> in the background.
>>
>> (Nor btrfs_path::reada either though, btrfs_path reada also happens at
>> tree search time, and it's synchronous).
> 
> No, the btrfs_path::reada mechanism is not synchronous - it does not wait
> for the reads (IO) to complete.
> 
> btrfs_readahead_node_child() triggers a read for the extent buffer's
> pages but does not wait for the reads to complete. I.e. we end up calling:
> 
>      read_extent_buffer_pages(eb, WAIT_NONE, 0);
> 
> So it does not wait on the read bios to complete.
> Otherwise that would be pointless.

Oh, forgot the WAIT_NONE part...

Then to be honest, there is even less meaning to have btrfs_reada_add() 
facility, the path reada is better than that.

> 
>>
>> Another reason why the existing btrfs_reada_add() facility is not
>> suitable for scrub is, our default tree block size is way larger than
>> the scrub data length.
>>
>> The current data length is 16 pages (64K), while even one 16K leaf can
>> contain at least csum for 8M (CRC32) or 1M (SHA256).
>> This means for most readahead, it doesn't make much sense as it won't
>> cross leaf boundaries that frequently.
>>
>> (BTW, in this particular case, btrfs_path::reada may perform better than
>> the start/end based reada, as that would really do some readahead)
>>
>> Anyway, only benchmark can prove whether I'm correct or wrong.
> 
> Yep, and preferably on a spinning disk and bare metal (no VM).

Unfortunately, that's what I don't have...

Physical machines either is the main ITX desktop I use for development 
and VM testing, or laptops without SATA interface for HDD.

And even worse, I don't have any HDD available for testing yet. (The 
only HDDs are utilized by my NAS)

I'll go for VM with SATA SSD (directly passed to VM) first, then try to 
get a physical machine (maybe NUC with 2.5in HDD then).

Thanks,
Qu

> 
> Thanks.
> 
>>
>> Thanks,
>> Qu
>>>
>>>>
>>>> So here I purpose to completely remove the old and under utilized
>>>> metadata readahead system.
>>>>
>>>> Qu Wenruo (2):
>>>>     btrfs: remove the unnecessary path parameter for scrub_raid56_parity()
>>>>     btrfs: remove reada mechanism
>>>>
>>>>    fs/btrfs/Makefile      |    2 +-
>>>>    fs/btrfs/ctree.h       |   25 -
>>>>    fs/btrfs/dev-replace.c |    5 -
>>>>    fs/btrfs/disk-io.c     |   20 +-
>>>>    fs/btrfs/extent_io.c   |    3 -
>>>>    fs/btrfs/reada.c       | 1086 ----------------------------------------
>>>>    fs/btrfs/scrub.c       |   64 +--
>>>>    fs/btrfs/super.c       |    1 -
>>>>    fs/btrfs/volumes.c     |    7 -
>>>>    fs/btrfs/volumes.h     |    7 -
>>>>    10 files changed, 17 insertions(+), 1203 deletions(-)
>>>>    delete mode 100644 fs/btrfs/reada.c
>>>>
>>>> --
>>>> 2.34.1
>>>>
> 


  reply	other threads:[~2021-12-07 12:01 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-07  7:43 [PATCH RFC 0/2] btrfs: remove the metadata readahead mechanism Qu Wenruo
2021-12-07  7:43 ` [PATCH RFC 1/2] btrfs: remove the unnecessary path parameter for scrub_raid56_parity() Qu Wenruo
2021-12-07  7:44 ` [PATCH RFC 2/2] btrfs: remove reada mechanism Qu Wenruo
2021-12-07 11:02 ` [PATCH RFC 0/2] btrfs: remove the metadata readahead mechanism Filipe Manana
2021-12-07 11:43   ` Qu Wenruo
2021-12-07 11:56     ` Filipe Manana
2021-12-07 12:01       ` Qu Wenruo [this message]
2021-12-07 14:53         ` David Sterba
2021-12-07 15:40           ` David Sterba
2021-12-07 15:53             ` Filipe Manana
2021-12-08  0:08               ` Qu Wenruo
2021-12-08 14:04               ` David Sterba
2021-12-09 10:25                 ` Filipe Manana
2021-12-09 13:25                   ` Qu Wenruo
2021-12-09 14:33                     ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a91e60a4-7f5a-43eb-3c10-af2416aade9f@suse.com \
    --to=wqu@suse.com \
    --cc=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox