From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: dsterba@suse.cz
Cc: Qu Wenruo <wqu@suse.com>, David Sterba <dsterba@suse.com>,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 09/12] btrfs: clear defragmented inodes using postorder in btrfs_cleanup_defrag_inodes()
Date: Wed, 28 Aug 2024 09:43:17 +0930 [thread overview]
Message-ID: <8cc59c80-444b-4e2a-8d70-40c7b7fd3ff6@gmx.com> (raw)
In-Reply-To: <20240828001147.GB25962@twin.jikos.cz>
在 2024/8/28 09:41, David Sterba 写道:
> On Wed, Aug 28, 2024 at 09:18:11AM +0930, Qu Wenruo wrote:
>>
>>
>> 在 2024/8/28 08:48, David Sterba 写道:
>>> On Wed, Aug 28, 2024 at 08:29:23AM +0930, Qu Wenruo wrote:
>>>>
>>>>
>>>> 在 2024/8/28 07:25, David Sterba 写道:
>>>>> btrfs_cleanup_defrag_inodes() is not called frequently, only in remount
>>>>> or unmount, but the way it frees the inodes in fs_info->defrag_inodes
>>>>> is inefficient. Each time it needs to locate first node, remove it,
>>>>> potentially rebalance tree until it's done. This allows to do a
>>>>> conditional reschedule.
>>>>>
>>>>> For cleanups the rbtree_postorder_for_each_entry_safe() iterator is
>>>>> convenient but if the reschedule happens and unlocks fs_info->defrag_inodes_lock
>>>>> we can't be sure that the tree is in the same state. If that happens,
>>>>> restart the iteration from the beginning.
>>>>
>>>> In that case, isn't the rbtree itself in an inconsistent state, and
>>>> restarting will only cause invalid memory access?
>>>>
>>>> So in this particular case, since we can be interrupted, the full tree
>>>> balance looks like the only safe way we can go?
>>>
>>> You're right, the nodes get freed so even if the iteration is restarted
>>> it would touch freed memory, IOW rbtree_postorder_for_each_entry_safe()
>>> can't be interrupted. I can drop the reschedule, with the same argument
>>> that it should be relatively fast even for thousands of entries, this
>>> should not hurt for remouunt/umount context.
>>>
>>
>> Considering the autodefrag is only triggered for certain writes, and at
>> remount (to RO) or unmount time, there should be no more writes, the
>> solution looks fine.
>
> Ok, thanks. I'll commit the following updated version:
>
> btrfs: clear defragmented inodes using postorder in btrfs_cleanup_defrag_inodes()
>
> btrfs_cleanup_defrag_inodes() is not called frequently, only in remount
> or unmount, but the way it frees the inodes in fs_info->defrag_inodes
> is inefficient. Each time it needs to locate first node, remove it,
> potentially rebalance tree until it's done. This allows to do a
> conditional reschedule.
>
> For cleanups the rbtree_postorder_for_each_entry_safe() iterator is
> convenient but we can't reschedule and restart iteration because some of
> the tree nodes would be already freed.
>
> The cleanup operation is kmem_cache_free() which will likely take the
> fast path for most objects so rescheduling should not be necessary.
>
> Signed-off-by: David Sterba <dsterba@suse.com>
>
> --- a/fs/btrfs/defrag.c
> +++ b/fs/btrfs/defrag.c
> @@ -212,19 +212,12 @@ static struct inode_defrag *btrfs_pick_defrag_inode(
>
> void btrfs_cleanup_defrag_inodes(struct btrfs_fs_info *fs_info)
> {
> - struct inode_defrag *defrag;
> - struct rb_node *node;
> + struct inode_defrag *defrag, *next;
>
> spin_lock(&fs_info->defrag_inodes_lock);
> - node = rb_first(&fs_info->defrag_inodes);
> - while (node) {
> - rb_erase(node, &fs_info->defrag_inodes);
> - defrag = rb_entry(node, struct inode_defrag, rb_node);
> + rbtree_postorder_for_each_entry_safe(defrag, next, &fs_info->defrag_inodes,
> + rb_node) {
> kmem_cache_free(btrfs_inode_defrag_cachep, defrag);
> -
> - cond_resched_lock(&fs_info->defrag_inodes_lock);
> -
> - node = rb_first(&fs_info->defrag_inodes);
> }
Since it's just a simple kmem_cache_free() line, there is no need for
the brackets.
Otherwise the whole series looks good to me.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Thanks,
Qu
> spin_unlock(&fs_info->defrag_inodes_lock);
> }
>
next prev parent reply other threads:[~2024-08-28 0:13 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-27 21:55 [PATCH 00/12] Renames and defrag cleanups David Sterba
2024-08-27 21:55 ` [PATCH 01/12] btrfs: rename btrfs_submit_bio() to btrfs_submit_bbio() David Sterba
2024-08-27 21:55 ` [PATCH 02/12] btrfs: rename __btrfs_submit_bio() and drop double underscores David Sterba
2024-08-27 21:55 ` [PATCH 03/12] btrfs: rename __extent_writepage() " David Sterba
2024-08-27 21:55 ` [PATCH 04/12] btrfs: rename __compare_inode_defrag() " David Sterba
2024-08-27 21:55 ` [PATCH 05/12] btrfs: constify arguments of compare_inode_defrag() David Sterba
2024-08-27 21:55 ` [PATCH 06/12] btrfs: rename __need_auto_defrag() and drop double underscores David Sterba
2024-08-27 21:55 ` [PATCH 07/12] btrfs: rename __btrfs_add_inode_defrag() " David Sterba
2024-08-27 21:55 ` [PATCH 08/12] btrfs: rename __btrfs_run_defrag_inode() " David Sterba
2024-08-27 21:55 ` [PATCH 09/12] btrfs: clear defragmented inodes using postorder in btrfs_cleanup_defrag_inodes() David Sterba
2024-08-27 22:59 ` Qu Wenruo
2024-08-27 23:18 ` David Sterba
2024-08-27 23:48 ` Qu Wenruo
2024-08-28 0:11 ` David Sterba
2024-08-28 0:13 ` Qu Wenruo [this message]
2024-08-27 21:55 ` [PATCH 10/12] btrfs: return void from btrfs_add_inode_defrag() David Sterba
2024-08-27 21:55 ` [PATCH 11/12] btrfs: drop transaction parameter " David Sterba
2024-08-27 21:55 ` [PATCH 12/12] btrfs: always pass readahead state to defrag David Sterba
2024-08-27 23:02 ` [PATCH 00/12] Renames and defrag cleanups Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8cc59c80-444b-4e2a-8d70-40c7b7fd3ff6@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=dsterba@suse.com \
--cc=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox