From: Gabriel Niebler <gniebler@suse.com>
To: Nikolay Borisov <nborisov@suse.com>, linux-btrfs@vger.kernel.org
Cc: dsterba@suse.com
Subject: Re: [PATCH] btrfs: Turn fs_roots_radix in btrfs_fs_info into an XArray
Date: Mon, 2 May 2022 10:59:01 +0200 [thread overview]
Message-ID: <e4ced932-1f39-86fb-c0a4-018c47cf10fa@suse.com> (raw)
In-Reply-To: <a2b7e2c4-440c-318c-ea1f-273be04591f0@suse.com>
Am 28.04.22 um 13:59 schrieb Nikolay Borisov:
> On 27.04.22 г. 0:45 ч., Gabriel Niebler wrote:
>> … rename it to simply fs_roots and adjust all usages of this object to
>> use
>> the XArray API, because it is notionally easier to use and unserstand, as
>> it provides array semantics, and also takes care of locking for us,
>> further simplifying the code.
>>
>> Also do some refactoring, esp. where the API change requires largely
>> rewriting some functions, anyway.
>>
>> Signed-off-by: Gabriel Niebler <gniebler@suse.com>
>> ---
>> fs/btrfs/ctree.h | 5 +-
>> fs/btrfs/disk-io.c | 176 ++++++++++++++++++++---------------------
>> fs/btrfs/inode.c | 13 +--
>> fs/btrfs/transaction.c | 67 +++++++---------
>> 4 files changed, 126 insertions(+), 135 deletions(-)
>>
>
> <snip>
>
>> @@ -2346,28 +2340,23 @@ void btrfs_put_root(struct btrfs_root *root)
>> void btrfs_free_fs_roots(struct btrfs_fs_info *fs_info)
>> {
>> - int ret;
>> - struct btrfs_root *gang[8];
>> - int i;
>> + struct btrfs_root *root;
>> + unsigned long index = 0;
>> while (!list_empty(&fs_info->dead_roots)) {
>> - gang[0] = list_entry(fs_info->dead_roots.next,
>> - struct btrfs_root, root_list);
>> - list_del(&gang[0]->root_list);
>> + root = list_entry(fs_info->dead_roots.next,
>> + struct btrfs_root, root_list);
>> + list_del(&root->root_list);
>> - if (test_bit(BTRFS_ROOT_IN_RADIX, &gang[0]->state))
>> - btrfs_drop_and_free_fs_root(fs_info, gang[0]);
>> - btrfs_put_root(gang[0]);
>> + if (test_bit(BTRFS_ROOT_IN_RADIX, &root->state))
>> + btrfs_drop_and_free_fs_root(fs_info, root);
>> + btrfs_put_root(root);
>> }
>> - while (1) {
>> - ret = radix_tree_gang_lookup(&fs_info->fs_roots_radix,
>> - (void **)gang, 0,
>> - ARRAY_SIZE(gang));
>> - if (!ret)
>> - break;
>> - for (i = 0; i < ret; i++)
>> - btrfs_drop_and_free_fs_root(fs_info, gang[i]);
>> + while (!xa_empty(&fs_info->fs_roots)) > +
>> xa_for_each(&fs_info->fs_roots, index, root) {
>
> Why do you need the nested loop structure? xa_for_each should be
> sufficient as btrfs_free_fs_roots happens when the fs is being unmounted
> and we can't get another loop added after having done xa_for_each. And
> even if it could the nested loop structure won't help because we don't
> hold any locks, so hypothetically speaking, even if there was a race
> using nested loops doesn't solve it.
I wasn't sure if the old code was implemented like this only because of
the way radix_tree_gang_lookup works, or if there was another reason,
like preventing a possible race condition (e.g. when another root gets
inserted), or perhaps as a retry mechanism in case of delays/faillures
in root removal.
> TLDR: Leavint this xa_for_each should be sufficient.
Acknowledged.
>> + btrfs_drop_and_free_fs_root(fs_info, root);
>> + }
>> }
>> }
>
> <snip>
>
>> @@ -4491,12 +4480,12 @@ void btrfs_drop_and_free_fs_root(struct
>> btrfs_fs_info *fs_info,
>> {
>> bool drop_ref = false;
>> - spin_lock(&fs_info->fs_roots_radix_lock);
>> - radix_tree_delete(&fs_info->fs_roots_radix,
>> - (unsigned long)root->root_key.objectid);
>> + spin_lock(&fs_info->fs_roots_lock);
>> + xa_erase(&fs_info->fs_roots,
>> + (unsigned long)root->root_key.objectid);
>
> nit: No need to have the 2nd argument on a new line, as the line is
> short enough.
Oh yeah, sorry... I had used `fs_roots_xarray` originally, then renamed
it before submission. I thought I had checked all the lines for possible
shortening, but it looks like I missed some.
Thanks for noticing!
>> if (test_and_clear_bit(BTRFS_ROOT_IN_RADIX, &root->state))
>> drop_ref = true;
>> - spin_unlock(&fs_info->fs_roots_radix_lock);
>> + spin_unlock(&fs_info->fs_roots_lock);
>> if (BTRFS_FS_ERROR(fs_info)) {
>> ASSERT(root->log_root == NULL);
>> @@ -4512,50 +4501,54 @@ void btrfs_drop_and_free_fs_root(struct
>> btrfs_fs_info *fs_info,
>> int btrfs_cleanup_fs_roots(struct btrfs_fs_info *fs_info)
>> {
>> - u64 root_objectid = 0;
>> - struct btrfs_root *gang[8];
>> - int i = 0;
>> + struct btrfs_root *roots[8];
>> + unsigned long index = 0;
>> + int i;
> nit: This can be defined into the 2 loops that use the variable.
Sure, why not.
>> int err = 0;
>> - unsigned int ret = 0;
>> + int grabbed;
>> while (1) {
>> - spin_lock(&fs_info->fs_roots_radix_lock);
>> - ret = radix_tree_gang_lookup(&fs_info->fs_roots_radix,
>> - (void **)gang, root_objectid,
>> - ARRAY_SIZE(gang));
>> - if (!ret) {
>> - spin_unlock(&fs_info->fs_roots_radix_lock);
>> + struct btrfs_root *root;
>> +
>> + spin_lock(&fs_info->fs_roots_lock);
>> + if (!xa_find(&fs_info->fs_roots, &index,
>> + ULONG_MAX, XA_PRESENT)) {
>> + spin_unlock(&fs_info->fs_roots_lock);
>> break;
>> }
>> - root_objectid = gang[ret - 1]->root_key.objectid + 1;
>> - for (i = 0; i < ret; i++) {
>> - /* Avoid to grab roots in dead_roots */
>> - if (btrfs_root_refs(&gang[i]->root_item) == 0) {
>> - gang[i] = NULL;
>> - continue;
>> + grabbed = 0;
>> + xa_for_each_start(&fs_info->fs_roots, index, root,
>> + index) {
>> + /* Avoid grabbing roots in dead_roots */
>> + if (btrfs_root_refs(&root->root_item) == 0) {
>> + roots[grabbed] = NULL;
>> + } else {
>> + /* Grab all the search results for later use */
>> + roots[grabbed] = btrfs_grab_root(root);
>> }
>> - /* grab all the search result for later use */
>> - gang[i] = btrfs_grab_root(gang[i]);
>> + grabbed++;
>> + if (grabbed >= ARRAY_SIZE(roots))
>> + break;
>> }
>> - spin_unlock(&fs_info->fs_roots_radix_lock);
>> + spin_unlock(&fs_info->fs_roots_lock);
>> - for (i = 0; i < ret; i++) {
>> - if (!gang[i])
>> + for (i = 0; i < grabbed; i++) {
>> + if (!roots[i])
>> continue;
>> - root_objectid = gang[i]->root_key.objectid;
>> - err = btrfs_orphan_cleanup(gang[i]);
>> + index = roots[i]->root_key.objectid;
>> + err = btrfs_orphan_cleanup(roots[i]);
>> if (err)
>> break;
>> - btrfs_put_root(gang[i]);
>> + btrfs_put_root(roots[i]);
>> }
>> - root_objectid++;
>> + index++;
>> }
>> - /* release the uncleaned roots due to error */
>> - for (; i < ret; i++) {
>> - if (gang[i])
>> - btrfs_put_root(gang[i]);
>> + /* Release the roots that remain uncleaned due to error */
>> + for (; i < grabbed; i++) {
>> + if (roots[i])
>> + btrfs_put_root(roots[i]);
>> }
>> return err;
>> }
>> @@ -4872,31 +4865,36 @@ static void btrfs_error_commit_super(struct
>> btrfs_fs_info *fs_info)
>> static void btrfs_drop_all_logs(struct btrfs_fs_info *fs_info)
>> {
>> - struct btrfs_root *gang[8];
>> - u64 root_objectid = 0;
>> - int ret;
>> + unsigned long index = 0;
>> - spin_lock(&fs_info->fs_roots_radix_lock);
>> - while ((ret = radix_tree_gang_lookup(&fs_info->fs_roots_radix,
>> - (void **)gang, root_objectid,
>> - ARRAY_SIZE(gang))) != 0) {
>> + spin_lock(&fs_info->fs_roots_lock);
>> + while (xa_find(&fs_info->fs_roots,
>> + &index, ULONG_MAX, XA_PRESENT)) {
>
> nit: No need to split across two lines as it's short enough.
ACK
>> + struct btrfs_root *root;
>> + struct btrfs_root *roots[8];
>> int i;
>
> nit: This can probably be defined inside the for loop
OK
>> + int grabbed = 0;
>> - for (i = 0; i < ret; i++)
>> - gang[i] = btrfs_grab_root(gang[i]);
>> - spin_unlock(&fs_info->fs_roots_radix_lock);
>> + xa_for_each_start(&fs_info->fs_roots, index, root,
>> + index) {
>> + roots[grabbed] = btrfs_grab_root(root);
>> + grabbed++;
>> + if (grabbed >= ARRAY_SIZE(roots))
>> + break;
>> + }
>> + spin_unlock(&fs_info->fs_roots_lock);
>> - for (i = 0; i < ret; i++) {
>> - if (!gang[i])
>> + for (i = 0; i < grabbed; i++) {
>> + if (!roots[i])
>> continue;
>> - root_objectid = gang[i]->root_key.objectid;
>> - btrfs_free_log(NULL, gang[i]);
>> - btrfs_put_root(gang[i]);
>> + index = roots[i]->root_key.objectid;
>> + btrfs_free_log(NULL, roots[i]);
>> + btrfs_put_root(roots[i]);
>> }
>> - root_objectid++;
>> - spin_lock(&fs_info->fs_roots_radix_lock);
>> + index++;
>> + spin_lock(&fs_info->fs_roots_lock);
>> }
>> - spin_unlock(&fs_info->fs_roots_radix_lock);
>> + spin_unlock(&fs_info->fs_roots_lock);
>> btrfs_free_log_root_tree(NULL, fs_info);
>> }
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index 5082b9c70f8c..d0ef3a17ce11 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -3494,6 +3494,7 @@ int btrfs_orphan_cleanup(struct btrfs_root *root)
>> u64 last_objectid = 0;
>> int ret = 0, nr_unlink = 0;
>> + /* Bail out if the cleanup is already running. */
> nit: This seems like an irrelevant change.
I did stumble a bit over the following line when I read through the
function for the first time, so I added this in for possible future newbies.
I'd like to keep this in, if you don't mind. It's not like the comment
hurts anything.
>> if (test_and_set_bit(BTRFS_ROOT_ORPHAN_CLEANUP, &root->state))
>> return 0;
>
> <snip>
>
>> @@ -1404,9 +1402,8 @@ void btrfs_add_dead_root(struct btrfs_root *root)
>> static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>> {
>> struct btrfs_fs_info *fs_info = trans->fs_info;
>> - struct btrfs_root *gang[8];
>> - int i;
>> - int ret;
>> + struct btrfs_root *root;
>> + unsigned long index;
>> /*
>> * At this point no one can be using this transaction to modify
>> any tree
>> @@ -1414,17 +1411,11 @@ static noinline int commit_fs_roots(struct
>> btrfs_trans_handle *trans)
>> */
>> ASSERT(trans->transaction->state == TRANS_STATE_COMMIT_DOING);
>> - spin_lock(&fs_info->fs_roots_radix_lock);
>> - while (1) {
>> - ret = radix_tree_gang_lookup_tag(&fs_info->fs_roots_radix,
>> - (void **)gang, 0,
>> - ARRAY_SIZE(gang),
>> - BTRFS_ROOT_TRANS_TAG);
>> - if (ret == 0)
>> - break;
>> - for (i = 0; i < ret; i++) {
>> - struct btrfs_root *root = gang[i];
>> - int ret2;
>> + spin_lock(&fs_info->fs_roots_lock);
>> + while (xa_marked(&fs_info->fs_roots, BTRFS_ROOT_TRANS_TAG)) {
>> + xa_for_each_marked(&fs_info->fs_roots, index, root,
>
> While the while/xa_for_each_marked and not straight xa_for_each_marked?
Like before, this just mimicks the old logic, because I wasn't sure
whether the double loop is only due to the bunched nature of gang lookup
or whether it has some additional safeguarding function.
I'm fine with simplifying it to xa_for_each_marked if you say it's safe.
Waiting for further feedback whether to resubmit or not.
next prev parent reply other threads:[~2022-05-02 8:59 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-26 21:45 [PATCH] btrfs: Turn fs_roots_radix in btrfs_fs_info into an XArray Gabriel Niebler
2022-04-27 14:51 ` Gabriel Niebler
2022-04-27 19:21 ` David Sterba
2022-04-28 11:37 ` Nikolay Borisov
2022-04-27 20:07 ` David Sterba
2022-04-28 11:59 ` Nikolay Borisov
2022-05-02 8:59 ` Gabriel Niebler [this message]
2022-05-02 18:39 ` David Sterba
2022-05-03 8:01 ` Gabriel Niebler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e4ced932-1f39-86fb-c0a4-018c47cf10fa@suse.com \
--to=gniebler@suse.com \
--cc=dsterba@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=nborisov@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox