public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Gabriel Niebler <gniebler@suse.com>
To: Nikolay Borisov <nborisov@suse.com>, linux-btrfs@vger.kernel.org
Cc: dsterba@suse.com
Subject: Re: [PATCH] btrfs: Turn fs_roots_radix in btrfs_fs_info into an XArray
Date: Mon, 2 May 2022 10:59:01 +0200	[thread overview]
Message-ID: <e4ced932-1f39-86fb-c0a4-018c47cf10fa@suse.com> (raw)
In-Reply-To: <a2b7e2c4-440c-318c-ea1f-273be04591f0@suse.com>

Am 28.04.22 um 13:59 schrieb Nikolay Borisov:
> On 27.04.22 г. 0:45 ч., Gabriel Niebler wrote:
>> … rename it to simply fs_roots and adjust all usages of this object to 
>> use
>> the XArray API, because it is notionally easier to use and unserstand, as
>> it provides array semantics, and also takes care of locking for us,
>> further simplifying the code.
>>
>> Also do some refactoring, esp. where the API change requires largely
>> rewriting some functions, anyway.
>>
>> Signed-off-by: Gabriel Niebler <gniebler@suse.com>
>> ---
>>   fs/btrfs/ctree.h       |   5 +-
>>   fs/btrfs/disk-io.c     | 176 ++++++++++++++++++++---------------------
>>   fs/btrfs/inode.c       |  13 +--
>>   fs/btrfs/transaction.c |  67 +++++++---------
>>   4 files changed, 126 insertions(+), 135 deletions(-)
>>
> 
> <snip>
> 
>> @@ -2346,28 +2340,23 @@ void btrfs_put_root(struct btrfs_root *root)
>>   void btrfs_free_fs_roots(struct btrfs_fs_info *fs_info)
>>   {
>> -    int ret;
>> -    struct btrfs_root *gang[8];
>> -    int i;
>> +    struct btrfs_root *root;
>> +    unsigned long index = 0;
>>       while (!list_empty(&fs_info->dead_roots)) {
>> -        gang[0] = list_entry(fs_info->dead_roots.next,
>> -                     struct btrfs_root, root_list);
>> -        list_del(&gang[0]->root_list);
>> +        root = list_entry(fs_info->dead_roots.next,
>> +                  struct btrfs_root, root_list);
>> +        list_del(&root->root_list);
>> -        if (test_bit(BTRFS_ROOT_IN_RADIX, &gang[0]->state))
>> -            btrfs_drop_and_free_fs_root(fs_info, gang[0]);
>> -        btrfs_put_root(gang[0]);
>> +        if (test_bit(BTRFS_ROOT_IN_RADIX, &root->state))
>> +            btrfs_drop_and_free_fs_root(fs_info, root);
>> +        btrfs_put_root(root);
>>       }
>> -    while (1) {
>> -        ret = radix_tree_gang_lookup(&fs_info->fs_roots_radix,
>> -                         (void **)gang, 0,
>> -                         ARRAY_SIZE(gang));
>> -        if (!ret)
>> -            break;
>> -        for (i = 0; i < ret; i++)
>> -            btrfs_drop_and_free_fs_root(fs_info, gang[i]);
>> +    while (!xa_empty(&fs_info->fs_roots))  > +        
>> xa_for_each(&fs_info->fs_roots, index, root) {
> 
> Why do you need the nested loop structure? xa_for_each should be 
> sufficient as btrfs_free_fs_roots happens when the fs is being unmounted 
> and we can't get another loop added after having done xa_for_each. And 
> even if it could the nested loop structure won't help because we don't 
> hold any locks, so hypothetically speaking, even if there was a race 
> using nested loops doesn't solve it.

I wasn't sure if the old code was implemented like this only because of 
the way radix_tree_gang_lookup works, or if there was another reason, 
like preventing a possible race condition (e.g. when another root gets 
inserted), or perhaps as a retry mechanism in case of delays/faillures 
in root removal.

> TLDR: Leavint this xa_for_each should be sufficient.

Acknowledged.

>> +            btrfs_drop_and_free_fs_root(fs_info, root);
>> +        }
>>       }
>>   }
> 
> <snip>
> 
>> @@ -4491,12 +4480,12 @@ void btrfs_drop_and_free_fs_root(struct 
>> btrfs_fs_info *fs_info,
>>   {
>>       bool drop_ref = false;
>> -    spin_lock(&fs_info->fs_roots_radix_lock);
>> -    radix_tree_delete(&fs_info->fs_roots_radix,
>> -              (unsigned long)root->root_key.objectid);
>> +    spin_lock(&fs_info->fs_roots_lock);
>> +    xa_erase(&fs_info->fs_roots,
>> +         (unsigned long)root->root_key.objectid);
> 
> nit: No need to have the 2nd argument on a new line, as the line is 
> short enough.

Oh yeah, sorry... I had used `fs_roots_xarray` originally, then renamed 
it before submission. I thought I had checked all the lines for possible 
shortening, but it looks like I missed some.

Thanks for noticing!

>>       if (test_and_clear_bit(BTRFS_ROOT_IN_RADIX, &root->state))
>>           drop_ref = true;
>> -    spin_unlock(&fs_info->fs_roots_radix_lock);
>> +    spin_unlock(&fs_info->fs_roots_lock);
>>       if (BTRFS_FS_ERROR(fs_info)) {
>>           ASSERT(root->log_root == NULL);
>> @@ -4512,50 +4501,54 @@ void btrfs_drop_and_free_fs_root(struct 
>> btrfs_fs_info *fs_info,
>>   int btrfs_cleanup_fs_roots(struct btrfs_fs_info *fs_info)
>>   {
>> -    u64 root_objectid = 0;
>> -    struct btrfs_root *gang[8];
>> -    int i = 0;
>> +    struct btrfs_root *roots[8];
>> +    unsigned long index = 0;
>> +    int i;
> nit: This can be defined into the 2 loops that use the variable.

Sure, why not.

>>       int err = 0;
>> -    unsigned int ret = 0;
>> +    int grabbed;
>>       while (1) {
>> -        spin_lock(&fs_info->fs_roots_radix_lock);
>> -        ret = radix_tree_gang_lookup(&fs_info->fs_roots_radix,
>> -                         (void **)gang, root_objectid,
>> -                         ARRAY_SIZE(gang));
>> -        if (!ret) {
>> -            spin_unlock(&fs_info->fs_roots_radix_lock);
>> +        struct btrfs_root *root;
>> +
>> +        spin_lock(&fs_info->fs_roots_lock);
>> +        if (!xa_find(&fs_info->fs_roots, &index,
>> +                 ULONG_MAX, XA_PRESENT)) {
>> +            spin_unlock(&fs_info->fs_roots_lock);
>>               break;
>>           }
>> -        root_objectid = gang[ret - 1]->root_key.objectid + 1;
>> -        for (i = 0; i < ret; i++) {
>> -            /* Avoid to grab roots in dead_roots */
>> -            if (btrfs_root_refs(&gang[i]->root_item) == 0) {
>> -                gang[i] = NULL;
>> -                continue;
>> +        grabbed = 0;
>> +        xa_for_each_start(&fs_info->fs_roots, index, root,
>> +                  index) {
>> +            /* Avoid grabbing roots in dead_roots */
>> +            if (btrfs_root_refs(&root->root_item) == 0) {
>> +                roots[grabbed] = NULL;
>> +            } else {
>> +                /* Grab all the search results for later use */
>> +                roots[grabbed] = btrfs_grab_root(root);
>>               }
>> -            /* grab all the search result for later use */
>> -            gang[i] = btrfs_grab_root(gang[i]);
>> +            grabbed++;
>> +            if (grabbed >= ARRAY_SIZE(roots))
>> +                break;
>>           }
>> -        spin_unlock(&fs_info->fs_roots_radix_lock);
>> +        spin_unlock(&fs_info->fs_roots_lock);
>> -        for (i = 0; i < ret; i++) {
>> -            if (!gang[i])
>> +        for (i = 0; i < grabbed; i++) {
>> +            if (!roots[i])
>>                   continue;
>> -            root_objectid = gang[i]->root_key.objectid;
>> -            err = btrfs_orphan_cleanup(gang[i]);
>> +            index = roots[i]->root_key.objectid;
>> +            err = btrfs_orphan_cleanup(roots[i]);
>>               if (err)
>>                   break;
>> -            btrfs_put_root(gang[i]);
>> +            btrfs_put_root(roots[i]);
>>           }
>> -        root_objectid++;
>> +        index++;
>>       }
>> -    /* release the uncleaned roots due to error */
>> -    for (; i < ret; i++) {
>> -        if (gang[i])
>> -            btrfs_put_root(gang[i]);
>> +    /* Release the roots that remain uncleaned due to error */
>> +    for (; i < grabbed; i++) {
>> +        if (roots[i])
>> +            btrfs_put_root(roots[i]);
>>       }
>>       return err;
>>   }
>> @@ -4872,31 +4865,36 @@ static void btrfs_error_commit_super(struct 
>> btrfs_fs_info *fs_info)
>>   static void btrfs_drop_all_logs(struct btrfs_fs_info *fs_info)
>>   {
>> -    struct btrfs_root *gang[8];
>> -    u64 root_objectid = 0;
>> -    int ret;
>> +    unsigned long index = 0;
>> -    spin_lock(&fs_info->fs_roots_radix_lock);
>> -    while ((ret = radix_tree_gang_lookup(&fs_info->fs_roots_radix,
>> -                         (void **)gang, root_objectid,
>> -                         ARRAY_SIZE(gang))) != 0) {
>> +    spin_lock(&fs_info->fs_roots_lock);
>> +    while (xa_find(&fs_info->fs_roots,
>> +               &index, ULONG_MAX, XA_PRESENT)) {
> 
> nit: No need to split across two lines as it's short enough.

ACK

>> +        struct btrfs_root *root;
>> +        struct btrfs_root *roots[8];
>>           int i;
> 
> nit: This can probably be defined inside the for loop

OK

>> +        int grabbed = 0;
>> -        for (i = 0; i < ret; i++)
>> -            gang[i] = btrfs_grab_root(gang[i]);
>> -        spin_unlock(&fs_info->fs_roots_radix_lock);
>> +        xa_for_each_start(&fs_info->fs_roots, index, root,
>> +                  index) {
>> +            roots[grabbed] = btrfs_grab_root(root);
>> +            grabbed++;
>> +            if (grabbed >= ARRAY_SIZE(roots))
>> +                break;
>> +        }
>> +        spin_unlock(&fs_info->fs_roots_lock);
>> -        for (i = 0; i < ret; i++) {
>> -            if (!gang[i])
>> +        for (i = 0; i < grabbed; i++) {
>> +            if (!roots[i])
>>                   continue;
>> -            root_objectid = gang[i]->root_key.objectid;
>> -            btrfs_free_log(NULL, gang[i]);
>> -            btrfs_put_root(gang[i]);
>> +            index = roots[i]->root_key.objectid;
>> +            btrfs_free_log(NULL, roots[i]);
>> +            btrfs_put_root(roots[i]);
>>           }
>> -        root_objectid++;
>> -        spin_lock(&fs_info->fs_roots_radix_lock);
>> +        index++;
>> +        spin_lock(&fs_info->fs_roots_lock);
>>       }
>> -    spin_unlock(&fs_info->fs_roots_radix_lock);
>> +    spin_unlock(&fs_info->fs_roots_lock);
>>       btrfs_free_log_root_tree(NULL, fs_info);
>>   }
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index 5082b9c70f8c..d0ef3a17ce11 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -3494,6 +3494,7 @@ int btrfs_orphan_cleanup(struct btrfs_root *root)
>>       u64 last_objectid = 0;
>>       int ret = 0, nr_unlink = 0;
>> +    /* Bail out if the cleanup is already running. */
> nit: This seems like an irrelevant change.

I did stumble a bit over the following line when I read through the 
function for the first time, so I added this in for possible future newbies.

I'd like to keep this in, if you don't mind. It's not like the comment 
hurts anything.

>>       if (test_and_set_bit(BTRFS_ROOT_ORPHAN_CLEANUP, &root->state))
>>           return 0;
> 
> <snip>
> 
>> @@ -1404,9 +1402,8 @@ void btrfs_add_dead_root(struct btrfs_root *root)
>>   static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
>>   {
>>       struct btrfs_fs_info *fs_info = trans->fs_info;
>> -    struct btrfs_root *gang[8];
>> -    int i;
>> -    int ret;
>> +    struct btrfs_root *root;
>> +    unsigned long index;
>>       /*
>>        * At this point no one can be using this transaction to modify 
>> any tree
>> @@ -1414,17 +1411,11 @@ static noinline int commit_fs_roots(struct 
>> btrfs_trans_handle *trans)
>>        */
>>       ASSERT(trans->transaction->state == TRANS_STATE_COMMIT_DOING);
>> -    spin_lock(&fs_info->fs_roots_radix_lock);
>> -    while (1) {
>> -        ret = radix_tree_gang_lookup_tag(&fs_info->fs_roots_radix,
>> -                         (void **)gang, 0,
>> -                         ARRAY_SIZE(gang),
>> -                         BTRFS_ROOT_TRANS_TAG);
>> -        if (ret == 0)
>> -            break;
>> -        for (i = 0; i < ret; i++) {
>> -            struct btrfs_root *root = gang[i];
>> -            int ret2;
>> +    spin_lock(&fs_info->fs_roots_lock);
>> +    while (xa_marked(&fs_info->fs_roots, BTRFS_ROOT_TRANS_TAG)) {
>> +        xa_for_each_marked(&fs_info->fs_roots, index, root,
> 
> While the while/xa_for_each_marked and not straight xa_for_each_marked?

Like before, this just mimicks the old logic, because I wasn't sure 
whether the double loop is only due to the bunched nature of gang lookup 
or whether it has some additional safeguarding function.

I'm fine with simplifying it to xa_for_each_marked if you say it's safe.

Waiting for further feedback whether to resubmit or not.

  reply	other threads:[~2022-05-02  8:59 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-26 21:45 [PATCH] btrfs: Turn fs_roots_radix in btrfs_fs_info into an XArray Gabriel Niebler
2022-04-27 14:51 ` Gabriel Niebler
2022-04-27 19:21   ` David Sterba
2022-04-28 11:37   ` Nikolay Borisov
2022-04-27 20:07 ` David Sterba
2022-04-28 11:59 ` Nikolay Borisov
2022-05-02  8:59   ` Gabriel Niebler [this message]
2022-05-02 18:39     ` David Sterba
2022-05-03  8:01     ` Gabriel Niebler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e4ced932-1f39-86fb-c0a4-018c47cf10fa@suse.com \
    --to=gniebler@suse.com \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nborisov@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox