From: Miao Xie <miaox@cn.fujitsu.com>
To: bo.li.liu@oracle.com
Cc: linux-btrfs@vger.kernel.org, alex.btrfs@zadarastorage.com
Subject: Re: [PATCH 06/17] Btrfs: introduce grab/put functions for the root of the fs/file tree
Date: Thu, 16 May 2013 13:34:11 +0800 [thread overview]
Message-ID: <51946FD3.5000008@cn.fujitsu.com> (raw)
In-Reply-To: <20130516051556.GH20202@liubo.jp.oracle.com>
On Thu, 16 May 2013 13:15:57 +0800, Liu Bo wrote:
> On Thu, May 16, 2013 at 12:31:11PM +0800, Miao Xie wrote:
>> On thu, 16 May 2013 11:36:46 +0800, Liu Bo wrote:
>>> On Wed, May 15, 2013 at 03:48:20PM +0800, Miao Xie wrote:
>>>> The grab/put funtions will be used in the next patch, which need grab
>>>> the root object and ensure it is not freed. We use reference counter
>>>> instead of the srcu lock is to aovid blocking the memory reclaim task,
>>>> which invokes synchronize_srcu().
>>>>
>>>
>>> I don't think this is necessary, we put 'kfree(root)' because we really
>>> need to free them at the very end time, when there should be no inodes
>>> linking on the root(we should have cleaned all inodes out from it).
>>>
>>> So when we flush delalloc inodes and wait for ordered extents to finish,
>>> the root should be valid, otherwise someone is doing wrong things.
>>>
>>> And even with this grab_fs_root to avoid freeing root, it's just the
>>> root that remains in memory, all its attributes, like root->node,
>>> commit_root, root->inode_tree, are already NULL or empty.
>>
>> Please consider the case:
>> Task1 Task2 Cleaner
>> get the root
>> flush all delalloc inodes
>> drop subvolume
>> iput the last inode
>> move the root into the dead list
>> drop subvolume
>> kfree(root)
>> If Task1 accesses the root now, oops will happen.
>
> Then it's task1's fault, why it is not protected by subvol_srcu section when
> it's possible that someone like task2 sets root's refs to 0?
>
> synchronize_srcu(subvol_srcu) before adding root into dead root list is
> just for this race case, why do we need another?
Please read my changelog.
Miao
>
> thanks,
> liubo
>
>>
>> I introduced there two functions just to protect the access of the root object, not its
>> attributes, so don't worry about the attributes. (Please see the first sentence of the
>> changelog.)
>>
>> Thanks
>> Miao
>>
>>>
>>> thanks,
>>> liubo
>>>
>>>> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
>>>> ---
>>>> fs/btrfs/ctree.h | 1 +
>>>> fs/btrfs/disk-io.c | 5 +++--
>>>> fs/btrfs/disk-io.h | 21 +++++++++++++++++++++
>>>> fs/btrfs/extent-tree.c | 2 +-
>>>> 4 files changed, 26 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>>>> index 845b77f..958ce6c 100644
>>>> --- a/fs/btrfs/ctree.h
>>>> +++ b/fs/btrfs/ctree.h
>>>> @@ -1739,6 +1739,7 @@ struct btrfs_root {
>>>> int force_cow;
>>>>
>>>> spinlock_t root_item_lock;
>>>> + atomic_t refs;
>>>> };
>>>>
>>>> struct btrfs_ioctl_defrag_range_args {
>>>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>>>> index 42d6ba2..642c861 100644
>>>> --- a/fs/btrfs/disk-io.c
>>>> +++ b/fs/btrfs/disk-io.c
>>>> @@ -1216,6 +1216,7 @@ static void __setup_root(u32 nodesize, u32 leafsize, u32 sectorsize,
>>>> atomic_set(&root->log_writers, 0);
>>>> atomic_set(&root->log_batch, 0);
>>>> atomic_set(&root->orphan_inodes, 0);
>>>> + atomic_set(&root->refs, 1);
>>>> root->log_transid = 0;
>>>> root->last_log_commit = 0;
>>>> extent_io_tree_init(&root->dirty_log_pages,
>>>> @@ -2049,7 +2050,7 @@ static void del_fs_roots(struct btrfs_fs_info *fs_info)
>>>> } else {
>>>> free_extent_buffer(gang[0]->node);
>>>> free_extent_buffer(gang[0]->commit_root);
>>>> - kfree(gang[0]);
>>>> + btrfs_put_fs_root(gang[0]);
>>>> }
>>>> }
>>>>
>>>> @@ -3415,7 +3416,7 @@ static void free_fs_root(struct btrfs_root *root)
>>>> kfree(root->free_ino_ctl);
>>>> kfree(root->free_ino_pinned);
>>>> kfree(root->name);
>>>> - kfree(root);
>>>> + btrfs_put_fs_root(root);
>>>> }
>>>>
>>>> void btrfs_free_fs_root(struct btrfs_root *root)
>>>> diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
>>>> index 534d583..b71acd6e 100644
>>>> --- a/fs/btrfs/disk-io.h
>>>> +++ b/fs/btrfs/disk-io.h
>>>> @@ -76,6 +76,27 @@ void btrfs_btree_balance_dirty_nodelay(struct btrfs_root *root);
>>>> void btrfs_drop_and_free_fs_root(struct btrfs_fs_info *fs_info,
>>>> struct btrfs_root *root);
>>>> void btrfs_free_fs_root(struct btrfs_root *root);
>>>> +
>>>> +/*
>>>> + * This function is used to grab the root, and avoid it is freed when we
>>>> + * access it. But it doesn't ensure that the tree is not dropped.
>>>> + *
>>>> + * If you want to ensure the whole tree is safe, you should use
>>>> + * fs_info->subvol_srcu
>>>> + */
>>>> +static inline struct btrfs_root *btrfs_grab_fs_root(struct btrfs_root *root)
>>>> +{
>>>> + if (atomic_inc_not_zero(&root->refs))
>>>> + return root;
>>>> + return NULL;
>>>> +}
>>>> +
>>>> +static inline void btrfs_put_fs_root(struct btrfs_root *root)
>>>> +{
>>>> + if (atomic_dec_and_test(&root->refs))
>>>> + kfree(root);
>>>> +}
>>>> +
>>>> void btrfs_mark_buffer_dirty(struct extent_buffer *buf);
>>>> int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid,
>>>> int atomic);
>>>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>>>> index 08e42c8..08f9862 100644
>>>> --- a/fs/btrfs/extent-tree.c
>>>> +++ b/fs/btrfs/extent-tree.c
>>>> @@ -7463,7 +7463,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>>>> } else {
>>>> free_extent_buffer(root->node);
>>>> free_extent_buffer(root->commit_root);
>>>> - kfree(root);
>>>> + btrfs_put_fs_root(root);
>>>> }
>>>> out_end_trans:
>>>> btrfs_end_transaction_throttle(trans, tree_root);
>>>> --
>>>> 1.8.1.4
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>
next prev parent reply other threads:[~2013-05-16 5:33 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-15 7:48 [PATCH 00/17] improve the block time during the transaction commit Miao Xie
2013-05-15 7:48 ` [PATCH 01/17] Btrfs: fix accessing a freed tree root Miao Xie
2013-05-15 7:48 ` [PATCH 02/17] Btrfs: fix unprotected root node of the subvolume's inode rb-tree Miao Xie
2013-05-15 7:48 ` [PATCH 03/17] Btrfs: pause the space balance when remounting to R/O Miao Xie
2013-05-15 7:48 ` [PATCH 04/17] Btrfs: remove BUG_ON() in btrfs_read_fs_tree_no_radix() Miao Xie
2013-05-15 7:48 ` [PATCH 05/17] Btrfs: cleanup the similar code of the fs root read Miao Xie
2013-05-15 7:48 ` [PATCH 06/17] Btrfs: introduce grab/put functions for the root of the fs/file tree Miao Xie
2013-05-16 3:36 ` Liu Bo
2013-05-16 4:31 ` Miao Xie
2013-05-16 5:15 ` Liu Bo
2013-05-16 5:34 ` Miao Xie [this message]
2013-05-16 6:15 ` Liu Bo
2013-05-16 7:22 ` Miao Xie
2013-05-16 11:54 ` Chris Mason
2013-05-16 14:31 ` Liu Bo
2013-05-16 14:34 ` Chris Mason
2013-05-16 14:57 ` Liu Bo
2013-05-17 1:50 ` Miao Xie
2013-05-16 12:05 ` Liu Bo
2013-05-15 7:48 ` [PATCH 07/17] Btrfs: don't invoke btrfs_invalidate_inodes() in the spin lock context Miao Xie
2013-05-15 7:48 ` [PATCH 08/17] Btrfs: introduce per-subvolume delalloc inode list Miao Xie
2013-05-15 7:48 ` [PATCH 09/17] Btrfs: introduce per-subvolume ordered extent list Miao Xie
2013-05-15 7:48 ` [PATCH 10/17] Btrfs: just flush the delalloc inodes in the source tree before snapshot creation Miao Xie
2013-05-16 3:20 ` Liu Bo
2013-05-16 4:00 ` Miao Xie
2013-05-15 7:48 ` [PATCH 11/17] Btrfs: cleanup unnecessary assignment when cleaning up all the residual transaction Miao Xie
2013-05-15 7:48 ` [PATCH 12/17] Btrfs: remove the code for the impossible case in cleanup_transaction() Miao Xie
2013-05-15 7:48 ` [PATCH 13/17] Btrfs: don't wait for all the writers circularly during the transaction commit Miao Xie
2013-05-15 7:48 ` [PATCH 14/17] Btrfs: don't flush the delalloc inodes in the while loop if flushoncommit is set Miao Xie
2013-05-15 7:48 ` [PATCH 15/17] Btrfs: remove unnecessary varient ->num_joined in btrfs_transaction structure Miao Xie
2013-05-15 7:48 ` [PATCH 16/17] Btrfs: remove the time check in btrfs_commit_transaction() Miao Xie
2013-05-15 7:48 ` [PATCH 17/17] Btrfs: make the state of the transaction more readable Miao Xie
2013-05-17 3:53 ` [PATCH V2 " Miao Xie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51946FD3.5000008@cn.fujitsu.com \
--to=miaox@cn.fujitsu.com \
--cc=alex.btrfs@zadarastorage.com \
--cc=bo.li.liu@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).