From: Miao Xie <miaox@cn.fujitsu.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 2/5] Btrfs: fix missing flush when committing a transaction
Date: Thu, 01 Nov 2012 16:16:43 +0800 [thread overview]
Message-ID: <50922FEB.8030900@cn.fujitsu.com> (raw)
In-Reply-To: <20121101080420.GA2554@liubo.cn.oracle.com>
On thu, 1 Nov 2012 16:04:27 +0800, Liu Bo wrote:
> (sorry, forgot to cc linux-btrfs.)
> On Thu, Nov 01, 2012 at 03:51:41PM +0800, Miao Xie wrote:
>> On Thu, 1 Nov 2012 15:44:43 +0800, Liu Bo wrote:
>>> On Thu, Nov 01, 2012 at 03:33:14PM +0800, Miao Xie wrote:
>>>> Consider the following case:
>>>> Task1 Task2
>>>> start_transaction
>>>> commit_transaction
>>>> check pending snapshots list and the
>>>> list is empty.
>>>> add pending snapshot into list
>>>> skip the delalloc flush
>>>> end_transaction
>>>> ...
>>>>
>>>> And then the problem that the snapshot is different with the source subvolume
>>>> happen.
>>>>
>>>
>>> This is weird, create_snapshot() will first add pending snapshot into
>>> list and then commit the transaction itself, regardless of if the
>>> snapshot is different with others or not.
>>
>> But the transaction may be committed by the other task, and the snapshot
>> creation task just wait until it ends.
>>
>
> It's possible that a commit tranaction becomes a end transaction when it
> finds itself is already in commit.
>
> So if snapshot creation starts the transaction, it will increment the
> transaction's num_writers, why does not the other task wait for its
> end_transacion?
>
> I doubt if this can really happen anyway...
>
> Can you elaborate the situation more?
Task1 Task2
start_transaction
start_transaction
commit_transaction
set in_commit to 1
check pending snapshots list and the list is empty.
add pending snapshot into list
skip the delalloc flush
commit_transaction
find in_commit is 1
end_transaction (num_writer--)
wait_for_commit
num_writer is 1
continue committing the transaction
...
Thanks
Miao
>
> thanks,
> liubo
>
>>>
>>> How do you find this?
>>
>> Just by review the code. I think it can be triggered
>>
>> Thanks
>> Miao
>>
>>>
>>> thanks,
>>> liubo
>>>
>>>> This patch fixes the above problem by flush all pending stuffs when all the
>>>> other tasks end the transaction.
>>>>
>>>> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
>>>> ---
>>>> fs/btrfs/transaction.c | 74 ++++++++++++++++++++++++++++++-----------------
>>>> 1 files changed, 47 insertions(+), 27 deletions(-)
>>>>
>>>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
>>>> index 6d0d5a0..d9a9a70 100644
>>>> --- a/fs/btrfs/transaction.c
>>>> +++ b/fs/btrfs/transaction.c
>>>> @@ -1401,6 +1401,48 @@ static void cleanup_transaction(struct btrfs_trans_handle *trans,
>>>> kmem_cache_free(btrfs_trans_handle_cachep, trans);
>>>> }
>>>>
>>>> +static int btrfs_flush_all_pending_stuffs(struct btrfs_trans_handle *trans,
>>>> + struct btrfs_root *root)
>>>> +{
>>>> + int flush_on_commit = btrfs_test_opt(root, FLUSHONCOMMIT);
>>>> + int snap_pending = 0;
>>>> + int ret;
>>>> +
>>>> + if (!flush_on_commit) {
>>>> + spin_lock(&root->fs_info->trans_lock);
>>>> + if (!list_empty(&trans->transaction->pending_snapshots))
>>>> + snap_pending = 1;
>>>> + spin_unlock(&root->fs_info->trans_lock);
>>>> + }
>>>> +
>>>> + if (flush_on_commit || snap_pending) {
>>>> + btrfs_start_delalloc_inodes(root, 1);
>>>> + btrfs_wait_ordered_extents(root, 1);
>>>> + }
>>>> +
>>>> + ret = btrfs_run_delayed_items(trans, root);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + /*
>>>> + * running the delayed items may have added new refs. account
>>>> + * them now so that they hinder processing of more delayed refs
>>>> + * as little as possible.
>>>> + */
>>>> + btrfs_delayed_refs_qgroup_accounting(trans, root->fs_info);
>>>> +
>>>> + /*
>>>> + * rename don't use btrfs_join_transaction, so, once we
>>>> + * set the transaction to blocked above, we aren't going
>>>> + * to get any new ordered operations. We can safely run
>>>> + * it here and no for sure that nothing new will be added
>>>> + * to the list
>>>> + */
>>>> + btrfs_run_ordered_operations(root, 1);
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> /*
>>>> * btrfs_transaction state sequence:
>>>> * in_commit = 0, blocked = 0 (initial)
>>>> @@ -1418,7 +1460,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
>>>> int ret = -EIO;
>>>> int should_grow = 0;
>>>> unsigned long now = get_seconds();
>>>> - int flush_on_commit = btrfs_test_opt(root, FLUSHONCOMMIT);
>>>>
>>>> btrfs_run_ordered_operations(root, 0);
>>>>
>>>> @@ -1491,39 +1532,14 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
>>>> should_grow = 1;
>>>>
>>>> do {
>>>> - int snap_pending = 0;
>>>> -
>>>> joined = cur_trans->num_joined;
>>>> - if (!list_empty(&trans->transaction->pending_snapshots))
>>>> - snap_pending = 1;
>>>>
>>>> WARN_ON(cur_trans != trans->transaction);
>>>>
>>>> - if (flush_on_commit || snap_pending) {
>>>> - btrfs_start_delalloc_inodes(root, 1);
>>>> - btrfs_wait_ordered_extents(root, 1);
>>>> - }
>>>> -
>>>> - ret = btrfs_run_delayed_items(trans, root);
>>>> + ret = btrfs_flush_all_pending_stuffs(trans, root);
>>>> if (ret)
>>>> goto cleanup_transaction;
>>>>
>>>> - /*
>>>> - * running the delayed items may have added new refs. account
>>>> - * them now so that they hinder processing of more delayed refs
>>>> - * as little as possible.
>>>> - */
>>>> - btrfs_delayed_refs_qgroup_accounting(trans, root->fs_info);
>>>> -
>>>> - /*
>>>> - * rename don't use btrfs_join_transaction, so, once we
>>>> - * set the transaction to blocked above, we aren't going
>>>> - * to get any new ordered operations. We can safely run
>>>> - * it here and no for sure that nothing new will be added
>>>> - * to the list
>>>> - */
>>>> - btrfs_run_ordered_operations(root, 1);
>>>> -
>>>> prepare_to_wait(&cur_trans->writer_wait, &wait,
>>>> TASK_UNINTERRUPTIBLE);
>>>>
>>>> @@ -1536,6 +1552,10 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
>>>> } while (atomic_read(&cur_trans->num_writers) > 1 ||
>>>> (should_grow && cur_trans->num_joined != joined));
>>>>
>>>> + ret = btrfs_flush_all_pending_stuffs(trans, root);
>>>> + if (ret)
>>>> + goto cleanup_transaction;
>>>> +
>>>> /*
>>>> * Ok now we need to make sure to block out any other joins while we
>>>> * commit the transaction. We could have started a join before setting
>>>> --
>>>> 1.7.6.5
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2012-11-01 8:16 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-01 7:33 [PATCH 2/5] Btrfs: fix missing flush when committing a transaction Miao Xie
[not found] ` <20121101074443.GC1591@liubo.cn.oracle.com>
[not found] ` <50922A0D.80103@cn.fujitsu.com>
2012-11-01 8:04 ` Liu Bo
2012-11-01 8:16 ` Miao Xie [this message]
2012-11-01 9:00 ` Liu Bo
2012-11-01 10:18 ` Miao Xie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50922FEB.8030900@cn.fujitsu.com \
--to=miaox@cn.fujitsu.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).