From: Miao Xie <miaox@cn.fujitsu.com>
To: Alex Lyakas <alex.btrfs@zadarastorage.com>
Cc: Linux Btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 2/3] Btrfs: fix the deadlock between the transaction start/attach and commit
Date: Mon, 25 Feb 2013 18:20:23 +0800 [thread overview]
Message-ID: <512B3AE7.9090208@cn.fujitsu.com> (raw)
In-Reply-To: <CAOcd+r35uFwUsmRuLRR3eGkVcceP2QEimhrN7xxSnBzB-LWVyQ@mail.gmail.com>
On sun, 24 Feb 2013 21:49:55 +0200, Alex Lyakas wrote:
> Hi Miao,
> can you please explain your solution a bit more.
>
> On Wed, Feb 20, 2013 at 11:16 AM, Miao Xie <miaox@cn.fujitsu.com> wrote:
>> Now btrfs_commit_transaction() does this
>>
>> ret = btrfs_run_ordered_operations(root, 0)
>>
>> which async flushes all inodes on the ordered operations list, it introduced
>> a deadlock that transaction-start task, transaction-commit task and the flush
>> workers waited for each other.
>> (See the following URL to get the detail
>> http://marc.info/?l=linux-btrfs&m=136070705732646&w=2)
>>
>> As we know, if ->in_commit is set, it means someone is committing the
>> current transaction, we should not try to join it if we are not JOIN
>> or JOIN_NOLOCK, wait is the best choice for it. In this way, we can avoid
>> the above problem. In this way, there is another benefit: there is no new
>> transaction handle to block the transaction which is on the way of commit,
>> once we set ->in_commit.
>>
>> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
>> ---
>> fs/btrfs/transaction.c | 17 ++++++++++++++++-
>> 1 files changed, 16 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
>> index bc2f2d1..71b7e2e 100644
>> --- a/fs/btrfs/transaction.c
>> +++ b/fs/btrfs/transaction.c
>> @@ -51,6 +51,14 @@ static noinline void switch_commit_root(struct btrfs_root *root)
>> root->commit_root = btrfs_root_node(root);
>> }
>>
>> +static inline int can_join_transaction(struct btrfs_transaction *trans,
>> + int type)
>> +{
>> + return !(trans->in_commit &&
>> + type != TRANS_JOIN &&
>> + type != TRANS_JOIN_NOLOCK);
>> +}
>> +
>> /*
>> * either allocate a new transaction or hop into the existing one
>> */
>> @@ -86,6 +94,10 @@ loop:
>> spin_unlock(&fs_info->trans_lock);
>> return cur_trans->aborted;
>> }
>> + if (!can_join_transaction(cur_trans, type)) {
>> + spin_unlock(&fs_info->trans_lock);
>> + return -EBUSY;
>> + }
>> atomic_inc(&cur_trans->use_count);
>> atomic_inc(&cur_trans->num_writers);
>> cur_trans->num_joined++;
>> @@ -360,8 +372,11 @@ again:
>>
>> do {
>> ret = join_transaction(root, type);
>> - if (ret == -EBUSY)
>> + if (ret == -EBUSY) {
>> wait_current_trans(root);
>> + if (unlikely(type == TRANS_ATTACH))
>> + ret = -ENOENT;
>> + }
>
> So I understand that instead of incrementing num_writes and joining
> the current transaction, you do not join and wait for the current
> transaction to unblock.
More specifically,TRANS_START、TRANS_USERSPACE and TRANS_ATTACH can not
join and just wait for the current transaction to unblock if ->in_commit
is set.
> Which task in Josef's example
> http://marc.info/?l=linux-btrfs&m=136070705732646&w=2
> task 1, task 2 or task 3 is the one that will not join the
> transaction, but instead wait?
Task1 will not join the transaction, in this way, async inode flush
won't run, and then task3 won't do anything.
Before applying the patch:
Start/Attach_Trans_Task Commit_Task Flush_Worker
(Task1) (Task2) (Task3) -- the name in Josef's example
btrfs_start_transaction()
|->may_wait_transaction()
| (return 0)
| btrfs_commit_transaction()
| |->set ->in_commit and
| | blocked to 1
| |->wait writers to be 1
| | (writers is 1)
|->join_transaction() |
| (writers is 2) |
|->btrfs_commit_transaction() |
| |->set trans_no_join to 1
| | (close join transaction)
|->btrfs_run_ordered_operations |
(Those ordered operations |
are added when releasing |
file) |
|->async inode flush() |
|->wait_flush_comlete() |
| work_loop()
| |->run_work()
| |->btrfs_join_transaction()
| |->wait_current_trans()
|->wait writers to be 1
This three tasks waited for each other.
After applying this patch:
Start/Attach_Trans_Task Commit_Task Flush_Worker
(Task1) (Task2) (Task3)
btrfs_start_transaction()
|->may_wait_transaction()
| (return 0)
| btrfs_commit_transaction()
| |->set ->in_commit and
| | blocked to 1
| |->wait writers to be 1
| | (writers is 1)
|->join_transaction() fail |
| (return -EBUSY, writers is still 1) |
|->wait_current_trans() |
|->set trans_no_join to 1
| (close join transaction)
|->wait writers to be 1
|->continue committing
(Task3 does nothing)
> Also, I think I don't fully understand Josef's example. What is
> preventing from async flushing to complete?
> Is task 3 waiting because trans_no_join is set?
> Is task 3 the one that actually does the delalloc flush?
See above.
Thanks
Miao
>
> Thanks,
> Alex.
>
>
>
>
>
>
>> } while (ret == -EBUSY);
>>
>> if (ret < 0) {
>> --
>> 1.6.5.2
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2013-02-25 10:20 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-20 9:16 [PATCH 2/3] Btrfs: fix the deadlock between the transaction start/attach and commit Miao Xie
2013-02-24 19:49 ` Alex Lyakas
2013-02-25 10:20 ` Miao Xie [this message]
2013-03-02 21:15 ` Alex Lyakas
2013-03-24 11:13 ` Alex Lyakas
2013-03-25 1:51 ` Miao Xie
2013-03-25 9:11 ` Alex Lyakas
2013-04-10 18:45 ` Alex Lyakas
2013-04-11 2:19 ` Miao Xie
[not found] ` <518B56F1.40909@cn.fujitsu.com>
2013-06-12 20:11 ` Alex Lyakas
2013-06-13 3:08 ` Miao Xie
2013-06-16 10:38 ` Alex Lyakas
2013-06-17 1:51 ` Miao Xie
2013-06-26 17:53 ` Alex Lyakas
2013-07-04 2:28 ` Miao Xie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=512B3AE7.9090208@cn.fujitsu.com \
--to=miaox@cn.fujitsu.com \
--cc=alex.btrfs@zadarastorage.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).