All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miao Xie <miaox@cn.fujitsu.com>
To: Alex Lyakas <alex.btrfs@zadarastorage.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>,
	Shyam Kaushik <shyam@zadarastorage.com>
Subject: Re: [PATCH 2/3] Btrfs: fix the deadlock between the transaction start/attach and commit
Date: Mon, 17 Jun 2013 09:51:33 +0800	[thread overview]
Message-ID: <51BE6BA5.40809@cn.fujitsu.com> (raw)
In-Reply-To: <CAOcd+r0986mL2FEC2tVTgNd0n1Lbc8FD=N9qPeAnd4h1QaxdAQ@mail.gmail.com>

On 	sun, 16 Jun 2013 13:38:42 +0300, Alex Lyakas wrote:
> Hi Miao,
> 
> On Thu, Jun 13, 2013 at 6:08 AM, Miao Xie <miaox@cn.fujitsu.com> wrote:
>> On wed, 12 Jun 2013 23:11:02 +0300, Alex Lyakas wrote:
>>> I reviewed the code starting from:
>>> 69aef69a1bc154 Btrfs: don't wait for all the writers circularly during
>>> the transaction commit
>>> until
>>> 2ce7935bf4cdf3 Btrfs: remove the time check in btrfs_commit_transaction()
>>>
>>> It looks very good. Let me check if I understand the fix correctly:
>>> # When transaction starts to commit, we want to wait only for external
>>> writers (those that did ATTACH/START/USERSPACE)
>>> # We guarantee at this point that no new external writers will hop on
>>> the committing transaction, by setting ->blocked state, so we only
>>> wait for existing extwriters to detach from transaction
> 
> I have a doubt about this point with your new code. Example:
> Task1 - external writer
> Task2 - transaction kthread
> 
> Task1                                                                   Task2
> |start_transaction(TRANS_START)                           |
> |-wait_current_trans(blocked=0, so it doesn't wait)     |
> |-join_transaction()                                                  |
> |--lock(trans_lock)                                                   |
> |--can_join_transaction() YES                                  |
> |
>       |-btrfs_commit_transaction()
> |
>       |--blocked=1
> |
>       |--in_commit=1
> |
>       |--wait_event(extwriter== 0);
> |
>       |--btrfs_flush_all_pending_stuffs()
> |                                                                            |
> |--extwriter_counter_inc()                                         |
> |--unlock(trans_lock)                                               |
> |
>       | lock(trans_lock)
> |
>       | trans_no_join=1
> 
> Basically, the "blocked/in_commit" check is not synchronized with
> joining a transaction. After checking "blocked", the external writer
> may proceed and join the transaction. Right before joining, it calls
> can_join_transaction(). But this function checks in_commit flag under
> fs_info->trans_lock. But btrfs_commit_transaction() sets this flag not
> under trans_lock, but under commit_lock, so checking this flag is not
> synchronized.
> 
> Or maybe I am wrong, because btrfs_commit_transaction() locks and
> unlocks trans_lock to check for previous transaction, so by accident
> there is no problem, and above scenario cannot happen?

Your analysis at the last section is right, so the right process is:

Task1							Task2
|start_transaction(TRANS_START)				|
|-wait_current_trans(blocked=0, so it doesn't wait)	|
|-join_transaction()					|
|--lock(trans_lock)					|
|--can_join_transaction() YES				|
|							|-btrfs_commit_transaction()
|							|--blocked=1
|							|--in_commit=1
|--extwriter_counter_inc()				|
|--unlock(trans_lock)					|
|							|--lock(trans_lock)
|							|--...
|							|--unlock(trans_lock)
|							|--...
|							|--wait_event(extwriter== 0);
|							|--btrfs_flush_all_pending_stuffs()

The problem you worried can not happen.

Anyway, it is not good that the "blocked/in_commit" check is not synchronized with
joining a transaction. So I modified the relative code in this patchset.

Miao

  reply	other threads:[~2013-06-17  1:50 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-20  9:16 [PATCH 2/3] Btrfs: fix the deadlock between the transaction start/attach and commit Miao Xie
2013-02-24 19:49 ` Alex Lyakas
2013-02-25 10:20   ` Miao Xie
2013-03-02 21:15     ` Alex Lyakas
2013-03-24 11:13     ` Alex Lyakas
2013-03-25  1:51       ` Miao Xie
2013-03-25  9:11         ` Alex Lyakas
2013-04-10 18:45           ` Alex Lyakas
2013-04-11  2:19             ` Miao Xie
     [not found]             ` <518B56F1.40909@cn.fujitsu.com>
2013-06-12 20:11               ` Alex Lyakas
2013-06-13  3:08                 ` Miao Xie
2013-06-16 10:38                   ` Alex Lyakas
2013-06-17  1:51                     ` Miao Xie [this message]
2013-06-26 17:53                       ` Alex Lyakas
2013-07-04  2:28                         ` Miao Xie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51BE6BA5.40809@cn.fujitsu.com \
    --to=miaox@cn.fujitsu.com \
    --cc=alex.btrfs@zadarastorage.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=shyam@zadarastorage.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.