From: Josef Bacik <jbacik@fb.com>
To: <fdmanana@gmail.com>, Miao Xie <miaox@cn.fujitsu.com>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH] Btrfs: fix crash when starting transaction
Date: Tue, 24 Jun 2014 08:25:39 -0700 [thread overview]
Message-ID: <53A99873.9080007@fb.com> (raw)
In-Reply-To: <CAL3q7H5vW0cdG+-J2avEJ37z0jKaDAJQ34PhygdZh8iyuPQMUQ@mail.gmail.com>
On 06/24/2014 03:29 AM, Filipe David Manana wrote:
> On Tue, Jun 24, 2014 at 11:22 AM, Miao Xie <miaox@cn.fujitsu.com> wrote:
>> CC Josef
>>
>> On Mon, 23 Jun 2014 12:58:59 +0100, Filipe David Borba Manana wrote:
>>> Often when starting a transaction we commit the currently running transaction,
>>> which can end up writing block group caches when the current process has its
>>> journal_info set to NULL (and not to a transaction). This makes our assertion
>>> at btrfs_check_data_free_space() (current_journal != NULL) fail, resulting
>>> in a crash/hang. Therefore fix it by setting journal_info.
>>>
>>> Two different traces of this issue follow below.
>>>
>>> 1)
>>>
>>> [51502.241936] BTRFS: assertion failed: current->journal_info, file: fs/btrfs/extent-tree.c, line: 3670
>>> [51502.242213] ------------[ cut here ]------------
>>> [51502.242493] kernel BUG at fs/btrfs/ctree.h:3964!
>>> [51502.242669] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
>>> (...)
>>> [51502.244010] Call Trace:
>>> [51502.244010] [<ffffffffa02bc025>] btrfs_check_data_free_space+0x395/0x3a0 [btrfs]
>>> [51502.244010] [<ffffffffa02c3bdc>] btrfs_write_dirty_block_groups+0x4ac/0x640 [btrfs]
>>> [51502.244010] [<ffffffffa0357a6a>] commit_cowonly_roots+0x164/0x226 [btrfs]
>>> [51502.244010] [<ffffffffa02d53cd>] btrfs_commit_transaction+0x4ed/0xab0 [btrfs]
>>> [51502.244010] [<ffffffff8168ec7b>] ? _raw_spin_unlock+0x2b/0x40
>>> [51502.244010] [<ffffffffa02d6259>] start_transaction+0x459/0x620 [btrfs]
>>> [51502.244010] [<ffffffffa02d67ab>] btrfs_start_transaction+0x1b/0x20 [btrfs]
>>> [51502.244010] [<ffffffffa02d73e1>] __unlink_start_trans+0x31/0xe0 [btrfs]
>>> [51502.244010] [<ffffffffa02dea67>] btrfs_unlink+0x37/0xc0 [btrfs]
>>> [51502.244010] [<ffffffff811bb054>] ? do_unlinkat+0x114/0x2a0
>>> [51502.244010] [<ffffffff811baebc>] vfs_unlink+0xcc/0x150
>>> [51502.244010] [<ffffffff811bb1a0>] do_unlinkat+0x260/0x2a0
>>> [51502.244010] [<ffffffff811a9ef4>] ? filp_close+0x64/0x90
>>> [51502.244010] [<ffffffff810aaea6>] ? trace_hardirqs_on_caller+0x16/0x1e0
>>> [51502.244010] [<ffffffff81349cab>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>>> [51502.244010] [<ffffffff811be9eb>] SyS_unlinkat+0x1b/0x40
>>> [51502.244010] [<ffffffff81698452>] system_call_fastpath+0x16/0x1b
>>> [51502.244010] Code: 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 89 f1 48 c7 c2 71 13 36 a0 48 89 fe 31 c0 48 c7 c7 b8 43 36 a0 48 89 e5 e8 5d b0 32 e1 <0f> 0b 0f 1f 44 00 00 55 b9 11 00 00 00 48 89 e5 41 55 49 89 f5
>>> [51502.244010] RIP [<ffffffffa03575da>] assfail.constprop.88+0x1e/0x20 [btrfs]
>>>
>>> 2)
>>>
>>> [25405.097230] BTRFS: assertion failed: current->journal_info, file: fs/btrfs/extent-tree.c, line: 3670
>>> [25405.097488] ------------[ cut here ]------------
>>> [25405.097767] kernel BUG at fs/btrfs/ctree.h:3964!
>>> [25405.097940] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
>>> (...)
>>> [25405.100008] Call Trace:
>>> [25405.100008] [<ffffffffa02bc025>] btrfs_check_data_free_space+0x395/0x3a0 [btrfs]
>>> [25405.100008] [<ffffffffa02c3bdc>] btrfs_write_dirty_block_groups+0x4ac/0x640 [btrfs]
>>> [25405.100008] [<ffffffffa035755a>] commit_cowonly_roots+0x164/0x226 [btrfs]
>>> [25405.100008] [<ffffffffa02d53cd>] btrfs_commit_transaction+0x4ed/0xab0 [btrfs]
>>> [25405.100008] [<ffffffff8109c170>] ? bit_waitqueue+0xc0/0xc0
>>> [25405.100008] [<ffffffffa02d6259>] start_transaction+0x459/0x620 [btrfs]
>>> [25405.100008] [<ffffffffa02d67ab>] btrfs_start_transaction+0x1b/0x20 [btrfs]
>>> [25405.100008] [<ffffffffa02e3407>] btrfs_create+0x47/0x210 [btrfs]
>>> [25405.100008] [<ffffffffa02d74cc>] ? btrfs_permission+0x3c/0x80 [btrfs]
>>> [25405.100008] [<ffffffff811bc63b>] vfs_create+0x9b/0x130
>>> [25405.100008] [<ffffffff811bcf19>] do_last+0x849/0xe20
>>> [25405.100008] [<ffffffff811b9409>] ? link_path_walk+0x79/0x820
>>> [25405.100008] [<ffffffff811bd5b5>] path_openat+0xc5/0x690
>>> [25405.100008] [<ffffffff810ab07d>] ? trace_hardirqs_on+0xd/0x10
>>> [25405.100008] [<ffffffff811cdcd2>] ? __alloc_fd+0x32/0x1d0
>>> [25405.100008] [<ffffffff811be2a3>] do_filp_open+0x43/0xa0
>>> [25405.100008] [<ffffffff811cddf1>] ? __alloc_fd+0x151/0x1d0
>>> [25405.100008] [<ffffffff811abcfc>] do_sys_open+0x13c/0x230
>>> [25405.100008] [<ffffffff810aaea6>] ? trace_hardirqs_on_caller+0x16/0x1e0
>>> [25405.100008] [<ffffffff811abe12>] SyS_open+0x22/0x30
>>> [25405.100008] [<ffffffff81698452>] system_call_fastpath+0x16/0x1b
>>> [25405.100008] Code: 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 89 f1 48 c7 c2 51 13 36 a0 48 89 fe 31 c0 48 c7 c7 d0 43 36 a0 48 89 e5 e8 6d b5 32 e1 <0f> 0b 0f 1f 44 00 00 55 b9 11 00 00 00 48 89 e5 41 55 49 89 f5
>>> [25405.100008] RIP [<ffffffffa03570ca>] assfail.constprop.88+0x1e/0x20 [btrfs]
>>>
>>> Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
>>> ---
>>> fs/btrfs/transaction.c | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
>>> index ac984a3..fe4abe9 100644
>>> --- a/fs/btrfs/transaction.c
>>> +++ b/fs/btrfs/transaction.c
>>> @@ -491,7 +491,11 @@ again:
>>> smp_mb();
>>> if (cur_trans->state >= TRANS_STATE_BLOCKED &&
>>> may_wait_transaction(root, type)) {
>>> + void *journal_info = current->journal_info;
>>> + if (!journal_info)
>>> + current->journal_info = h;
>>> btrfs_commit_transaction(h, root);
>>> + current->journal_info = journal_info;
>>> goto again;
>>
>> It seems it is impossible that the task who set BTRFS_SEND_TRANS_STUB would start a transaction,
>> that is if current->journal_info is not NULL, we are sure there is a transaction handle in it,
>> so we just join that handle and then go out. In other words, current->journal_info here must be NULL,
>> so the if sentence is unnecessary,
>
> I think that is true too (current->journal_info == NULL here).
> The if test is done mostly because it's done some lines below too.
>
Yeah journal_info should always == NULL here. In fact I'd actually like to put
an ASSERT() to make sure we don't have TRANS_SUB set coming into
start_transaction, but you can leave that for another day if you like. Good
find btw,
Josef
next prev parent reply other threads:[~2014-06-24 15:26 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-23 11:58 [PATCH] Btrfs: fix crash when starting transaction Filipe David Borba Manana
2014-06-24 2:34 ` Satoru Takeuchi
2014-06-24 8:29 ` Filipe David Manana
2014-06-24 9:35 ` Satoru Takeuchi
2014-06-24 10:22 ` Miao Xie
2014-06-24 10:29 ` Filipe David Manana
2014-06-24 15:25 ` Josef Bacik [this message]
2014-06-24 16:46 ` [PATCH v2] " Filipe David Borba Manana
2014-07-03 8:30 ` Miao Xie
2014-07-03 10:32 ` Satoru Takeuchi
2014-07-03 11:07 ` Miao Xie
2014-07-03 23:13 ` Satoru Takeuchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53A99873.9080007@fb.com \
--to=jbacik@fb.com \
--cc=fdmanana@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=miaox@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.