From: Waiman Long <waiman.long@hp.com>
To: Chris Mason <clm@fb.com>
Cc: Marc Dionne <marc.c.dionne@gmail.com>,
Josef Bacik <jbacik@fb.com>,
linux-btrfs@vger.kernel.org, t-itoh@jp.fujitsu.com
Subject: Re: Lockups with btrfs on 3.16-rc1 - bisected
Date: Thu, 19 Jun 2014 19:21:50 -0400 [thread overview]
Message-ID: <53A3708E.9060203@hp.com> (raw)
In-Reply-To: <53A35B31.4060203@fb.com>
On 06/19/2014 05:50 PM, Chris Mason wrote:
>>>
>>> I would like to take back my comments. I took out the read_lock, but the
>>> process still hang while doing file activities on btrfs filesystem. So
>>> the problem is trickier than I thought. Below are the stack backtraces
>>> of some of the relevant processes.
>>>
>> You weren't wrong, but it was also the tree trylock code. Our trylocks
>> only back off if the blocking lock is held. btrfs_next_leaf needs it to
>> be a true trylock. The confusing part is this hasn't really changed,
>> but one of the callers must be a spinner where we used to have a blocker.
> This is what I have queued up, it's working here.
>
> -chris
>
> commit ea4ebde02e08558b020c4b61bb9a4c0fcf63028e
> Author: Chris Mason<clm@fb.com>
> Date: Thu Jun 19 14:16:52 2014 -0700
>
> Btrfs: fix deadlocks with trylock on tree nodes
>
> The Btrfs tree trylock function is poorly named. It always takes
> the spinlock and backs off if the blocking lock is held. This
> can lead to surprising lockups because people expect it to really be a
> trylock.
>
> This commit makes it a pure trylock, both for the spinlock and the
> blocking lock. It also reworks the nested lock handling slightly to
> avoid taking the read lock while a spinning write lock might be held.
>
> Signed-off-by: Chris Mason<clm@fb.com>
I didn't realize that those non-blocking lock functions are really
trylocks. Yes, the patch did seem to fix the hanging problem that I saw
when I just untar the kernel source files into a btrfs filesystem.
However, when I tried did a kernel build on a 24-thread (-j 24) system,
the build process hanged after a while. The following kind of stack
trace messages were printed:
INFO: task btrfs-transacti:16576 blocked for more than 120 seconds.
Tainted: G E 3.16.0-rc1 #5
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
btrfs-transacti D 000000000000000f 0 16576 2 0x00000000
ffff88080eabbbf8 0000000000000046 ffff880803b98350 ffff88080eab8010
0000000000012b80 0000000000012b80 ffff880805ed8f10 ffff88080d162310
ffff88080eabbce8 ffff8807be170880 ffff8807be170888 7fffffffffffffff
Call Trace:
[<ffffffff81592de9>] schedule+0x29/0x70
[<ffffffff815920bd>] schedule_timeout+0x13d/0x1d0
[<ffffffff8106b474>] ? wake_up_worker+0x24/0x30
[<ffffffff8106d595>] ? insert_work+0x65/0xb0
[<ffffffff81593cc6>] wait_for_completion+0xc6/0x100
[<ffffffff810868d0>] ? try_to_wake_up+0x220/0x220
[<ffffffffa06bb9ba>] btrfs_wait_and_free_delalloc_work+0x1a/0x30 [btrfs]
[<ffffffffa06d458d>] btrfs_run_ordered_operations+0x1dd/0x2c0 [btrfs]
[<ffffffffa06b7fd5>] btrfs_flush_all_pending_stuffs+0x35/0x40 [btrfs]
[<ffffffffa06ba099>] btrfs_commit_transaction+0x229/0xa30 [btrfs]
[<ffffffff8105ef30>] ? lock_timer_base+0x70/0x70
[<ffffffffa06b51db>] transaction_kthread+0x1eb/0x270 [btrfs]
[<ffffffffa06b4ff0>] ? close_ctree+0x2d0/0x2d0 [btrfs]
[<ffffffff8107544e>] kthread+0xce/0xf0
[<ffffffff81075380>] ? kthread_freezable_should_stop+0x70/0x70
[<ffffffff8159636c>] ret_from_fork+0x7c/0xb0
[<ffffffff81075380>] ? kthread_freezable_should_stop+0x70/0x70
It looks like some more work may still be needed. Or it could be a
problem in my system configuration.
-Longman
next prev parent reply other threads:[~2014-06-19 23:22 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-18 20:57 Lockups with btrfs on 3.16-rc1 - bisected Marc Dionne
2014-06-18 22:17 ` Waiman Long
2014-06-18 22:27 ` Josef Bacik
2014-06-18 22:47 ` Waiman Long
2014-06-18 23:10 ` Josef Bacik
2014-06-18 23:19 ` Waiman Long
2014-06-18 23:27 ` Chris Mason
2014-06-18 23:30 ` Waiman Long
2014-06-18 23:53 ` Chris Mason
2014-06-19 0:03 ` Marc Dionne
2014-06-19 0:08 ` Waiman Long
2014-06-19 0:41 ` Marc Dionne
2014-06-19 2:03 ` Marc Dionne
2014-06-19 2:11 ` Chris Mason
2014-06-19 3:21 ` Waiman Long
2014-06-19 16:51 ` Chris Mason
2014-06-19 17:52 ` Waiman Long
2014-06-19 20:10 ` Chris Mason
2014-06-19 21:50 ` Chris Mason
2014-06-19 23:21 ` Waiman Long [this message]
2014-06-20 3:20 ` Tsutomu Itoh
2014-06-21 1:09 ` Long, Wai Man
2014-06-19 9:49 ` btrfs-transacti:516 blocked 120 seconds on 3.16-rc1 Konstantinos Skarlatos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53A3708E.9060203@hp.com \
--to=waiman.long@hp.com \
--cc=clm@fb.com \
--cc=jbacik@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=marc.c.dionne@gmail.com \
--cc=t-itoh@jp.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.