linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace
Date: Mon, 18 Apr 2016 16:54:28 +0800	[thread overview]
Message-ID: <5714A0C4.2000305@oracle.com> (raw)
In-Reply-To: <20160414230932.GA9264@jeknote.loshitsa1.net>



On 04/15/2016 07:09 AM, Yauhen Kharuzhy wrote:
> On Tue, Apr 12, 2016 at 10:15:50PM +0800, Anand Jain wrote:
>> Thanks for various comments, tests and feedback.
>
> Hmm... Yet another lockdep warning, appeared when I removed target drive
> during of replacing:

  Thanks for the report.

  This is not introduce in this patch set, its in the original
  set, I have sent out

   btrfs: fix lock dep warning, move scratch dev out of 
device_list_mutex and uuid_mutex

  to fix this.

Thanks, Anand


> [ 5375.718844]
> [ 5375.718845] ======================================================
> [ 5375.718846] [ INFO: possible circular locking dependency detected ]
> [ 5375.718849] 4.4.5-scst31x-debug-11+ #40 Not tainted
> [ 5375.718849] -------------------------------------------------------
> [ 5375.718851] btrfs-health/4662 is trying to acquire lock:
> [ 5375.718861]  (sb_writers){.+.+.+}, at: [<ffffffff812214f7>] __sb_start_write+0xb7/0xf0
> [ 5375.718862]
> [ 5375.718862] but task is already holding lock:
> [ 5375.718907]  (&fs_devs->device_list_mutex){+.+.+.}, at: [<ffffffffa028263c>] btrfs_destroy_dev_replace_tgtdev+0x3c/0x150 [btrfs]
> [ 5375.718907]
> [ 5375.718907] which lock already depends on the new lock.
> [ 5375.718907]
> [ 5375.718908]
> [ 5375.718908] the existing dependency chain (in reverse order) is:
> [ 5375.718911]
> [ 5375.718911] -> #3 (&fs_devs->device_list_mutex){+.+.+.}:
> [ 5375.718917]        [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
> [ 5375.718921]        [<ffffffff81633949>] mutex_lock_nested+0x69/0x3c0
> [ 5375.718940]        [<ffffffffa0219bf6>] btrfs_show_devname+0x36/0x210 [btrfs]
> [ 5375.718945]        [<ffffffff81267079>] show_vfsmnt+0x49/0x150
> [ 5375.718948]        [<ffffffff81240b07>] m_show+0x17/0x20
> [ 5375.718951]        [<ffffffff81246868>] seq_read+0x2d8/0x3b0
> [ 5375.718955]        [<ffffffff8121df28>] __vfs_read+0x28/0xd0
> [ 5375.718959]        [<ffffffff8121e806>] vfs_read+0x86/0x130
> [ 5375.718962]        [<ffffffff8121f4c9>] SyS_read+0x49/0xa0
> [ 5375.718966]        [<ffffffff81637976>] entry_SYSCALL_64_fastpath+0x16/0x7a
> [ 5375.718968]
> [ 5375.718968] -> #2 (namespace_sem){+++++.}:
> [ 5375.718971]        [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
> [ 5375.718974]        [<ffffffff81635199>] down_write+0x49/0x80
> [ 5375.718977]        [<ffffffff81243593>] lock_mount+0x43/0x1c0
> [ 5375.718979]        [<ffffffff81243c13>] do_add_mount+0x23/0xd0
> [ 5375.718982]        [<ffffffff81244afb>] do_mount+0x27b/0xe30
> [ 5375.718985]        [<ffffffff812459dc>] SyS_mount+0x8c/0xd0
> [ 5375.718988]        [<ffffffff81637976>] entry_SYSCALL_64_fastpath+0x16/0x7a
> [ 5375.718991]
> [ 5375.718991] -> #1 (&sb->s_type->i_mutex_key#5){+.+.+.}:
> [ 5375.718994]        [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
> [ 5375.718996]        [<ffffffff81633949>] mutex_lock_nested+0x69/0x3c0
> [ 5375.719001]        [<ffffffff8122d608>] path_openat+0x468/0x1360
> [ 5375.719004]        [<ffffffff8122f86e>] do_filp_open+0x7e/0xe0
> [ 5375.719007]        [<ffffffff8121da7b>] do_sys_open+0x12b/0x210
> [ 5375.719010]        [<ffffffff8121db7e>] SyS_open+0x1e/0x20
> [ 5375.719013]        [<ffffffff81637976>] entry_SYSCALL_64_fastpath+0x16/0x7a
> [ 5375.719015]
> [ 5375.719015] -> #0 (sb_writers){.+.+.+}:
> [ 5375.719018]        [<ffffffff810d97ca>] __lock_acquire+0x17ba/0x1ae0
> [ 5375.719021]        [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
> [ 5375.719026]        [<ffffffff810d3bef>] percpu_down_read+0x4f/0xa0
> [ 5375.719028]        [<ffffffff812214f7>] __sb_start_write+0xb7/0xf0
> [ 5375.719031]        [<ffffffff81242eb4>] mnt_want_write+0x24/0x50
> [ 5375.719035]        [<ffffffff8122ded2>] path_openat+0xd32/0x1360
> [ 5375.719037]        [<ffffffff8122f86e>] do_filp_open+0x7e/0xe0
> [ 5375.719040]        [<ffffffff8121d8a4>] file_open_name+0xe4/0x130
> [ 5375.719043]        [<ffffffff8121d923>] filp_open+0x33/0x60
> [ 5375.719073]        [<ffffffffa02776a6>] update_dev_time+0x16/0x40 [btrfs]
> [ 5375.719099]        [<ffffffffa02825be>] btrfs_scratch_superblocks+0x4e/0x90 [btrfs]
> [ 5375.719123]        [<ffffffffa0282665>] btrfs_destroy_dev_replace_tgtdev+0x65/0x150 [btrfs]
> [ 5375.719150]        [<ffffffffa02c6c80>] btrfs_dev_replace_finishing+0x6b0/0x990 [btrfs]
> [ 5375.719175]        [<ffffffffa02c729e>] btrfs_dev_replace_start+0x33e/0x540 [btrfs]
> [ 5375.719199]        [<ffffffffa02c7f58>] btrfs_auto_replace_start+0xf8/0x140 [btrfs]
> [ 5375.719222]        [<ffffffffa02464e6>] health_kthread+0x246/0x490 [btrfs]
> [ 5375.719225]        [<ffffffff810a70df>] kthread+0xef/0x110
> [ 5375.719229]        [<ffffffff81637d2f>] ret_from_fork+0x3f/0x70
> [ 5375.719230]
> [ 5375.719230] other info that might help us debug this:
> [ 5375.719230]
> [ 5375.719233] Chain exists of:
> [ 5375.719233]   sb_writers --> namespace_sem --> &fs_devs->device_list_mutex
> [ 5375.719233]
> [ 5375.719234]  Possible unsafe locking scenario:
> [ 5375.719234]
> [ 5375.719234]        CPU0                    CPU1
> [ 5375.719235]        ----                    ----
> [ 5375.719236]   lock(&fs_devs->device_list_mutex);
> [ 5375.719238]                                lock(namespace_sem);
> [ 5375.719239]                                lock(&fs_devs->device_list_mutex);
> [ 5375.719241]   lock(sb_writers);
> [ 5375.719241]
> [ 5375.719241]  *** DEADLOCK ***
> [ 5375.719241]
> [ 5375.719243] 4 locks held by btrfs-health/4662:
> [ 5375.719266]  #0:  (&fs_info->health_mutex){+.+.+.}, at: [<ffffffffa0246303>] health_kthread+0x63/0x490 [btrfs]
> [ 5375.719293]  #1:  (&fs_info->dev_replace.lock_finishing_cancel_unmount){+.+.+.}, at: [<ffffffffa02c6611>] btrfs_dev_replace_finishing+0x41/0x990 [btrfs]
> [ 5375.719319]  #2:  (uuid_mutex){+.+.+.}, at: [<ffffffffa0282620>] btrfs_destroy_dev_replace_tgtdev+0x20/0x150 [btrfs]
> [ 5375.719343]  #3:  (&fs_devs->device_list_mutex){+.+.+.}, at: [<ffffffffa028263c>] btrfs_destroy_dev_replace_tgtdev+0x3c/0x150 [btrfs]
> [ 5375.719343]
> [ 5375.719343] stack backtrace:
> [ 5375.719347] CPU: 2 PID: 4662 Comm: btrfs-health Not tainted 4.4.5-scst31x-debug-11+ #40
> [ 5375.719348] Hardware name: Supermicro SYS-6018R-WTRT/X10DRW-iT, BIOS 1.0c 01/07/2015
> [ 5375.719352]  0000000000000000 ffff880856f73880 ffffffff813529e3 ffffffff826182a0
> [ 5375.719354]  ffffffff8260c090 ffff880856f738c0 ffffffff810d667c ffff880856f73930
> [ 5375.719357]  ffff880861f32b40 ffff880861f32b68 0000000000000003 0000000000000004
> [ 5375.719357] Call Trace:
> [ 5375.719363]  [<ffffffff813529e3>] dump_stack+0x85/0xc2
> [ 5375.719366]  [<ffffffff810d667c>] print_circular_bug+0x1ec/0x260
> [ 5375.719369]  [<ffffffff810d97ca>] __lock_acquire+0x17ba/0x1ae0
> [ 5375.719373]  [<ffffffff810f606d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [ 5375.719376]  [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
> [ 5375.719378]  [<ffffffff812214f7>] ? __sb_start_write+0xb7/0xf0
> [ 5375.719383]  [<ffffffff810d3bef>] percpu_down_read+0x4f/0xa0
> [ 5375.719385]  [<ffffffff812214f7>] ? __sb_start_write+0xb7/0xf0
> [ 5375.719387]  [<ffffffff812214f7>] __sb_start_write+0xb7/0xf0
> [ 5375.719389]  [<ffffffff81242eb4>] mnt_want_write+0x24/0x50
> [ 5375.719393]  [<ffffffff8122ded2>] path_openat+0xd32/0x1360
> [ 5375.719415]  [<ffffffffa02462a0>] ? btrfs_congested_fn+0x180/0x180 [btrfs]
> [ 5375.719418]  [<ffffffff810f606d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [ 5375.719420]  [<ffffffff8122f86e>] do_filp_open+0x7e/0xe0
> [ 5375.719423]  [<ffffffff810f615d>] ? rcu_read_lock_sched_held+0x6d/0x80
> [ 5375.719426]  [<ffffffff81201a9b>] ? kmem_cache_alloc+0x26b/0x5d0
> [ 5375.719430]  [<ffffffff8122e7d4>] ? getname_kernel+0x34/0x120
> [ 5375.719433]  [<ffffffff8121d8a4>] file_open_name+0xe4/0x130
> [ 5375.719436]  [<ffffffff8121d923>] filp_open+0x33/0x60
> [ 5375.719462]  [<ffffffffa02776a6>] update_dev_time+0x16/0x40 [btrfs]
> [ 5375.719485]  [<ffffffffa02825be>] btrfs_scratch_superblocks+0x4e/0x90 [btrfs]
> [ 5375.719506]  [<ffffffffa0282665>] btrfs_destroy_dev_replace_tgtdev+0x65/0x150 [btrfs]
> [ 5375.719530]  [<ffffffffa02c6c80>] btrfs_dev_replace_finishing+0x6b0/0x990 [btrfs]
> [ 5375.719554]  [<ffffffffa02c6b23>] ? btrfs_dev_replace_finishing+0x553/0x990 [btrfs]
> [ 5375.719576]  [<ffffffffa02c729e>] btrfs_dev_replace_start+0x33e/0x540 [btrfs]
> [ 5375.719598]  [<ffffffffa02c7f58>] btrfs_auto_replace_start+0xf8/0x140 [btrfs]
> [ 5375.719621]  [<ffffffffa02464e6>] health_kthread+0x246/0x490 [btrfs]
> [ 5375.719641]  [<ffffffffa02463d8>] ? health_kthread+0x138/0x490 [btrfs]
> [ 5375.719661]  [<ffffffffa02462a0>] ? btrfs_congested_fn+0x180/0x180 [btrfs]
> [ 5375.719663]  [<ffffffff810a70df>] kthread+0xef/0x110
> [ 5375.719666]  [<ffffffff810a6ff0>] ? kthread_create_on_node+0x200/0x200
> [ 5375.719669]  [<ffffffff81637d2f>] ret_from_fork+0x3f/0x70
> [ 5375.719672]  [<ffffffff810a6ff0>] ? kthread_create_on_node+0x200/0x200
> [ 5375.719697] ------------[ cut here ]------------
>
>

      reply	other threads:[~2016-04-18  8:54 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-12 14:15 [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace Anand Jain
2016-04-12 14:15 ` [PATCH 01/13] btrfs: Introduce a new function to check if all chunks a OK for degraded mount Anand Jain
2016-04-12 19:21   ` Yauhen Kharuzhy
2016-04-12 14:15 ` [PATCH 02/13] btrfs: Do per-chunk check for mount time check Anand Jain
2016-04-12 14:15 ` [PATCH 03/13] btrfs: Do per-chunk degraded check for remount Anand Jain
2016-04-12 14:15 ` [PATCH 04/13] btrfs: Allow barrier_all_devices to do per-chunk device check Anand Jain
2016-04-12 14:15 ` [PATCH 05/13] btrfs: Cleanup num_tolerated_disk_barrier_failures Anand Jain
2016-04-12 14:15 ` [PATCH 06/13] btrfs: introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV Anand Jain
2016-04-12 14:15 ` [PATCH 07/13] btrfs: add check not to mount a spare device Anand Jain
2016-04-12 14:15 ` [PATCH 08/13] btrfs: support btrfs dev scan for " Anand Jain
2016-04-12 14:15 ` [PATCH 09/13] btrfs: provide framework to get and put a " Anand Jain
2016-04-12 14:16 ` [PATCH 10/13] btrfs: introduce helper functions to perform hot replace Anand Jain
2016-04-12 14:40   ` kbuild test robot
2016-04-12 14:16 ` [PATCH 11/13] btrfs: introduce device dynamic state transition to offline or failed Anand Jain
2016-04-14  1:15   ` [PATCH] Btrfs: Set superblock s_bdev field properly at device closing Yauhen Kharuzhy
2016-04-14  6:59     ` Anand Jain
2016-04-14  9:10       ` Yauhen Kharuzhy
2016-04-14  9:48         ` Anand Jain
2016-04-14 10:51   ` [PATCH v5 11/13] btrfs: introduce device dynamic state transition to offline or failed Anand Jain
2016-04-14 16:56     ` Yauhen Kharuzhy
2016-04-18 10:50       ` Anand Jain
2016-04-12 14:16 ` [PATCH 12/13] btrfs: check device for critical errors and mark failed Anand Jain
2016-04-12 14:16 ` [PATCH 13/13] btrfs: check for failed device and hot replace Anand Jain
2016-04-12 20:02 ` [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace Yauhen Kharuzhy
2016-04-13 22:43   ` Anand Jain
2016-04-13 21:21 ` Yauhen Kharuzhy
2016-04-14  8:45   ` Anand Jain
2016-04-14  9:22     ` Yauhen Kharuzhy
2016-04-14  9:57       ` Anand Jain
2016-04-14 19:12 ` Yauhen Kharuzhy
2016-04-14 23:09 ` Yauhen Kharuzhy
2016-04-18  8:54   ` Anand Jain [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5714A0C4.2000305@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=yauhen.kharuzhy@zavadatar.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).