All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace
Date: Mon, 18 Apr 2016 16:54:28 +0800	[thread overview]
Message-ID: <5714A0C4.2000305@oracle.com> (raw)
In-Reply-To: <20160414230932.GA9264@jeknote.loshitsa1.net>



On 04/15/2016 07:09 AM, Yauhen Kharuzhy wrote:
> On Tue, Apr 12, 2016 at 10:15:50PM +0800, Anand Jain wrote:
>> Thanks for various comments, tests and feedback.
>
> Hmm... Yet another lockdep warning, appeared when I removed target drive
> during of replacing:

  Thanks for the report.

  This is not introduce in this patch set, its in the original
  set, I have sent out

   btrfs: fix lock dep warning, move scratch dev out of 
device_list_mutex and uuid_mutex

  to fix this.

Thanks, Anand


> [ 5375.718844]
> [ 5375.718845] ======================================================
> [ 5375.718846] [ INFO: possible circular locking dependency detected ]
> [ 5375.718849] 4.4.5-scst31x-debug-11+ #40 Not tainted
> [ 5375.718849] -------------------------------------------------------
> [ 5375.718851] btrfs-health/4662 is trying to acquire lock:
> [ 5375.718861]  (sb_writers){.+.+.+}, at: [<ffffffff812214f7>] __sb_start_write+0xb7/0xf0
> [ 5375.718862]
> [ 5375.718862] but task is already holding lock:
> [ 5375.718907]  (&fs_devs->device_list_mutex){+.+.+.}, at: [<ffffffffa028263c>] btrfs_destroy_dev_replace_tgtdev+0x3c/0x150 [btrfs]
> [ 5375.718907]
> [ 5375.718907] which lock already depends on the new lock.
> [ 5375.718907]
> [ 5375.718908]
> [ 5375.718908] the existing dependency chain (in reverse order) is:
> [ 5375.718911]
> [ 5375.718911] -> #3 (&fs_devs->device_list_mutex){+.+.+.}:
> [ 5375.718917]        [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
> [ 5375.718921]        [<ffffffff81633949>] mutex_lock_nested+0x69/0x3c0
> [ 5375.718940]        [<ffffffffa0219bf6>] btrfs_show_devname+0x36/0x210 [btrfs]
> [ 5375.718945]        [<ffffffff81267079>] show_vfsmnt+0x49/0x150
> [ 5375.718948]        [<ffffffff81240b07>] m_show+0x17/0x20
> [ 5375.718951]        [<ffffffff81246868>] seq_read+0x2d8/0x3b0
> [ 5375.718955]        [<ffffffff8121df28>] __vfs_read+0x28/0xd0
> [ 5375.718959]        [<ffffffff8121e806>] vfs_read+0x86/0x130
> [ 5375.718962]        [<ffffffff8121f4c9>] SyS_read+0x49/0xa0
> [ 5375.718966]        [<ffffffff81637976>] entry_SYSCALL_64_fastpath+0x16/0x7a
> [ 5375.718968]
> [ 5375.718968] -> #2 (namespace_sem){+++++.}:
> [ 5375.718971]        [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
> [ 5375.718974]        [<ffffffff81635199>] down_write+0x49/0x80
> [ 5375.718977]        [<ffffffff81243593>] lock_mount+0x43/0x1c0
> [ 5375.718979]        [<ffffffff81243c13>] do_add_mount+0x23/0xd0
> [ 5375.718982]        [<ffffffff81244afb>] do_mount+0x27b/0xe30
> [ 5375.718985]        [<ffffffff812459dc>] SyS_mount+0x8c/0xd0
> [ 5375.718988]        [<ffffffff81637976>] entry_SYSCALL_64_fastpath+0x16/0x7a
> [ 5375.718991]
> [ 5375.718991] -> #1 (&sb->s_type->i_mutex_key#5){+.+.+.}:
> [ 5375.718994]        [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
> [ 5375.718996]        [<ffffffff81633949>] mutex_lock_nested+0x69/0x3c0
> [ 5375.719001]        [<ffffffff8122d608>] path_openat+0x468/0x1360
> [ 5375.719004]        [<ffffffff8122f86e>] do_filp_open+0x7e/0xe0
> [ 5375.719007]        [<ffffffff8121da7b>] do_sys_open+0x12b/0x210
> [ 5375.719010]        [<ffffffff8121db7e>] SyS_open+0x1e/0x20
> [ 5375.719013]        [<ffffffff81637976>] entry_SYSCALL_64_fastpath+0x16/0x7a
> [ 5375.719015]
> [ 5375.719015] -> #0 (sb_writers){.+.+.+}:
> [ 5375.719018]        [<ffffffff810d97ca>] __lock_acquire+0x17ba/0x1ae0
> [ 5375.719021]        [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
> [ 5375.719026]        [<ffffffff810d3bef>] percpu_down_read+0x4f/0xa0
> [ 5375.719028]        [<ffffffff812214f7>] __sb_start_write+0xb7/0xf0
> [ 5375.719031]        [<ffffffff81242eb4>] mnt_want_write+0x24/0x50
> [ 5375.719035]        [<ffffffff8122ded2>] path_openat+0xd32/0x1360
> [ 5375.719037]        [<ffffffff8122f86e>] do_filp_open+0x7e/0xe0
> [ 5375.719040]        [<ffffffff8121d8a4>] file_open_name+0xe4/0x130
> [ 5375.719043]        [<ffffffff8121d923>] filp_open+0x33/0x60
> [ 5375.719073]        [<ffffffffa02776a6>] update_dev_time+0x16/0x40 [btrfs]
> [ 5375.719099]        [<ffffffffa02825be>] btrfs_scratch_superblocks+0x4e/0x90 [btrfs]
> [ 5375.719123]        [<ffffffffa0282665>] btrfs_destroy_dev_replace_tgtdev+0x65/0x150 [btrfs]
> [ 5375.719150]        [<ffffffffa02c6c80>] btrfs_dev_replace_finishing+0x6b0/0x990 [btrfs]
> [ 5375.719175]        [<ffffffffa02c729e>] btrfs_dev_replace_start+0x33e/0x540 [btrfs]
> [ 5375.719199]        [<ffffffffa02c7f58>] btrfs_auto_replace_start+0xf8/0x140 [btrfs]
> [ 5375.719222]        [<ffffffffa02464e6>] health_kthread+0x246/0x490 [btrfs]
> [ 5375.719225]        [<ffffffff810a70df>] kthread+0xef/0x110
> [ 5375.719229]        [<ffffffff81637d2f>] ret_from_fork+0x3f/0x70
> [ 5375.719230]
> [ 5375.719230] other info that might help us debug this:
> [ 5375.719230]
> [ 5375.719233] Chain exists of:
> [ 5375.719233]   sb_writers --> namespace_sem --> &fs_devs->device_list_mutex
> [ 5375.719233]
> [ 5375.719234]  Possible unsafe locking scenario:
> [ 5375.719234]
> [ 5375.719234]        CPU0                    CPU1
> [ 5375.719235]        ----                    ----
> [ 5375.719236]   lock(&fs_devs->device_list_mutex);
> [ 5375.719238]                                lock(namespace_sem);
> [ 5375.719239]                                lock(&fs_devs->device_list_mutex);
> [ 5375.719241]   lock(sb_writers);
> [ 5375.719241]
> [ 5375.719241]  *** DEADLOCK ***
> [ 5375.719241]
> [ 5375.719243] 4 locks held by btrfs-health/4662:
> [ 5375.719266]  #0:  (&fs_info->health_mutex){+.+.+.}, at: [<ffffffffa0246303>] health_kthread+0x63/0x490 [btrfs]
> [ 5375.719293]  #1:  (&fs_info->dev_replace.lock_finishing_cancel_unmount){+.+.+.}, at: [<ffffffffa02c6611>] btrfs_dev_replace_finishing+0x41/0x990 [btrfs]
> [ 5375.719319]  #2:  (uuid_mutex){+.+.+.}, at: [<ffffffffa0282620>] btrfs_destroy_dev_replace_tgtdev+0x20/0x150 [btrfs]
> [ 5375.719343]  #3:  (&fs_devs->device_list_mutex){+.+.+.}, at: [<ffffffffa028263c>] btrfs_destroy_dev_replace_tgtdev+0x3c/0x150 [btrfs]
> [ 5375.719343]
> [ 5375.719343] stack backtrace:
> [ 5375.719347] CPU: 2 PID: 4662 Comm: btrfs-health Not tainted 4.4.5-scst31x-debug-11+ #40
> [ 5375.719348] Hardware name: Supermicro SYS-6018R-WTRT/X10DRW-iT, BIOS 1.0c 01/07/2015
> [ 5375.719352]  0000000000000000 ffff880856f73880 ffffffff813529e3 ffffffff826182a0
> [ 5375.719354]  ffffffff8260c090 ffff880856f738c0 ffffffff810d667c ffff880856f73930
> [ 5375.719357]  ffff880861f32b40 ffff880861f32b68 0000000000000003 0000000000000004
> [ 5375.719357] Call Trace:
> [ 5375.719363]  [<ffffffff813529e3>] dump_stack+0x85/0xc2
> [ 5375.719366]  [<ffffffff810d667c>] print_circular_bug+0x1ec/0x260
> [ 5375.719369]  [<ffffffff810d97ca>] __lock_acquire+0x17ba/0x1ae0
> [ 5375.719373]  [<ffffffff810f606d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [ 5375.719376]  [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
> [ 5375.719378]  [<ffffffff812214f7>] ? __sb_start_write+0xb7/0xf0
> [ 5375.719383]  [<ffffffff810d3bef>] percpu_down_read+0x4f/0xa0
> [ 5375.719385]  [<ffffffff812214f7>] ? __sb_start_write+0xb7/0xf0
> [ 5375.719387]  [<ffffffff812214f7>] __sb_start_write+0xb7/0xf0
> [ 5375.719389]  [<ffffffff81242eb4>] mnt_want_write+0x24/0x50
> [ 5375.719393]  [<ffffffff8122ded2>] path_openat+0xd32/0x1360
> [ 5375.719415]  [<ffffffffa02462a0>] ? btrfs_congested_fn+0x180/0x180 [btrfs]
> [ 5375.719418]  [<ffffffff810f606d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [ 5375.719420]  [<ffffffff8122f86e>] do_filp_open+0x7e/0xe0
> [ 5375.719423]  [<ffffffff810f615d>] ? rcu_read_lock_sched_held+0x6d/0x80
> [ 5375.719426]  [<ffffffff81201a9b>] ? kmem_cache_alloc+0x26b/0x5d0
> [ 5375.719430]  [<ffffffff8122e7d4>] ? getname_kernel+0x34/0x120
> [ 5375.719433]  [<ffffffff8121d8a4>] file_open_name+0xe4/0x130
> [ 5375.719436]  [<ffffffff8121d923>] filp_open+0x33/0x60
> [ 5375.719462]  [<ffffffffa02776a6>] update_dev_time+0x16/0x40 [btrfs]
> [ 5375.719485]  [<ffffffffa02825be>] btrfs_scratch_superblocks+0x4e/0x90 [btrfs]
> [ 5375.719506]  [<ffffffffa0282665>] btrfs_destroy_dev_replace_tgtdev+0x65/0x150 [btrfs]
> [ 5375.719530]  [<ffffffffa02c6c80>] btrfs_dev_replace_finishing+0x6b0/0x990 [btrfs]
> [ 5375.719554]  [<ffffffffa02c6b23>] ? btrfs_dev_replace_finishing+0x553/0x990 [btrfs]
> [ 5375.719576]  [<ffffffffa02c729e>] btrfs_dev_replace_start+0x33e/0x540 [btrfs]
> [ 5375.719598]  [<ffffffffa02c7f58>] btrfs_auto_replace_start+0xf8/0x140 [btrfs]
> [ 5375.719621]  [<ffffffffa02464e6>] health_kthread+0x246/0x490 [btrfs]
> [ 5375.719641]  [<ffffffffa02463d8>] ? health_kthread+0x138/0x490 [btrfs]
> [ 5375.719661]  [<ffffffffa02462a0>] ? btrfs_congested_fn+0x180/0x180 [btrfs]
> [ 5375.719663]  [<ffffffff810a70df>] kthread+0xef/0x110
> [ 5375.719666]  [<ffffffff810a6ff0>] ? kthread_create_on_node+0x200/0x200
> [ 5375.719669]  [<ffffffff81637d2f>] ret_from_fork+0x3f/0x70
> [ 5375.719672]  [<ffffffff810a6ff0>] ? kthread_create_on_node+0x200/0x200
> [ 5375.719697] ------------[ cut here ]------------
>
>

      reply	other threads:[~2016-04-18  8:54 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-12 14:15 [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace Anand Jain
2016-04-12 14:15 ` [PATCH 01/13] btrfs: Introduce a new function to check if all chunks a OK for degraded mount Anand Jain
2016-04-12 19:21   ` Yauhen Kharuzhy
2016-04-12 14:15 ` [PATCH 02/13] btrfs: Do per-chunk check for mount time check Anand Jain
2016-04-12 14:15 ` [PATCH 03/13] btrfs: Do per-chunk degraded check for remount Anand Jain
2016-04-12 14:15 ` [PATCH 04/13] btrfs: Allow barrier_all_devices to do per-chunk device check Anand Jain
2016-04-12 14:15 ` [PATCH 05/13] btrfs: Cleanup num_tolerated_disk_barrier_failures Anand Jain
2016-04-12 14:15 ` [PATCH 06/13] btrfs: introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV Anand Jain
2016-04-12 14:15 ` [PATCH 07/13] btrfs: add check not to mount a spare device Anand Jain
2016-04-12 14:15 ` [PATCH 08/13] btrfs: support btrfs dev scan for " Anand Jain
2016-04-12 14:15 ` [PATCH 09/13] btrfs: provide framework to get and put a " Anand Jain
2016-04-12 14:16 ` [PATCH 10/13] btrfs: introduce helper functions to perform hot replace Anand Jain
2016-04-12 14:40   ` kbuild test robot
2016-04-12 14:16 ` [PATCH 11/13] btrfs: introduce device dynamic state transition to offline or failed Anand Jain
2016-04-14  1:15   ` [PATCH] Btrfs: Set superblock s_bdev field properly at device closing Yauhen Kharuzhy
2016-04-14  6:59     ` Anand Jain
2016-04-14  9:10       ` Yauhen Kharuzhy
2016-04-14  9:48         ` Anand Jain
2016-04-14 10:51   ` [PATCH v5 11/13] btrfs: introduce device dynamic state transition to offline or failed Anand Jain
2016-04-14 16:56     ` Yauhen Kharuzhy
2016-04-18 10:50       ` Anand Jain
2016-04-12 14:16 ` [PATCH 12/13] btrfs: check device for critical errors and mark failed Anand Jain
2016-04-12 14:16 ` [PATCH 13/13] btrfs: check for failed device and hot replace Anand Jain
2016-04-12 20:02 ` [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace Yauhen Kharuzhy
2016-04-13 22:43   ` Anand Jain
2016-04-13 21:21 ` Yauhen Kharuzhy
2016-04-14  8:45   ` Anand Jain
2016-04-14  9:22     ` Yauhen Kharuzhy
2016-04-14  9:57       ` Anand Jain
2016-04-14 19:12 ` Yauhen Kharuzhy
2016-04-14 23:09 ` Yauhen Kharuzhy
2016-04-18  8:54   ` Anand Jain [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5714A0C4.2000305@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=yauhen.kharuzhy@zavadatar.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.