linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: Liu Bo <bo.li.liu@oracle.com>, <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 2/2] Btrfs: fix lockdep deadlock warning due to dev_replace
Date: Thu, 4 Feb 2016 16:44:41 +0800	[thread overview]
Message-ID: <56B30F79.4060907@cn.fujitsu.com> (raw)
In-Reply-To: <1437122959-23376-2-git-send-email-bo.li.liu@oracle.com>

Hi

Will this patch get merged in next merge windows?
(Although it should be merged in 4.5 merge window).

I also encountered the same warning, but with fsstress + some sync calls.

Thanks,
Qu

Liu Bo wrote on 2015/07/17 16:49 +0800:
> Xfstests btrfs/011 complains about a deadlock warning,
>
> [ 1226.649039] =========================================================
> [ 1226.649039] [ INFO: possible irq lock inversion dependency detected ]
> [ 1226.649039] 4.1.0+ #270 Not tainted
> [ 1226.649039] ---------------------------------------------------------
> [ 1226.652955] kswapd0/46 just changed the state of lock:
> [ 1226.652955]  (&delayed_node->mutex){+.+.-.}, at: [<ffffffff81458735>] __btrfs_release_delayed_node+0x45/0x1d0
> [ 1226.652955] but this lock took another, RECLAIM_FS-unsafe lock in the past:
> [ 1226.652955]  (&fs_info->dev_replace.lock){+.+.+.}
>
> and interrupts could create inverse lock ordering between them.
>
> [ 1226.652955]
> other info that might help us debug this:
> [ 1226.652955] Chain exists of:
>    &delayed_node->mutex --> &found->groups_sem --> &fs_info->dev_replace.lock
>
> [ 1226.652955]  Possible interrupt unsafe locking scenario:
>
> [ 1226.652955]        CPU0                    CPU1
> [ 1226.652955]        ----                    ----
> [ 1226.652955]   lock(&fs_info->dev_replace.lock);
> [ 1226.652955]                                local_irq_disable();
> [ 1226.652955]                                lock(&delayed_node->mutex);
> [ 1226.652955]                                lock(&found->groups_sem);
> [ 1226.652955]   <Interrupt>
> [ 1226.652955]     lock(&delayed_node->mutex);
> [ 1226.652955]
>   *** DEADLOCK ***
>
> Commit 084b6e7c7607 ("btrfs: Fix a lockdep warning when running xfstest.") tried
> to fix a similar one that has the exactly same warning, but with that, we still
> run to this.
>
> The above lock chain comes from
> btrfs_commit_transaction
>    ->btrfs_run_delayed_items
>      ...
>      ->__btrfs_update_delayed_inode
>        ...
>        ->__btrfs_cow_block
>           ...
>           ->find_free_extent
>              ->cache_block_group
>                ->load_free_space_cache
>                  ->btrfs_readpages
>                    ->submit_one_bio
>                      ...
>                      ->__btrfs_map_block
>                        ->btrfs_dev_replace_lock
>
> However, with high memory pressure, tasks which hold dev_replace.lock can
> be interrupted by kswapd and then kswapd is intended to release memory occupied
> by superblock, inodes and dentries, where we may call evict_inode, and it comes
> to
>
> [ 1226.652955]  [<ffffffff81458735>] __btrfs_release_delayed_node+0x45/0x1d0
> [ 1226.652955]  [<ffffffff81459e74>] btrfs_remove_delayed_node+0x24/0x30
> [ 1226.652955]  [<ffffffff8140c5fe>] btrfs_evict_inode+0x34e/0x700
>
> delayed_node->mutex may be acquired in __btrfs_release_delayed_node(), and it leads
> to a ABBA deadlock.
>
> To fix this, we can use "blocking rwlock" used in the case of extent_buffer, but
> things are simpler here since we only needs read's spinlock to blocking lock.
>
> With this, btrfs/011 no more produces warnings in dmesg.
>
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> ---
>   fs/btrfs/ctree.h       |   6 ++-
>   fs/btrfs/dev-replace.c | 130 ++++++++++++++++++++++++++-----------------------
>   fs/btrfs/dev-replace.h |   7 ++-
>   fs/btrfs/disk-io.c     |   6 ++-
>   fs/btrfs/reada.c       |  10 ++--
>   fs/btrfs/scrub.c       |   6 +--
>   fs/btrfs/volumes.c     |  24 +++++----
>   7 files changed, 105 insertions(+), 84 deletions(-)
>
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index aac314e..c903652 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -965,8 +965,10 @@ struct btrfs_dev_replace {
>   	pid_t lock_owner;
>   	atomic_t nesting_level;
>   	struct mutex lock_finishing_cancel_unmount;
> -	struct mutex lock_management_lock;
> -	struct mutex lock;
> +	rwlock_t lock;
> +	atomic_t read_locks;
> +	atomic_t blocking_readers;
> +	wait_queue_head_t read_lock_wq;
>
>   	struct btrfs_scrub_progress scrub_progress;
>   };
> diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
> index 9eb1401..eb9e016 100644
> --- a/fs/btrfs/dev-replace.c
> +++ b/fs/btrfs/dev-replace.c
> @@ -203,13 +203,13 @@ int btrfs_run_dev_replace(struct btrfs_trans_handle *trans,
>   	struct btrfs_dev_replace_item *ptr;
>   	struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
>
> -	btrfs_dev_replace_lock(dev_replace);
> +	btrfs_dev_replace_lock(dev_replace, 0);
>   	if (!dev_replace->is_valid ||
>   	    !dev_replace->item_needs_writeback) {
> -		btrfs_dev_replace_unlock(dev_replace);
> +		btrfs_dev_replace_unlock(dev_replace, 0);
>   		return 0;
>   	}
> -	btrfs_dev_replace_unlock(dev_replace);
> +	btrfs_dev_replace_unlock(dev_replace, 0);
>
>   	key.objectid = 0;
>   	key.type = BTRFS_DEV_REPLACE_KEY;
> @@ -265,7 +265,7 @@ int btrfs_run_dev_replace(struct btrfs_trans_handle *trans,
>   	ptr = btrfs_item_ptr(eb, path->slots[0],
>   			     struct btrfs_dev_replace_item);
>
> -	btrfs_dev_replace_lock(dev_replace);
> +	btrfs_dev_replace_lock(dev_replace, 1);
>   	if (dev_replace->srcdev)
>   		btrfs_set_dev_replace_src_devid(eb, ptr,
>   			dev_replace->srcdev->devid);
> @@ -288,7 +288,7 @@ int btrfs_run_dev_replace(struct btrfs_trans_handle *trans,
>   	btrfs_set_dev_replace_cursor_right(eb, ptr,
>   		dev_replace->cursor_right);
>   	dev_replace->item_needs_writeback = 0;
> -	btrfs_dev_replace_unlock(dev_replace);
> +	btrfs_dev_replace_unlock(dev_replace, 1);
>
>   	btrfs_mark_buffer_dirty(eb);
>
> @@ -357,7 +357,7 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
>   	if (ret)
>   		return ret;
>
> -	btrfs_dev_replace_lock(dev_replace);
> +	btrfs_dev_replace_lock(dev_replace, 1);
>   	switch (dev_replace->replace_state) {
>   	case BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED:
>   	case BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED:
> @@ -396,7 +396,7 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
>   	dev_replace->is_valid = 1;
>   	dev_replace->item_needs_writeback = 1;
>   	args->result = BTRFS_IOCTL_DEV_REPLACE_RESULT_NO_ERROR;
> -	btrfs_dev_replace_unlock(dev_replace);
> +	btrfs_dev_replace_unlock(dev_replace, 1);
>
>   	ret = btrfs_kobj_add_device(tgt_device->fs_devices, tgt_device);
>   	if (ret)
> @@ -408,7 +408,7 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
>   	trans = btrfs_start_transaction(root, 0);
>   	if (IS_ERR(trans)) {
>   		ret = PTR_ERR(trans);
> -		btrfs_dev_replace_lock(dev_replace);
> +		btrfs_dev_replace_lock(dev_replace, 1);
>   		goto leave;
>   	}
>
> @@ -434,7 +434,7 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
>   leave:
>   	dev_replace->srcdev = NULL;
>   	dev_replace->tgtdev = NULL;
> -	btrfs_dev_replace_unlock(dev_replace);
> +	btrfs_dev_replace_unlock(dev_replace, 1);
>   	btrfs_destroy_dev_replace_tgtdev(fs_info, tgt_device);
>   	return ret;
>   }
> @@ -473,18 +473,18 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
>   	/* don't allow cancel or unmount to disturb the finishing procedure */
>   	mutex_lock(&dev_replace->lock_finishing_cancel_unmount);
>
> -	btrfs_dev_replace_lock(dev_replace);
> +	btrfs_dev_replace_lock(dev_replace, 0);
>   	/* was the operation canceled, or is it finished? */
>   	if (dev_replace->replace_state !=
>   	    BTRFS_IOCTL_DEV_REPLACE_STATE_STARTED) {
> -		btrfs_dev_replace_unlock(dev_replace);
> +		btrfs_dev_replace_unlock(dev_replace, 0);
>   		mutex_unlock(&dev_replace->lock_finishing_cancel_unmount);
>   		return 0;
>   	}
>
>   	tgt_device = dev_replace->tgtdev;
>   	src_device = dev_replace->srcdev;
> -	btrfs_dev_replace_unlock(dev_replace);
> +	btrfs_dev_replace_unlock(dev_replace, 0);
>
>   	/*
>   	 * flush all outstanding I/O and inode extent mappings before the
> @@ -509,7 +509,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
>   	/* keep away write_all_supers() during the finishing procedure */
>   	mutex_lock(&root->fs_info->fs_devices->device_list_mutex);
>   	mutex_lock(&root->fs_info->chunk_mutex);
> -	btrfs_dev_replace_lock(dev_replace);
> +	btrfs_dev_replace_lock(dev_replace, 1);
>   	dev_replace->replace_state =
>   		scrub_ret ? BTRFS_IOCTL_DEV_REPLACE_STATE_CANCELED
>   			  : BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED;
> @@ -530,7 +530,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
>   			        rcu_str_deref(src_device->name),
>   			      src_device->devid,
>   			      rcu_str_deref(tgt_device->name), scrub_ret);
> -		btrfs_dev_replace_unlock(dev_replace);
> +		btrfs_dev_replace_unlock(dev_replace, 1);
>   		mutex_unlock(&root->fs_info->chunk_mutex);
>   		mutex_unlock(&root->fs_info->fs_devices->device_list_mutex);
>   		mutex_unlock(&uuid_mutex);
> @@ -567,7 +567,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
>   	list_add(&tgt_device->dev_alloc_list, &fs_info->fs_devices->alloc_list);
>   	fs_info->fs_devices->rw_devices++;
>
> -	btrfs_dev_replace_unlock(dev_replace);
> +	btrfs_dev_replace_unlock(dev_replace, 1);
>
>   	btrfs_rm_dev_replace_blocked(fs_info);
>
> @@ -651,7 +651,7 @@ void btrfs_dev_replace_status(struct btrfs_fs_info *fs_info,
>   	struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
>   	struct btrfs_device *srcdev;
>
> -	btrfs_dev_replace_lock(dev_replace);
> +	btrfs_dev_replace_lock(dev_replace, 0);
>   	/* even if !dev_replace_is_valid, the values are good enough for
>   	 * the replace_status ioctl */
>   	args->result = BTRFS_IOCTL_DEV_REPLACE_RESULT_NO_ERROR;
> @@ -677,7 +677,7 @@ void btrfs_dev_replace_status(struct btrfs_fs_info *fs_info,
>   			div_u64(btrfs_device_get_total_bytes(srcdev), 1000));
>   		break;
>   	}
> -	btrfs_dev_replace_unlock(dev_replace);
> +	btrfs_dev_replace_unlock(dev_replace, 0);
>   }
>
>   int btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info,
> @@ -700,13 +700,13 @@ static u64 __btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info)
>   		return -EROFS;
>
>   	mutex_lock(&dev_replace->lock_finishing_cancel_unmount);
> -	btrfs_dev_replace_lock(dev_replace);
> +	btrfs_dev_replace_lock(dev_replace, 1);
>   	switch (dev_replace->replace_state) {
>   	case BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED:
>   	case BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED:
>   	case BTRFS_IOCTL_DEV_REPLACE_STATE_CANCELED:
>   		result = BTRFS_IOCTL_DEV_REPLACE_RESULT_NOT_STARTED;
> -		btrfs_dev_replace_unlock(dev_replace);
> +		btrfs_dev_replace_unlock(dev_replace, 1);
>   		goto leave;
>   	case BTRFS_IOCTL_DEV_REPLACE_STATE_STARTED:
>   	case BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED:
> @@ -719,7 +719,7 @@ static u64 __btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info)
>   	dev_replace->replace_state = BTRFS_IOCTL_DEV_REPLACE_STATE_CANCELED;
>   	dev_replace->time_stopped = get_seconds();
>   	dev_replace->item_needs_writeback = 1;
> -	btrfs_dev_replace_unlock(dev_replace);
> +	btrfs_dev_replace_unlock(dev_replace, 1);
>   	btrfs_scrub_cancel(fs_info);
>
>   	trans = btrfs_start_transaction(root, 0);
> @@ -742,7 +742,7 @@ void btrfs_dev_replace_suspend_for_unmount(struct btrfs_fs_info *fs_info)
>   	struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
>
>   	mutex_lock(&dev_replace->lock_finishing_cancel_unmount);
> -	btrfs_dev_replace_lock(dev_replace);
> +	btrfs_dev_replace_lock(dev_replace, 1);
>   	switch (dev_replace->replace_state) {
>   	case BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED:
>   	case BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED:
> @@ -758,7 +758,7 @@ void btrfs_dev_replace_suspend_for_unmount(struct btrfs_fs_info *fs_info)
>   		break;
>   	}
>
> -	btrfs_dev_replace_unlock(dev_replace);
> +	btrfs_dev_replace_unlock(dev_replace, 1);
>   	mutex_unlock(&dev_replace->lock_finishing_cancel_unmount);
>   }
>
> @@ -768,12 +768,12 @@ int btrfs_resume_dev_replace_async(struct btrfs_fs_info *fs_info)
>   	struct task_struct *task;
>   	struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
>
> -	btrfs_dev_replace_lock(dev_replace);
> +	btrfs_dev_replace_lock(dev_replace, 1);
>   	switch (dev_replace->replace_state) {
>   	case BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED:
>   	case BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED:
>   	case BTRFS_IOCTL_DEV_REPLACE_STATE_CANCELED:
> -		btrfs_dev_replace_unlock(dev_replace);
> +		btrfs_dev_replace_unlock(dev_replace, 1);
>   		return 0;
>   	case BTRFS_IOCTL_DEV_REPLACE_STATE_STARTED:
>   		break;
> @@ -786,10 +786,10 @@ int btrfs_resume_dev_replace_async(struct btrfs_fs_info *fs_info)
>   		btrfs_info(fs_info, "cannot continue dev_replace, tgtdev is missing");
>   		btrfs_info(fs_info,
>   			"you may cancel the operation after 'mount -o degraded'");
> -		btrfs_dev_replace_unlock(dev_replace);
> +		btrfs_dev_replace_unlock(dev_replace, 1);
>   		return 0;
>   	}
> -	btrfs_dev_replace_unlock(dev_replace);
> +	btrfs_dev_replace_unlock(dev_replace, 1);
>
>   	WARN_ON(atomic_xchg(
>   		&fs_info->mutually_exclusive_operation_running, 1));
> @@ -867,48 +867,58 @@ int btrfs_dev_replace_is_ongoing(struct btrfs_dev_replace *dev_replace)
>   	return 1;
>   }
>
> -void btrfs_dev_replace_lock(struct btrfs_dev_replace *dev_replace)
> +void btrfs_dev_replace_lock(struct btrfs_dev_replace *dev_replace, int rw)
>   {
> -	/* the beginning is just an optimization for the typical case */
> -	if (atomic_read(&dev_replace->nesting_level) == 0) {
> -acquire_lock:
> -		/* this is not a nested case where the same thread
> -		 * is trying to acqurire the same lock twice */
> -		mutex_lock(&dev_replace->lock);
> -		mutex_lock(&dev_replace->lock_management_lock);
> -		dev_replace->lock_owner = current->pid;
> -		atomic_inc(&dev_replace->nesting_level);
> -		mutex_unlock(&dev_replace->lock_management_lock);
> -		return;
> +	if (rw == 1) {
> +		/* write */
> +again:
> +		wait_event(dev_replace->read_lock_wq,
> +			   atomic_read(&dev_replace->blocking_readers) == 0);
> +		write_lock(&dev_replace->lock);
> +		if (atomic_read(&dev_replace->blocking_readers)) {
> +			write_unlock(&dev_replace->lock);
> +			goto again;
> +		}
> +	} else {
> +		read_lock(&dev_replace->lock);
> +		atomic_inc(&dev_replace->read_locks);
>   	}
> +}
>
> -	mutex_lock(&dev_replace->lock_management_lock);
> -	if (atomic_read(&dev_replace->nesting_level) > 0 &&
> -	    dev_replace->lock_owner == current->pid) {
> -		WARN_ON(!mutex_is_locked(&dev_replace->lock));
> -		atomic_inc(&dev_replace->nesting_level);
> -		mutex_unlock(&dev_replace->lock_management_lock);
> -		return;
> +void btrfs_dev_replace_unlock(struct btrfs_dev_replace *dev_replace, int rw)
> +{
> +	if (rw == 1) {
> +		/* write */
> +		ASSERT(atomic_read(&dev_replace->blocking_readers) == 0);
> +		write_unlock(&dev_replace->lock);
> +	} else {
> +		ASSERT(atomic_read(&dev_replace->read_locks) > 0);
> +		atomic_dec(&dev_replace->read_locks);
> +		read_unlock(&dev_replace->lock);
>   	}
> +}
>
> -	mutex_unlock(&dev_replace->lock_management_lock);
> -	goto acquire_lock;
> +/* inc blocking cnt and release read lock */
> +void btrfs_dev_replace_set_lock_blocking(
> +					struct btrfs_dev_replace *dev_replace)
> +{
> +	/* only set blocking for read lock */
> +	ASSERT(atomic_read(&dev_replace->read_locks) > 0);
> +	atomic_inc(&dev_replace->blocking_readers);
> +	read_unlock(&dev_replace->lock);
>   }
>
> -void btrfs_dev_replace_unlock(struct btrfs_dev_replace *dev_replace)
> +/* acquire read lock and dec blocking cnt */
> +void btrfs_dev_replace_clear_lock_blocking(
> +					struct btrfs_dev_replace *dev_replace)
>   {
> -	WARN_ON(!mutex_is_locked(&dev_replace->lock));
> -	mutex_lock(&dev_replace->lock_management_lock);
> -	WARN_ON(atomic_read(&dev_replace->nesting_level) < 1);
> -	WARN_ON(dev_replace->lock_owner != current->pid);
> -	atomic_dec(&dev_replace->nesting_level);
> -	if (atomic_read(&dev_replace->nesting_level) == 0) {
> -		dev_replace->lock_owner = 0;
> -		mutex_unlock(&dev_replace->lock_management_lock);
> -		mutex_unlock(&dev_replace->lock);
> -	} else {
> -		mutex_unlock(&dev_replace->lock_management_lock);
> -	}
> +	/* only set blocking for read lock */
> +	ASSERT(atomic_read(&dev_replace->read_locks) > 0);
> +	ASSERT(atomic_read(&dev_replace->blocking_readers) > 0);
> +	read_lock(&dev_replace->lock);
> +	if (atomic_dec_and_test(&dev_replace->blocking_readers) &&
> +	    waitqueue_active(&dev_replace->read_lock_wq))
> +		wake_up(&dev_replace->read_lock_wq);
>   }
>
>   void btrfs_bio_counter_inc_noblocked(struct btrfs_fs_info *fs_info)
> diff --git a/fs/btrfs/dev-replace.h b/fs/btrfs/dev-replace.h
> index 20035cb..29e3ef5 100644
> --- a/fs/btrfs/dev-replace.h
> +++ b/fs/btrfs/dev-replace.h
> @@ -34,8 +34,11 @@ int btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info,
>   void btrfs_dev_replace_suspend_for_unmount(struct btrfs_fs_info *fs_info);
>   int btrfs_resume_dev_replace_async(struct btrfs_fs_info *fs_info);
>   int btrfs_dev_replace_is_ongoing(struct btrfs_dev_replace *dev_replace);
> -void btrfs_dev_replace_lock(struct btrfs_dev_replace *dev_replace);
> -void btrfs_dev_replace_unlock(struct btrfs_dev_replace *dev_replace);
> +void btrfs_dev_replace_lock(struct btrfs_dev_replace *dev_replace, int rw);
> +void btrfs_dev_replace_unlock(struct btrfs_dev_replace *dev_replace, int rw);
> +void btrfs_dev_replace_set_lock_blocking(struct btrfs_dev_replace *dev_replace);
> +void btrfs_dev_replace_clear_lock_blocking(
> +					struct btrfs_dev_replace *dev_replace);
>
>   static inline void btrfs_dev_replace_stats_inc(atomic64_t *stat_value)
>   {
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index a9aadb2..c567a44 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2233,9 +2233,11 @@ static void btrfs_init_dev_replace_locks(struct btrfs_fs_info *fs_info)
>   	fs_info->dev_replace.lock_owner = 0;
>   	atomic_set(&fs_info->dev_replace.nesting_level, 0);
>   	mutex_init(&fs_info->dev_replace.lock_finishing_cancel_unmount);
> -	mutex_init(&fs_info->dev_replace.lock_management_lock);
> -	mutex_init(&fs_info->dev_replace.lock);
> +	rwlock_init(&fs_info->dev_replace.lock);
> +	atomic_set(&fs_info->dev_replace.read_locks, 0);
> +	atomic_set(&fs_info->dev_replace.blocking_readers, 0);
>   	init_waitqueue_head(&fs_info->replace_wait);
> +	init_waitqueue_head(&fs_info->dev_replace.read_lock_wq);
>   }
>
>   static void btrfs_init_qgroup(struct btrfs_fs_info *fs_info)
> diff --git a/fs/btrfs/reada.c b/fs/btrfs/reada.c
> index 0e7beea..2f55591 100644
> --- a/fs/btrfs/reada.c
> +++ b/fs/btrfs/reada.c
> @@ -394,7 +394,7 @@ static struct reada_extent *reada_find_extent(struct btrfs_root *root,
>   	}
>
>   	/* insert extent in reada_tree + all per-device trees, all or nothing */
> -	btrfs_dev_replace_lock(&fs_info->dev_replace);
> +	btrfs_dev_replace_lock(&fs_info->dev_replace, 0);
>   	spin_lock(&fs_info->reada_lock);
>   	ret = radix_tree_insert(&fs_info->reada_tree, index, re);
>   	if (ret == -EEXIST) {
> @@ -402,12 +402,12 @@ static struct reada_extent *reada_find_extent(struct btrfs_root *root,
>   		BUG_ON(!re_exist);
>   		re_exist->refcnt++;
>   		spin_unlock(&fs_info->reada_lock);
> -		btrfs_dev_replace_unlock(&fs_info->dev_replace);
> +		btrfs_dev_replace_unlock(&fs_info->dev_replace, 0);
>   		goto error;
>   	}
>   	if (ret) {
>   		spin_unlock(&fs_info->reada_lock);
> -		btrfs_dev_replace_unlock(&fs_info->dev_replace);
> +		btrfs_dev_replace_unlock(&fs_info->dev_replace, 0);
>   		goto error;
>   	}
>   	prev_dev = NULL;
> @@ -454,12 +454,12 @@ static struct reada_extent *reada_find_extent(struct btrfs_root *root,
>   			BUG_ON(fs_info == NULL);
>   			radix_tree_delete(&fs_info->reada_tree, index);
>   			spin_unlock(&fs_info->reada_lock);
> -			btrfs_dev_replace_unlock(&fs_info->dev_replace);
> +			btrfs_dev_replace_unlock(&fs_info->dev_replace, 0);
>   			goto error;
>   		}
>   	}
>   	spin_unlock(&fs_info->reada_lock);
> -	btrfs_dev_replace_unlock(&fs_info->dev_replace);
> +	btrfs_dev_replace_unlock(&fs_info->dev_replace, 0);
>
>   	btrfs_put_bbio(bbio);
>   	return re;
> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> index 94db0fa..37d7c93 100644
> --- a/fs/btrfs/scrub.c
> +++ b/fs/btrfs/scrub.c
> @@ -3701,16 +3701,16 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start,
>   		return -EIO;
>   	}
>
> -	btrfs_dev_replace_lock(&fs_info->dev_replace);
> +	btrfs_dev_replace_lock(&fs_info->dev_replace, 0);
>   	if (dev->scrub_device ||
>   	    (!is_dev_replace &&
>   	     btrfs_dev_replace_is_ongoing(&fs_info->dev_replace))) {
> -		btrfs_dev_replace_unlock(&fs_info->dev_replace);
> +		btrfs_dev_replace_unlock(&fs_info->dev_replace, 0);
>   		mutex_unlock(&fs_info->scrub_lock);
>   		mutex_unlock(&fs_info->fs_devices->device_list_mutex);
>   		return -EINPROGRESS;
>   	}
> -	btrfs_dev_replace_unlock(&fs_info->dev_replace);
> +	btrfs_dev_replace_unlock(&fs_info->dev_replace, 0);
>
>   	ret = scrub_workers_get(fs_info, is_dev_replace);
>   	if (ret) {
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index fbe7c10..23f63fa 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1650,12 +1650,12 @@ int btrfs_rm_device(struct btrfs_root *root, char *device_path)
>   	} while (read_seqretry(&root->fs_info->profiles_lock, seq));
>
>   	num_devices = root->fs_info->fs_devices->num_devices;
> -	btrfs_dev_replace_lock(&root->fs_info->dev_replace);
> +	btrfs_dev_replace_lock(&root->fs_info->dev_replace, 0);
>   	if (btrfs_dev_replace_is_ongoing(&root->fs_info->dev_replace)) {
>   		WARN_ON(num_devices < 1);
>   		num_devices--;
>   	}
> -	btrfs_dev_replace_unlock(&root->fs_info->dev_replace);
> +	btrfs_dev_replace_unlock(&root->fs_info->dev_replace, 0);
>
>   	if ((all_avail & BTRFS_BLOCK_GROUP_RAID10) && num_devices <= 4) {
>   		ret = BTRFS_ERROR_DEV_RAID10_MIN_NOT_MET;
> @@ -3495,12 +3495,12 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
>   	}
>
>   	num_devices = fs_info->fs_devices->num_devices;
> -	btrfs_dev_replace_lock(&fs_info->dev_replace);
> +	btrfs_dev_replace_lock(&fs_info->dev_replace, 0);
>   	if (btrfs_dev_replace_is_ongoing(&fs_info->dev_replace)) {
>   		BUG_ON(num_devices < 1);
>   		num_devices--;
>   	}
> -	btrfs_dev_replace_unlock(&fs_info->dev_replace);
> +	btrfs_dev_replace_unlock(&fs_info->dev_replace, 0);
>   	allowed = BTRFS_AVAIL_ALLOC_BIT_SINGLE;
>   	if (num_devices == 1)
>   		allowed |= BTRFS_BLOCK_GROUP_DUP;
> @@ -4938,10 +4938,10 @@ int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 logical, u64 len)
>   		ret = 1;
>   	free_extent_map(em);
>
> -	btrfs_dev_replace_lock(&fs_info->dev_replace);
> +	btrfs_dev_replace_lock(&fs_info->dev_replace, 0);
>   	if (btrfs_dev_replace_is_ongoing(&fs_info->dev_replace))
>   		ret++;
> -	btrfs_dev_replace_unlock(&fs_info->dev_replace);
> +	btrfs_dev_replace_unlock(&fs_info->dev_replace, 0);
>
>   	return ret;
>   }
> @@ -5203,10 +5203,12 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
>   	if (!bbio_ret)
>   		goto out;
>
> -	btrfs_dev_replace_lock(dev_replace);
> +	btrfs_dev_replace_lock(dev_replace, 0);
>   	dev_replace_is_ongoing = btrfs_dev_replace_is_ongoing(dev_replace);
>   	if (!dev_replace_is_ongoing)
> -		btrfs_dev_replace_unlock(dev_replace);
> +		btrfs_dev_replace_unlock(dev_replace, 0);
> +	else
> +		btrfs_dev_replace_set_lock_blocking(dev_replace);
>
>   	if (dev_replace_is_ongoing && mirror_num == map->num_stripes + 1 &&
>   	    !(rw & (REQ_WRITE | REQ_DISCARD | REQ_GET_READ_MIRRORS)) &&
> @@ -5631,8 +5633,10 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
>   		bbio->mirror_num = map->num_stripes + 1;
>   	}
>   out:
> -	if (dev_replace_is_ongoing)
> -		btrfs_dev_replace_unlock(dev_replace);
> +	if (dev_replace_is_ongoing) {
> +		btrfs_dev_replace_clear_lock_blocking(dev_replace);
> +		btrfs_dev_replace_unlock(dev_replace, 0);
> +	}
>   	free_extent_map(em);
>   	return ret;
>   }
>



  reply	other threads:[~2016-02-04  8:44 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-17  8:49 [PATCH 1/2] Btrfs: move kobj stuff out of dev_replace lock range Liu Bo
2015-07-17  8:49 ` [PATCH 2/2] Btrfs: fix lockdep deadlock warning due to dev_replace Liu Bo
2016-02-04  8:44   ` Qu Wenruo [this message]
2016-02-04 14:25     ` David Sterba
2016-02-23 12:28   ` David Sterba
2015-07-17 10:23 ` [PATCH 1/2] Btrfs: move kobj stuff out of dev_replace lock range Anand Jain
2015-07-17 11:59   ` Liu Bo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56B30F79.4060907@cn.fujitsu.com \
    --to=quwenruo@cn.fujitsu.com \
    --cc=bo.li.liu@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).