linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH 15/15] btrfs: check for failed device and hot replace
Date: Mon,  9 Nov 2015 18:56:29 +0800	[thread overview]
Message-ID: <1447066589-3835-16-git-send-email-anand.jain@oracle.com> (raw)
In-Reply-To: <1447066589-3835-1-git-send-email-anand.jain@oracle.com>

This patch creates casualty_kthread to check for the failed
devices, and triggers device replace.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/ctree.h       |  1 +
 fs/btrfs/disk-io.c     | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/transaction.c |  3 ++-
 3 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 4d25fd8..3e706ff 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1613,6 +1613,7 @@ struct btrfs_fs_info {
 	struct btrfs_workqueue *extent_workers;
 	struct task_struct *transaction_kthread;
 	struct task_struct *cleaner_kthread;
+	struct task_struct *casualty_kthread;
 	int thread_pool_size;
 
 	struct kobject *space_info_kobj;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 3662c0a..beefe35 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1836,6 +1836,64 @@ sleep:
 	return 0;
 }
 
+/*
+ * A kthread to check if any auto maintenance be required. This is
+ * multithread safe, and kthread is running only if
+ * fs_info->casualty_kthread is not NULL, fixme: atomic ?
+ */
+static int casualty_kthread(void *arg)
+{
+	struct btrfs_root *root = arg;
+	struct btrfs_fs_info *fs_info = root->fs_info;
+	struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
+	struct btrfs_device *device;
+	int found = 0;
+
+	if (root->fs_info->sb->s_flags & MS_RDONLY)
+		goto out;
+
+	btrfs_dev_replace_lock(&fs_info->dev_replace);
+	if (btrfs_dev_replace_is_ongoing(&fs_info->dev_replace)) {
+		btrfs_dev_replace_unlock(&fs_info->dev_replace);
+		goto out;
+	}
+	btrfs_dev_replace_unlock(&fs_info->dev_replace);
+
+	/*
+	 * Find failed device, if any. After the replace the failed
+	 * device is removed, so any failed device found here is new and
+	 * will be a candidate for the replace, if FS can't work without
+	 * the failed device then btrfs_std_error() will have put FS into
+	 * readonly
+	 */
+	/*
+	 * fixme: introduce a priority order to find failed device,
+	 * chronological order ?
+	 */
+	mutex_lock(&fs_devices->device_list_mutex);
+	rcu_read_lock();
+	list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) {
+		if (device->failed) {
+			found = 1;
+			break;
+		}
+	}
+	rcu_read_unlock();
+	mutex_unlock(&fs_devices->device_list_mutex);
+
+	/*
+	 * We are using the replace code which should be interrupt-able
+	 * during unmount, and as of now there is no user land stop
+	 * request that we support
+	 */
+	if (found)
+		btrfs_auto_replace_start(root, device);
+
+out:
+	fs_info->casualty_kthread = NULL;
+	return 0;
+}
+
 static void btrfs_check_devices(struct btrfs_fs_devices *fs_devices)
 {
 	struct btrfs_fs_info *fs_info = fs_devices->fs_info;
@@ -1924,6 +1982,10 @@ static int transaction_kthread(void *arg)
 		}
 sleep:
 		btrfs_check_devices(root->fs_info->fs_devices);
+		if (!root->fs_info->casualty_kthread)
+			root->fs_info->casualty_kthread =
+				kthread_run(casualty_kthread, root,
+							"btrfs-casualty");
 
 		wake_up_process(root->fs_info->cleaner_kthread);
 		mutex_unlock(&root->fs_info->transaction_kthread_mutex);
@@ -3159,6 +3221,9 @@ fail_trans_kthread:
 	kthread_stop(fs_info->transaction_kthread);
 	btrfs_cleanup_transaction(fs_info->tree_root);
 	btrfs_free_fs_roots(fs_info);
+	if (fs_info->casualty_kthread)
+		kthread_stop(fs_info->casualty_kthread);
+
 fail_cleaner:
 	kthread_stop(fs_info->cleaner_kthread);
 
@@ -3807,6 +3872,8 @@ void close_ctree(struct btrfs_root *root)
 
 	kthread_stop(fs_info->transaction_kthread);
 	kthread_stop(fs_info->cleaner_kthread);
+	if (fs_info->casualty_kthread)
+		kthread_stop(fs_info->casualty_kthread);
 
 	fs_info->closing = 2;
 	smp_mb();
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 76354bb..ef4aaf5 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -2187,7 +2187,8 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
 	kmem_cache_free(btrfs_trans_handle_cachep, trans);
 
 	if (current != root->fs_info->transaction_kthread &&
-	    current != root->fs_info->cleaner_kthread)
+	    current != root->fs_info->cleaner_kthread &&
+	    current != root->fs_info->casualty_kthread)
 		btrfs_run_delayed_iputs(root);
 
 	return ret;
-- 
2.4.1


  parent reply	other threads:[~2015-11-09 10:57 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-09 10:56 [PATCH 00/15] btrfs: Hot spare and Auto replace Anand Jain
2015-11-09 10:56 ` [PATCH 01/15] btrfs: Introduce a new function to check if all chunks a OK for degraded mount Anand Jain
2015-11-09 10:56 ` [PATCH 02/15] btrfs: Do per-chunk check for mount time check Anand Jain
2015-11-09 10:56 ` [PATCH 03/15] btrfs: Do per-chunk degraded check for remount Anand Jain
2015-11-09 10:56 ` [PATCH 04/15] btrfs: Allow barrier_all_devices to do per-chunk device check Anand Jain
2015-11-09 10:56 ` [PATCH 05/15] btrfs: optimize btrfs_check_degradable() for calls outside of barrier Anand Jain
2015-11-09 10:56 ` [PATCH 06/15] btrfs: Cleanup num_tolerated_disk_barrier_failures Anand Jain
2015-12-05  7:16   ` Qu Wenruo
2015-11-09 10:56 ` [PATCH 07/15] btrfs: introduce device dynamic state transition to offline or failed Anand Jain
2015-11-09 10:56 ` [PATCH 08/15] btrfs: check device for critical errors and mark failed Anand Jain
2015-11-09 10:56 ` [PATCH 09/15] btrfs: block incompatible optional features at scan Anand Jain
2015-11-09 10:56 ` [PATCH 10/15] btrfs: introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV Anand Jain
2015-11-09 10:56 ` [PATCH 11/15] btrfs: add check not to mount a spare device Anand Jain
2015-11-09 10:56 ` [PATCH 12/15] btrfs: support btrfs dev scan for " Anand Jain
2015-11-09 10:56 ` [PATCH 13/15] btrfs: provide framework to get and put a " Anand Jain
2015-11-09 10:56 ` [PATCH 14/15] btrfs: introduce helper functions to perform hot replace Anand Jain
2015-11-09 10:56 ` Anand Jain [this message]
2015-11-09 10:58 ` [PATCH 0/4] btrfs-progs: Hot spare and Auto replace Anand Jain
2015-11-09 10:58   ` [PATCH 1/4] btrfs-progs: Introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV SB flags Anand Jain
2015-11-09 10:58   ` [PATCH 2/4] btrfs-progs: Introduce btrfs spare subcommand Anand Jain
2015-11-09 10:58   ` [PATCH 3/4] btrfs-progs: add fi show for spare Anand Jain
2015-11-09 10:58   ` [PATCH 4/4] btrfs-progs: add global spare device list to filesystem show Anand Jain
2015-11-09 14:09 ` [PATCH 00/15] btrfs: Hot spare and Auto replace Austin S Hemmelgarn
2015-11-09 21:29   ` Duncan
2015-11-10 12:13     ` Austin S Hemmelgarn
2015-11-13 10:17       ` Anand Jain
2015-11-13 12:25         ` Austin S Hemmelgarn
2015-11-15 18:10         ` Christoph Anton Mitterer
2015-11-12  2:15 ` Qu Wenruo
2015-11-12  6:46   ` Duncan
2015-11-12 13:04   ` Austin S Hemmelgarn
2015-11-13  1:07     ` Qu Wenruo
2015-11-13 10:20       ` Anand Jain
2015-11-14  0:54         ` Qu Wenruo
2015-11-16 13:39           ` Austin S Hemmelgarn
2015-11-12 19:08   ` Goffredo Baroncelli
2015-11-13 10:18   ` Anand Jain
2015-11-12 19:21 ` Goffredo Baroncelli
2015-11-13 10:20   ` Anand Jain
2015-11-14 11:05     ` Goffredo Baroncelli
2015-11-16 13:41 ` Austin S Hemmelgarn
2015-11-16 22:07   ` Anand Jain
2015-11-17 12:28     ` Austin S Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1447066589-3835-16-git-send-email-anand.jain@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).