From: Anand Jain <anand.jain@oracle.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH 15/15] btrfs: check for failed device and hot replace
Date: Mon, 9 Nov 2015 18:56:29 +0800 [thread overview]
Message-ID: <1447066589-3835-16-git-send-email-anand.jain@oracle.com> (raw)
In-Reply-To: <1447066589-3835-1-git-send-email-anand.jain@oracle.com>
This patch creates casualty_kthread to check for the failed
devices, and triggers device replace.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
fs/btrfs/ctree.h | 1 +
fs/btrfs/disk-io.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++
fs/btrfs/transaction.c | 3 ++-
3 files changed, 70 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 4d25fd8..3e706ff 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1613,6 +1613,7 @@ struct btrfs_fs_info {
struct btrfs_workqueue *extent_workers;
struct task_struct *transaction_kthread;
struct task_struct *cleaner_kthread;
+ struct task_struct *casualty_kthread;
int thread_pool_size;
struct kobject *space_info_kobj;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 3662c0a..beefe35 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1836,6 +1836,64 @@ sleep:
return 0;
}
+/*
+ * A kthread to check if any auto maintenance be required. This is
+ * multithread safe, and kthread is running only if
+ * fs_info->casualty_kthread is not NULL, fixme: atomic ?
+ */
+static int casualty_kthread(void *arg)
+{
+ struct btrfs_root *root = arg;
+ struct btrfs_fs_info *fs_info = root->fs_info;
+ struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
+ struct btrfs_device *device;
+ int found = 0;
+
+ if (root->fs_info->sb->s_flags & MS_RDONLY)
+ goto out;
+
+ btrfs_dev_replace_lock(&fs_info->dev_replace);
+ if (btrfs_dev_replace_is_ongoing(&fs_info->dev_replace)) {
+ btrfs_dev_replace_unlock(&fs_info->dev_replace);
+ goto out;
+ }
+ btrfs_dev_replace_unlock(&fs_info->dev_replace);
+
+ /*
+ * Find failed device, if any. After the replace the failed
+ * device is removed, so any failed device found here is new and
+ * will be a candidate for the replace, if FS can't work without
+ * the failed device then btrfs_std_error() will have put FS into
+ * readonly
+ */
+ /*
+ * fixme: introduce a priority order to find failed device,
+ * chronological order ?
+ */
+ mutex_lock(&fs_devices->device_list_mutex);
+ rcu_read_lock();
+ list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) {
+ if (device->failed) {
+ found = 1;
+ break;
+ }
+ }
+ rcu_read_unlock();
+ mutex_unlock(&fs_devices->device_list_mutex);
+
+ /*
+ * We are using the replace code which should be interrupt-able
+ * during unmount, and as of now there is no user land stop
+ * request that we support
+ */
+ if (found)
+ btrfs_auto_replace_start(root, device);
+
+out:
+ fs_info->casualty_kthread = NULL;
+ return 0;
+}
+
static void btrfs_check_devices(struct btrfs_fs_devices *fs_devices)
{
struct btrfs_fs_info *fs_info = fs_devices->fs_info;
@@ -1924,6 +1982,10 @@ static int transaction_kthread(void *arg)
}
sleep:
btrfs_check_devices(root->fs_info->fs_devices);
+ if (!root->fs_info->casualty_kthread)
+ root->fs_info->casualty_kthread =
+ kthread_run(casualty_kthread, root,
+ "btrfs-casualty");
wake_up_process(root->fs_info->cleaner_kthread);
mutex_unlock(&root->fs_info->transaction_kthread_mutex);
@@ -3159,6 +3221,9 @@ fail_trans_kthread:
kthread_stop(fs_info->transaction_kthread);
btrfs_cleanup_transaction(fs_info->tree_root);
btrfs_free_fs_roots(fs_info);
+ if (fs_info->casualty_kthread)
+ kthread_stop(fs_info->casualty_kthread);
+
fail_cleaner:
kthread_stop(fs_info->cleaner_kthread);
@@ -3807,6 +3872,8 @@ void close_ctree(struct btrfs_root *root)
kthread_stop(fs_info->transaction_kthread);
kthread_stop(fs_info->cleaner_kthread);
+ if (fs_info->casualty_kthread)
+ kthread_stop(fs_info->casualty_kthread);
fs_info->closing = 2;
smp_mb();
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 76354bb..ef4aaf5 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -2187,7 +2187,8 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
kmem_cache_free(btrfs_trans_handle_cachep, trans);
if (current != root->fs_info->transaction_kthread &&
- current != root->fs_info->cleaner_kthread)
+ current != root->fs_info->cleaner_kthread &&
+ current != root->fs_info->casualty_kthread)
btrfs_run_delayed_iputs(root);
return ret;
--
2.4.1
next prev parent reply other threads:[~2015-11-09 10:57 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-09 10:56 [PATCH 00/15] btrfs: Hot spare and Auto replace Anand Jain
2015-11-09 10:56 ` [PATCH 01/15] btrfs: Introduce a new function to check if all chunks a OK for degraded mount Anand Jain
2015-11-09 10:56 ` [PATCH 02/15] btrfs: Do per-chunk check for mount time check Anand Jain
2015-11-09 10:56 ` [PATCH 03/15] btrfs: Do per-chunk degraded check for remount Anand Jain
2015-11-09 10:56 ` [PATCH 04/15] btrfs: Allow barrier_all_devices to do per-chunk device check Anand Jain
2015-11-09 10:56 ` [PATCH 05/15] btrfs: optimize btrfs_check_degradable() for calls outside of barrier Anand Jain
2015-11-09 10:56 ` [PATCH 06/15] btrfs: Cleanup num_tolerated_disk_barrier_failures Anand Jain
2015-12-05 7:16 ` Qu Wenruo
2015-11-09 10:56 ` [PATCH 07/15] btrfs: introduce device dynamic state transition to offline or failed Anand Jain
2015-11-09 10:56 ` [PATCH 08/15] btrfs: check device for critical errors and mark failed Anand Jain
2015-11-09 10:56 ` [PATCH 09/15] btrfs: block incompatible optional features at scan Anand Jain
2015-11-09 10:56 ` [PATCH 10/15] btrfs: introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV Anand Jain
2015-11-09 10:56 ` [PATCH 11/15] btrfs: add check not to mount a spare device Anand Jain
2015-11-09 10:56 ` [PATCH 12/15] btrfs: support btrfs dev scan for " Anand Jain
2015-11-09 10:56 ` [PATCH 13/15] btrfs: provide framework to get and put a " Anand Jain
2015-11-09 10:56 ` [PATCH 14/15] btrfs: introduce helper functions to perform hot replace Anand Jain
2015-11-09 10:56 ` Anand Jain [this message]
2015-11-09 10:58 ` [PATCH 0/4] btrfs-progs: Hot spare and Auto replace Anand Jain
2015-11-09 10:58 ` [PATCH 1/4] btrfs-progs: Introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV SB flags Anand Jain
2015-11-09 10:58 ` [PATCH 2/4] btrfs-progs: Introduce btrfs spare subcommand Anand Jain
2015-11-09 10:58 ` [PATCH 3/4] btrfs-progs: add fi show for spare Anand Jain
2015-11-09 10:58 ` [PATCH 4/4] btrfs-progs: add global spare device list to filesystem show Anand Jain
2015-11-09 14:09 ` [PATCH 00/15] btrfs: Hot spare and Auto replace Austin S Hemmelgarn
2015-11-09 21:29 ` Duncan
2015-11-10 12:13 ` Austin S Hemmelgarn
2015-11-13 10:17 ` Anand Jain
2015-11-13 12:25 ` Austin S Hemmelgarn
2015-11-15 18:10 ` Christoph Anton Mitterer
2015-11-12 2:15 ` Qu Wenruo
2015-11-12 6:46 ` Duncan
2015-11-12 13:04 ` Austin S Hemmelgarn
2015-11-13 1:07 ` Qu Wenruo
2015-11-13 10:20 ` Anand Jain
2015-11-14 0:54 ` Qu Wenruo
2015-11-16 13:39 ` Austin S Hemmelgarn
2015-11-12 19:08 ` Goffredo Baroncelli
2015-11-13 10:18 ` Anand Jain
2015-11-12 19:21 ` Goffredo Baroncelli
2015-11-13 10:20 ` Anand Jain
2015-11-14 11:05 ` Goffredo Baroncelli
2015-11-16 13:41 ` Austin S Hemmelgarn
2015-11-16 22:07 ` Anand Jain
2015-11-17 12:28 ` Austin S Hemmelgarn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1447066589-3835-16-git-send-email-anand.jain@oracle.com \
--to=anand.jain@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).