From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Anand Jain <anand.jain@oracle.com>, linux-btrfs@vger.kernel.org
Cc: clm@fb.com, dsterba@suse.cz
Subject: Re: [PATCH v2 00/15] Introduce device state 'failed', Hot spare and Auto replace
Date: Tue, 29 Mar 2016 13:30:13 -0400 [thread overview]
Message-ID: <56FABBA5.4090402@gmail.com> (raw)
In-Reply-To: <1459261349-32206-1-git-send-email-anand.jain@oracle.com>
On 2016-03-29 10:22, Anand Jain wrote:
> Thanks for various comments, tests and feedback.
>
> Background: Hot spare and Auto replace:
> Hot spare is predominately used to mitigate or narrow the time
> window of a storage in degraded mode during which any further disk
> failure might lead to a catastrophic data loss. Data center
> storage generally will have couple of disks reserved as spares
> on the storage. Mainly this is an enterprise storage feature
> rather than a FS feature, I believe people acquainted with
> enterprise storage use cases will appreciate the need of it and
> so most/all of the enterprise storage has hot spare feature.
>
> Btrfs device states:
> This patch-set adds 'failed' state and makes provision to use
> 'offline' state as two new device states. So to summarize
> various device states and their meanings..
>
> /* missing: device wasn't found at the time of mount */
> int missing;
>
> /*
> * failed: device confirmed to have experienced critical
> * io failure
> */
> int failed;
>
> /*
> * offline: When there is no confirmation that a disk has
> * failed. But an interim communication breakdown
> * and not necessarily a candidate for the device replace.
> * Device might be online after user intervention or after
> * block transport layer error recovery.
> */
> int offline;
>
>
> Device state transition Tuning and visualization:
> Sysfs interfaces are planned to provide the required tuning for
> device state transition sensitivities and visualization of device
> states. However sysfs framework which could provide such an interface
> is being reviewed/tested and not yet ready as of now. So for the
> testing and debug of these features here I have used an update
> version of the procfs patch which is in the ML.
>
> [PATCH] btrfs: debug: procfs-devlist: introduce procfs interface for
> the device list for debugging
>
> I find the above patch very useful and stable as compared to sysfs
> to visualize the device state.
>
> This patch set does not depend on any of the sysfs patches as such.
>
> Cross compatibility:
> Adds a new incompatibility feature flags
> (BTRFS_FEATURE_INCOMPAT_SPARE_DEV) to manage the spare device
> when older kernels are used. So it is tested to be work fine
> with older kernel/prog versions.
>
>
> Auto replace:
> Replace happens automatically, that is when there is any write
> failed or flush failed, the device will be marked as failed, which
> will stop any further IO attempt to that device. And in the next
> commit cycle the auto replace will pick the spare device to
> replace the failed device. And so the btrfs volume is back to a
> healthy state.
>
> Per FSID spare vs Global spare:
> As of now only global hot spare is supported, that is hot spare(s)
> are for all the btrfs FS in the system. However future there will
> be a fs_info->no_auto_replace tunable which can be tuned by the user
> to limit the use of global spare.
>
>
> Example use case:
> Here below is an example use case of the hot spare setup.
>
> Add a spare device:
> btrfs spare add /dev/sde -f
>
> If there is a spare device which is already added before the,
> just run
>
> btrfs dev scan [/dev/sde]
>
> Which will register the spare device to the kernel.
>
> btrfs fi show
> Label: none uuid: 52f170c1-725c-457d-8cfd-d57090460091
> Total devices 2 FS bytes used 112.00KiB
> devid 1 size 2.00GiB used 417.50MiB path /dev/sdc
> devid 2 size 2.00GiB used 417.50MiB path /dev/sdd
>
> Global spare
> device size 3.00GiB path /dev/sde
>
>
> Patches:
>
> Kernel:
> First, it needs, Qu's per chunk missing device patchset, which is
> part of the set.
>
> Next patches 6/12 brings in support to manage the transition of
> devices from online (no state) to offline OR failed state dynamically.
> On top of static device state like the current "missing" state.
>
> Next patches 7-11/12 adds support for Spare device. For kernel without
> spare feature the spare device is kept away. And when the kernel
> supports the spare device, it will inhibit from mounting it. Further
> these patch set provides helper function to pick a spare device and
> release a spare device back to the spare device pool.
>
> Patch 11/12 provides function for auto replace, this is mainly
> from the existing replace code.
> Last 12/15, uses all these facilities, picks a failed device and
> triggers a auto replace in a kthread (casualty_kthread())
>
>
> Progs:
> Needs below 4 patches which will add sub cli 'spare' to manage
> the spare device. As of now deleting a spare device has to be
> managed using wipefs. However in the long run we would a proper
> btrfs command to do that job.
>
>
> V1->V2:
> Kernel:
> (Based on tests and commets provided in the ML)
> a. Now transition_kthread() wakes up the casualty_kthread to check
> for device states. Instead of doing that in the transition_kthread()
> itself. Cleaner and less pressure on transition_kthread().
> b. Dropped
> [PATCH 05/15] btrfs: optimize btrfs_check_degradable() for calls outside of barrier
> as it was wrong patch and the optimization was incomplete.
> c. Merged patches
> btrfs: check for failed device and hot replace
> to
> btrfs: check device for critical errors and mark failed
> in an effort to make the changes as in a above.
>
> Progs:
> a. Added to call btrfs_register_one_device() when doing btrfs
> spare add
>
>
> Anand Jain (7):
> btrfs: introduce device dynamic state transition to offline or failed
> btrfs: introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV
> btrfs: add check not to mount a spare device
> btrfs: support btrfs dev scan for spare device
> btrfs: provide framework to get and put a spare device
> btrfs: introduce helper functions to perform hot replace
> btrfs: check device for critical errors and mark failed
>
> Qu Wenruo (5):
> btrfs: Introduce a new function to check if all chunks a OK for
> degraded mount
> btrfs: Do per-chunk check for mount time check
> btrfs: Do per-chunk degraded check for remount
> btrfs: Allow barrier_all_devices to do per-chunk device check
> btrfs: Cleanup num_tolerated_disk_barrier_failures
>
> fs/btrfs/ctree.h | 8 +-
> fs/btrfs/dev-replace.c | 24 +++++
> fs/btrfs/dev-replace.h | 1 +
> fs/btrfs/disk-io.c | 256 +++++++++++++++++++++++++++++++++--------------
> fs/btrfs/disk-io.h | 4 +-
> fs/btrfs/super.c | 20 +++-
> fs/btrfs/volumes.c | 263 +++++++++++++++++++++++++++++++++++++++++++++----
> fs/btrfs/volumes.h | 27 +++++
> 8 files changed, 504 insertions(+), 99 deletions(-)
>
> Anand Jain (4):
> btrfs-progs: Introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV SB flags
> btrfs-progs: Introduce btrfs spare subcommand
> btrfs-progs: add fi show for spare
> btrfs-progs: add global spare device list to filesystem show
>
> Android.mk | 2 +-
> Makefile.in | 3 +-
> btrfs.c | 1 +
> cmds-filesystem.c | 9 ++
> cmds-spare.c | 292 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> commands.h | 2 +
> ctree.h | 4 +-
> utils.h | 1 +
> volumes.c | 4 +
> volumes.h | 2 +
> 10 files changed, 317 insertions(+), 3 deletions(-)
> create mode 100644 cmds-spare.c
>
I can't provide the same degree of testing this time that I did for the
previous version (the system I had set up with my normal testing harness
is offline for the foreseeable future). That said, I've built and
booted a kernel with these patches in a VM on my laptop and tested the
new functionality, and everything appears to work like it's supposed to
without breaking any existing code, so for the patch-set as a whole:
Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
prev parent reply other threads:[~2016-03-29 17:31 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-29 14:22 [PATCH v2 00/15] Introduce device state 'failed', Hot spare and Auto replace Anand Jain
2016-03-29 14:22 ` [PATCH 01/12] btrfs: Introduce a new function to check if all chunks a OK for degraded mount Anand Jain
2016-03-29 14:22 ` [PATCH 02/12] btrfs: Do per-chunk check for mount time check Anand Jain
2016-03-29 14:22 ` [PATCH 03/12] btrfs: Do per-chunk degraded check for remount Anand Jain
2016-03-29 14:22 ` [PATCH 04/12] btrfs: Allow barrier_all_devices to do per-chunk device check Anand Jain
2016-03-29 14:22 ` [PATCH 05/12] btrfs: Cleanup num_tolerated_disk_barrier_failures Anand Jain
2016-03-29 14:22 ` [PATCH 06/12] btrfs: introduce device dynamic state transition to offline or failed Anand Jain
2016-03-29 14:22 ` [PATCH 07/12] btrfs: introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV Anand Jain
2016-03-29 14:22 ` [PATCH 08/12] btrfs: add check not to mount a spare device Anand Jain
2016-03-29 14:22 ` [PATCH 09/12] btrfs: support btrfs dev scan for " Anand Jain
2016-03-29 14:22 ` [PATCH 10/12] btrfs: provide framework to get and put a " Anand Jain
2016-03-29 14:22 ` [PATCH 11/12] btrfs: introduce helper functions to perform hot replace Anand Jain
2016-03-29 14:45 ` kbuild test robot
2016-03-30 10:13 ` Anand Jain
2016-03-31 2:14 ` [kbuild-all] " Fengguang Wu
2016-03-29 14:22 ` [PATCH 12/12] btrfs: check device for critical errors and mark failed Anand Jain
2016-03-29 22:41 ` Yauhen Kharuzhy
2016-04-01 23:53 ` Anand Jain
2016-03-30 0:49 ` Yauhen Kharuzhy
2016-04-01 23:59 ` Anand Jain
2016-03-29 14:27 ` [PATCH 1/4] btrfs-progs: Introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV SB flags Anand Jain
2016-03-29 14:27 ` [PATCH v2 2/4] btrfs-progs: Introduce btrfs spare subcommand Anand Jain
2016-03-29 14:27 ` [PATCH 3/4] btrfs-progs: add fi show for spare Anand Jain
2016-03-29 14:27 ` [PATCH 4/4] btrfs-progs: add global spare device list to filesystem show Anand Jain
2016-03-29 17:30 ` Austin S. Hemmelgarn [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56FABBA5.4090402@gmail.com \
--to=ahferroin7@gmail.com \
--cc=anand.jain@oracle.com \
--cc=clm@fb.com \
--cc=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).