From: NeilBrown <neilb@suse.de>
To: Goldwyn Rodrigues <rgoldwyn@suse.de>
Cc: GQJiang@suse.com, linux-raid@vger.kernel.org
Subject: Re: [PATCH 5/6] md: re-add a failed disk
Date: Mon, 20 Apr 2015 11:56:40 +1000 [thread overview]
Message-ID: <20150420115640.1eb3d371@notabene.brown> (raw)
In-Reply-To: <20150414154522.GA4105@shrek.lan>
[-- Attachment #1: Type: text/plain, Size: 4570 bytes --]
On Tue, 14 Apr 2015 10:45:22 -0500 Goldwyn Rodrigues <rgoldwyn@suse.de> wrote:
> This adds the capability of re-adding a failed disk by
> writing "re-add" to /sys/block/mdXX/md/dev-YYY/state.
>
> This facilitates adding disks which have encountered a temporary
> error such as a network disconnection/hiccup in an iSCSI device,
> or a SAN cable disconnection which has been restored. In such
> a situation, you do not need to remove and re-add the device.
> Writing re-add to the failed device's state would add it again
> to the array and perform the recovery of only the blocks which
> were written after the device failed.
>
> This works for generic md, and is not related to clustering. However,
> this patch is to ease re-add operations listed above in clustering
> environments.
>
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> ---
> drivers/md/md.c | 56 +++++++++++++++++++++++++++++++++++---------------------
> 1 file changed, 35 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 9127d11..ba01605 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -2379,6 +2379,36 @@ repeat:
> }
> EXPORT_SYMBOL(md_update_sb);
>
> +static int add_bound_rdev(struct md_rdev *rdev)
> +{
> + struct mddev *mddev = rdev->mddev;
> + int err = 0;
> +
> + if (!mddev->pers->hot_remove_disk) {
> + /* If there is hot_add_disk but no hot_remove_disk
> + * then added disks for geometry changes,
> + * and should be added immediately.
> + */
> + super_types[mddev->major_version].
> + validate_super(mddev, rdev);
> + err = mddev->pers->hot_add_disk(mddev, rdev);
> + if (err) {
> + unbind_rdev_from_array(rdev);
> + export_rdev(rdev);
> + return err;
> + }
> + }
> + sysfs_notify_dirent_safe(rdev->sysfs_state);
> +
> + set_bit(MD_CHANGE_DEVS, &mddev->flags);
> + if (mddev->degraded)
> + set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
> + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
> + md_new_event(mddev);
> + md_wakeup_thread(mddev->thread);
> + return 0;
> +}
> +
> /* words written to sysfs files may, or may not, be \n terminated.
> * We want to accept with case. For this we use cmd_match.
> */
> @@ -2568,7 +2598,10 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
> clear_bit(Replacement, &rdev->flags);
> err = 0;
> }
> - }
> + } else if (cmd_match(buf, "re-add") && (test_bit(Faulty, &rdev->flags) || (rdev->raid_disk == -1))) {
> + clear_bit(Faulty, &rdev->flags);
> + err = add_bound_rdev(rdev);
> + }
I changed this to:
} else if (cmd_match(buf, "re-add")) {
if (test_bit(Faulty, &rdev->flags) && (rdev->raid_disk == -1)) {
clear_bit(Faulty, &rdev->flags);
err = add_bound_rdev(rdev);
} else
err = -EBUSY;
}
because:
1/ I want all branches of the main if/else to be just "cmd_match...",
2/ I want to return EBUSY, if 're-add' was recognised as a command, but
the default wasn't available for a re-add, and
3/ re-add can only be allowed if the device is faulty AND raid_disk is -1.
If not faulty, re-add makes no sense.
If raid_disk is not -1, then the device still has outstanding IO and
we need to keep waiting for that to complete.
Otherwise, patch accepted - thanks.
NeilBrown
> if (!err)
> sysfs_notify_dirent_safe(rdev->sysfs_state);
> return err ? err : len;
> @@ -5882,29 +5915,10 @@ static int add_new_disk(struct mddev *mddev, mdu_disk_info_t *info)
>
> rdev->raid_disk = -1;
> err = bind_rdev_to_array(rdev, mddev);
> - if (!err && !mddev->pers->hot_remove_disk) {
> - /* If there is hot_add_disk but no hot_remove_disk
> - * then added disks for geometry changes,
> - * and should be added immediately.
> - */
> - super_types[mddev->major_version].
> - validate_super(mddev, rdev);
> - err = mddev->pers->hot_add_disk(mddev, rdev);
> - if (err)
> - unbind_rdev_from_array(rdev);
> - }
> if (err)
> export_rdev(rdev);
> else
> - sysfs_notify_dirent_safe(rdev->sysfs_state);
> -
> - set_bit(MD_CHANGE_DEVS, &mddev->flags);
> - if (mddev->degraded)
> - set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
> - set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
> - if (!err)
> - md_new_event(mddev);
> - md_wakeup_thread(mddev->thread);
> + err = add_bound_rdev(rdev);
> if (mddev_is_clustered(mddev) &&
> (info->state & (1 << MD_DISK_CLUSTER_ADD)))
> md_cluster_ops->add_new_disk_finish(mddev);
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
prev parent reply other threads:[~2015-04-20 1:56 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-14 15:45 [PATCH 5/6] md: re-add a failed disk Goldwyn Rodrigues
2015-04-20 1:56 ` NeilBrown [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150420115640.1eb3d371@notabene.brown \
--to=neilb@suse.de \
--cc=GQJiang@suse.com \
--cc=linux-raid@vger.kernel.org \
--cc=rgoldwyn@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).