From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A88F9C433F5 for ; Wed, 9 Feb 2022 20:58:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232460AbiBIU6f (ORCPT ); Wed, 9 Feb 2022 15:58:35 -0500 Received: from gmail-smtp-in.l.google.com ([23.128.96.19]:43522 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232452AbiBIU63 (ORCPT ); Wed, 9 Feb 2022 15:58:29 -0500 Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com [IPv6:2607:f8b0:4864:20::829]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05B74C043181 for ; Wed, 9 Feb 2022 12:58:32 -0800 (PST) Received: by mail-qt1-x829.google.com with SMTP id x5so2949203qtw.10 for ; Wed, 09 Feb 2022 12:58:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=zkSZ6bcFZBYafA1JqYtldqzQ16y/hSamKVyNAO31Je8=; b=OGFHgdr39gnF6AzFHt1IRGid1Ob9j6VvMhko0hykOQQ78wYHN3ywZ1ISxUk45G0p/S szG2B2Cd3M0SjFtL1LnIL04MgKi9OIq7DOtXYa5JJKSk+guf5LPvhA79jOo5VFepOCXo DkQ5RlWizRY/bPFfQJyaktQd4sbZCqFKD1wgsgTEMFHYcb/rQ17azlUrYElgN4eGF45j EGUvHRl3cD5dCvAjRDQ6QLrlHeGdRdBhKV+CViDeVGKyc24yrZKRvacHjRRWtH8Sg+Vm n16Olf0mSqjCKtGy7yE+oexmYCHCWNhTCzeuEpK2cK7ABTXHPiVVIKejHQ4GJbUlV4az HO9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zkSZ6bcFZBYafA1JqYtldqzQ16y/hSamKVyNAO31Je8=; b=ulqhF8XW4NNXtwOj1g5RZEt5oeDqR9tHg0aFqa1ZSRDpNuDAPxp+PYSLvXJeV5YqVs KVQgS7bKcQBmoNkVKBm9WUUHuSXPVbvBse5I/+FGo/h0dM5k/A8v+69atcJZtemmxiyr z0qQwShqjDCkVwIzIIp1MMyXCJ34+y/DCY4YgyDSiDr2bLOtfVR3RpI+6GAoUU0I4PZQ QJu0vLxTgyg5+lVQvDJwEuM42EJThyTY1RwhY3+s/mIFjC0qTbn0jowsEJ5FchRl6kG7 Bxb4cFgZ5IaLi5Qn4hqSLIx/F7EO41yiURMxLugUxWFYRFvn2Yhr/PN4XvxMbNMJrPrt A2zg== X-Gm-Message-State: AOAM532b6+3PJnjZZS4EgItTHUYYXWSvAyBhRKjxC6LU56zwrGWZudfG 46Wrsyr3p/6IGbX4EK6dLrlObZqFd14= X-Google-Smtp-Source: ABdhPJyEL2XAExUDnM7VsQjPy5BFv8nZy3cjWM95JGrQ/zl2hvjZO2Sq1PP/NbGv3CdHeA9MLxBg+Q== X-Received: by 2002:a05:622a:11c9:: with SMTP id n9mr2695465qtk.592.1644440311123; Wed, 09 Feb 2022 12:58:31 -0800 (PST) Received: from falcon.sitarc.ca ([2607:fea8:c39f:f018::c39]) by smtp.gmail.com with ESMTPSA id s6sm8936377qko.93.2022.02.09.12.58.30 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 09 Feb 2022 12:58:30 -0800 (PST) Date: Wed, 9 Feb 2022 15:58:25 -0500 From: Red Wil To: Wol Cc: linux-raid Subject: Re: Replacing all disks in a an array as a preventative measure before failing. Message-ID: <20220209155825.1c8a3570@falcon.sitarc.ca> In-Reply-To: References: <20220207152648.42dd311a@falcon.sitarc.ca> X-Mailer: Claws Mail 3.16.0 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org On Mon, 7 Feb 2022 22:28:57 +0000 Wol wrote: > On 07/02/2022 20:26, Red Wil wrote: > > Hello, > > > > It started as the subject said: > > - goal was to replace all 10 disks in a R6 > > - context and perceived constraints > > - soft raid (no imsm and or ddl containers) > > - multiple disk partition. partitions across 10 disks formed R6 > > - downtime not an issue > > - minimize the number of commands > > - minimize disks stress > > - reduce the time spent with this process > > - difficult to add 10 spares at once in the rig > > - after a reshape/grow from 6 to 10 disks offset of data in raid > > members was all over the place from cca 10ksect to 200ksect > > > > Approaches/solutions and critique > > 1- add one by one a 'spare' and 'replace' raid member > > critique: > > - seem to me long and tedious process > > - cannot/will not run in parallel > > There's not a problem running in parallel as far as mdraid is > concerned. If you can get the spare drives into the chassis (or on > eSATA), you can --replace several drives at once. > > And it pretty much just does a dd, just on the live system keeping > you raid-safe. If I remember correctly if you have multiple partitions on a single disk (different arrays obviously) if you start a syn/resync op, for example, on all arrays from that particular spindle/disk, it will be done sequentially. If it would do it in parallel -> heads movement stress. > > > 2- add all the spares at once and perform 'replace' on members > > critique > > - just tedious - lots of cli commands which can be prone to > > mistakes. > > pretty much the same as (1). Given that your sdX's are moving all > over the place, I would work with uuids even though it's more typing, > it's safer. > > > next ones assume I have all the 'spares' in the rig > > 3- create new arrays on spares, fresh fs and copy data. > > Well, you could fail/replace all the old drives, but yes just > building a new array from scratch (if you can afford the downtime) is > probably better. Another reason to go this route was to tune/tweak the stack (RAID-LVM-FS) > > > 4- dd/ddrescue copy each drive to a new one. Advantage can be > > done one by one or in parallel. less commands in the terminal. > > Less commands? Dunno about that. Much safer in many ways though, > remove the drive you're replacing, copy it, put the new one back. > Less chance for a physical error. well.. it's a matter of perception. for 10 disks I will have 10 dd commands of the form "dd if=olddrive of=newdrive " or even better "ddrescue olddrive newdrive logfile" otherwise all the "mdadm commands" would be 50 in total for 10 disks for I have 5 individual arrays across 10 disks > > > > In the end I decided I will use route (3). > > - flexibility on creation > > - copy only what I need > > - old array is a sort of backup > > > > Question: > > Just for my curiosity regarding (4) assuming array is offline: > > Besides being not recommended in case of imsm/ddl containers which > > (as far as i understood) keep some data on the hardware itself > > > > In case of pure soft raid is anything technical or safety related > > that prevents a 'dd' copy of a physical hard drive to act exactly > > as the original. > > > Nope. You've copied the partition byte for byte, the raid won't know > any different. > > One question, though. Why are you replacing the drives? Just a > precaution? > > How big are the drives? What I'd do if you're not replacing dying > drives, is buy five or possibly six drives of twice the capacity. Do > a --replace on those five drives. Now take two of the drives you've > removed, raid-0 them, and now do a major re-org, adding your raid-0 > as device 6, reducing your raid to a 6-device array, and removing the > last four old drives from the array. Assuming you've only got 10 bays > and you've been faffing about externally as you replace drives, you > can now use the last three drives in the chassis to create another > two-drive raid-0, add that as a spare into your raid-6, and add your > last drive as a spare into both your raid-0s. > > So you end up with a 6-device+plus-spare raid-6, and devices 6 & > spare (your raid-0s) share a spare between them. > > Cheers, > Wol I was thinking of cutting nr. of drives to 6 from 10 by using double size drives but financial considerations at the time end up with 10 slightly larger drives. Thanks for your comments Red