From: David T-G <davidtg-robot@justpickone.org>
To: Linux RAID list <linux-raid@vger.kernel.org>
Subject: Re: how do i fix these RAID5 arrays?
Date: Thu, 24 Nov 2022 21:10:20 +0000 [thread overview]
Message-ID: <20221124211019.GE19721@jpo> (raw)
In-Reply-To: <20221124032821.628cd042@nvm>
Roman, et al --
...and then Roman Mamedov said...
% On Wed, 23 Nov 2022 22:07:36 +0000
% David T-G <davidtg-robot@justpickone.org> wrote:
%
% > diskfarm:~ # mdadm -D /dev/md50
...
% > 0 9 51 0 active sync /dev/md/51
% > 1 9 52 1 active sync /dev/md/52
% > 2 9 53 2 active sync /dev/md/53
% > 3 9 54 3 active sync /dev/md/54
% > 4 9 55 4 active sync /dev/md/55
% > 5 9 56 5 active sync /dev/md/56
%
% It feels you haven't thought this through entirely. Sequential writes to this
Well, it's at least possible that I don't know what I'm doing. I'm just
a dumb ol' Sys Admin, and career-changed out of the biz a few years back
to boot. I'm certainly open to advice. Would changing the default RAID5
or RAID0 stripe size help?
...
%
% mdraid in the "linear" mode, or LVM with one large LV across all PVs (which
% are the individual RAID5 arrays), or multi-device Btrfs using "single" profile
% for data, all of those would avoid the described effect.
How is linear different from RAID0? I took a quick look but don't quite
know what I'm reading. If that's better then, hey, I'd try it (or at
least learn more).
I've played little enough with md, but I haven't played with LVM at all.
I imagine that it's fine to mix them since you've suggested it. Got any
pointers to a good primer? :-)
I don't want to try BtrFS. That's another area where I have no experience,
but from what I've seen and read I really don't want to go there yet.
%
% But I should clarify, the entire idea of splitting drives like this seems
% questionable to begin with, since drives more often fail entirely, not in part,
...
% complete loss of data anyway. Not to mention what you have seems like an insane
% amount of complexity.
To make a long story short, my understanding of a big problem with RAID5
is that rebuilds take a ridiculously long time as the devices get larger.
Using smaller "devices", like partitions of the actual disk, helps get
around that. If I lose an entire disk, it's no worse than replacing an
entire disk; it's half a dozen rebuilds but at least in small chunks we
can also manage. If I have read errors or bad sector problems on just a
part, I can toss in a 2T disk to "spare" that piece until I get another
large drive and replace each piece.
As I also understand it, since I wasn't a storage engineer but did have
to automate against big shiny arrays, striping together RAID5 volumes is
pretty straightforward and pretty common. Maybe my problem is that I
need a couple of orders of magnitude more drives, though.
The whole idea is to allow fault tolerance while also allowing recovery,
with growth by adding another device every once in a while pretty simple.
%
% To summarize, maybe it's better to blow away the entire thing and restart from
% the drawing board, while it's not too late? :)
I'm open to that idea as well, as long as I can understand where I'm
headed :-) But what's best?
%
% > diskfarm:~ # mdadm -D /dev/md5[13456] | egrep '^/dev|active|removed'
...
% > that are obviously the sdk (new disk) slice. If md52 were also broken,
% > I'd figure that the disk was somehow unplugged, but I don't think I can
...
% > and then re-add them to build and grow and finalize this?
%
% If you want to fix it still, without dmesg it's hard to say how this could
% have happened, but what does
%
% mdadm --re-add /dev/md51 /dev/sdk51
%
% say?
Only that it doesn't like the stale pieces:
diskfarm:~ # dmesg | egrep sdk
[ 8.238044] sd 9:2:0:0: [sdk] 19532873728 512-byte logical blocks: (10.0 TB/9.10 TiB)
[ 8.238045] sd 9:2:0:0: [sdk] 4096-byte physical blocks
[ 8.238051] sd 9:2:0:0: [sdk] Write Protect is off
[ 8.238052] sd 9:2:0:0: [sdk] Mode Sense: 00 3a 00 00
[ 8.238067] sd 9:2:0:0: [sdk] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 8.290084] sdk: sdk51 sdk52 sdk53 sdk54 sdk55 sdk56 sdk128
[ 8.290747] sd 9:2:0:0: [sdk] Attached SCSI removable disk
[ 17.920802] md: kicking non-fresh sdk51 from array!
[ 17.923119] md/raid:md52: device sdk52 operational as raid disk 3
[ 18.307507] md: kicking non-fresh sdk53 from array!
[ 18.311051] md: kicking non-fresh sdk54 from array!
[ 18.314854] md: kicking non-fresh sdk55 from array!
[ 18.317730] md: kicking non-fresh sdk56 from array!
Does it look like --re-add will be safe? [Yes, maybe I'll start over,
but clearing this problem would be a nice first step.]
%
% --
% With respect,
% Roman
Thanks again & HAND & Happy Thanksgiving in the US
:-D
--
David T-G
See http://justpickone.org/davidtg/email/
See http://justpickone.org/davidtg/tofu.txt
next prev parent reply other threads:[~2022-11-24 21:10 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-23 22:07 how do i fix these RAID5 arrays? David T-G
2022-11-23 22:28 ` Roman Mamedov
2022-11-24 0:01 ` Roger Heflin
2022-11-24 21:20 ` David T-G
2022-11-24 21:49 ` Wol
2022-11-25 13:36 ` and dm-integrity, too (was "Re: how do i fix these RAID5 arrays?") David T-G
2022-11-24 21:10 ` David T-G [this message]
2022-11-24 21:33 ` how do i fix these RAID5 arrays? Wol
2022-11-25 1:16 ` Roger Heflin
2022-11-25 13:22 ` David T-G
[not found] ` <CAAMCDed1-4zFgHMS760dO1pThtkrn8K+FMuG-QQ+9W-FE0iq9Q@mail.gmail.com>
2022-11-25 19:49 ` David T-G
2022-11-28 14:24 ` md RAID0 can be grown (was "Re: how do i fix these RAID5 arrays?") David T-G
2022-11-29 21:17 ` Jani Partanen
2022-11-29 22:22 ` Roman Mamedov
2022-12-03 5:41 ` md vs LVM and VMs and ... (was "Re: md RAID0 can be grown (was ...") David T-G
2022-12-03 12:06 ` Wols Lists
2022-12-03 18:04 ` batches and serial numbers (was "Re: md vs LVM and VMs and ...") David T-G
2022-12-03 20:07 ` Wols Lists
2022-12-04 2:47 ` batches and serial numbers David T-G
2022-12-04 13:54 ` Wols Lists
2022-12-04 13:04 ` batches and serial numbers (was "Re: md vs LVM and VMs and ...") Reindl Harald
2022-12-03 5:41 ` md RAID0 can be grown David T-G
2022-11-25 13:30 ` about linear and about RAID10 (was "Re: how do i fix these RAID5 arrays?") David T-G
2022-11-25 14:23 ` Wols Lists
2022-11-25 19:50 ` about linear and about RAID10 David T-G
2022-11-25 18:00 ` about linear and about RAID10 (was "Re: how do i fix these RAID5 arrays?") Roger Heflin
2022-11-28 14:46 ` about linear and about RAID10 David T-G
2022-11-28 15:32 ` Reindl Harald
[not found] ` <CAAMCDecXkcmUe=ZFnJ_NndND0C2=D5qSoj1Hohsrty8y1uqdfw@mail.gmail.com>
2022-11-28 17:03 ` Reindl Harald
2022-11-28 20:45 ` John Stoffel
2022-12-03 5:58 ` David T-G
2022-12-03 12:16 ` Wols Lists
2022-12-03 18:27 ` David T-G
2022-12-03 23:26 ` Wol
2022-12-04 2:53 ` David T-G
2022-12-04 13:13 ` Reindl Harald
2022-12-04 13:08 ` Reindl Harald
2022-12-03 5:45 ` David T-G
2022-12-03 12:20 ` Reindl Harald
[not found] ` <CAAMCDee_YrhXo+5hp31YXgUHkyuUr-zTXOqi0-HUjMrHpYMkTQ@mail.gmail.com>
2022-12-03 5:52 ` stripe size checking (was "Re: about linear and about RAID10") David T-G
2022-11-25 14:49 ` how do i fix these RAID5 arrays? Wols Lists
2022-11-26 20:02 ` John Stoffel
2022-11-27 9:33 ` Wols Lists
2022-11-27 11:46 ` Reindl Harald
2022-11-27 11:52 ` Wols Lists
2022-11-27 12:06 ` Reindl Harald
2022-11-27 14:33 ` Wol
2022-11-27 18:08 ` Roman Mamedov
2022-11-27 19:21 ` Wol
2022-11-28 1:26 ` Reindl Harald
2022-11-27 18:23 ` Reindl Harald
2022-11-27 19:30 ` Wol
2022-11-27 19:51 ` Reindl Harald
2022-11-27 14:10 ` piergiorgio.sartor
2022-11-27 18:21 ` Reindl Harald
2022-11-27 19:37 ` Piergiorgio Sartor
2022-11-27 19:52 ` Reindl Harald
2022-11-27 22:05 ` Wol
2022-11-27 22:08 ` Reindl Harald
2022-11-27 22:11 ` Reindl Harald
2022-11-27 22:17 ` Roman Mamedov
2022-11-27 14:58 ` John Stoffel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221124211019.GE19721@jpo \
--to=davidtg-robot@justpickone.org \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.