From: David T-G <davidtg-robot@justpickone.org>
To: Linux RAID list <linux-raid@vger.kernel.org>
Subject: Re: how do i fix these RAID5 arrays?
Date: Thu, 24 Nov 2022 21:10:20 +0000 [thread overview]
Message-ID: <20221124211019.GE19721@jpo> (raw)
In-Reply-To: <20221124032821.628cd042@nvm>
Roman, et al --
...and then Roman Mamedov said...
% On Wed, 23 Nov 2022 22:07:36 +0000
% David T-G <davidtg-robot@justpickone.org> wrote:
%
% > diskfarm:~ # mdadm -D /dev/md50
...
% > 0 9 51 0 active sync /dev/md/51
% > 1 9 52 1 active sync /dev/md/52
% > 2 9 53 2 active sync /dev/md/53
% > 3 9 54 3 active sync /dev/md/54
% > 4 9 55 4 active sync /dev/md/55
% > 5 9 56 5 active sync /dev/md/56
%
% It feels you haven't thought this through entirely. Sequential writes to this
Well, it's at least possible that I don't know what I'm doing. I'm just
a dumb ol' Sys Admin, and career-changed out of the biz a few years back
to boot. I'm certainly open to advice. Would changing the default RAID5
or RAID0 stripe size help?
...
%
% mdraid in the "linear" mode, or LVM with one large LV across all PVs (which
% are the individual RAID5 arrays), or multi-device Btrfs using "single" profile
% for data, all of those would avoid the described effect.
How is linear different from RAID0? I took a quick look but don't quite
know what I'm reading. If that's better then, hey, I'd try it (or at
least learn more).
I've played little enough with md, but I haven't played with LVM at all.
I imagine that it's fine to mix them since you've suggested it. Got any
pointers to a good primer? :-)
I don't want to try BtrFS. That's another area where I have no experience,
but from what I've seen and read I really don't want to go there yet.
%
% But I should clarify, the entire idea of splitting drives like this seems
% questionable to begin with, since drives more often fail entirely, not in part,
...
% complete loss of data anyway. Not to mention what you have seems like an insane
% amount of complexity.
To make a long story short, my understanding of a big problem with RAID5
is that rebuilds take a ridiculously long time as the devices get larger.
Using smaller "devices", like partitions of the actual disk, helps get
around that. If I lose an entire disk, it's no worse than replacing an
entire disk; it's half a dozen rebuilds but at least in small chunks we
can also manage. If I have read errors or bad sector problems on just a
part, I can toss in a 2T disk to "spare" that piece until I get another
large drive and replace each piece.
As I also understand it, since I wasn't a storage engineer but did have
to automate against big shiny arrays, striping together RAID5 volumes is
pretty straightforward and pretty common. Maybe my problem is that I
need a couple of orders of magnitude more drives, though.
The whole idea is to allow fault tolerance while also allowing recovery,
with growth by adding another device every once in a while pretty simple.
%
% To summarize, maybe it's better to blow away the entire thing and restart from
% the drawing board, while it's not too late? :)
I'm open to that idea as well, as long as I can understand where I'm
headed :-) But what's best?
%
% > diskfarm:~ # mdadm -D /dev/md5[13456] | egrep '^/dev|active|removed'
...
% > that are obviously the sdk (new disk) slice. If md52 were also broken,
% > I'd figure that the disk was somehow unplugged, but I don't think I can
...
% > and then re-add them to build and grow and finalize this?
%
% If you want to fix it still, without dmesg it's hard to say how this could
% have happened, but what does
%
% mdadm --re-add /dev/md51 /dev/sdk51
%
% say?
Only that it doesn't like the stale pieces:
diskfarm:~ # dmesg | egrep sdk
[ 8.238044] sd 9:2:0:0: [sdk] 19532873728 512-byte logical blocks: (10.0 TB/9.10 TiB)
[ 8.238045] sd 9:2:0:0: [sdk] 4096-byte physical blocks
[ 8.238051] sd 9:2:0:0: [sdk] Write Protect is off
[ 8.238052] sd 9:2:0:0: [sdk] Mode Sense: 00 3a 00 00
[ 8.238067] sd 9:2:0:0: [sdk] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 8.290084] sdk: sdk51 sdk52 sdk53 sdk54 sdk55 sdk56 sdk128
[ 8.290747] sd 9:2:0:0: [sdk] Attached SCSI removable disk
[ 17.920802] md: kicking non-fresh sdk51 from array!
[ 17.923119] md/raid:md52: device sdk52 operational as raid disk 3
[ 18.307507] md: kicking non-fresh sdk53 from array!
[ 18.311051] md: kicking non-fresh sdk54 from array!
[ 18.314854] md: kicking non-fresh sdk55 from array!
[ 18.317730] md: kicking non-fresh sdk56 from array!
Does it look like --re-add will be safe? [Yes, maybe I'll start over,
but clearing this problem would be a nice first step.]
%
% --
% With respect,
% Roman
Thanks again & HAND & Happy Thanksgiving in the US
:-D
--
David T-G
See http://justpickone.org/davidtg/email/
See http://justpickone.org/davidtg/tofu.txt
next prev parent reply other threads:[~2022-11-24 21:10 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-23 22:07 how do i fix these RAID5 arrays? David T-G
2022-11-23 22:28 ` Roman Mamedov
2022-11-24 0:01 ` Roger Heflin
2022-11-24 21:20 ` David T-G
2022-11-24 21:49 ` Wol
2022-11-25 13:36 ` and dm-integrity, too (was "Re: how do i fix these RAID5 arrays?") David T-G
2022-11-24 21:10 ` David T-G [this message]
2022-11-24 21:33 ` how do i fix these RAID5 arrays? Wol
2022-11-25 1:16 ` Roger Heflin
2022-11-25 13:22 ` David T-G
[not found] ` <CAAMCDed1-4zFgHMS760dO1pThtkrn8K+FMuG-QQ+9W-FE0iq9Q@mail.gmail.com>
2022-11-25 19:49 ` David T-G
2022-11-28 14:24 ` md RAID0 can be grown (was "Re: how do i fix these RAID5 arrays?") David T-G
2022-11-29 21:17 ` Jani Partanen
2022-11-29 22:22 ` Roman Mamedov
2022-12-03 5:41 ` md vs LVM and VMs and ... (was "Re: md RAID0 can be grown (was ...") David T-G
2022-12-03 12:06 ` Wols Lists
2022-12-03 18:04 ` batches and serial numbers (was "Re: md vs LVM and VMs and ...") David T-G
2022-12-03 20:07 ` Wols Lists
2022-12-04 2:47 ` batches and serial numbers David T-G
2022-12-04 13:54 ` Wols Lists
2022-12-04 13:04 ` batches and serial numbers (was "Re: md vs LVM and VMs and ...") Reindl Harald
2022-12-03 5:41 ` md RAID0 can be grown David T-G
2022-11-25 13:30 ` about linear and about RAID10 (was "Re: how do i fix these RAID5 arrays?") David T-G
2022-11-25 14:23 ` Wols Lists
2022-11-25 19:50 ` about linear and about RAID10 David T-G
2022-11-25 18:00 ` about linear and about RAID10 (was "Re: how do i fix these RAID5 arrays?") Roger Heflin
2022-11-28 14:46 ` about linear and about RAID10 David T-G
2022-11-28 15:32 ` Reindl Harald
[not found] ` <CAAMCDecXkcmUe=ZFnJ_NndND0C2=D5qSoj1Hohsrty8y1uqdfw@mail.gmail.com>
2022-11-28 17:03 ` Reindl Harald
2022-11-28 20:45 ` John Stoffel
2022-12-03 5:58 ` David T-G
2022-12-03 12:16 ` Wols Lists
2022-12-03 18:27 ` David T-G
2022-12-03 23:26 ` Wol
2022-12-04 2:53 ` David T-G
2022-12-04 13:13 ` Reindl Harald
2022-12-04 13:08 ` Reindl Harald
2022-12-03 5:45 ` David T-G
2022-12-03 12:20 ` Reindl Harald
[not found] ` <CAAMCDee_YrhXo+5hp31YXgUHkyuUr-zTXOqi0-HUjMrHpYMkTQ@mail.gmail.com>
2022-12-03 5:52 ` stripe size checking (was "Re: about linear and about RAID10") David T-G
2022-11-25 14:49 ` how do i fix these RAID5 arrays? Wols Lists
2022-11-26 20:02 ` John Stoffel
2022-11-27 9:33 ` Wols Lists
2022-11-27 11:46 ` Reindl Harald
2022-11-27 11:52 ` Wols Lists
2022-11-27 12:06 ` Reindl Harald
2022-11-27 14:33 ` Wol
2022-11-27 18:08 ` Roman Mamedov
2022-11-27 19:21 ` Wol
2022-11-28 1:26 ` Reindl Harald
2022-11-27 18:23 ` Reindl Harald
2022-11-27 19:30 ` Wol
2022-11-27 19:51 ` Reindl Harald
2022-11-27 14:10 ` piergiorgio.sartor
2022-11-27 18:21 ` Reindl Harald
2022-11-27 19:37 ` Piergiorgio Sartor
2022-11-27 19:52 ` Reindl Harald
2022-11-27 22:05 ` Wol
2022-11-27 22:08 ` Reindl Harald
2022-11-27 22:11 ` Reindl Harald
2022-11-27 22:17 ` Roman Mamedov
2022-11-27 14:58 ` John Stoffel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221124211019.GE19721@jpo \
--to=davidtg-robot@justpickone.org \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).