All of lore.kernel.org
 help / color / mirror / Atom feed
From: David T-G <davidtg-robot@justpickone.org>
To: Linux RAID list <linux-raid@vger.kernel.org>
Subject: all of my drives are spares
Date: Fri, 8 Sep 2023 02:50:35 +0000	[thread overview]
Message-ID: <20230908025035.GB1085@jpo> (raw)

Hi, all --

After a surprise reboot the other day, I came home to find diskfarm's
RAID5 arrays all offline with all disks marked as spares.  wtf?!?

After some googling around I found

  https://ronhks.hu/2021/01/07/mdadm-raid-5-all-disk-became-spare/

(for recent example) that it has happened to others, and at least the
pieces are all there rather than completely destroyed, but before I try
stopping and reassembling each array I thought I should double check :-)

Below is the output of a big ol' debugging run.  I tried to dump only what
is interesting :-)  [The smartctl-disks-timeout.sh script is based on the
wiki site script to check and set as necessary.]  I'm not sure why sd[dbc]
show a missing device while sd[lkf] are happy on each array, and I wonder
what happened to md53 with the widely differing event counter that may
make assembly interesting (and why does md52 have such a low event count
when the six of these are linear striped into a big fat array?).

Soooooo ...  What do you guys suggest to get us back up and happy?  TIA


  diskfarm:~ # uname -a ; mdadm --version ; for D in sd{d,c,b,l,k,f} ; do mdadm -E /dev/$D ; smartctl -H -i /dev/$D | egrep 'Model|SMART' | sed -e 's/^/    /' ; done ; echo '' ; for A in 51 52 53 54 55 56 ; do egrep md$A /proc/mdstat ; mdadm -D /dev/md$A | egrep 'Version|State|Events|/dev' ; for D in sd{d,b,c,l,k,f} ; do echo $D$A ; mdadm -E /dev/$D$A | egrep 'Raid|State|Events' ; done ; echo '' ; done ; /usr/local/bin/smartctl-disks-timeout.sh

  Linux diskfarm 5.3.18-lp152.106-default #1 SMP Mon Nov 22 08:38:17 UTC 2021 (52078fe) x86_64 x86_64 x86_64 GNU/Linux
  mdadm - v4.1 - 2018-10-01
  /dev/sdd:
     MBR Magic : aa55
  Partition[0] :   4294967295 sectors at            1 (type ee)
  	Device Model:     TOSHIBA HDWR11A
  	SMART support is: Available - device has SMART capability.
  	SMART support is: Enabled
  	=== START OF READ SMART DATA SECTION ===
  	SMART overall-health self-assessment test result: PASSED
  /dev/sdc:
     MBR Magic : aa55
  Partition[0] :   4294967295 sectors at            1 (type ee)
  	Device Model:     TOSHIBA HDWR11A
  	SMART support is: Available - device has SMART capability.
  	SMART support is: Enabled
  	=== START OF READ SMART DATA SECTION ===
  	SMART overall-health self-assessment test result: PASSED
  /dev/sdb:
     MBR Magic : aa55
  Partition[0] :   4294967295 sectors at            1 (type ee)
  	Device Model:     TOSHIBA HDWR11A
  	SMART support is: Available - device has SMART capability.
  	SMART support is: Enabled
  	=== START OF READ SMART DATA SECTION ===
  	SMART overall-health self-assessment test result: PASSED
  /dev/sdl:
     MBR Magic : aa55
  Partition[0] :   4294967295 sectors at            1 (type ee)
  	Device Model:     TOSHIBA HDWR11A
  	SMART support is: Available - device has SMART capability.
  	SMART support is: Enabled
  	=== START OF READ SMART DATA SECTION ===
  	SMART overall-health self-assessment test result: PASSED
  /dev/sdk:
     MBR Magic : aa55
  Partition[0] :   4294967295 sectors at            1 (type ee)
  	Device Model:     ST20000NM007D-3DJ103
  	SMART support is: Available - device has SMART capability.
  	SMART support is: Enabled
  	=== START OF READ SMART DATA SECTION ===
  	SMART overall-health self-assessment test result: PASSED
  /dev/sdf:
     MBR Magic : aa55
  Partition[0] :   4294967295 sectors at            1 (type ee)
  	Device Model:     ST20000NM007D-3DJ103
  	SMART support is: Available - device has SMART capability.
  	SMART support is: Enabled
  	=== START OF READ SMART DATA SECTION ===
  	SMART overall-health self-assessment test result: PASSED
  
  md51 : inactive sdd51[3](S) sdb51[0](S) sdc51[1](S) sdl51[4](S) sdk51[6](S) sdf51[5](S)
  /dev/md51:
             Version : 1.2
               State : inactive
              Events : 46655
         -     259       39        -        /dev/sdl51
         -     259        9        -        /dev/sdb51
         -     259       31        -        /dev/sdk51
         -     259       16        -        /dev/sdd51
         -     259        2        -        /dev/sdc51
         -     259       23        -        /dev/sdf51
  sdd51
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 46670
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdb51
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 46670
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdc51
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 46670
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdl51
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 46655
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  sdk51
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 46655
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  sdf51
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 46655
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  
  md52 : inactive sdd52[3](S) sdb52[0](S) sdc52[1](S) sdl52[4](S) sdk52[6](S) sdf52[5](S)
  /dev/md52:
             Version : 1.2
               State : inactive
              Events : 16482
         -     259        3        -        /dev/sdc52
         -     259       24        -        /dev/sdf52
         -     259       40        -        /dev/sdl52
         -     259       10        -        /dev/sdb52
         -     259       32        -        /dev/sdk52
         -     259       17        -        /dev/sdd52
  sdd52
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 16482
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdb52
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 16482
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdc52
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 16482
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdl52
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 16478
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  sdk52
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 16478
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  sdf52
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 16478
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  
  md53 : inactive sdd53[3](S) sdc53[1](S) sdb53[0](S) sdl53[4](S) sdk53[6](S) sdf53[5](S)
  /dev/md53:
             Version : 1.2
               State : inactive
              Events : 41470
         -     259       33        -        /dev/sdk53
         -     259       18        -        /dev/sdd53
         -     259        4        -        /dev/sdc53
         -     259       25        -        /dev/sdf53
         -     259       41        -        /dev/sdl53
         -     259       11        -        /dev/sdb53
  sdd53
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 53337
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdb53
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 53337
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdc53
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 53337
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdl53
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 41470
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  sdk53
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 41470
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  sdf53
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 41470
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  
  md54 : inactive sdd54[3](S) sdc54[1](S) sdb54[0](S) sdl54[4](S) sdk54[6](S) sdf54[5](S)
  /dev/md54:
             Version : 1.2
               State : inactive
              Events : 37400
         -     259        5        -        /dev/sdc54
         -     259       26        -        /dev/sdf54
         -     259       42        -        /dev/sdl54
         -     259       12        -        /dev/sdb54
         -     259       34        -        /dev/sdk54
         -     259       19        -        /dev/sdd54
  sdd54
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 37400
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdb54
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 37400
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdc54
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 37400
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdl54
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 37377
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  sdk54
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 37377
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  sdf54
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 37377
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  
  md55 : inactive sdd55[3](S) sdc55[1](S) sdb55[0](S) sdl55[4](S) sdk55[6](S) sdf55[5](S)
  /dev/md55:
             Version : 1.2
               State : inactive
              Events : 42328
         -     259       35        -        /dev/sdk55
         -     259       20        -        /dev/sdd55
         -     259        6        -        /dev/sdc55
         -     259       27        -        /dev/sdf55
         -     259       43        -        /dev/sdl55
         -     259       13        -        /dev/sdb55
  sdd55
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 42332
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdb55
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 42332
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdc55
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 42332
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdl55
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 42328
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  sdk55
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 42328
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  sdf55
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 42328
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  
  md56 : inactive sdd56[3](S) sdb56[0](S) sdc56[1](S) sdl56[4](S) sdk56[6](S) sdf56[5](S)
  /dev/md56:
             Version : 1.2
               State : inactive
              Events : 43091
         -     259        7        -        /dev/sdc56
         -     259       28        -        /dev/sdf56
         -     259       44        -        /dev/sdl56
         -     259       14        -        /dev/sdb56
         -     259       36        -        /dev/sdk56
         -     259       21        -        /dev/sdd56
  sdd56
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 43091
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdb56
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 43091
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdc56
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 43091
     Array State : AAAA.A ('A' == active, '.' == missing, 'R' == replacing)
  sdl56
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 43087
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  sdk56
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 43087
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  sdf56
       Raid Level : raid5
     Raid Devices : 6
            State : clean
           Events : 43087
     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
  
  ^[[34mDrive timeouts^[[0m: sda ^[[32mY^[[0m ; sdb ^[[32mY^[[0m ; sdc ^[[32mY^[[0m ; sdd ^[[32mY^[[0m ; sde ^[[32mY^[[0m ; sdf ^[[32mY^[[0m ; sdg ^[[33m180^[[0m ; sdh ^[[32mY^[[0m ; sdi ^[[32mY^[[0m ; sdj ^[[32mY^[[0m ; sdk ^[[32mY^[[0m ; sdl ^[[32mY^[[0m ; sdm ^[[32mY^[[0m ; 


:-D
-- 
David T-G
See http://justpickone.org/davidtg/email/
See http://justpickone.org/davidtg/tofu.txt


             reply	other threads:[~2023-09-08  3:05 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-08  2:50 David T-G [this message]
2023-09-09 11:26 ` all of my drives are spares David T-G
2023-09-09 18:28   ` Wol
2023-09-10  2:55     ` David T-G
2023-09-10  3:11       ` assemble didn't quite (was "Re: all of my drives are spares") David T-G
2023-09-14 15:59         ` assemble didn't quite David T-G
2023-09-10  3:44       ` timing (was "Re: all of my drives are spares") David T-G

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230908025035.GB1085@jpo \
    --to=davidtg-robot@justpickone.org \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.