Linux RAID subsystem development
 help / color / mirror / Atom feed
* SpareMissing event, but spare not missing
@ 2013-05-15 13:51 Ole Tange
  2013-05-15 15:24 ` Roy Sigurd Karlsbakk
  0 siblings, 1 reply; 3+ messages in thread
From: Ole Tange @ 2013-05-15 13:51 UTC (permalink / raw)
  To: linux-raid

The last couple of days I have received a warning about a missing
spare for a device that has several spares.

  A SparesMissing event had been detected on md device /dev/md1.
  :
  md1 : active raid6 sdas[12](S) sdx[0] sdaq[11](S) sdau[13](S)
sdah[9] sdaf[8] sdae[7] sdad[6] sdac[5] sdab[4] sdaa[3] sdz[2] sdy[1]
      31256138752 blocks super 1.2 level 6, 128k chunk, algorithm 2
[10/10] [UUUUUUUUUU]
      bitmap: 0/2 pages [0KB], 1048576KB chunk

I have tried removing one of the spares and adding it again. Same error.

$ mdadm --version
mdadm - v3.1.4 - 31st August 2010
$ uname -a
Linux orsted 3.2.0-0.bpo.1-amd64 #1 SMP Sat Feb 11 08:41:32 UTC 2012
x86_64 GNU/Linux

/Ole


The automated mail:

This is an automatically generated mail message from mdadm
running on orsted

A SparesMissing event had been detected on md device /dev/md1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid6] [raid5] [raid4] [raid0]
md3 : active raid0 md1[0] md2[1]
      62512275456 blocks super 1.2 512k chunks

md2 : active raid6 sdat[10] sdaj[0] sdao[8] sdag[7] sdap[6] sdan[5]
sdai[4] sdam[3] sdal[2] sdak[1]
      31256138752 blocks super 1.2 level 6, 128k chunk, algorithm 2
[10/10] [UUUUUUUUUU]
      bitmap: 0/2 pages [0KB], 1048576KB chunk

md1 : active raid6 sdas[12](S) sdx[0] sdaq[11](S) sdau[13](S) sdah[9]
sdaf[8] sdae[7] sdad[6] sdac[5] sdab[4] sdaa[3] sdz[2] sdy[1]
      31256138752 blocks super 1.2 level 6, 128k chunk, algorithm 2
[10/10] [UUUUUUUUUU]
      bitmap: 0/2 pages [0KB], 1048576KB chunk

md0 : active raid6 sdw[28](S) sdd[25] sdu[19] sdv[22] sdp[26] sds[16]
sdr[15] sdq[14] sdo[23] sdn[12] sdm[11] sdl[10] sdk[9] sdj[8] sdi[7]
sdh[6] sdc[20] sdf[4] sde[3] sdb[21] sdt[24]
      52744776192 blocks super 1.2 level 6, 256k chunk, algorithm 2
[20/20] [UUUUUUUUUUUUUUUUUUUU]
      bitmap: 1/2 pages [4KB], 1048576KB chunk

unused devices: <none>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: SpareMissing event, but spare not missing
  2013-05-15 13:51 SpareMissing event, but spare not missing Ole Tange
@ 2013-05-15 15:24 ` Roy Sigurd Karlsbakk
  2013-05-16  8:02   ` Ole Tange
  0 siblings, 1 reply; 3+ messages in thread
From: Roy Sigurd Karlsbakk @ 2013-05-15 15:24 UTC (permalink / raw)
  To: Ole Tange; +Cc: linux-raid

- Anything in dmesg?
- What does /etc/mdadm/mdadm.conf say?

Also, using 20 disks in a single RAID-6 gives you the same chances for a parity+1 error (or worse) as compared to 10 drives in RAID-5. I would really recommend using smaller (8+2?) RAID-6 sets and rather use LVM on top (which you may be doing already?). Even with proper cooling and enterprise drives, 20 drives in a single RAID-6 is asking for trouble…

roy

----- Opprinnelig melding -----
> The last couple of days I have received a warning about a missing
> spare for a device that has several spares.
> 
> A SparesMissing event had been detected on md device /dev/md1.
> :
> md1 : active raid6 sdas[12](S) sdx[0] sdaq[11](S) sdau[13](S)
> sdah[9] sdaf[8] sdae[7] sdad[6] sdac[5] sdab[4] sdaa[3] sdz[2] sdy[1]
> 31256138752 blocks super 1.2 level 6, 128k chunk, algorithm 2
> [10/10] [UUUUUUUUUU]
> bitmap: 0/2 pages [0KB], 1048576KB chunk
> 
> I have tried removing one of the spares and adding it again. Same
> error.
> 
> $ mdadm --version
> mdadm - v3.1.4 - 31st August 2010
> $ uname -a
> Linux orsted 3.2.0-0.bpo.1-amd64 #1 SMP Sat Feb 11 08:41:32 UTC 2012
> x86_64 GNU/Linux
> 
> /Ole
> 
> 
> The automated mail:
> 
> This is an automatically generated mail message from mdadm
> running on orsted
> 
> A SparesMissing event had been detected on md device /dev/md1.
> 
> Faithfully yours, etc.
> 
> P.S. The /proc/mdstat file currently contains the following:
> 
> Personalities : [raid6] [raid5] [raid4] [raid0]
> md3 : active raid0 md1[0] md2[1]
> 62512275456 blocks super 1.2 512k chunks
> 
> md2 : active raid6 sdat[10] sdaj[0] sdao[8] sdag[7] sdap[6] sdan[5]
> sdai[4] sdam[3] sdal[2] sdak[1]
> 31256138752 blocks super 1.2 level 6, 128k chunk, algorithm 2
> [10/10] [UUUUUUUUUU]
> bitmap: 0/2 pages [0KB], 1048576KB chunk
> 
> md1 : active raid6 sdas[12](S) sdx[0] sdaq[11](S) sdau[13](S) sdah[9]
> sdaf[8] sdae[7] sdad[6] sdac[5] sdab[4] sdaa[3] sdz[2] sdy[1]
> 31256138752 blocks super 1.2 level 6, 128k chunk, algorithm 2
> [10/10] [UUUUUUUUUU]
> bitmap: 0/2 pages [0KB], 1048576KB chunk
> 
> md0 : active raid6 sdw[28](S) sdd[25] sdu[19] sdv[22] sdp[26] sds[16]
> sdr[15] sdq[14] sdo[23] sdn[12] sdm[11] sdl[10] sdk[9] sdj[8] sdi[7]
> sdh[6] sdc[20] sdf[4] sde[3] sdb[21] sdt[24]
> 52744776192 blocks super 1.2 level 6, 256k chunk, algorithm 2
> [20/20] [UUUUUUUUUUUUUUUUUUUU]
> bitmap: 1/2 pages [4KB], 1048576KB chunk
> 
> unused devices: <none>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

-- 
Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 98013356
roy@karlsbakk.net
http://blogg.karlsbakk.net/
GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: SpareMissing event, but spare not missing
  2013-05-15 15:24 ` Roy Sigurd Karlsbakk
@ 2013-05-16  8:02   ` Ole Tange
  0 siblings, 0 replies; 3+ messages in thread
From: Ole Tange @ 2013-05-16  8:02 UTC (permalink / raw)
  To: Roy Sigurd Karlsbakk; +Cc: linux-raid

On Wed, May 15, 2013 at 5:24 PM, Roy Sigurd Karlsbakk <roy@karlsbakk.net> wrote:
> - Anything in dmesg?

Nope.

> - What does /etc/mdadm/mdadm.conf say?

Good call. It was generated automatically when there were 4 spares, so
it had 'spares=4' in it. And there are no longer 4 spares, which can
explain the message.

I have now removed the 'spares=4' and restarted mdadm --monitor.
Tomorrow we will see if that fixed it.

> Also, using 20 disks in a single RAID-6 gives you the same chances for a parity+1 error (or worse) as compared to 10 drives in RAID-5. I would really recommend using smaller (8+2?) RAID-6 sets and rather use LVM on top (which you may be doing already?). Even with proper cooling and enterprise drives, 20 drives in a single RAID-6 is asking for trouble…

We are migrating to a RAID60 2x10.

The major reason for this is the time to rebuild: To rebuild one of
the 19 drives we have to read remaining 19 drives during which the
performance will be slower. On our system rebuild would take at least
4 days.


/Ole
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-05-16  8:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-15 13:51 SpareMissing event, but spare not missing Ole Tange
2013-05-15 15:24 ` Roy Sigurd Karlsbakk
2013-05-16  8:02   ` Ole Tange

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox