Massive RAID-1 desync

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Massive RAID-1 desync
       [not found] <1919189912.18202330.1429908372364.JavaMail.zimbra@laposte.net>
@ 2015-04-24 20:47 ` cau2jeaf1honoq
  2015-04-25  4:02   ` Mikael Abrahamsson
  2015-04-25  7:25   ` NeilBrown
  0 siblings, 2 replies; 6+ messages in thread
From: cau2jeaf1honoq @ 2015-04-24 20:47 UTC (permalink / raw)
  To: linux-raid

Something is happening here. I don't know what, but I'm having
fun trying to guess.

The root file system (ext3) is on a 4 x 30 GB RAID-1 array. A
couple hours after boot, the kernel detected something wrong in
the file system and decided to remount it read-only.

Comparing the component partitions finds many differences with a
very uneven distribution :

- sda1 and sdb1 are identical except for 4 bytes in the last
  70 kB,

- sdd1 is identical to sda1 and sdb1 except for about 67,000
  differences in the last 70 kB.

- sdc1 is grossly out of sync with about 300 million differences
  with the others, all of them in the first 450 MB or so.

I'm not sure what to make of this. The knee-jerk thought would
be "/dev/sdc1 is the odd man out so sdc must be faulty". But
that disk participates in other arrays without problems, I don't
see anything obviously bad in its SMART data and the kernel
messages just before the remount were actually about sda.

To be honest, I don't have a clear idea of how things got where 
they are. Since writing to a RAID-1 array writes the same data
to all devices, how can you have so many differences ?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Massive RAID-1 desync
  2015-04-24 20:47 ` Massive RAID-1 desync cau2jeaf1honoq
@ 2015-04-25  4:02   ` Mikael Abrahamsson
  2015-04-25  6:18     ` Jean-Baptiste Thomas
  2015-04-25  7:25   ` NeilBrown
  1 sibling, 1 reply; 6+ messages in thread
From: Mikael Abrahamsson @ 2015-04-25  4:02 UTC (permalink / raw)
  To: cau2jeaf1honoq; +Cc: linux-raid

On Fri, 24 Apr 2015, cau2jeaf1honoq@laposte.net wrote:

> Something is happening here. I don't know what, but I'm having
> fun trying to guess.

What kernel version are you running? Some other information would be 
interesting as well, such as what /proc/mdstat is saying, and anything 
from dmesg or similar logs leading up to this...

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Massive RAID-1 desync
  2015-04-25  4:02   ` Mikael Abrahamsson
@ 2015-04-25  6:18     ` Jean-Baptiste Thomas
  0 siblings, 0 replies; 6+ messages in thread
From: Jean-Baptiste Thomas @ 2015-04-25  6:18 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: linux-raid

On 2015-04-25 06:02 +0200, Mikael Abrahamsson wrote:

> What kernel version are you running?

Linux 3.0.14.
mdadm 3.1.4 (2010-08-31)

> Some other information would be
> interesting as well, such as what /proc/mdstat is saying,

Good point. Curiously, nothing :

md1 : active raid1 sdd1[3] sdb1[0] sdc1[2] sda1[1]
      31463232 blocks [4/4] [UUUU]

> and anything from dmesg or similar logs leading up to this...

Too late, /var/log was on the root file system and lockd has 
since helpfully flooded the kernel ring buffer.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Massive RAID-1 desync
  2015-04-24 20:47 ` Massive RAID-1 desync cau2jeaf1honoq
  2015-04-25  4:02   ` Mikael Abrahamsson
@ 2015-04-25  7:25   ` NeilBrown
  2015-04-26  8:48     ` Jean-Baptiste Thomas
  1 sibling, 1 reply; 6+ messages in thread
From: NeilBrown @ 2015-04-25  7:25 UTC (permalink / raw)
  To: cau2jeaf1honoq; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2216 bytes --]

On Fri, 24 Apr 2015 22:47:49 +0200 (CEST) cau2jeaf1honoq@laposte.net wrote:

> Something is happening here. I don't know what, but I'm having
> fun trying to guess.
> 
> The root file system (ext3) is on a 4 x 30 GB RAID-1 array. A
> couple hours after boot, the kernel detected something wrong in
> the file system and decided to remount it read-only.
> 
> Comparing the component partitions finds many differences with a
> very uneven distribution :
> 
> - sda1 and sdb1 are identical except for 4 bytes in the last
>   70 kB,

Perfectly normal.  Metadata is at the end, at least 64K from the end and 64K
aligned.

> 
> - sdd1 is identical to sda1 and sdb1 except for about 67,000
>   differences in the last 70 kB.

Following the metadata is between 60K and 124K of nothing.  It could easily
be completely different on different devices.


> 
> - sdc1 is grossly out of sync with about 300 million differences
>   with the others, all of them in the first 450 MB or so.

sdc1 is sick.
Maybe it has hardware problems.  Maybe some hacker broke into your machine
and wrote garbage to it.  Or maybe you triggered a bug that no one else has
ever come across (unlikely, but possible).

> 
> I'm not sure what to make of this. The knee-jerk thought would
> be "/dev/sdc1 is the odd man out so sdc must be faulty". But
> that disk participates in other arrays without problems, I don't
> see anything obviously bad in its SMART data and the kernel
> messages just before the remount were actually about sda.

And what were those messages about sda?

> 
> To be honest, I don't have a clear idea of how things got where 
> they are. Since writing to a RAID-1 array writes the same data
> to all devices, how can you have so many differences ?

Cosmic rays?  EMP?

I actually think that the most likely explanation is that someone was
careless and wrote something to sdc that they didn't mean to.  But I'm
probably wrong.  I like guessing too.

NeilBrown


> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Massive RAID-1 desync
  2015-04-25  7:25   ` NeilBrown
@ 2015-04-26  8:48     ` Jean-Baptiste Thomas
  2015-04-28 21:39       ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Jean-Baptiste Thomas @ 2015-04-26  8:48 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On 2015-04-25 17:25 +1000, NeilBrown wrote:

> Perfectly normal.  Metadata is at the end, at least 64K from the end
> and 64K aligned.

Yes. Format 0.90.

> And what were those messages about sda?

The actual messages have been displaced by lockd's rambling but as I
remember, it was this sort of thing :

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.00: BMDMA stat 0x4
ata1.00: failed command: READ DMA EXT
ata1.00: cmd 25/00:80:a9:54:70/00:00:74:00:00/e0 tag 0 dma 65536 in
         res 51/40:00:25:55:70/40:00:74:00:00/e0 Emask 0x9 (media error)
ata1.00: status: { DRDY ERR }
ata1.00: error: { UNC }
ata1.00: configured for UDMA/133
ata1: EH complete

I ran e2fsck on copies of sda1 and sdc1. They are both heavily damaged,
not just sdc1.

Looks like I'm going to have to replace a disk and see. I'd like to
avoid replacing two, though. Or going through more crashes. Does 
MD have a paranoid mode in which reading a sector from a RAID-1
device would not return successfully until it got matching data
from at least two components ?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Massive RAID-1 desync
  2015-04-26  8:48     ` Jean-Baptiste Thomas
@ 2015-04-28 21:39       ` NeilBrown
  0 siblings, 0 replies; 6+ messages in thread
From: NeilBrown @ 2015-04-28 21:39 UTC (permalink / raw)
  To: Jean-Baptiste Thomas; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1708 bytes --]

On Sun, 26 Apr 2015 10:48:37 +0200 (CEST) Jean-Baptiste Thomas
<cau2jeaf1honoq@laposte.net> wrote:

> On 2015-04-25 17:25 +1000, NeilBrown wrote:
> 
> > Perfectly normal.  Metadata is at the end, at least 64K from the end
> > and 64K aligned.
> 
> Yes. Format 0.90.
> 
> > And what were those messages about sda?
> 
> The actual messages have been displaced by lockd's rambling but as I
> remember, it was this sort of thing :
> 
> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> ata1.00: BMDMA stat 0x4
> ata1.00: failed command: READ DMA EXT
> ata1.00: cmd 25/00:80:a9:54:70/00:00:74:00:00/e0 tag 0 dma 65536 in
>          res 51/40:00:25:55:70/40:00:74:00:00/e0 Emask 0x9 (media error)
> ata1.00: status: { DRDY ERR }
> ata1.00: error: { UNC }
> ata1.00: configured for UDMA/133
> ata1: EH complete

A clean "media error" on READ should involve the block being written and if
that fails, the drive ejected.  I wonder if the controller got confused.

> 
> I ran e2fsck on copies of sda1 and sdc1. They are both heavily damaged,
> not just sdc1.

That is rather sad.  I'm having trouble  imagining any scenario that would
result in the symptoms you are seeing.  Very odd.

> 
> Looks like I'm going to have to replace a disk and see. I'd like to
> avoid replacing two, though. Or going through more crashes. Does 
> MD have a paranoid mode in which reading a sector from a RAID-1
> device would not return successfully until it got matching data
> from at least two components ?

As mentioned separately: no.

If it were me, I'd probably be feeling suspicious of the controller at this
point.  If it is a cheap one, maybe replace it.

NeilBrown


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-04-28 21:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1919189912.18202330.1429908372364.JavaMail.zimbra@laposte.net>
2015-04-24 20:47 ` Massive RAID-1 desync cau2jeaf1honoq
2015-04-25  4:02   ` Mikael Abrahamsson
2015-04-25  6:18     ` Jean-Baptiste Thomas
2015-04-25  7:25   ` NeilBrown
2015-04-26  8:48     ` Jean-Baptiste Thomas
2015-04-28 21:39       ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).