linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* mismatch_cnt of 128 and 1408.
@ 2008-03-24  7:36 Janek Kozicki
  2008-03-24 20:23 ` Oliver Martin
  2008-03-25  3:33 ` Neil Brown
  0 siblings, 2 replies; 5+ messages in thread
From: Janek Kozicki @ 2008-03-24  7:36 UTC (permalink / raw)
  To: linux-raid

Hello,

My backup box with raid5 recently suffered a series of power
failures. Now I'm doing some recovery hoping that power surges did
not damage HDDs.

I have here
- md0 a raid1 of 3*1GB=1GB (root partition) and
- md1 a raid5 of 3*500GB=1000GB (backup partition) and
- md2 a raid1 of 3*4GB=4GB (swap).

I did following things:

1. power on the server, and incidentally mount count for md1 enforced
   an fsck which found two errors.

2. /usr/share/mdadm/checkarray -a
   which does a resync of all raid partitions. It takes 5 hours for
   the 1TB backup partition.

3. afterwards the command: cat /sys/block/md?/md/mismatch_cnt gave answer:
   md0: 128
   md1: 0
   md2: 1408

4. touch /forcefsck ; shutdown -r now

5. then after reboot: /usr/share/mdadm/checkarray -a ; cat /sys/block/md?/md/mismatch_cnt
   md0: 128
   md1: 0
   md2: 1408

Now I am alarmed that something might be wrong with the root
partition. From other posts I remember that mismatch count for swap
partitions is allowed to be nonzero.

I checked smart health status of HDDs with smartctl -H and the drives
are healthy.


Is mismatch count = 128 an indication of bad HDD? How to discover
which one is bad?

-- 
Janek Kozicki                                                         |

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mismatch_cnt of 128 and 1408.
  2008-03-24  7:36 mismatch_cnt of 128 and 1408 Janek Kozicki
@ 2008-03-24 20:23 ` Oliver Martin
  2008-03-25  3:33 ` Neil Brown
  1 sibling, 0 replies; 5+ messages in thread
From: Oliver Martin @ 2008-03-24 20:23 UTC (permalink / raw)
  To: Janek Kozicki; +Cc: linux-raid

Am Mon, 24 Mar 2008 08:36:51 +0100 schrieb Janek Kozicki:
> 
> Now I am alarmed that something might be wrong with the root
> partition. From other posts I remember that mismatch count for swap
> partitions is allowed to be nonzero.

Apparently it can also happen when a file is truncated between the
writes to the individual disks:
http://www.mail-archive.com/linux-raid@vger.kernel.org/msg07546.html

Or could it just be that it was writing when the power failed?

> 
> I checked smart health status of HDDs with smartctl -H and the drives
> are healthy.
> 
> 
> Is mismatch count = 128 an indication of bad HDD? How to discover
> which one is bad?
> 

You could run a long SMART selftest. If the downtime doesn't matter,
you could also boot from a live cd and run badblocks -n.

-- 
Oliver

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mismatch_cnt of 128 and 1408.
  2008-03-24  7:36 mismatch_cnt of 128 and 1408 Janek Kozicki
  2008-03-24 20:23 ` Oliver Martin
@ 2008-03-25  3:33 ` Neil Brown
  2008-03-25 12:43   ` Janek Kozicki
  2008-03-26  2:07   ` Guy Watkins
  1 sibling, 2 replies; 5+ messages in thread
From: Neil Brown @ 2008-03-25  3:33 UTC (permalink / raw)
  To: Janek Kozicki; +Cc: linux-raid

On Monday March 24, janek_listy@wp.pl wrote:
> 
> 5. then after reboot: /usr/share/mdadm/checkarray -a ; cat /sys/block/md?/md/mismatch_cnt
>    md0: 128
>    md1: 0
>    md2: 1408
> 
> Now I am alarmed that something might be wrong with the root
> partition. From other posts I remember that mismatch count for swap
> partitions is allowed to be nonzero.

Yes, the 'swap' is nothing to worry about.

As md does it's checks in units of around 128 sectors, the "md0: 128"
probably just means a single sector is different between the two.

I recommend

  echo repair > /sys/block/md0/md/sync_action

That will fix it.

Probably just another 'fsck -f' after that just to be on the same
side.

NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mismatch_cnt of 128 and 1408.
  2008-03-25  3:33 ` Neil Brown
@ 2008-03-25 12:43   ` Janek Kozicki
  2008-03-26  2:07   ` Guy Watkins
  1 sibling, 0 replies; 5+ messages in thread
From: Janek Kozicki @ 2008-03-25 12:43 UTC (permalink / raw)
  To: linux-raid

Neil Brown said:     (by the date of Tue, 25 Mar 2008 14:33:53 +1100)

> I recommend
>   echo repair > /sys/block/md0/md/sync_action
> That will fix it.

great, thanks. It worked :-)

-- 
Janek Kozicki                                                         |

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: mismatch_cnt of 128 and 1408.
  2008-03-25  3:33 ` Neil Brown
  2008-03-25 12:43   ` Janek Kozicki
@ 2008-03-26  2:07   ` Guy Watkins
  1 sibling, 0 replies; 5+ messages in thread
From: Guy Watkins @ 2008-03-26  2:07 UTC (permalink / raw)
  To: 'Neil Brown', 'Janek Kozicki'; +Cc: linux-raid

} -----Original Message-----
} From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
} owner@vger.kernel.org] On Behalf Of Neil Brown
} Sent: Monday, March 24, 2008 11:34 PM
} To: Janek Kozicki
} Cc: linux-raid@vger.kernel.org
} Subject: Re: mismatch_cnt of 128 and 1408.
} 
} On Monday March 24, janek_listy@wp.pl wrote:
} >
} > 5. then after reboot: /usr/share/mdadm/checkarray -a ; cat
} /sys/block/md?/md/mismatch_cnt
} >    md0: 128
} >    md1: 0
} >    md2: 1408
} >
} > Now I am alarmed that something might be wrong with the root
} > partition. From other posts I remember that mismatch count for swap
} > partitions is allowed to be nonzero.
} 
} Yes, the 'swap' is nothing to worry about.

How can a mismatch on swap be ok?  Seems the wrong data could be read in 50%
of the time.

Thanks,
Guy

} 
} As md does it's checks in units of around 128 sectors, the "md0: 128"
} probably just means a single sector is different between the two.
} 
} I recommend
} 
}   echo repair > /sys/block/md0/md/sync_action
} 
} That will fix it.
} 
} Probably just another 'fsck -f' after that just to be on the same
} side.
} 
} NeilBrown
} --
} To unsubscribe from this list: send the line "unsubscribe linux-raid" in
} the body of a message to majordomo@vger.kernel.org
} More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-03-26  2:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-24  7:36 mismatch_cnt of 128 and 1408 Janek Kozicki
2008-03-24 20:23 ` Oliver Martin
2008-03-25  3:33 ` Neil Brown
2008-03-25 12:43   ` Janek Kozicki
2008-03-26  2:07   ` Guy Watkins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).