Can't resolve mismatch_count > 0 for a raid 1 array

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Can't resolve mismatch_count > 0 for a raid 1 array
@ 2009-04-08 10:03 Steven Ellis
  2009-04-08 21:49 ` Goswin von Brederlow
  2009-04-08 21:50 ` Bill Davidsen
  0 siblings, 2 replies; 6+ messages in thread
From: Steven Ellis @ 2009-04-08 10:03 UTC (permalink / raw)
  To: Linux RAID

I've resolved most of my raid issues by re-housing the affected system  
and replacing the motherboard, but across the 3 boards I've tried I  
always have an issue with my /dev/md1 array producing mismatch_count  
of 128 or 256.

System is running Centos 5.2 with a Xen Dom0 kernel

This md1 volume is a pair of 40GB HDs raid1 on an IDE controller which  
I them have a bunch of LVM's that are my Xen guests.

Is there any chance that these mismatch_count values are due to swap  
partitions for the Xen guests?

Steve

Steven Ellis - Technical Director
OpenMedia Limited

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Can't resolve mismatch_count > 0 for a raid 1 array
  2009-04-08 10:03 Can't resolve mismatch_count > 0 for a raid 1 array Steven Ellis
@ 2009-04-08 21:49 ` Goswin von Brederlow
  2009-04-09  0:07   ` Steven Ellis
  2009-04-08 21:50 ` Bill Davidsen
  1 sibling, 1 reply; 6+ messages in thread
From: Goswin von Brederlow @ 2009-04-08 21:49 UTC (permalink / raw)
  To: Linux RAID

Steven Ellis <steven@openmedia.co.nz> writes:

> I've resolved most of my raid issues by re-housing the affected system
> and replacing the motherboard, but across the 3 boards I've tried I
> always have an issue with my /dev/md1 array producing mismatch_count
> of 128 or 256.
>
> System is running Centos 5.2 with a Xen Dom0 kernel
>
> This md1 volume is a pair of 40GB HDs raid1 on an IDE controller which
> I them have a bunch of LVM's that are my Xen guests.
>
> Is there any chance that these mismatch_count values are due to swap
> partitions for the Xen guests?
>
> Steve

Have you repaired the raid or do you just check and check and check?

MfG
        Goswin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Can't resolve mismatch_count > 0 for a raid 1 array
  2009-04-08 10:03 Can't resolve mismatch_count > 0 for a raid 1 array Steven Ellis
  2009-04-08 21:49 ` Goswin von Brederlow
@ 2009-04-08 21:50 ` Bill Davidsen
  2009-04-08 22:00   ` Iustin Pop
  1 sibling, 1 reply; 6+ messages in thread
From: Bill Davidsen @ 2009-04-08 21:50 UTC (permalink / raw)
  To: Steven Ellis; +Cc: Linux RAID

Steven Ellis wrote:
> I've resolved most of my raid issues by re-housing the affected system 
> and replacing the motherboard, but across the 3 boards I've tried I 
> always have an issue with my /dev/md1 array producing mismatch_count 
> of 128 or 256.
>
> System is running Centos 5.2 with a Xen Dom0 kernel
>
> This md1 volume is a pair of 40GB HDs raid1 on an IDE controller which 
> I them have a bunch of LVM's that are my Xen guests.
>
> Is there any chance that these mismatch_count values are due to swap 
> partitions for the Xen guests?

That's the cause, and since md code doesn't currently have a clue which 
copy is "right" it's always a problem if you do something like suspend 
to disk. You probably don't do that with xen images, but swap and raid1 
almost always have a mismatch.

-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc

"You are disgraced professional losers. And by the way, give us our money back."
    - Representative Earl Pomeroy,  Democrat of North Dakota
on the A.I.G. executives who were paid bonuses  after a federal bailout.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Can't resolve mismatch_count > 0 for a raid 1 array
  2009-04-08 21:50 ` Bill Davidsen
@ 2009-04-08 22:00   ` Iustin Pop
  2009-04-09  0:13     ` Steven Ellis
  0 siblings, 1 reply; 6+ messages in thread
From: Iustin Pop @ 2009-04-08 22:00 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Steven Ellis, Linux RAID

On Wed, Apr 08, 2009 at 05:50:46PM -0400, Bill Davidsen wrote:
> Steven Ellis wrote:
>> I've resolved most of my raid issues by re-housing the affected system  
>> and replacing the motherboard, but across the 3 boards I've tried I  
>> always have an issue with my /dev/md1 array producing mismatch_count  
>> of 128 or 256.
>>
>> System is running Centos 5.2 with a Xen Dom0 kernel
>>
>> This md1 volume is a pair of 40GB HDs raid1 on an IDE controller which  
>> I them have a bunch of LVM's that are my Xen guests.
>>
>> Is there any chance that these mismatch_count values are due to swap  
>> partitions for the Xen guests?
>
> That's the cause, and since md code doesn't currently have a clue which  
> copy is "right" it's always a problem if you do something like suspend  
> to disk. You probably don't do that with xen images, but swap and raid1  
> almost always have a mismatch.

But only because (for non-xen guests) the raid1 code and the swap code /
data live in the same address space, and could be changed in between the
two writes.

I would be surprised if this happens for xen guests, where the address
space is not shared; once the xen guest initiates a write, dom0 gets the
data and writes it from its internal buffer, not the domU's one which
could be modified.

At least, that's how I think things happen.

regards,
iustin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Can't resolve mismatch_count > 0 for a raid 1 array
  2009-04-08 21:49 ` Goswin von Brederlow
@ 2009-04-09  0:07   ` Steven Ellis
  0 siblings, 0 replies; 6+ messages in thread
From: Steven Ellis @ 2009-04-09  0:07 UTC (permalink / raw)
  To: Linux RAID


On Thu, April 9, 2009 9:49 am, Goswin von Brederlow wrote:
> Steven Ellis <steven@openmedia.co.nz> writes:
>
>> I've resolved most of my raid issues by re-housing the affected system
>> and replacing the motherboard, but across the 3 boards I've tried I
>> always have an issue with my /dev/md1 array producing mismatch_count
>> of 128 or 256.
>>
>> System is running Centos 5.2 with a Xen Dom0 kernel
>>
>> This md1 volume is a pair of 40GB HDs raid1 on an IDE controller which
>> I them have a bunch of LVM's that are my Xen guests.
>>
>> Is there any chance that these mismatch_count values are due to swap
>> partitions for the Xen guests?
>>
>> Steve
>
> Have you repaired the raid or do you just check and check and check?
>

Yes I've repaired the raid several times. Then when I put the Xen guests
under load and re-run check the errors re-appear.

--------------------------------------------
Steven Ellis - Technical Director
OpenMedia Limited - The Home of myPVR
email   - steven@openmedia.co.nz
website - http://www.openmedia.co.nz

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Can't resolve mismatch_count > 0 for a raid 1 array
  2009-04-08 22:00   ` Iustin Pop
@ 2009-04-09  0:13     ` Steven Ellis
  0 siblings, 0 replies; 6+ messages in thread
From: Steven Ellis @ 2009-04-09  0:13 UTC (permalink / raw)
  To: Linux RAID


On Thu, April 9, 2009 10:00 am, Iustin Pop wrote:
> On Wed, Apr 08, 2009 at 05:50:46PM -0400, Bill Davidsen wrote:
>> Steven Ellis wrote:
>>> I've resolved most of my raid issues by re-housing the affected system
>>> and replacing the motherboard, but across the 3 boards I've tried I
>>> always have an issue with my /dev/md1 array producing mismatch_count
>>> of 128 or 256.
>>>
>>> System is running Centos 5.2 with a Xen Dom0 kernel
>>>
>>> This md1 volume is a pair of 40GB HDs raid1 on an IDE controller which
>>> I them have a bunch of LVM's that are my Xen guests.
>>>
>>> Is there any chance that these mismatch_count values are due to swap
>>> partitions for the Xen guests?
>>
>> That's the cause, and since md code doesn't currently have a clue which
>> copy is "right" it's always a problem if you do something like suspend
>> to disk. You probably don't do that with xen images, but swap and raid1
>> almost always have a mismatch.
>
> But only because (for non-xen guests) the raid1 code and the swap code /
> data live in the same address space, and could be changed in between the
> two writes.
>
> I would be surprised if this happens for xen guests, where the address
> space is not shared; once the xen guest initiates a write, dom0 gets the
> data and writes it from its internal buffer, not the domU's one which
> could be modified.
>
> At least, that's how I think things happen.

The box has 5 Raid arrays

# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md3 : active raid1 sdb1[0] sda1[1]
      976759936 blocks [2/2] [UU]

md0 : active raid1 hdb1[0] hda1[1]
      128384 blocks [2/2] [UU]

md4 : active raid1 hdb2[1] hda2[0]
      522048 blocks [2/2] [UU]

md2 : active raid5 hdh1[3] hdg1[2] hdf1[1] hde1[0]
      732587712 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

md1 : active raid1 hdb3[0] hda3[1]
      38427392 blocks [2/2] [UU]

unused devices: <none>

md4 is the swap partition for the Xen Server, and md0 is the boot
partition. Neither of these have any issues.

md2 is a raid5 set that also hasn't been reporting any issues.

md3 is a new SATA based raid1 set that I had some issues with when using a
different motherboard, but doesn't produce any errors no even under
serious load.

md1 contains the root file system for my Xen server, plus the root + swap
partitions for my various Xen guests. This is the volume that is
generating the mismatch_count errors.

Now most of my Xen guests are presented with two LVM allocated partitions
out of md1, eg guest_root and guest_swap.

I do have an exception to this for one guest where I present a single LVM
partition as a virtual HD to the guest which then manages the
swap/root/home partitions itself.

I'm wondering if this presentation of a partition as a disk is the issue.

Steve



--------------------------------------------
Steven Ellis - Technical Director
OpenMedia Limited - The Home of myPVR
email   - steven@openmedia.co.nz
website - http://www.openmedia.co.nz

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-04-09  0:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-08 10:03 Can't resolve mismatch_count > 0 for a raid 1 array Steven Ellis
2009-04-08 21:49 ` Goswin von Brederlow
2009-04-09  0:07   ` Steven Ellis
2009-04-08 21:50 ` Bill Davidsen
2009-04-08 22:00   ` Iustin Pop
2009-04-09  0:13     ` Steven Ellis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).