From: Neil Brown <neilb@suse.de>
To: jeff stern <jas.61803+lr@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: how to deal with continuously getting more errors?
Date: Thu, 19 Jul 2007 09:23:15 +1000 [thread overview]
Message-ID: <18078.41187.752326.634975@notabene.brown> (raw)
In-Reply-To: message from jeff stern on Saturday July 14
On Saturday July 14, jas.61803+lr@gmail.com wrote:
>
> EXTENDED DESCRIPTION OF PROBLEM
>
> i first noticed this problem when i downloaded the fedora core 7 .iso,
> and did a checksum on it, and it didn't match. with a little more
> investigating, i found that i could make a copy of any large file on
> disk, and its copy would sometimes match, sometimes not.
>
> here is a typical session:
> ------------------------------------------------------------------------------------------
> $ cp F-7-i386-DVD.iso F.iso
> $ cmp F-7-i386-DVD.iso F.iso
> F-7-i386-DVD.iso F.iso differ: byte 1033827385, line 3789612
> $ cmp F-7-i386-DVD.iso F.iso
> $ cmp F-7-i386-DVD.iso F.iso
> F-7-i386-DVD.iso F.iso differ: byte 1033827385, line 3789612
> $ cmp F-7-i386-DVD.iso F.iso
> F-7-i386-DVD.iso F.iso differ: byte 8870221, line 37265
> $ cmp F-7-i386-DVD.iso F.iso
> F-7-i386-DVD.iso F.iso differ: byte 8870221, line 37265
> $ _
> ------------------------------------------------------------------------------------------
This clearly indicates a hardware problem.
You tried in /tmp and didn't get this sort of result, so it probably
isn't RAM/CPU.
Next step is to break the raid1, mount each drive as a separate
filesystem and do the same test on each filesystem.
If one works and the other fails, then it must be something specific
to the faulty device. If they are on the same controller, it must be
drive or cable, so swap cables and try again.
If they are on different controllers, try swapping controllers too.
If both filesystems show the same problem, it must be something
common, maybe the controller. Try to find an alternate controller to
test with. Narrow it down to the faulty component, and replace it.
>
>
> furthermore, i discovered that there was a way to fix them (i.e.,
> "sync" the drives). however, this fixing procedure came with a caveat.
> this caveat was something that i should have realized the importance
> of in the first place: that a RAID 1 system with only two drives is
> going to have a problem when repairing. the problem is that when
> sync'ing the drives, whenever a mismatch is found, a decision must be
> made as to which drive has the correct data: drive 1 or drive 2? and
> that apparently, it's just a toss-up, and the repair program just
> picks randomly.
>
> "WHAAAAT????????????"
>
> yeap. so, it's really better to either go with RAID 5, or to have a
> RAID 1 system with 3 or more disks.
>
This is not true at all.
If the difference is due to the drive subsystem returning bad data
(rather than indicating a read error), then no RAID system is safe.
If the difference is due to the kernel writing different data to the
two drives (as happens sometimes on swap or with memory-mapped files),
then both copies of the data are equally correct, and there isn't
really a problem.
NeilBrown
next prev parent reply other threads:[~2007-07-18 23:23 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-14 18:15 how to deal with continuously getting more errors? jeff stern
2007-07-14 21:03 ` Justin Piszcz
2007-07-18 23:23 ` Neil Brown [this message]
2007-07-28 3:55 ` jeff stern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=18078.41187.752326.634975@notabene.brown \
--to=neilb@suse.de \
--cc=jas.61803+lr@gmail.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).