From: Pierre Ossman <pierre-list@ossman.eu>
To: Tejun Heo <tj@kernel.org>
Cc: linux-ide@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Strange read data corruption on ext4/LVM/md
Date: Thu, 20 May 2010 11:29:45 +0200 [thread overview]
Message-ID: <20100520112945.61bf9705@mjolnir.ossman.eu> (raw)
In-Reply-To: <4BF4F979.4070903@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 1822 bytes --]
On Thu, 20 May 2010 10:57:29 +0200
Tejun Heo <tj@kernel.org> wrote:
> On 05/20/2010 09:14 AM, Pierre Ossman wrote:
> > Note that this is a live system, so there is some chance that something
> > wrote to than area, then restored it to the previous state. I'm not
> > sure how likely that is.
> >
> > If not, then it would seem that this is a problem in either the disks,
> > the controller or the controller driver. The components are WD
> > WD1002FAEX, sil3132 and sata_sil24 respectively.
>
> There is a report that sil3124/32 recognizes FIS corruption but keeps
> using the payload anyway thus leading to data corruption when the bus
> condition on pci-e side isn't ideal. Does moving the controller to
> different slot make difference?
>
The machine is rather crammed right now, with one controller in each of
the three available pci-e slots (5 disks). I am running continuous tests
on the disks right now though to see if the problems is on all disks or
just some. If just one slot is causing problems then we should see some
results there.
When you say FIS corruption, do you mean corruption in the sense of
randomly flipped bits? I don't know if you saw the first couple of
mails (before linux-ide was added), but the problem is data being moved
around, not just randomly changed.
Another note is that the problem seems to worsen under load. I'm
running the dd thing in the background, which seems to make read errors
more common on my test files on the filesystem level.
I also tried disabling NCQ without any noticeable change.
Rgds
--
-- Pierre Ossman
WARNING: This correspondence is being monitored by FRA, a
Swedish intelligence agency. Make sure your server uses
encryption for SMTP traffic and consider using PGP for
end-to-end encryption.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
next prev parent reply other threads:[~2010-05-20 9:29 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20100519225653.1fedb453@mjolnir.ossman.eu>
[not found] ` <20100519230426.47c6c1ed@mjolnir.ossman.eu>
[not found] ` <20100519232906.3be82279@mjolnir.ossman.eu>
[not found] ` <20100519233408.7436bd9b@mjolnir.ossman.eu>
2010-05-20 7:14 ` Strange read data corruption on ext4/LVM/md Pierre Ossman
2010-05-20 8:57 ` Tejun Heo
2010-05-20 9:29 ` Pierre Ossman [this message]
2010-05-20 9:42 ` Tejun Heo
2010-05-20 10:22 ` Pierre Ossman
2010-05-20 14:00 ` Pierre Ossman
2010-05-20 16:28 ` Pierre Ossman
2010-07-15 19:38 ` Pierre Ossman
2011-01-16 14:01 ` Pierre Ossman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100520112945.61bf9705@mjolnir.ossman.eu \
--to=pierre-list@ossman.eu \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).