* EIO and data corruption on XFS file system
@ 2014-11-06 13:03 GuangYang
2014-11-06 14:27 ` Emmanuel Florac
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: GuangYang @ 2014-11-06 13:03 UTC (permalink / raw)
To: xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 586 bytes --]
Hello,While working on the storage system, I got one question in terms of the XFS utilities to fix file system corruption. Basically, our storage system put 3 copies of data and the system would detect data inconsistency on regular basis, there are two patterns we observed so far: 1) the data is corrupted which result in an EIO, 2) data is still accessible but the content is changed.
I am wondering the way to fix such issues from file system's perspective, do we expect hardware failure for both cases or some xfs repairing tools could help for such case?
Thanks,Guang
[-- Attachment #1.2: Type: text/html, Size: 874 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: EIO and data corruption on XFS file system
2014-11-06 13:03 EIO and data corruption on XFS file system GuangYang
@ 2014-11-06 14:27 ` Emmanuel Florac
2014-11-10 2:54 ` GuangYang
2014-11-06 17:55 ` Stan Hoeppner
2014-11-06 21:54 ` Dave Chinner
2 siblings, 1 reply; 8+ messages in thread
From: Emmanuel Florac @ 2014-11-06 14:27 UTC (permalink / raw)
To: GuangYang; +Cc: xfs@oss.sgi.com
Le Thu, 6 Nov 2014 13:03:45 +0000
GuangYang <yguang11@outlook.com> écrivait:
> Hello,While working on the storage system, I got one question in
> terms of the XFS utilities to fix file system corruption. Basically,
> our storage system put 3 copies of data and the system would detect
> data inconsistency on regular basis, there are two patterns we
> observed so far: 1) the data is corrupted which result in an EIO, 2)
> data is still accessible but the content is changed. I am wondering
> the way to fix such issues from file system's perspective, do we
> expect hardware failure for both cases or some xfs repairing tools
> could help for such case?
XFS relies on the underlying hardware to maintain data integrity. The
latest XFS version comes with metadata checksums, which allows
correcting invalid data in filesystem structures, but not data.
Generally speaking, if the data is corrupted because of faulty hardware,
xfs_repair can't help.
regards,
--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <eflorac@intellique.com>
| +33 1 78 94 84 02
------------------------------------------------------------------------
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread* RE: EIO and data corruption on XFS file system
2014-11-06 14:27 ` Emmanuel Florac
@ 2014-11-10 2:54 ` GuangYang
0 siblings, 0 replies; 8+ messages in thread
From: GuangYang @ 2014-11-10 2:54 UTC (permalink / raw)
To: Emmanuel Florac; +Cc: xfs@oss.sgi.com
Thanks Emmanuel. That clarifies things a lot.
Thanks,
Guang
----------------------------------------
> Date: Thu, 6 Nov 2014 15:27:41 +0100
> From: eflorac@intellique.com
> To: yguang11@outlook.com
> CC: xfs@oss.sgi.com
> Subject: Re: EIO and data corruption on XFS file system
>
> Le Thu, 6 Nov 2014 13:03:45 +0000
> GuangYang <yguang11@outlook.com> écrivait:
>
>> Hello,While working on the storage system, I got one question in
>> terms of the XFS utilities to fix file system corruption. Basically,
>> our storage system put 3 copies of data and the system would detect
>> data inconsistency on regular basis, there are two patterns we
>> observed so far: 1) the data is corrupted which result in an EIO, 2)
>> data is still accessible but the content is changed. I am wondering
>> the way to fix such issues from file system's perspective, do we
>> expect hardware failure for both cases or some xfs repairing tools
>> could help for such case?
>
> XFS relies on the underlying hardware to maintain data integrity. The
> latest XFS version comes with metadata checksums, which allows
> correcting invalid data in filesystem structures, but not data.
>
> Generally speaking, if the data is corrupted because of faulty hardware,
> xfs_repair can't help.
>
> regards,
> --
> ------------------------------------------------------------------------
> Emmanuel Florac | Direction technique
> | Intellique
> | <eflorac@intellique.com>
> | +33 1 78 94 84 02
> ------------------------------------------------------------------------
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: EIO and data corruption on XFS file system
2014-11-06 13:03 EIO and data corruption on XFS file system GuangYang
2014-11-06 14:27 ` Emmanuel Florac
@ 2014-11-06 17:55 ` Stan Hoeppner
2014-11-10 2:57 ` GuangYang
2014-11-06 21:54 ` Dave Chinner
2 siblings, 1 reply; 8+ messages in thread
From: Stan Hoeppner @ 2014-11-06 17:55 UTC (permalink / raw)
To: GuangYang, xfs@oss.sgi.com
On 11/06/2014 07:03 AM, GuangYang wrote:
> Hello,
> While working on the storage system, I got one question in terms of the
> XFS utilities to fix file system corruption. Basically, our storage
> system put 3 copies of data and the system would detect data
> inconsistency on regular basis, there are two patterns we observed so
> far: 1) the data is corrupted which result in an EIO, 2) data is still
> accessible but the content is changed.
>
> I am wondering the way to fix such issues from file system's
> perspective, do we expect hardware failure for both cases or some xfs
> repairing tools could help for such case?
xfs_repair addresses filesystem metadata inconsistencies/corruption. It
does not inspect file contents. If a file is corrupted xfs_repair can
do nothing for you. If you're seeing inconsistencies in file data the
problem is not XFS but something else, possibly hardware, as you suspect.
What hardware is this? What RAID controllers?
Cheers,
Stan
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: EIO and data corruption on XFS file system
2014-11-06 17:55 ` Stan Hoeppner
@ 2014-11-10 2:57 ` GuangYang
2014-11-10 14:54 ` Emmanuel Florac
0 siblings, 1 reply; 8+ messages in thread
From: GuangYang @ 2014-11-10 2:57 UTC (permalink / raw)
To: Stan Hoeppner, xfs@oss.sgi.com
Thanks Stan. My comments inline...
----------------------------------------
> Date: Thu, 6 Nov 2014 11:55:56 -0600
> From: stan@hardwarefreak.com
> To: yguang11@outlook.com; xfs@oss.sgi.com
> Subject: Re: EIO and data corruption on XFS file system
>
> On 11/06/2014 07:03 AM, GuangYang wrote:
>> Hello,
>> While working on the storage system, I got one question in terms of the
>> XFS utilities to fix file system corruption. Basically, our storage
>> system put 3 copies of data and the system would detect data
>> inconsistency on regular basis, there are two patterns we observed so
>> far: 1) the data is corrupted which result in an EIO, 2) data is still
>> accessible but the content is changed.
>>
>> I am wondering the way to fix such issues from file system's
>> perspective, do we expect hardware failure for both cases or some xfs
>> repairing tools could help for such case?
>
> xfs_repair addresses filesystem metadata inconsistencies/corruption. It
> does not inspect file contents. If a file is corrupted xfs_repair can
> do nothing for you. If you're seeing inconsistencies in file data the
> problem is not XFS but something else, possibly hardware, as you suspect.
>
> What hardware is this? What RAID controllers?
We are using the following disk:
Disk: sdl (hpsa0): 6.0TB (23%) RAID-0 == 1 x 6TB 7.2K SATA 300MB/s ATA-ST6000NM0024-1HT
and controller:
Disk-Control: hpsa0: HP Smart Array P410, FW 6.40, Cache on 0B/512MB (R/W), BBU
>
> Cheers,
> Stan
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: EIO and data corruption on XFS file system
2014-11-10 2:57 ` GuangYang
@ 2014-11-10 14:54 ` Emmanuel Florac
0 siblings, 0 replies; 8+ messages in thread
From: Emmanuel Florac @ 2014-11-10 14:54 UTC (permalink / raw)
To: GuangYang; +Cc: Stan Hoeppner, xfs@oss.sgi.com
Le Mon, 10 Nov 2014 02:57:47 +0000
GuangYang <yguang11@outlook.com> écrivait:
> Disk-Control: hpsa0: HP Smart Array P410, FW 6.40, Cache on
> 0B/512MB (R/W), BBU
Ah, the dreaded SmartArray... People have all sorts of problems with
these. Apparently they're quite terrible, alas.
--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <eflorac@intellique.com>
| +33 1 78 94 84 02
------------------------------------------------------------------------
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: EIO and data corruption on XFS file system
2014-11-06 13:03 EIO and data corruption on XFS file system GuangYang
2014-11-06 14:27 ` Emmanuel Florac
2014-11-06 17:55 ` Stan Hoeppner
@ 2014-11-06 21:54 ` Dave Chinner
2014-11-10 2:59 ` GuangYang
2 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2014-11-06 21:54 UTC (permalink / raw)
To: GuangYang; +Cc: xfs@oss.sgi.com
On Thu, Nov 06, 2014 at 01:03:45PM +0000, GuangYang wrote:
> Hello,While working on the storage system, I got one question in
> terms of the XFS utilities to fix file system corruption.
> Basically, our storage system put 3 copies of data and the system
> would detect data inconsistency on regular basis, there are two
> patterns we observed so far:
>
> 1) the data is corrupted which result in an EIO,
Data corruption doesn't trigger EIO errors. EIO errors from the
underlying storage might cause data corruption, but the only thing
that can detect bad data is the application itself, not the kernel.
> 2) data is still accessible but the content is changed.
Again, data being incorrect is generally not a filesystem issue
unless there's a bug somewhere in the filesystem IO path. You'll
need to give us a *lot* more information about your storage and
application workload if you think XFS is corrupting data. Start
with:
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: EIO and data corruption on XFS file system
2014-11-06 21:54 ` Dave Chinner
@ 2014-11-10 2:59 ` GuangYang
0 siblings, 0 replies; 8+ messages in thread
From: GuangYang @ 2014-11-10 2:59 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs@oss.sgi.com
Thanks Dave. Yeah you are right, it seems the problem comes from hardware and system power cycling (so that some IO transaction get lost)..
----------------------------------------
> Date: Fri, 7 Nov 2014 08:54:01 +1100
> From: david@fromorbit.com
> To: yguang11@outlook.com
> CC: xfs@oss.sgi.com
> Subject: Re: EIO and data corruption on XFS file system
>
> On Thu, Nov 06, 2014 at 01:03:45PM +0000, GuangYang wrote:
>> Hello,While working on the storage system, I got one question in
>> terms of the XFS utilities to fix file system corruption.
>> Basically, our storage system put 3 copies of data and the system
>> would detect data inconsistency on regular basis, there are two
>> patterns we observed so far:
>>
>> 1) the data is corrupted which result in an EIO,
>
> Data corruption doesn't trigger EIO errors. EIO errors from the
> underlying storage might cause data corruption, but the only thing
> that can detect bad data is the application itself, not the kernel.
>
>> 2) data is still accessible but the content is changed.
>
> Again, data being incorrect is generally not a filesystem issue
> unless there's a bug somewhere in the filesystem IO path. You'll
> need to give us a *lot* more information about your storage and
> application workload if you think XFS is corrupting data. Start
> with:
>
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-11-10 14:54 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-06 13:03 EIO and data corruption on XFS file system GuangYang
2014-11-06 14:27 ` Emmanuel Florac
2014-11-10 2:54 ` GuangYang
2014-11-06 17:55 ` Stan Hoeppner
2014-11-10 2:57 ` GuangYang
2014-11-10 14:54 ` Emmanuel Florac
2014-11-06 21:54 ` Dave Chinner
2014-11-10 2:59 ` GuangYang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox