Corrupted files

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* Corrupted files
@ 2014-09-09 15:21 Leslie Rhorer
  2014-09-09 15:50 ` Sean Caron
                   ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Leslie Rhorer @ 2014-09-09 15:21 UTC (permalink / raw)
  To: xfs


Hello,

	I have an issue with my primary RAID array.  I have 13T of data on the 
array, and I suffered a major array failure.  I was able to rebuild the 
array, but some data was lost.  Of course I have backups, so after 
running xfs_repair, I ran an rsync job to recover the lost data.  Most 
of it was recovered, but there are several files that cannot be read, 
deleted, or overwritten.  I have tried running xfs_repair several times, 
but any attempt to access these files continuously reports "cannot stat 
XXXXXXXX: Structure needs cleaning".  I don't need to try to recover the 
data directly, as it does reside on the backup, but I need to clear the 
file structure so I can write the files back to the filesystem.  How do 
I proceed?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-09 15:21 Corrupted files Leslie Rhorer
@ 2014-09-09 15:50 ` Sean Caron
  2014-09-09 16:03   ` Sean Caron
  2014-09-09 16:08 ` Emmanuel Florac
  2014-09-09 22:06 ` Dave Chinner
  2 siblings, 1 reply; 35+ messages in thread
From: Sean Caron @ 2014-09-09 15:50 UTC (permalink / raw)
  To: Leslie Rhorer, Sean Caron; +Cc: xfs@oss.sgi.com

[-- Attachment #1.1: Type: text/plain, Size: 2581 bytes --]

Hi Leslie,

If you have a full backup, I would STRONGLY recommend just wiping your old
filesystem and restoring your backups on top of a totally fresh XFS, rather
than repairing the original filesystem and then filling in the blanks with
backups using a file-diff tool like rsync.

You will probably hear various opinions here about xfs_repair; my personal
opinion is that xfs_repair is a program made available for the unwary to
further scramble their data and make a hash of the filesystem... In my
first-hand experience managing ~7 PB of XFS storage and growing, I have
NEVER found xfs_repair (yes, even the "newest version") to ever do anything
positive. It's basically a data scrambler.

At this point, you will never achieve anything near what I'd consider a
production-grade, trustworthy data repository. Any further runs of
xfs_repair will either do nothing, or make the situation worse. Fortunately
you followed best practice and kept backups so you don't really need
xfs_repair anyway, right?

Best,

Sean

P.S. No backups? Still don't even think about running xfs_repair.
ESPECIALLY don't think about running xfs_repair. Try mounting ro; if that
doesn't work, mount ro with noreplaylog and scavenge what you can. Write
off the rest. That's the cost of doing business without backups. Running
xfs_repair (especially as a first-line step) will only make it worse, and
especially on big filesystems, the run time can extend to weeks... Don't
keep your users down any longer than you need to, running a program that
won't really help you. Just scavenge it, reformat and turn it back around.

On Tue, Sep 9, 2014 at 11:21 AM, Leslie Rhorer <lrhorer@mygrande.net> wrote:

>
> Hello,
>
>         I have an issue with my primary RAID array.  I have 13T of data on
> the array, and I suffered a major array failure.  I was able to rebuild the
> array, but some data was lost.  Of course I have backups, so after running
> xfs_repair, I ran an rsync job to recover the lost data.  Most of it was
> recovered, but there are several files that cannot be read, deleted, or
> overwritten.  I have tried running xfs_repair several times, but any
> attempt to access these files continuously reports "cannot stat XXXXXXXX:
> Structure needs cleaning".  I don't need to try to recover the data
> directly, as it does reside on the backup, but I need to clear the file
> structure so I can write the files back to the filesystem.  How do I
> proceed?
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>

[-- Attachment #1.2: Type: text/html, Size: 3292 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-09 15:50 ` Sean Caron
@ 2014-09-09 16:03   ` Sean Caron
  2014-09-09 22:24     ` Eric Sandeen
  0 siblings, 1 reply; 35+ messages in thread
From: Sean Caron @ 2014-09-09 16:03 UTC (permalink / raw)
  To: Leslie Rhorer, Sean Caron; +Cc: xfs@oss.sgi.com


[-- Attachment #1.1: Type: text/plain, Size: 3577 bytes --]

OK, let me retract just a tiny fraction of what I said originally; thinking
about it further, there was _one_ time I was able to use xfs_repair to
successfully recover a "lightly bruised" XFS and return it to service. But
in that case, the fault was very minor and I always check first with:

xfs_repair [-L] -n -v <filesystem>

and give the output a good looking over before proceeding further.

If it won't run without zeroing the log, you can take that as a sign that
things are getting dire.. I wouldn't bother to run xfs_repair "for real" if
the trial output looked even slightly non-trivial, in cases of underlying
array failure or massive filesystem corruption, and I'd never run it
without mounting and scavenging first (unless I had a very recent full
backup). Barring rare cases, xfs_repair is bad juju.

Best,

Sean






On Tue, Sep 9, 2014 at 11:50 AM, Sean Caron <scaron@umich.edu> wrote:

> Hi Leslie,
>
> If you have a full backup, I would STRONGLY recommend just wiping your old
> filesystem and restoring your backups on top of a totally fresh XFS, rather
> than repairing the original filesystem and then filling in the blanks with
> backups using a file-diff tool like rsync.
>
> You will probably hear various opinions here about xfs_repair; my personal
> opinion is that xfs_repair is a program made available for the unwary to
> further scramble their data and make a hash of the filesystem... In my
> first-hand experience managing ~7 PB of XFS storage and growing, I have
> NEVER found xfs_repair (yes, even the "newest version") to ever do anything
> positive. It's basically a data scrambler.
>
> At this point, you will never achieve anything near what I'd consider a
> production-grade, trustworthy data repository. Any further runs of
> xfs_repair will either do nothing, or make the situation worse. Fortunately
> you followed best practice and kept backups so you don't really need
> xfs_repair anyway, right?
>
> Best,
>
> Sean
>
> P.S. No backups? Still don't even think about running xfs_repair.
> ESPECIALLY don't think about running xfs_repair. Try mounting ro; if that
> doesn't work, mount ro with noreplaylog and scavenge what you can. Write
> off the rest. That's the cost of doing business without backups. Running
> xfs_repair (especially as a first-line step) will only make it worse, and
> especially on big filesystems, the run time can extend to weeks... Don't
> keep your users down any longer than you need to, running a program that
> won't really help you. Just scavenge it, reformat and turn it back around.
>
>
>
>
>
> On Tue, Sep 9, 2014 at 11:21 AM, Leslie Rhorer <lrhorer@mygrande.net>
> wrote:
>
>>
>> Hello,
>>
>>         I have an issue with my primary RAID array.  I have 13T of data
>> on the array, and I suffered a major array failure.  I was able to rebuild
>> the array, but some data was lost.  Of course I have backups, so after
>> running xfs_repair, I ran an rsync job to recover the lost data.  Most of
>> it was recovered, but there are several files that cannot be read, deleted,
>> or overwritten.  I have tried running xfs_repair several times, but any
>> attempt to access these files continuously reports "cannot stat XXXXXXXX:
>> Structure needs cleaning".  I don't need to try to recover the data
>> directly, as it does reside on the backup, but I need to clear the file
>> structure so I can write the files back to the filesystem.  How do I
>> proceed?
>>
>> _______________________________________________
>> xfs mailing list
>> xfs@oss.sgi.com
>> http://oss.sgi.com/mailman/listinfo/xfs
>>
>
>

[-- Attachment #1.2: Type: text/html, Size: 4763 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-09 16:03   ` Sean Caron
@ 2014-09-09 22:24     ` Eric Sandeen
  2014-09-09 22:57       ` Sean Caron
  2014-09-10  0:48       ` Leslie Rhorer
  0 siblings, 2 replies; 35+ messages in thread
From: Eric Sandeen @ 2014-09-09 22:24 UTC (permalink / raw)
  To: Sean Caron, Leslie Rhorer; +Cc: xfs@oss.sgi.com

On 9/9/14 11:03 AM, Sean Caron wrote:

>Barring rare cases, xfs_repair is bad juju.

No, it's not.  It is the appropriate tool to use for filesystem repair.

But it is not the appropriate tool for recovery from mangled storage.

I've actually been running a filesystem fuzzer over xfs images, randomly
corrupting data and testing repair, 1000s of times over.  It does
remarkably well.

If you scramble your raid, which means your block device is no longer
an xfs filesystem, but is instead a random tangle of bits and pieces of
other things, of course xfs_repair won't do well, but it's not the right
tool for the job at that stage.

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-09 22:24     ` Eric Sandeen
@ 2014-09-09 22:57       ` Sean Caron
  2014-09-10  1:00         ` Roger Willcocks
  2014-09-10  5:09         ` Eric Sandeen
  2014-09-10  0:48       ` Leslie Rhorer
  1 sibling, 2 replies; 35+ messages in thread
From: Sean Caron @ 2014-09-09 22:57 UTC (permalink / raw)
  To: Eric Sandeen, Sean Caron; +Cc: Leslie Rhorer, xfs@oss.sgi.com


[-- Attachment #1.1: Type: text/plain, Size: 1243 bytes --]

Hey, just sharing some hard-won (believe me) professional experience. I
have seen xfs_repair take a bad situation and make it worse many times. I
don't know that a filesystem fuzzer or any other simulation can ever
provide true simulation of users absolutely pounding the tar out of a
system. There seems to be a real disconnect between what developers are
able to test and observe directly, and what happens in the production
environment in a very high-throughput environment.

Best,

Sean


On Tue, Sep 9, 2014 at 6:24 PM, Eric Sandeen <sandeen@sandeen.net> wrote:

> On 9/9/14 11:03 AM, Sean Caron wrote:
>
>  Barring rare cases, xfs_repair is bad juju.
>>
>
> No, it's not.  It is the appropriate tool to use for filesystem repair.
>
> But it is not the appropriate tool for recovery from mangled storage.
>
> I've actually been running a filesystem fuzzer over xfs images, randomly
> corrupting data and testing repair, 1000s of times over.  It does
> remarkably well.
>
> If you scramble your raid, which means your block device is no longer
> an xfs filesystem, but is instead a random tangle of bits and pieces of
> other things, of course xfs_repair won't do well, but it's not the right
> tool for the job at that stage.
>
> -Eric
>

[-- Attachment #1.2: Type: text/html, Size: 1884 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-09 22:57       ` Sean Caron
@ 2014-09-10  1:00         ` Roger Willcocks
  2014-09-10  1:23           ` Leslie Rhorer
  2014-09-10  5:09         ` Eric Sandeen
  1 sibling, 1 reply; 35+ messages in thread
From: Roger Willcocks @ 2014-09-10  1:00 UTC (permalink / raw)
  To: Sean Caron; +Cc: Roger Willcocks, Eric Sandeen, Leslie Rhorer, xfs@oss.sgi.com

[-- Attachment #1.1: Type: text/plain, Size: 2383 bytes --]

I normally watch quietly from the sidelines but I think it's important to get some balance here; our customers between them run many hundreds of multi-terabyte arrays and when something goes badly awry it generally falls to me to sort it out. In my experience xfs_repair does exactly what it says on the tin.

I can recall only a couple of instances where we elected to reformat and reload from backups and they were both due to human error: somebody deleted the wrong raid unit when doing routine maintenance, and then tried to fix it up hemselves.

In theory of course xfs_repair shouldn't be needed if the write barriers work properly (it's a journalled filesystem), but low-level corruption does creep in due to power failures / kernel crashes and it's this which xfs_repair is intended to address; not massive data corruption due to failed hardware or careless users.

--
Roger

On 9 Sep 2014, at 23:57, Sean Caron <scaron@umich.edu> wrote:

> Hey, just sharing some hard-won (believe me) professional experience. I have seen xfs_repair take a bad situation and make it worse many times. I don't know that a filesystem fuzzer or any other simulation can ever provide true simulation of users absolutely pounding the tar out of a system. There seems to be a real disconnect between what developers are able to test and observe directly, and what happens in the production environment in a very high-throughput environment.
> 
> Best,
> 
> Sean
> 
> 
> On Tue, Sep 9, 2014 at 6:24 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
> On 9/9/14 11:03 AM, Sean Caron wrote:
> 
> Barring rare cases, xfs_repair is bad juju.
> 
> No, it's not.  It is the appropriate tool to use for filesystem repair.
> 
> But it is not the appropriate tool for recovery from mangled storage.
> 
> I've actually been running a filesystem fuzzer over xfs images, randomly
> corrupting data and testing repair, 1000s of times over.  It does
> remarkably well.
> 
> If you scramble your raid, which means your block device is no longer
> an xfs filesystem, but is instead a random tangle of bits and pieces of
> other things, of course xfs_repair won't do well, but it's not the right
> tool for the job at that stage.
> 
> -Eric
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

[-- Attachment #1.2: Type: text/html, Size: 3458 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  1:00         ` Roger Willcocks
@ 2014-09-10  1:23           ` Leslie Rhorer
  0 siblings, 0 replies; 35+ messages in thread
From: Leslie Rhorer @ 2014-09-10  1:23 UTC (permalink / raw)
  To: Roger Willcocks, Sean Caron; +Cc: Eric Sandeen, xfs@oss.sgi.com

On 9/9/2014 8:00 PM, Roger Willcocks wrote:
> I normally watch quietly from the sidelines but I think it's important
> to get some balance here

	That is almost always wise advice.  Shooting from the hip often has 
regrettable consequences, yet being too cautious can have its down side, 
too.  In this case, things are working very well at the moment, and the 
apparent issues are reasonably small, so there is no need for panic.

> our customers between them run many hundreds
> of multi-terabyte arrays and when something goes badly awry it generally
> falls to me to sort it out. In my experience xfs_repair does exactly
> what it says on the tin.

	I couldn't say.  This is only the second time I have ever had an array 
drop, and the first time it was completely unrecoverable.  Less than 5 
minutes after I had started a RAID upgrade from RAID5 to RAID6, there 
was a protracted power outage.  I shut down the system cleanly and after 
the outage restarted the reshape.  The recovery had only been running a 
few minutes when the system suffered a kernel panic - I never did find 
out why.  Every single structure on the array larger than the stripe 
size (16K, I think) was garbage.

> I can recall only a couple of instances where we elected to reformat and
> reload from backups and they were both due to human error: somebody
> deleted the wrong raid unit when doing routine maintenance, and then
> tried to fix it up hemselves.
>
> In theory of course xfs_repair shouldn't be needed if the write barriers
> work properly (it's a journalled filesystem), but low-level corruption
> does creep in due to power failures / kernel crashes and it's this which
> xfs_repair is intended to address; not massive data corruption due to
> failed hardware or careless users.

	Oh, yeah, like losing 3 out of 8 drives in the array after a drive 
controller replacement...

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-09 22:57       ` Sean Caron
  2014-09-10  1:00         ` Roger Willcocks
@ 2014-09-10  5:09         ` Eric Sandeen
  1 sibling, 0 replies; 35+ messages in thread
From: Eric Sandeen @ 2014-09-10  5:09 UTC (permalink / raw)
  To: Sean Caron; +Cc: Leslie Rhorer, xfs@oss.sgi.com

On 9/9/14 5:57 PM, Sean Caron wrote:
> Hey, just sharing some hard-won (believe me) professional experience.
> I have seen xfs_repair take a bad situation and make it worse many
> times. I don't know that a filesystem fuzzer or any other simulation
> can ever provide true simulation of users absolutely pounding the tar
> out of a system. There seems to be a real disconnect between what
> developers are able to test and observe directly, and what happens in
> the production environment in a very high-throughput environment.
>
> Best,
>
> Sean

Fair enough, but I don't want to let stand an assertion that you should
avoid xfs_repair at all (most) costs.  It, like almost any software,
has some bugs, but they don't get fixed if they don't get well reported.
We do our best to improve it when we get useful reports from
users - usually including a metadata dump - and we beat on it as best
we can in the lab.

"pounding the tar out of a filesystem" should not, in general, require
an xfs_repair run.  ;)

Yes, it's always good advice to do a dry run before committing to a
repair, in case something goes off the rails.  But most times I've seen
things go very very badly was when the storage device under the filesystem
was no longer consistent, and the filesystem really had no pieces to
pick up.

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-09 22:24     ` Eric Sandeen
  2014-09-09 22:57       ` Sean Caron
@ 2014-09-10  0:48       ` Leslie Rhorer
  2014-09-10  1:10         ` Roger Willcocks
  1 sibling, 1 reply; 35+ messages in thread
From: Leslie Rhorer @ 2014-09-10  0:48 UTC (permalink / raw)
  To: Eric Sandeen, Sean Caron; +Cc: xfs@oss.sgi.com

On 9/9/2014 5:24 PM, Eric Sandeen wrote:
> On 9/9/14 11:03 AM, Sean Caron wrote:
>
>> Barring rare cases, xfs_repair is bad juju.
>
> No, it's not.  It is the appropriate tool to use for filesystem repair.
>
> But it is not the appropriate tool for recovery from mangled storage.

	It's not all that mangled.  Out of over 52,000 files on the backup 
server array, only 5758 were missing from the primary array, and most of 
those were lost by the corruption of just a couple of directories, where 
every file in the directory was lost with the directory itself.  Several 
directories and a scattering of individual files were deleted with 
intent prior to the failure but not yet purged from the backup.  Most 
were small files - only 29 were larger than 1G.  All of those 5758 were 
easily recovered.  The only ones remaining at issue are 3 files which 
cannot be read, written or deleted.  The rest have been read and 
checksums sucessfully computed and compared.  With only 50K files in 
question, I am confidant any checksum collisions are of insignificant 
probability.  Someone is going to have to do a lot of talking to 
convince me rsync can read two copies of what should be the same data 
and come up with the same checksum value for both, but other 
applications would be able to successfully read one of the files and not 
the other.

	I really don't think Draconian measures are required.  Even if it turns 
out they are, the existence of the backup allows for a good deal of 
fiddling with the main filesystem before one is compelled to give up and 
start fresh.  This especially since a small amount of the data on the 
main array had not yet been backed up to the secondary array.  These 
e-mails, for example.  The rsync job that backs up the main array runs 
every morning at 04:00, so files created that day were not backed up, 
and for safety I have changed the backup array file system to read-only, 
so nothing created since is backed up.


> I've actually been running a filesystem fuzzer over xfs images, randomly
> corrupting data and testing repair, 1000s of times over.  It does
> remarkably well.
>
> If you scramble your raid, which means your block device is no longer
> an xfs filesystem, but is instead a random tangle of bits and pieces of
> other things, of course xfs_repair won't do well, but it's not the right
> tool for the job at that stage.

	This is nowhere near that stage.  A few sectors here and there were 
lost because 3 drives were kicked from the array while write operations 
were underway.  I had to force re-assemble the array, which lost some 
data.  The vast majority of the data is clearly intact, including most 
of the file system structures.  Far less than 1% of the data was lost or 
corrupted.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  0:48       ` Leslie Rhorer
@ 2014-09-10  1:10         ` Roger Willcocks
  2014-09-10  1:31           ` Leslie Rhorer
  0 siblings, 1 reply; 35+ messages in thread
From: Roger Willcocks @ 2014-09-10  1:10 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: Sean Caron, Roger Willcocks, Eric Sandeen, xfs@oss.sgi.com


On 10 Sep 2014, at 01:48, Leslie Rhorer <lrhorer@mygrande.net> wrote:

>  The only ones remaining at issue are 3 files which cannot be read, written or deleted.

The most straightforward fix would be to note down the inode numbers of the three fies and then use xfs_db to clear the inodes; then run xfs_repair again.

See:

http://xfs.org/index.php/XFS_FAQ#Q:_How_to_get_around_a_bad_inode_repair_is_unable_to_clean_up

but before that try running the latest (3.2.1 I think) xfs_repair.

--
Roger

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  1:10         ` Roger Willcocks
@ 2014-09-10  1:31           ` Leslie Rhorer
  2014-09-10 14:24             ` Emmanuel Florac
  0 siblings, 1 reply; 35+ messages in thread
From: Leslie Rhorer @ 2014-09-10  1:31 UTC (permalink / raw)
  To: Roger Willcocks; +Cc: Sean Caron, Eric Sandeen, xfs@oss.sgi.com

On 9/9/2014 8:10 PM, Roger Willcocks wrote:
>
> On 10 Sep 2014, at 01:48, Leslie Rhorer <lrhorer@mygrande.net> wrote:
>
>>   The only ones remaining at issue are 3 files which cannot be read, written or deleted.
>
> The most straightforward fix would be to note down the inode numbers of the three fies and then use xfs_db to clear the inodes; then run xfs_repair again.
>
> See:
>
> http://xfs.org/index.php/XFS_FAQ#Q:_How_to_get_around_a_bad_inode_repair_is_unable_to_clean_up

	That sounds reasonable.  If no one has any more sound advice, I think I 
will try that.


> but before that try running the latest (3.2.1 I think) xfs_repair.

	I am always reticent to run anything outside the distro package.  Ive 
had problems in the past with doing so.  3.1.7 is pretty close, so 
unless there is a really solid reason to use 3.2.1 vs. 3.1.7, I think I 
will stick with the distro version and try the above.  Can you or anyone 
else give a reason why 3.2.1 would work when 3.1.7 would not?  More 
importantly, is there some reason 3.1.7 would make things worse while 
3.2.1 would not?  If not, then I can always try 3.1.7 and then try 3.2.1 
if that does not help.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  1:31           ` Leslie Rhorer
@ 2014-09-10 14:24             ` Emmanuel Florac
  2014-09-10 14:49               ` Sean Caron
  0 siblings, 1 reply; 35+ messages in thread
From: Emmanuel Florac @ 2014-09-10 14:24 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: Eric Sandeen, Roger Willcocks, Sean Caron, xfs@oss.sgi.com

Le Tue, 09 Sep 2014 20:31:07 -0500
Leslie Rhorer <lrhorer@mygrande.net> écrivait:

> More 
> importantly, is there some reason 3.1.7 would make things worse while 
> 3.2.1 would not?  If not, then I can always try 3.1.7 and then try
> 3.2.1 if that does not help.

I don't know for these particular versions, however in the past
I've confirmed that a later version of xfs_repair performed way better
(salvaged more files from lost+found, in particular).

At some point in the distant past, some versions of xfs_repair were
buggy and would happily throw away TB of perfectly sane data... Ih ad
this very problem once on Christmas eve in 2005 IIRC :/

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10 14:24             ` Emmanuel Florac
@ 2014-09-10 14:49               ` Sean Caron
  0 siblings, 0 replies; 35+ messages in thread
From: Sean Caron @ 2014-09-10 14:49 UTC (permalink / raw)
  To: Emmanuel Florac, Sean Caron
  Cc: Roger Willcocks, Eric Sandeen, Leslie Rhorer, xfs@oss.sgi.com

[-- Attachment #1.1: Type: text/plain, Size: 4234 bytes --]

I don't want to bloviate too much and drag this completely off topic esp.
since the OPs query is resolved but please allow me just one anecdote :)

Earlier this year, I had one of our project file servers (450 TB) go down.
It didn't go down because the array spuriously just lost a bunch of disks;
it was simply your usual sort of Linux kernel panic... you go to the
console and it's just black screen and unresponsive, or maybe you can see
the tail end of a backtrace and it's unresponsive. So, OK, issue a quick
remote IPMI reboot of the machine, it comes up...

I'm in single user mode, bringing up each sub-RAID6 in our RAID60 by hand,
no problem. Bring up the top level RAID0. OK. Then I go to mount the XFS...
no go. Apparently the log somehow got corrupted in the crash?

So I try to mount ro, no dice, but I _can_ mount ro,noreplaylog and I see
good files here! Thank goodness. I start scavenging to a spare host...

A few weeks later, after the scavenge is done, I did a few xfs_repair runs
just for the sake of experimentation. Using both in dry run mode, I tried
the version that shipped with Ubuntu 12.04, as well as the latest
xfs_repair I could pull from the source tree. I redirected the output of
both runs to file and watched them with 'tail -f'.

Diffing the output when they were done, it didn't look like they were
behaving much differently. Both files had thousands or tens of thousands of
lines worth of output in them, bad this, bad that... (I always run in
verbose mode) Since the filesystem was hosed anyway and I was going to
rebuild it, I decided to let the new xfs_repair run "for real" just to see
what would happen, for kicks. And who knows? Maybe I could recover even
more than I already had ...? (I wasn't just totally wasting time)

I think it took maybe a week for it to run on a 450 TB volume? At least a
week. Maybe I was being a teensy bit hyperbolic in my previous descriptions
of runtime, LOL. After it was done?

... almost everything was obliterated. I had tens of millions of
zero-length files, and tens of millions of bits of anonymous scrambled junk
in lost+found.

So, I chuckled a bit (thankful for my hard-won previous experience) before
reformatting the array and then copied back the results of my scavenging.
Just by ro-mounting and copying what I could, I was able to save around 90%
of the data by volume on the array (it was a little more than half full
when it failed... ~290 TB? There was only ~30 TB that I couldn't salvage);
good clean files that passed validation from their respective users. I
think 80-90% recovery rates are very commonly achievable just mounting
ro,noreplaylog and getting what you can with cp -R or rsync, given that
there wasn't grievous failure of the underlying storage system.

If I had depended on xfs_repair, or blithely run it as a first line of
response as the documentation might intimate (hey, it's called xfs_repair,
right?) like you would casually think to do; run it like people run fsck or
CHKDSK... I would have been hosed, big time.

Best,

Sean

On Wed, Sep 10, 2014 at 10:24 AM, Emmanuel Florac <eflorac@intellique.com>
wrote:

> Le Tue, 09 Sep 2014 20:31:07 -0500
> Leslie Rhorer <lrhorer@mygrande.net> écrivait:
>
> > More
> > importantly, is there some reason 3.1.7 would make things worse while
> > 3.2.1 would not?  If not, then I can always try 3.1.7 and then try
> > 3.2.1 if that does not help.
>
> I don't know for these particular versions, however in the past
> I've confirmed that a later version of xfs_repair performed way better
> (salvaged more files from lost+found, in particular).
>
> At some point in the distant past, some versions of xfs_repair were
> buggy and would happily throw away TB of perfectly sane data... Ih ad
> this very problem once on Christmas eve in 2005 IIRC :/
>
> --
> ------------------------------------------------------------------------
> Emmanuel Florac     |   Direction technique
>                     |   Intellique
>                     |   <eflorac@intellique.com>
>                     |   +33 1 78 94 84 02
> ------------------------------------------------------------------------
>

[-- Attachment #1.2: Type: text/html, Size: 5218 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-09 15:21 Corrupted files Leslie Rhorer
  2014-09-09 15:50 ` Sean Caron
@ 2014-09-09 16:08 ` Emmanuel Florac
  2014-09-09 22:06 ` Dave Chinner
  2 siblings, 0 replies; 35+ messages in thread
From: Emmanuel Florac @ 2014-09-09 16:08 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: xfs

Le Tue, 09 Sep 2014 10:21:37 -0500
Leslie Rhorer <lrhorer@mygrande.net> écrivait:

> I have tried running xfs_repair several times, 
> but any attempt to access these files continuously reports "cannot
> stat XXXXXXXX: Structure needs cleaning". 

I won't agree with Sean here(1). Most of the time xfs_repair ends with
the expected result; however many distros (particularly centOS) provide
positively ancient versions. You'd better grab a recent version (3.1 or
better).

(1) in particular on the "run for weeks" part. I've never had
xfs_repair take more than a couple of hours, even on badly damaged
filesystems in the hundred of terabytes range.

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-09 15:21 Corrupted files Leslie Rhorer
  2014-09-09 15:50 ` Sean Caron
  2014-09-09 16:08 ` Emmanuel Florac
@ 2014-09-09 22:06 ` Dave Chinner
  2014-09-10  1:12   ` Leslie Rhorer
  2 siblings, 1 reply; 35+ messages in thread
From: Dave Chinner @ 2014-09-09 22:06 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: xfs

On Tue, Sep 09, 2014 at 10:21:37AM -0500, Leslie Rhorer wrote:
> 
> Hello,
> 
> 	I have an issue with my primary RAID array.  I have 13T of data on
> the array, and I suffered a major array failure.  I was able to
> rebuild the array, but some data was lost.  Of course I have
> backups, so after running xfs_repair, I ran an rsync job to recover
> the lost data.  Most of it was recovered, but there are several
> files that cannot be read, deleted, or overwritten.  I have tried
> running xfs_repair several times, but any attempt to access these
> files continuously reports "cannot stat XXXXXXXX: Structure needs
> cleaning".  I don't need to try to recover the data directly, as it
> does reside on the backup, but I need to clear the file structure so
> I can write the files back to the filesystem.  How do I proceed?

Fristly, more infomration is required, namely versions and actual
error messages:

http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

dmesg, in particular, should tell use what the corruption being
encountered is when stat fails.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-09 22:06 ` Dave Chinner
@ 2014-09-10  1:12   ` Leslie Rhorer
  2014-09-10  1:25     ` Sean Caron
  2014-09-10  1:53     ` Dave Chinner
  0 siblings, 2 replies; 35+ messages in thread
From: Leslie Rhorer @ 2014-09-10  1:12 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 9/9/2014 5:06 PM, Dave Chinner wrote:
> Fristly, more infomration is required, namely versions and actual
> error messages:

	Indubitably:

RAID-Server:/# xfs_repair -V
xfs_repair version 3.1.7
RAID-Server:/# uname -r
3.2.0-4-amd64

4.0 GHz FX-8350 eight core processor

RAID-Server:/# cat /proc/meminfo /proc/mounts /proc/partitions
MemTotal:        8099916 kB
MemFree:         5786420 kB
Buffers:          112684 kB
Cached:           457020 kB
SwapCached:            0 kB
Active:           521800 kB
Inactive:         457268 kB
Active(anon):     276648 kB
Inactive(anon):   140180 kB
Active(file):     245152 kB
Inactive(file):   317088 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      12623740 kB
SwapFree:       12623740 kB
Dirty:                20 kB
Writeback:             0 kB
AnonPages:        409488 kB
Mapped:            47576 kB
Shmem:              7464 kB
Slab:             197100 kB
SReclaimable:     112644 kB
SUnreclaim:        84456 kB
KernelStack:        2560 kB
PageTables:         8468 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    16673696 kB
Committed_AS:    1010172 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      339140 kB
VmallocChunk:   34359395308 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       65532 kB
DirectMap2M:     5120000 kB
DirectMap1G:     3145728 kB
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=10240k,nr_inodes=1002653,mode=755 0 0
devpts /dev/pts devpts 
rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=809992k,mode=755 0 0
/dev/disk/by-uuid/fa5c404a-bfcb-43de-87ed-e671fda1ba99 / ext4 
rw,relatime,errors=remount-ro,user_xattr,barrier=1,data=ordered 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
tmpfs /run/shm tmpfs rw,nosuid,nodev,noexec,relatime,size=4144720k 0 0
/dev/md1 /boot ext2 rw,relatime,errors=continue 0 0
rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
Backup:/Backup /Backup nfs 
rw,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.51,mountvers=3,mountport=39597,mountproto=tcp,local_lock=none,addr=192.168.1.51 
0 0
Backup:/var/www /var/www/backup nfs 
rw,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.51,mountvers=3,mountport=39597,mountproto=tcp,local_lock=none,addr=192.168.1.51 
0 0
/dev/md0 /RAID xfs 
rw,relatime,attr2,delaylog,sunit=2048,swidth=12288,noquota 0 0
major minor  #blocks  name

    8        0  125034840 sda
    8        1      96256 sda1
    8        2  112305152 sda2
    8        3   12632064 sda3
    8       16  125034840 sdb
    8       17      96256 sdb1
    8       18  112305152 sdb2
    8       19   12632064 sdb3
    8       48 3907018584 sdd
    8       32 3907018584 sdc
    8       64 1465138584 sde
    8       80 1465138584 sdf
    8       96 1465138584 sdg
    8      112 3907018584 sdh
    8      128 3907018584 sdi
    8      144 3907018584 sdj
    8      160 3907018584 sdk
    9        1      96192 md1
    9        2  112239488 md2
    9        3   12623744 md3
    9        0 23441319936 md0
    9       10 4395021312 md10

RAID-Server:/# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid1] [raid0]
md10 : active raid0 sdf[0] sde[2] sdg[1]
       4395021312 blocks super 1.2 512k chunks

md0 : active raid6 md10[12] sdc[13] sdk[10] sdj[11] sdi[15] sdh[8] sdd[9]
       23441319936 blocks super 1.2 level 6, 1024k chunk, algorithm 2 
[8/7] [UUU_UUUU]
       bitmap: 29/30 pages [116KB], 65536KB chunk

md3 : active (auto-read-only) raid1 sda3[0] sdb3[1]
       12623744 blocks super 1.2 [3/2] [UU_]
       bitmap: 1/1 pages [4KB], 65536KB chunk

md2 : active raid1 sda2[0] sdb2[1]
       112239488 blocks super 1.2 [3/2] [UU_]
       bitmap: 1/1 pages [4KB], 65536KB chunk

md1 : active raid1 sda1[0] sdb1[1]
       96192 blocks [3/2] [UU_]
       bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices: <none>

	Six of the drives are 4T spindles (a mixture of makes and models).  The 
three drives comprising MD10 are WD 1.5T green drives.  These are in 
place to take over the function of one of the kicked 4T drives.  Md1, 2, 
and 3 are not data drives and are not suffering any issue.

	I'm not sure what is meant by "write cache status" in this context. 
The machine has been rebooted more than once during recovery and the FS 
has been umounted and xfs_repair run several times.

	I don't know for what the acronym BBWC stands.

RAID-Server:/# xfs_info /dev/md0
meta-data=/dev/md0               isize=256    agcount=43, 
agsize=137356288 blks
          =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=5860329984, imaxpct=5
          =                       sunit=256    swidth=1536 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=521728, version=2
          =                       sectsz=512   sunit=8 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

	The system performs just fine, other than the aforementioned, with 
loads in excess of 3Gbps.  That is internal only.  The LAN link is ony 
1Gbps, so no external request exceeds about 950Mbps.

> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
>
> dmesg, in particular, should tell use what the corruption being
> encountered is when stat fails.

RAID-Server:/# ls "/RAID/DVD/Big Sleep, The (1945)/VIDEO_TS/VTS_01_1.VOB"
ls: cannot access /RAID/DVD/Big Sleep, The (1945)/VIDEO_TS/VTS_01_1.VOB: 
Structure needs cleaning
RAID-Server:/# dmesg | tail -n 30
...
[192173.363981] XFS (md0): corrupt dinode 41006, extent total = 1, 
nblocks = 0.
[192173.363988] ffff8802338b8e00: 49 4e 81 b6 02 02 00 00 00 00 03 e8 00 
00 03 e8  IN..............
[192173.363996] XFS (md0): Internal error xfs_iformat(1) at line 319 of 
file /build/linux-eKuxrT/linux-3.2.60/fs/xfs/xfs_inode.c.  Caller 
0xffffffffa0509318
[192173.363999]
[192173.364062] Pid: 10813, comm: ls Not tainted 3.2.0-4-amd64 #1 Debian 
3.2.60-1+deb7u3
[192173.364065] Call Trace:
[192173.364097]  [<ffffffffa04d3731>] ? xfs_corruption_error+0x54/0x6f [xfs]
[192173.364134]  [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
[192173.364170]  [<ffffffffa0508efa>] ? xfs_iformat+0xe3/0x462 [xfs]
[192173.364204]  [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
[192173.364240]  [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
[192173.364268]  [<ffffffffa04d6ebe>] ? xfs_iget+0x37c/0x56c [xfs]
[192173.364300]  [<ffffffffa04e13b4>] ? xfs_lookup+0xa4/0xd3 [xfs]
[192173.364328]  [<ffffffffa04d9e5a>] ? xfs_vn_lookup+0x3f/0x7e [xfs]
[192173.364344]  [<ffffffff81102de9>] ? d_alloc_and_lookup+0x3a/0x60
[192173.364357]  [<ffffffff8110388d>] ? walk_component+0x219/0x406
[192173.364370]  [<ffffffff81104721>] ? path_lookupat+0x7c/0x2bd
[192173.364383]  [<ffffffff81036628>] ? should_resched+0x5/0x23
[192173.364396]  [<ffffffff8134f144>] ? _cond_resched+0x7/0x1c
[192173.364408]  [<ffffffff8110497e>] ? do_path_lookup+0x1c/0x87
[192173.364420]  [<ffffffff81106407>] ? user_path_at_empty+0x47/0x7b
[192173.364434]  [<ffffffff813533d8>] ? do_page_fault+0x30a/0x345
[192173.364448]  [<ffffffff810d6a04>] ? mmap_region+0x353/0x44a
[192173.364460]  [<ffffffff810fe45a>] ? vfs_fstatat+0x32/0x60
[192173.364471]  [<ffffffff810fe590>] ? sys_newstat+0x12/0x2b
[192173.364483]  [<ffffffff813509f5>] ? page_fault+0x25/0x30
[192173.364495]  [<ffffffff81355452>] ? system_call_fastpath+0x16/0x1b
[192173.364503] XFS (md0): Corruption detected. Unmount and run xfs_repair

	That last line, by the way, is why I ran umount and xfs_repair.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  1:12   ` Leslie Rhorer
@ 2014-09-10  1:25     ` Sean Caron
  2014-09-10  1:43       ` Leslie Rhorer
  2014-09-10  1:53     ` Dave Chinner
  1 sibling, 1 reply; 35+ messages in thread
From: Sean Caron @ 2014-09-10  1:25 UTC (permalink / raw)
  To: Leslie Rhorer, Sean Caron; +Cc: xfs@oss.sgi.com


[-- Attachment #1.1: Type: text/plain, Size: 9535 bytes --]

Hi Leslie,

You really don't want to be running "green" anything in an array... that is
a ticking time bomb just waiting to go off... let me tell you... At my
installation, a predecessor had procured a large number of green drives
because they were very inexpensive and regrets were had by all. Lousy
performance, lots of spurious ejection/RAID gremlins and the failure rate
on the WDC Greens is just appalling...

BBWC stands for Battery Backed Write Cache; this is a feature of hardware
RAID cards; it is just like it says on the tin; a bit (usually half a gig,
or a gig, or two...) of nonvolatile cache that retains writes to the array
in case of power failure, etc. If you have BBWC enabled but your battery is
dead, bad things can happen. Not applicable for JBOD software RAID.

I hold firm to my beliefs on xfs_repair :) As I say, you'll see a variety
of opinions here.

Best,

Sean




On Tue, Sep 9, 2014 at 9:12 PM, Leslie Rhorer <lrhorer@mygrande.net> wrote:

> On 9/9/2014 5:06 PM, Dave Chinner wrote:
>
>> Fristly, more infomration is required, namely versions and actual
>> error messages:
>>
>
>         Indubitably:
>
> RAID-Server:/# xfs_repair -V
> xfs_repair version 3.1.7
> RAID-Server:/# uname -r
> 3.2.0-4-amd64
>
> 4.0 GHz FX-8350 eight core processor
>
> RAID-Server:/# cat /proc/meminfo /proc/mounts /proc/partitions
> MemTotal:        8099916 kB
> MemFree:         5786420 kB
> Buffers:          112684 kB
> Cached:           457020 kB
> SwapCached:            0 kB
> Active:           521800 kB
> Inactive:         457268 kB
> Active(anon):     276648 kB
> Inactive(anon):   140180 kB
> Active(file):     245152 kB
> Inactive(file):   317088 kB
> Unevictable:           0 kB
> Mlocked:               0 kB
> SwapTotal:      12623740 kB
> SwapFree:       12623740 kB
> Dirty:                20 kB
> Writeback:             0 kB
> AnonPages:        409488 kB
> Mapped:            47576 kB
> Shmem:              7464 kB
> Slab:             197100 kB
> SReclaimable:     112644 kB
> SUnreclaim:        84456 kB
> KernelStack:        2560 kB
> PageTables:         8468 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:    16673696 kB
> Committed_AS:    1010172 kB
> VmallocTotal:   34359738367 kB
> VmallocUsed:      339140 kB
> VmallocChunk:   34359395308 kB
> HardwareCorrupted:     0 kB
> AnonHugePages:         0 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> DirectMap4k:       65532 kB
> DirectMap2M:     5120000 kB
> DirectMap1G:     3145728 kB
> rootfs / rootfs rw 0 0
> sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
> proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
> udev /dev devtmpfs rw,relatime,size=10240k,nr_inodes=1002653,mode=755 0 0
> devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000
> 0 0
> tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=809992k,mode=755 0 0
> /dev/disk/by-uuid/fa5c404a-bfcb-43de-87ed-e671fda1ba99 / ext4
> rw,relatime,errors=remount-ro,user_xattr,barrier=1,data=ordered 0 0
> tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
> tmpfs /run/shm tmpfs rw,nosuid,nodev,noexec,relatime,size=4144720k 0 0
> /dev/md1 /boot ext2 rw,relatime,errors=continue 0 0
> rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
> Backup:/Backup /Backup nfs rw,relatime,vers=3,rsize=
> 524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,
> retrans=2,sec=sys,mountaddr=192.168.1.51,mountvers=3,
> mountport=39597,mountproto=tcp,local_lock=none,addr=192.168.1.51 0 0
> Backup:/var/www /var/www/backup nfs rw,relatime,vers=3,rsize=
> 524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,
> retrans=2,sec=sys,mountaddr=192.168.1.51,mountvers=3,
> mountport=39597,mountproto=tcp,local_lock=none,addr=192.168.1.51 0 0
> /dev/md0 /RAID xfs rw,relatime,attr2,delaylog,sunit=2048,swidth=12288,noquota
> 0 0
> major minor  #blocks  name
>
>    8        0  125034840 sda
>    8        1      96256 sda1
>    8        2  112305152 sda2
>    8        3   12632064 sda3
>    8       16 125034840 sdb
>    8       17      96256 sdb1
>    8       18  112305152 sdb2
>    8       19   12632064 sdb3
>    8       48 3907018584 sdd
>    8       32 3907018584 sdc
>    8       64 1465138584 sde
>    8       80 1465138584 sdf
>    8       96 1465138584 sdg
>    8      112 3907018584 sdh
>    8      128 3907018584 sdi
>    8      144 3907018584 sdj
>    8      160 3907018584 sdk
>    9        1      96192 md1
>    9        2  112239488 md2
>    9        3   12623744 md3
>    9        0 23441319936 md0
>    9       10 4395021312 md10
>
> RAID-Server:/# cat /proc/mdstat
> Personalities : [raid6] [raid5] [raid4] [raid1] [raid0]
> md10 : active raid0 sdf[0] sde[2] sdg[1]
>       4395021312 blocks super 1.2 512k chunks
>
> md0 : active raid6 md10[12] sdc[13] sdk[10] sdj[11] sdi[15] sdh[8] sdd[9]
>       23441319936 blocks super 1.2 level 6, 1024k chunk, algorithm 2 [8/7]
> [UUU_UUUU]
>       bitmap: 29/30 pages [116KB], 65536KB chunk
>
> md3 : active (auto-read-only) raid1 sda3[0] sdb3[1]
>       12623744 blocks super 1.2 [3/2] [UU_]
>       bitmap: 1/1 pages [4KB], 65536KB chunk
>
> md2 : active raid1 sda2[0] sdb2[1]
>       112239488 blocks super 1.2 [3/2] [UU_]
>       bitmap: 1/1 pages [4KB], 65536KB chunk
>
> md1 : active raid1 sda1[0] sdb1[1]
>       96192 blocks [3/2] [UU_]
>       bitmap: 1/1 pages [4KB], 65536KB chunk
>
> unused devices: <none>
>
>         Six of the drives are 4T spindles (a mixture of makes and
> models).  The three drives comprising MD10 are WD 1.5T green drives.  These
> are in place to take over the function of one of the kicked 4T drives.
> Md1, 2, and 3 are not data drives and are not suffering any issue.
>
>         I'm not sure what is meant by "write cache status" in this
> context. The machine has been rebooted more than once during recovery and
> the FS has been umounted and xfs_repair run several times.
>
>         I don't know for what the acronym BBWC stands.
>
> RAID-Server:/# xfs_info /dev/md0
> meta-data=/dev/md0               isize=256    agcount=43, agsize=137356288
> blks
>          =                       sectsz=512   attr=2
> data     =                       bsize=4096   blocks=5860329984, imaxpct=5
>          =                       sunit=256    swidth=1536 blks
> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal               bsize=4096   blocks=521728, version=2
>          =                       sectsz=512   sunit=8 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
>
>         The system performs just fine, other than the aforementioned, with
> loads in excess of 3Gbps.  That is internal only.  The LAN link is ony
> 1Gbps, so no external request exceeds about 950Mbps.
>
>  http://xfs.org/index.php/XFS_FAQ#Q:_What_information_
>> should_I_include_when_reporting_a_problem.3F
>>
>> dmesg, in particular, should tell use what the corruption being
>> encountered is when stat fails.
>>
>
> RAID-Server:/# ls "/RAID/DVD/Big Sleep, The (1945)/VIDEO_TS/VTS_01_1.VOB"
> ls: cannot access /RAID/DVD/Big Sleep, The (1945)/VIDEO_TS/VTS_01_1.VOB:
> Structure needs cleaning
> RAID-Server:/# dmesg | tail -n 30
> ...
> [192173.363981] XFS (md0): corrupt dinode 41006, extent total = 1, nblocks
> = 0.
> [192173.363988] ffff8802338b8e00: 49 4e 81 b6 02 02 00 00 00 00 03 e8 00
> 00 03 e8  IN..............
> [192173.363996] XFS (md0): Internal error xfs_iformat(1) at line 319 of
> file /build/linux-eKuxrT/linux-3.2.60/fs/xfs/xfs_inode.c.  Caller
> 0xffffffffa0509318
> [192173.363999]
> [192173.364062] Pid: 10813, comm: ls Not tainted 3.2.0-4-amd64 #1 Debian
> 3.2.60-1+deb7u3
> [192173.364065] Call Trace:
> [192173.364097]  [<ffffffffa04d3731>] ? xfs_corruption_error+0x54/0x6f
> [xfs]
> [192173.364134]  [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
> [192173.364170]  [<ffffffffa0508efa>] ? xfs_iformat+0xe3/0x462 [xfs]
> [192173.364204]  [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
> [192173.364240]  [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
> [192173.364268]  [<ffffffffa04d6ebe>] ? xfs_iget+0x37c/0x56c [xfs]
> [192173.364300]  [<ffffffffa04e13b4>] ? xfs_lookup+0xa4/0xd3 [xfs]
> [192173.364328]  [<ffffffffa04d9e5a>] ? xfs_vn_lookup+0x3f/0x7e [xfs]
> [192173.364344]  [<ffffffff81102de9>] ? d_alloc_and_lookup+0x3a/0x60
> [192173.364357]  [<ffffffff8110388d>] ? walk_component+0x219/0x406
> [192173.364370]  [<ffffffff81104721>] ? path_lookupat+0x7c/0x2bd
> [192173.364383]  [<ffffffff81036628>] ? should_resched+0x5/0x23
> [192173.364396]  [<ffffffff8134f144>] ? _cond_resched+0x7/0x1c
> [192173.364408]  [<ffffffff8110497e>] ? do_path_lookup+0x1c/0x87
> [192173.364420]  [<ffffffff81106407>] ? user_path_at_empty+0x47/0x7b
> [192173.364434]  [<ffffffff813533d8>] ? do_page_fault+0x30a/0x345
> [192173.364448]  [<ffffffff810d6a04>] ? mmap_region+0x353/0x44a
> [192173.364460]  [<ffffffff810fe45a>] ? vfs_fstatat+0x32/0x60
> [192173.364471]  [<ffffffff810fe590>] ? sys_newstat+0x12/0x2b
> [192173.364483]  [<ffffffff813509f5>] ? page_fault+0x25/0x30
> [192173.364495]  [<ffffffff81355452>] ? system_call_fastpath+0x16/0x1b
> [192173.364503] XFS (md0): Corruption detected. Unmount and run xfs_repair
>
>         That last line, by the way, is why I ran umount and xfs_repair.
>
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>

[-- Attachment #1.2: Type: text/html, Size: 12076 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  1:25     ` Sean Caron
@ 2014-09-10  1:43       ` Leslie Rhorer
  2014-09-10 14:31         ` Emmanuel Florac
  0 siblings, 1 reply; 35+ messages in thread
From: Leslie Rhorer @ 2014-09-10  1:43 UTC (permalink / raw)
  To: Sean Caron; +Cc: xfs@oss.sgi.com

On 9/9/2014 8:25 PM, Sean Caron wrote:
> Hi Leslie,
>
> You really don't want to be running "green" anything in an array... that
> is a ticking time bomb just waiting to go off... let me tell you... At
> my installation, a predecessor had procured a large number of green
> drives because they were very inexpensive and regrets were had by all.

	The alternative is nothing at all.  I am not a company, just a guy with 
a couple of arrays at his house.  'Not a rich guy, either.

	I've had these arrays since 2001 with only one other mass drive 
failure, and that was not unrecoverable, nor were they "green" drives. 
(Four Seagate drives all suddenly decided they did not want to be part 
of the array, so md kicked all four simultaneously.  After that, they 
would not stay up as part of the array long enough to be mounted.  I was 
able to read all four with dd_rescue, and get the array back online 
without a single lost file.

	Note also these arrays are not usually under any sort of massive load. 
  The bulk of the data is video files which are written once at about 
80MBps and then read one-by-one at about 4MBps.

> Lousy performance, lots of spurious ejection/RAID gremlins and the
> failure rate on the WDC Greens is just appalling...

	None of the failed drives were WD green.  All three and the previous 
four were Seagate.  I realize that is not a large statistical sample.

> BBWC stands for Battery Backed Write Cache; this is a feature of
> hardware RAID cards

	Ah, yes.  This array does not have a BBWC controller.  The backup array 
does, actually, but the battery backup is disabled.

> it is just like it says on the tin; a bit (usually
> half a gig, or a gig, or two...) of nonvolatile cache that retains
> writes to the array in case of power failure, etc. If you have BBWC
> enabled but your battery is dead, bad things can happen. Not applicable
> for JBOD software RAID.

	Exactly.  All the arrays are JBOD / mdadm.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  1:43       ` Leslie Rhorer
@ 2014-09-10 14:31         ` Emmanuel Florac
  2014-09-10 14:52           ` Grozdan
                             ` (3 more replies)
  0 siblings, 4 replies; 35+ messages in thread
From: Emmanuel Florac @ 2014-09-10 14:31 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: Sean Caron, xfs@oss.sgi.com

Le Tue, 09 Sep 2014 20:43:08 -0500
Leslie Rhorer <lrhorer@mygrande.net> écrivait:

> 	None of the failed drives were WD green.  All three and the
> previous four were Seagate.  I realize that is not a large
> statistical sample.
> 

If you're interested in large statistical samples, on a grand total of
4000 1 TB Seagate Barracuda ES2, I had to replace 2100 of them over the
course of 3 years. I still have a couple of hundred of these
unfortunate pieces of crap in service, and they still represent the
vast majority of unexpected RAID malfunctions, urgent replacements,
late night calls and other "interesting side activities".

I wouldn't buy anything labeled Seagate nowadays. Their drives have
been the baddest train wreck since the dreaded 9 GB Micropolis back in
1994 (or was it 1995?).

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10 14:31         ` Emmanuel Florac
@ 2014-09-10 14:52           ` Grozdan
  2014-09-10 15:12             ` Emmanuel Florac
  2014-09-10 14:54           ` Sean Caron
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 35+ messages in thread
From: Grozdan @ 2014-09-10 14:52 UTC (permalink / raw)
  To: Emmanuel Florac; +Cc: Sean Caron, Leslie Rhorer, xfs@oss.sgi.com

On Wed, Sep 10, 2014 at 4:31 PM, Emmanuel Florac <eflorac@intellique.com> wrote:
> Le Tue, 09 Sep 2014 20:43:08 -0500
> Leslie Rhorer <lrhorer@mygrande.net> écrivait:
>
>>       None of the failed drives were WD green.  All three and the
>> previous four were Seagate.  I realize that is not a large
>> statistical sample.
>>
>
> If you're interested in large statistical samples, on a grand total of
> 4000 1 TB Seagate Barracuda ES2, I had to replace 2100 of them over the
> course of 3 years. I still have a couple of hundred of these
> unfortunate pieces of crap in service, and they still represent the
> vast majority of unexpected RAID malfunctions, urgent replacements,
> late night calls and other "interesting side activities".
>
> I wouldn't buy anything labeled Seagate nowadays. Their drives have
> been the baddest train wreck since the dreaded 9 GB Micropolis back in
> 1994 (or was it 1995?).

Funny, because our server (105 of them) all run on Seagate drives a
few years now and I have yet to see one fail or cause other problems.
But then again, we use Constellation disks, not Barracuda's. At home I
also use both Barracuda's and Constellation ones and also have yet to
see a problem with them.

>
> --
> ------------------------------------------------------------------------
> Emmanuel Florac     |   Direction technique
>                     |   Intellique
>                     |   <eflorac@intellique.com>
>                     |   +33 1 78 94 84 02
> ------------------------------------------------------------------------
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs



-- 
Yours truly

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10 14:52           ` Grozdan
@ 2014-09-10 15:12             ` Emmanuel Florac
  2014-09-10 15:32               ` Grozdan
  0 siblings, 1 reply; 35+ messages in thread
From: Emmanuel Florac @ 2014-09-10 15:12 UTC (permalink / raw)
  To: Grozdan; +Cc: Sean Caron, Leslie Rhorer, xfs@oss.sgi.com

Le Wed, 10 Sep 2014 16:52:27 +0200
Grozdan <neutrino8@gmail.com> écrivait:

> Funny, because our server (105 of them) all run on Seagate drives a
> few years now and I have yet to see one fail or cause other problems.
> But then again, we use Constellation disks, not Barracuda's. At home I
> also use both Barracuda's and Constellation ones and also have yet to
> see a problem with them.
> 

Yes, we replaced most failed Barracudas with Constellations at a
later stage (because the "certified repaired" Barracudas aren't any
better... ) and these work fine so far. However, why would I give
Seagate my hard-earned money after they cost me so dearly for years? :)

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10 15:12             ` Emmanuel Florac
@ 2014-09-10 15:32               ` Grozdan
  0 siblings, 0 replies; 35+ messages in thread
From: Grozdan @ 2014-09-10 15:32 UTC (permalink / raw)
  To: Emmanuel Florac; +Cc: Sean Caron, Leslie Rhorer, xfs@oss.sgi.com

On Wed, Sep 10, 2014 at 5:12 PM, Emmanuel Florac <eflorac@intellique.com> wrote:
> Le Wed, 10 Sep 2014 16:52:27 +0200
> Grozdan <neutrino8@gmail.com> écrivait:
>
>> Funny, because our server (105 of them) all run on Seagate drives a
>> few years now and I have yet to see one fail or cause other problems.
>> But then again, we use Constellation disks, not Barracuda's. At home I
>> also use both Barracuda's and Constellation ones and also have yet to
>> see a problem with them.
>>
>
> Yes, we replaced most failed Barracudas with Constellations at a
> later stage (because the "certified repaired" Barracudas aren't any
> better... ) and these work fine so far. However, why would I give
> Seagate my hard-earned money after they cost me so dearly for years? :)

Oh, you are correct about the money. If it happened to us I'll also
think twice about that too. The biggest problems thus far we had were
with Samsung disks. I haven't seen such a high fail rate in all my
life. About 70% of the 100 disks we got failed within a year. Too bad
Seagate took them over. I can only hope that Seagate's manufacturing
an QA doesn't suffer because of that.

>
> --
> ------------------------------------------------------------------------
> Emmanuel Florac     |   Direction technique
>                     |   Intellique
>                     |   <eflorac@intellique.com>
>                     |   +33 1 78 94 84 02
> ------------------------------------------------------------------------



-- 
Yours truly

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10 14:31         ` Emmanuel Florac
  2014-09-10 14:52           ` Grozdan
@ 2014-09-10 14:54           ` Sean Caron
  2014-09-10 23:18           ` Leslie Rhorer
  2014-09-11 13:24           ` Greg Freemyer
  3 siblings, 0 replies; 35+ messages in thread
From: Sean Caron @ 2014-09-10 14:54 UTC (permalink / raw)
  To: Emmanuel Florac, Sean Caron; +Cc: Leslie Rhorer, xfs@oss.sgi.com

[-- Attachment #1.1: Type: text/plain, Size: 2182 bytes --]

I am probably overseeing a similar number (3-4000) of Hitachi A7K3000s,
A7K2000s and WDC RE4s and I probably see a few failures a month. When we
are building a new machine and we get a fresh shipment in, maybe 10%
failure rate right out of the box. Those that survive the burn-in usually
do pretty good. Man, you have my sympathy with that failure rate in excess
of 50%... even the WDC Greens weren't THAT bad (although it probably got
close, as we neared closer and closer to EOLing them... and they had been
moved to third-tier "backup storage" status by that point). Thankfully they
are gone now, LOL.

You're right, esp. in large installations, it's critical to do your
homework on drives, pick a good candidate, validate it and then run with
them. Even with the good ones, you've gotta keep a watchful eye... "when
you buy them in bulk, they fail in bulk".

Best,

Sean

On Wed, Sep 10, 2014 at 10:31 AM, Emmanuel Florac <eflorac@intellique.com>
wrote:

> Le Tue, 09 Sep 2014 20:43:08 -0500
> Leslie Rhorer <lrhorer@mygrande.net> écrivait:
>
> >       None of the failed drives were WD green.  All three and the
> > previous four were Seagate.  I realize that is not a large
> > statistical sample.
> >
>
> If you're interested in large statistical samples, on a grand total of
> 4000 1 TB Seagate Barracuda ES2, I had to replace 2100 of them over the
> course of 3 years. I still have a couple of hundred of these
> unfortunate pieces of crap in service, and they still represent the
> vast majority of unexpected RAID malfunctions, urgent replacements,
> late night calls and other "interesting side activities".
>
> I wouldn't buy anything labeled Seagate nowadays. Their drives have
> been the baddest train wreck since the dreaded 9 GB Micropolis back in
> 1994 (or was it 1995?).
>
> --
> ------------------------------------------------------------------------
> Emmanuel Florac     |   Direction technique
>                     |   Intellique
>                     |   <eflorac@intellique.com>
>                     |   +33 1 78 94 84 02
> ------------------------------------------------------------------------
>

[-- Attachment #1.2: Type: text/html, Size: 2947 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10 14:31         ` Emmanuel Florac
  2014-09-10 14:52           ` Grozdan
  2014-09-10 14:54           ` Sean Caron
@ 2014-09-10 23:18           ` Leslie Rhorer
  2014-09-11 13:24           ` Greg Freemyer
  3 siblings, 0 replies; 35+ messages in thread
From: Leslie Rhorer @ 2014-09-10 23:18 UTC (permalink / raw)
  To: Emmanuel Florac; +Cc: Sean Caron, xfs@oss.sgi.com

On 9/10/2014 9:31 AM, Emmanuel Florac wrote:
> Le Tue, 09 Sep 2014 20:43:08 -0500
> Leslie Rhorer <lrhorer@mygrande.net> écrivait:
>
>> 	None of the failed drives were WD green.  All three and the
>> previous four were Seagate.  I realize that is not a large
>> statistical sample.
>>
>
> If you're interested in large statistical samples, on a grand total of
> 4000 1 TB Seagate Barracuda ES2, I had to replace 2100 of them over the
> course of 3 years.

	That's a good sized statistical sample.  Oddly enough, perhaps, the 
ones that failed on me were also 1T Barracuda drives, and my failure 
rate was 40%.


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10 14:31         ` Emmanuel Florac
                             ` (2 preceding siblings ...)
  2014-09-10 23:18           ` Leslie Rhorer
@ 2014-09-11 13:24           ` Greg Freemyer
  2014-09-12  7:06             ` Emmanuel Florac
  3 siblings, 1 reply; 35+ messages in thread
From: Greg Freemyer @ 2014-09-11 13:24 UTC (permalink / raw)
  To: Emmanuel Florac; +Cc: Sean Caron, Leslie Rhorer, xfs@oss.sgi.com

On Wed, Sep 10, 2014 at 10:31 AM, Emmanuel Florac
<eflorac@intellique.com> wrote:
> Le Tue, 09 Sep 2014 20:43:08 -0500
> Leslie Rhorer <lrhorer@mygrande.net> écrivait:
>
>>       None of the failed drives were WD green.  All three and the
>> previous four were Seagate.  I realize that is not a large
>> statistical sample.
>>
>
> If you're interested in large statistical samples, on a grand total of
> 4000 1 TB Seagate Barracuda ES2, I had to replace 2100 of them over the
> course of 3 years. I still have a couple of hundred of these
> unfortunate pieces of crap in service, and they still represent the
> vast majority of unexpected RAID malfunctions, urgent replacements,
> late night calls and other "interesting side activities".
>
> I wouldn't buy anything labeled Seagate nowadays. Their drives have
> been the baddest train wreck since the dreaded 9 GB Micropolis back in
> 1994 (or was it 1995?).

I buy about 100 drives a year, but I don't work them very hard.  Just
lots of data to store and I need to keep my data sets segregated for
legal reasons.  I don't use raid, just lots of individual disks and
most data maintained redundantly.

About 4 years ago (or maybe 5), Seagate had a catastrophic drive
situation.  I can remember buying a batch of 10 drives and having 8 of
them fail in the first 2 months.  The bad part was they mostly
survived a 10 hour burn-in, so they tended to fail with real data on
them.   I had one case (at a minimum) that summer where I put the data
on 3 different Seagate drives and all 3 failed.

Fortunately, I was able to swap the disk controller card from one of
the working drives with one of the dead drives and recover the data.

Regardless, ignoring the summer of discontent, I find Seagate to be my
preferred drives.

fyi: In June I bought 30 or so WD elements drives to try them out.
These are not the green drives, just bare bones WD drives.  None of
them were DOA, but 3 failed within 4 weeks, so a 10% failure rate in
the first month.  Only one of them had unique data on it, so I had to
recreate that data.  Fortunately the source of the data was still
available.  All of those drives have been pulled out of routine
service.

Greg

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-11 13:24           ` Greg Freemyer
@ 2014-09-12  7:06             ` Emmanuel Florac
  0 siblings, 0 replies; 35+ messages in thread
From: Emmanuel Florac @ 2014-09-12  7:06 UTC (permalink / raw)
  To: Greg Freemyer; +Cc: Sean Caron, Leslie Rhorer, xfs@oss.sgi.com

Le Thu, 11 Sep 2014 09:24:04 -0400 vous écriviez:

> Regardless, ignoring the summer of discontent, I find Seagate to be my
> preferred drives.

Nowadays  I only buy HGST drives. The 3 TB aren't as reliable as the
1, 2, 4 and 6 TB, but generally speaking the failure rate is extremely
low (an order of  a few failures a year among several thousands units).

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  1:12   ` Leslie Rhorer
  2014-09-10  1:25     ` Sean Caron
@ 2014-09-10  1:53     ` Dave Chinner
  2014-09-10  3:10       ` Leslie Rhorer
  1 sibling, 1 reply; 35+ messages in thread
From: Dave Chinner @ 2014-09-10  1:53 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: xfs

On Tue, Sep 09, 2014 at 08:12:38PM -0500, Leslie Rhorer wrote:
> On 9/9/2014 5:06 PM, Dave Chinner wrote:
> >Fristly, more infomration is required, namely versions and actual
> >error messages:
> 
> 	Indubitably:
> 
> RAID-Server:/# xfs_repair -V
> xfs_repair version 3.1.7
> RAID-Server:/# uname -r
> 3.2.0-4-amd64

Ok, so a relatively old xfs_repair. That's important - read on....

> 4.0 GHz FX-8350 eight core processor
> 
> RAID-Server:/# cat /proc/meminfo /proc/mounts /proc/partitions
> MemTotal:        8099916 kB
....
> /dev/md0 /RAID xfs
> rw,relatime,attr2,delaylog,sunit=2048,swidth=12288,noquota 0 0

FWIW, you don't need sunit=2048,swidth=12288 in the mount options -
they are stored on disk and the mount options are only necessray to
change the on-disk values.

> Personalities : [raid6] [raid5] [raid4] [raid1] [raid0]
> md10 : active raid0 sdf[0] sde[2] sdg[1]
>       4395021312 blocks super 1.2 512k chunks
> 
> md0 : active raid6 md10[12] sdc[13] sdk[10] sdj[11] sdi[15] sdh[8] sdd[9]
>       23441319936 blocks super 1.2 level 6, 1024k chunk, algorithm 2
> [8/7] [UUU_UUUU]
>       bitmap: 29/30 pages [116KB], 65536KB chunk
> 
> md3 : active (auto-read-only) raid1 sda3[0] sdb3[1]
>       12623744 blocks super 1.2 [3/2] [UU_]
>       bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md2 : active raid1 sda2[0] sdb2[1]
>       112239488 blocks super 1.2 [3/2] [UU_]
>       bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> md1 : active raid1 sda1[0] sdb1[1]
>       96192 blocks [3/2] [UU_]
>       bitmap: 1/1 pages [4KB], 65536KB chunk
> 
> unused devices: <none>
> 
> 	Six of the drives are 4T spindles (a mixture of makes and models).
> The three drives comprising MD10 are WD 1.5T green drives.  These
> are in place to take over the function of one of the kicked 4T
> drives.  Md1, 2, and 3 are not data drives and are not suffering any
> issue.

Ok, that's creative. But when you need another drive in the array
and you don't have the right spares.... ;)

> 	I'm not sure what is meant by "write cache status" in this context.
> The machine has been rebooted more than once during recovery and the
> FS has been umounted and xfs_repair run several times.

Start here and read the next few entries:

http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F

> 	I don't know for what the acronym BBWC stands.

"battery backed write cache". If you're not using a hardware RAID
controller, it's unlikely you have one. The difference between a
drive write cache and a BBWC is that the BBWC is non-volatile - it
does not get lost when power drops.

> RAID-Server:/# xfs_info /dev/md0
> meta-data=/dev/md0               isize=256    agcount=43,
> agsize=137356288 blks
>          =                       sectsz=512   attr=2
> data     =                       bsize=4096   blocks=5860329984, imaxpct=5
>          =                       sunit=256    swidth=1536 blks
> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal               bsize=4096   blocks=521728, version=2
>          =                       sectsz=512   sunit=8 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0

Ok, that all looks pretty good, and the sunit/swidth match the mount
options you set so you definitely don't need the mount options...

> 	The system performs just fine, other than the aforementioned, with
> loads in excess of 3Gbps.  That is internal only.  The LAN link is
> ony 1Gbps, so no external request exceeds about 950Mbps.
> 
> >http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
> >
> >dmesg, in particular, should tell use what the corruption being
> >encountered is when stat fails.
> 
> RAID-Server:/# ls "/RAID/DVD/Big Sleep, The (1945)/VIDEO_TS/VTS_01_1.VOB"
> ls: cannot access /RAID/DVD/Big Sleep, The
> (1945)/VIDEO_TS/VTS_01_1.VOB: Structure needs cleaning
> RAID-Server:/# dmesg | tail -n 30
> ...
> [192173.363981] XFS (md0): corrupt dinode 41006, extent total = 1,
> nblocks = 0.
> [192173.363988] ffff8802338b8e00: 49 4e 81 b6 02 02 00 00 00 00 03
> e8 00 00 03 e8  IN..............
> [192173.363996] XFS (md0): Internal error xfs_iformat(1) at line 319
> of file /build/linux-eKuxrT/linux-3.2.60/fs/xfs/xfs_inode.c.  Caller
> 0xffffffffa0509318
> [192173.363999]
> [192173.364062] Pid: 10813, comm: ls Not tainted 3.2.0-4-amd64 #1
> Debian 3.2.60-1+deb7u3
> [192173.364065] Call Trace:
> [192173.364097]  [<ffffffffa04d3731>] ? xfs_corruption_error+0x54/0x6f [xfs]
> [192173.364134]  [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
> [192173.364170]  [<ffffffffa0508efa>] ? xfs_iformat+0xe3/0x462 [xfs]
> [192173.364204]  [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
> [192173.364240]  [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
> [192173.364268]  [<ffffffffa04d6ebe>] ? xfs_iget+0x37c/0x56c [xfs]
> [192173.364300]  [<ffffffffa04e13b4>] ? xfs_lookup+0xa4/0xd3 [xfs]
> [192173.364328]  [<ffffffffa04d9e5a>] ? xfs_vn_lookup+0x3f/0x7e [xfs]
> [192173.364344]  [<ffffffff81102de9>] ? d_alloc_and_lookup+0x3a/0x60
> [192173.364357]  [<ffffffff8110388d>] ? walk_component+0x219/0x406
> [192173.364370]  [<ffffffff81104721>] ? path_lookupat+0x7c/0x2bd
> [192173.364383]  [<ffffffff81036628>] ? should_resched+0x5/0x23
> [192173.364396]  [<ffffffff8134f144>] ? _cond_resched+0x7/0x1c
> [192173.364408]  [<ffffffff8110497e>] ? do_path_lookup+0x1c/0x87
> [192173.364420]  [<ffffffff81106407>] ? user_path_at_empty+0x47/0x7b
> [192173.364434]  [<ffffffff813533d8>] ? do_page_fault+0x30a/0x345
> [192173.364448]  [<ffffffff810d6a04>] ? mmap_region+0x353/0x44a
> [192173.364460]  [<ffffffff810fe45a>] ? vfs_fstatat+0x32/0x60
> [192173.364471]  [<ffffffff810fe590>] ? sys_newstat+0x12/0x2b
> [192173.364483]  [<ffffffff813509f5>] ? page_fault+0x25/0x30
> [192173.364495]  [<ffffffff81355452>] ? system_call_fastpath+0x16/0x1b
> [192173.364503] XFS (md0): Corruption detected. Unmount and run xfs_repair
> 
> 	That last line, by the way, is why I ran umount and xfs_repair.

Right, that's the correct thing to do, but sometimes there are
issues that repair doesn't handle properly. This *was* one of them,
and it was fixed by commit e1f43b4 ("repair: update extent count
after zapping duplicate blocks") which was added to xfs_repair
v3.1.8.

IOWs, upgrading xfsprogs to the latest release and re-running
xfs_repair should fix this error.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  1:53     ` Dave Chinner
@ 2014-09-10  3:10       ` Leslie Rhorer
  2014-09-10  3:33         ` Dave Chinner
  0 siblings, 1 reply; 35+ messages in thread
From: Leslie Rhorer @ 2014-09-10  3:10 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 9/9/2014 8:53 PM, Dave Chinner wrote:
> On Tue, Sep 09, 2014 at 08:12:38PM -0500, Leslie Rhorer wrote:
>> On 9/9/2014 5:06 PM, Dave Chinner wrote:
>>> Fristly, more infomration is required, namely versions and actual
>>> error messages:
>>
>> 	Indubitably:
>>
>> RAID-Server:/# xfs_repair -V
>> xfs_repair version 3.1.7
>> RAID-Server:/# uname -r
>> 3.2.0-4-amd64
>
> Ok, so a relatively old xfs_repair. That's important - read on....

	OK, a good reason is a good reason.

>> 4.0 GHz FX-8350 eight core processor
>>
>> RAID-Server:/# cat /proc/meminfo /proc/mounts /proc/partitions
>> MemTotal:        8099916 kB
> ....
>> /dev/md0 /RAID xfs
>> rw,relatime,attr2,delaylog,sunit=2048,swidth=12288,noquota 0 0
>
> FWIW, you don't need sunit=2048,swidth=12288 in the mount options -
> they are stored on disk and the mount options are only necessray to
> change the on-disk values.

	They aren't.  Those were created automatically, weather at creation 
time or at mount time, I don't know, but the filesystem was created with

mkfs.xfs /dev/md0

and fstab contains:

/dev/md0  /RAID  xfs  rw  1  2

>> 	Six of the drives are 4T spindles (a mixture of makes and models).
>> The three drives comprising MD10 are WD 1.5T green drives.  These
>> are in place to take over the function of one of the kicked 4T
>> drives.  Md1, 2, and 3 are not data drives and are not suffering any
>> issue.
>
> Ok, that's creative. But when you need another drive in the array
> and you don't have the right spares.... ;)

	Yes, but I wasn't really expecting to need 3 spares this soon or 
suddenly.  These are fairly new drives, and with 33% of the array being 
parity, the sudden need for 3 extra drives just is not too likely. 
That, plus I have quite a few 1.5 and 1.0T drives lying around in case 
of sudden emergency.  This isn't the first time I've replaced a single 
drive temporarily with a RAID0.  The performance is actually better, of 
course, and for the 3 or 4 days it takes to get a new drive, it's really 
not an issue.  Since I have a full online backup system plus a regularly 
updated off-site backup, the risk is quite minimal.  This is an exercise 
in mild inconvenience, not an emergency failure.  If this were a 
commercial system, it would be another matter, but I know for a fact 
there are a very large number of home NAS solutions in place that are 
less robust than this one.  I personally know quite a few people who 
never do backups, at all.

>> 	I'm not sure what is meant by "write cache status" in this context.
>> The machine has been rebooted more than once during recovery and the
>> FS has been umounted and xfs_repair run several times.
>
> Start here and read the next few entries:
>
> http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F

	I knew that, but I still don't see the relevance in this context. 
There is no battery backup on the drive controller or the drives, and 
the drives have all been powered down and back up several times. 
Anything in any cache right now would be from some operation in the last 
few minutes, not four days ago.

>> 	I don't know for what the acronym BBWC stands.
>
> "battery backed write cache". If you're not using a hardware RAID
> controller, it's unlikely you have one.

	See my previous.  I do have one (a 3Ware 9650E, given to me by a friend 
when his company switched to zfs for their server).  It's not on this 
system.  This array is on a HighPoint RocketRAID 2722.

> The difference between a
> drive write cache and a BBWC is that the BBWC is non-volatile - it
> does not get lost when power drops.

	Yeah, I'm aware, thanks.  I just didn't cotton to the acronym.

>> RAID-Server:/# xfs_info /dev/md0
>> meta-data=/dev/md0               isize=256    agcount=43,
>> agsize=137356288 blks
>>           =                       sectsz=512   attr=2
>> data     =                       bsize=4096   blocks=5860329984, imaxpct=5
>>           =                       sunit=256    swidth=1536 blks
>> naming   =version 2              bsize=4096   ascii-ci=0
>> log      =internal               bsize=4096   blocks=521728, version=2
>>           =                       sectsz=512   sunit=8 blks, lazy-count=1
>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>
> Ok, that all looks pretty good, and the sunit/swidth match the mount
> options you set so you definitely don't need the mount options...

	Yeah, I didn't set them.  What did, I don't really know for certain. 
See above.


>> [192173.364460]  [<ffffffff810fe45a>] ? vfs_fstatat+0x32/0x60
>> [192173.364471]  [<ffffffff810fe590>] ? sys_newstat+0x12/0x2b
>> [192173.364483]  [<ffffffff813509f5>] ? page_fault+0x25/0x30
>> [192173.364495]  [<ffffffff81355452>] ? system_call_fastpath+0x16/0x1b
>> [192173.364503] XFS (md0): Corruption detected. Unmount and run xfs_repair
>>
>> 	That last line, by the way, is why I ran umount and xfs_repair.
>
> Right, that's the correct thing to do, but sometimes there are
> issues that repair doesn't handle properly. This *was* one of them,
> and it was fixed by commit e1f43b4 ("repair: update extent count
> after zapping duplicate blocks") which was added to xfs_repair
> v3.1.8.
>
> IOWs, upgrading xfsprogs to the latest release and re-running
> xfs_repair should fix this error.

	OK. I'll scarf the source and compile.  All I need is to git clone 
git://oss.sgi.com/xfs/xfs and git://oss.sgi.com/xfs/cmds/xfsprogs, right?

	I've never used git on a package maintained in my distro.  Will I have 
issues when I upgrade to Debian Jessie in a few months, since this is 
not being managed by apt / dpkg?  It looks like Jessie has 3.2.1 of 
xfs-progs.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  3:10       ` Leslie Rhorer
@ 2014-09-10  3:33         ` Dave Chinner
  2014-09-10  4:14           ` Leslie Rhorer
  2014-09-10  4:51           ` Leslie Rhorer
  0 siblings, 2 replies; 35+ messages in thread
From: Dave Chinner @ 2014-09-10  3:33 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: xfs

On Tue, Sep 09, 2014 at 10:10:45PM -0500, Leslie Rhorer wrote:
> On 9/9/2014 8:53 PM, Dave Chinner wrote:
> >On Tue, Sep 09, 2014 at 08:12:38PM -0500, Leslie Rhorer wrote:
> >>On 9/9/2014 5:06 PM, Dave Chinner wrote:
> >>>Fristly, more infomration is required, namely versions and actual
> >>>error messages:
> >>
> >>	Indubitably:
> >>
> >>RAID-Server:/# xfs_repair -V
> >>xfs_repair version 3.1.7
> >>RAID-Server:/# uname -r
> >>3.2.0-4-amd64
> >
> >Ok, so a relatively old xfs_repair. That's important - read on....
> 
> 	OK, a good reason is a good reason.
> 
> >>4.0 GHz FX-8350 eight core processor
> >>
> >>RAID-Server:/# cat /proc/meminfo /proc/mounts /proc/partitions
> >>MemTotal:        8099916 kB
> >....
> >>/dev/md0 /RAID xfs
> >>rw,relatime,attr2,delaylog,sunit=2048,swidth=12288,noquota 0 0
> >
> >FWIW, you don't need sunit=2048,swidth=12288 in the mount options -
> >they are stored on disk and the mount options are only necessray to
> >change the on-disk values.
> 
> 	They aren't.  Those were created automatically, weather at creation
> time or at mount time, I don't know, but the filesystem was created
> with

Ah, my mistake. Normally it's only mount options in that code - I
forgot that we report sunit/swidth unconditionally if it is set in
the superblock.

> >>	I'm not sure what is meant by "write cache status" in this context.
> >>The machine has been rebooted more than once during recovery and the
> >>FS has been umounted and xfs_repair run several times.
> >
> >Start here and read the next few entries:
> >
> >http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F
> 
> 	I knew that, but I still don't see the relevance in this context.
> There is no battery backup on the drive controller or the drives,
> and the drives have all been powered down and back up several times.
> Anything in any cache right now would be from some operation in the
> last few minutes, not four days ago.

There is no direct relevance to your situation, but for a lot of
other common problems it definitely is. That's why we ask people to
report it with all the other information about their system

> >>	I don't know for what the acronym BBWC stands.
> >
> >"battery backed write cache". If you're not using a hardware RAID
> >controller, it's unlikely you have one.
> 
> 	See my previous.  I do have one (a 3Ware 9650E, given to me by a
> friend when his company switched to zfs for their server).  It's not
> on this system.  This array is on a HighPoint RocketRAID 2722.

Ok. We have seen over time that those 3ware controllers can do
strange things in error conditions - we've had reports of entire
hardware luns dying and being completely unrecoverable after a
disk was kicked out due to an error. I can't comment on the
highpoint controller - either not many people use them or they just
don't report problems if there do. Either way, I'd suggest that if
you aren't running the latest firmware it would be to update them
as these problems were typically fixed by newer firmware releases.

> >>[192173.364460]  [<ffffffff810fe45a>] ? vfs_fstatat+0x32/0x60
> >>[192173.364471]  [<ffffffff810fe590>] ? sys_newstat+0x12/0x2b
> >>[192173.364483]  [<ffffffff813509f5>] ? page_fault+0x25/0x30
> >>[192173.364495]  [<ffffffff81355452>] ? system_call_fastpath+0x16/0x1b
> >>[192173.364503] XFS (md0): Corruption detected. Unmount and run xfs_repair
> >>
> >>	That last line, by the way, is why I ran umount and xfs_repair.
> >
> >Right, that's the correct thing to do, but sometimes there are
> >issues that repair doesn't handle properly. This *was* one of them,
> >and it was fixed by commit e1f43b4 ("repair: update extent count
> >after zapping duplicate blocks") which was added to xfs_repair
> >v3.1.8.
> >
> >IOWs, upgrading xfsprogs to the latest release and re-running
> >xfs_repair should fix this error.
> 
> 	OK. I'll scarf the source and compile.  All I need is to git clone
> git://oss.sgi.com/xfs/xfs and git://oss.sgi.com/xfs/cmds/xfsprogs,
> right?

Just clone git://oss.sgi.com/xfs/cmds/xfsprogs and check out the
v3.2.1 tag and build that..

> 	I've never used git on a package maintained in my distro.  Will I
> have issues when I upgrade to Debian Jessie in a few months, since
> this is not being managed by apt / dpkg?  It looks like Jessie has
> 3.2.1 of xfs-progs.

If you're using debian you can build debian packages directly from
the git tree via "make deb" (I use it all the time for pushing
new builds to my test machines) and so when you upgrade to Jessie it
should just replace your custom built package correctly...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  3:33         ` Dave Chinner
@ 2014-09-10  4:14           ` Leslie Rhorer
  2014-09-10  4:22             ` Leslie Rhorer
  2014-09-10  4:51           ` Leslie Rhorer
  1 sibling, 1 reply; 35+ messages in thread
From: Leslie Rhorer @ 2014-09-10  4:14 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 9/9/2014 10:33 PM, Dave Chinner wrote:

> There is no direct relevance to your situation, but for a lot of
> other common problems it definitely is. That's why we ask people to
> report it with all the other information about their system

	Yeah, understood.

> Ok. We have seen over time that those 3ware controllers can do
> strange things in error conditions - we've had reports of entire
> hardware luns dying and being completely unrecoverable after a
> disk was kicked out due to an error.

	Oof.  That's not good.  It's stable right now.  I'm considering a 
different controller at some point.  I may accelerate that process.

> I can't comment on the
> highpoint controller - either not many people use them or they just
> don't report problems if there do. Either way, I'd suggest that if
> you aren't running the latest firmware it would be to update them
> as these problems were typically fixed by newer firmware releases.

	As a matter of fact, I was going to do just that.  I have to reboot the 
system in DOS (of all things), since they don't have a linux loader. 
I've got to arrange a convenient time.

>> 	OK. I'll scarf the source and compile.  All I need is to git clone
>> git://oss.sgi.com/xfs/xfs and git://oss.sgi.com/xfs/cmds/xfsprogs,
>> right?
>
> Just clone git://oss.sgi.com/xfs/cmds/xfsprogs and check out the
> v3.2.1 tag and build that..

	OK, I'm doing something wrong, I think.  It's been over a decade since 
I compiled a kernel.  It makes me a little nervous.
>
>> 	I've never used git on a package maintained in my distro.  Will I
>> have issues when I upgrade to Debian Jessie in a few months, since
>> this is not being managed by apt / dpkg?  It looks like Jessie has
>> 3.2.1 of xfs-progs.
>
> If you're using debian you can build debian packages directly from
> the git tree via "make deb" (I use it all the time for pushing

	Um, is that make deb-pkg, perhaps?  I'm not seeing a "deb" in the 
package targets.

> new builds to my test machines) and so when you upgrade to Jessie it
> should just replace your custom built package correctly...

	`make deb` finds no install target.  If I run `make menuconfig` it 
complains about there being no ncurses.  Libncurses5 is installed, and I 
don't know what else I should get.  `make oldconfig` seems to work.  Am 
I headed the right direction?  There are quite a few configuration 
targets, and I am not sure which one to choose.  There are also a number 
of questions asked by the oldconfig target (and presumably the same for 
other config targets), and I'm unsure how to answer.  I definitely don't 
want to make an error and potentially wind up with an unbootable system.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  4:14           ` Leslie Rhorer
@ 2014-09-10  4:22             ` Leslie Rhorer
  2014-09-10 14:34               ` Emmanuel Florac
  0 siblings, 1 reply; 35+ messages in thread
From: Leslie Rhorer @ 2014-09-10  4:22 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs


>>>     OK. I'll scarf the source and compile.  All I need is to git clone
>>> git://oss.sgi.com/xfs/xfs and git://oss.sgi.com/xfs/cmds/xfsprogs,
>>> right?
>>
>> Just clone git://oss.sgi.com/xfs/cmds/xfsprogs and check out the
>> v3.2.1 tag and build that..

	Oops!  Hold on.  I didn't read that closely enough.  You were saying I 
only need to compile xfs-progs.  That's working.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  4:22             ` Leslie Rhorer
@ 2014-09-10 14:34               ` Emmanuel Florac
  0 siblings, 0 replies; 35+ messages in thread
From: Emmanuel Florac @ 2014-09-10 14:34 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: xfs

Le Tue, 09 Sep 2014 23:22:03 -0500
Leslie Rhorer <lrhorer@mygrande.net> écrivait:

> 	Oops!  Hold on.  I didn't read that closely enough.  You were
> saying I only need to compile xfs-progs.  That's working.
> 

You don't need to install the resulting binaries either. xfs_repair
will happily run from the source directory, ./xfs_repair /dev/blah ...

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  3:33         ` Dave Chinner
  2014-09-10  4:14           ` Leslie Rhorer
@ 2014-09-10  4:51           ` Leslie Rhorer
  2014-09-10  5:23             ` Dave Chinner
  1 sibling, 1 reply; 35+ messages in thread
From: Leslie Rhorer @ 2014-09-10  4:51 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 9/9/2014 10:33 PM, Dave Chinner wrote:
> On Tue, Sep 09, 2014 at 10:10:45PM -0500, Leslie Rhorer wrote:
>> On 9/9/2014 8:53 PM, Dave Chinner wrote:
>>> On Tue, Sep 09, 2014 at 08:12:38PM -0500, Leslie Rhorer wrote:
>>>> On 9/9/2014 5:06 PM, Dave Chinner wrote:
>>>>> Fristly, more infomration is required, namely versions and actual
>>>>> error messages:
>>>>
>>>> 	Indubitably:
>>>>
>>>> RAID-Server:/# xfs_repair -V
>>>> xfs_repair version 3.1.7
>>>> RAID-Server:/# uname -r
>>>> 3.2.0-4-amd64
>>>
>>> Ok, so a relatively old xfs_repair. That's important - read on....
>>
>> 	OK, a good reason is a good reason.
>>
>>>> 4.0 GHz FX-8350 eight core processor
>>>>
>>>> RAID-Server:/# cat /proc/meminfo /proc/mounts /proc/partitions
>>>> MemTotal:        8099916 kB
>>> ....
>>>> /dev/md0 /RAID xfs
>>>> rw,relatime,attr2,delaylog,sunit=2048,swidth=12288,noquota 0 0
>>>
>>> FWIW, you don't need sunit=2048,swidth=12288 in the mount options -
>>> they are stored on disk and the mount options are only necessray to
>>> change the on-disk values.
>>
>> 	They aren't.  Those were created automatically, weather at creation
>> time or at mount time, I don't know, but the filesystem was created
>> with
>
> Ah, my mistake. Normally it's only mount options in that code - I
> forgot that we report sunit/swidth unconditionally if it is set in
> the superblock.
>
>>>> 	I'm not sure what is meant by "write cache status" in this context.
>>>> The machine has been rebooted more than once during recovery and the
>>>> FS has been umounted and xfs_repair run several times.
>>>
>>> Start here and read the next few entries:
>>>
>>> http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F
>>
>> 	I knew that, but I still don't see the relevance in this context.
>> There is no battery backup on the drive controller or the drives,
>> and the drives have all been powered down and back up several times.
>> Anything in any cache right now would be from some operation in the
>> last few minutes, not four days ago.
>
> There is no direct relevance to your situation, but for a lot of
> other common problems it definitely is. That's why we ask people to
> report it with all the other information about their system
>
>>>> 	I don't know for what the acronym BBWC stands.
>>>
>>> "battery backed write cache". If you're not using a hardware RAID
>>> controller, it's unlikely you have one.
>>
>> 	See my previous.  I do have one (a 3Ware 9650E, given to me by a
>> friend when his company switched to zfs for their server).  It's not
>> on this system.  This array is on a HighPoint RocketRAID 2722.
>
> Ok. We have seen over time that those 3ware controllers can do
> strange things in error conditions - we've had reports of entire
> hardware luns dying and being completely unrecoverable after a
> disk was kicked out due to an error. I can't comment on the
> highpoint controller - either not many people use them or they just
> don't report problems if there do. Either way, I'd suggest that if
> you aren't running the latest firmware it would be to update them
> as these problems were typically fixed by newer firmware releases.
>
>>>> [192173.364460]  [<ffffffff810fe45a>] ? vfs_fstatat+0x32/0x60
>>>> [192173.364471]  [<ffffffff810fe590>] ? sys_newstat+0x12/0x2b
>>>> [192173.364483]  [<ffffffff813509f5>] ? page_fault+0x25/0x30
>>>> [192173.364495]  [<ffffffff81355452>] ? system_call_fastpath+0x16/0x1b
>>>> [192173.364503] XFS (md0): Corruption detected. Unmount and run xfs_repair
>>>>
>>>> 	That last line, by the way, is why I ran umount and xfs_repair.
>>>
>>> Right, that's the correct thing to do, but sometimes there are
>>> issues that repair doesn't handle properly. This *was* one of them,
>>> and it was fixed by commit e1f43b4 ("repair: update extent count
>>> after zapping duplicate blocks") which was added to xfs_repair
>>> v3.1.8.
>>>
>>> IOWs, upgrading xfsprogs to the latest release and re-running
>>> xfs_repair should fix this error.
>>
>> 	OK. I'll scarf the source and compile.  All I need is to git clone
>> git://oss.sgi.com/xfs/xfs and git://oss.sgi.com/xfs/cmds/xfsprogs,
>> right?
>
> Just clone git://oss.sgi.com/xfs/cmds/xfsprogs and check out the
> v3.2.1 tag and build that..
>
>> 	I've never used git on a package maintained in my distro.  Will I
>> have issues when I upgrade to Debian Jessie in a few months, since
>> this is not being managed by apt / dpkg?  It looks like Jessie has
>> 3.2.1 of xfs-progs.
>
> If you're using debian you can build debian packages directly from
> the git tree via "make deb" (I use it all the time for pushing
> new builds to my test machines) and so when you upgrade to Jessie it
> should just replace your custom built package correctly...
>
> Cheers,
>
> Dave.

	Thanks a ton, Dave (and everyone else who helped).  That seems to have 
worked just fine.  The three grunged entries are gone and the system is 
happily copying over the backups.  Now I'll run another rsync with 
checksum to make sure everything is good before putting the backup into 
production.  I'm also going to upgrade the controller BIOS just in case.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  4:51           ` Leslie Rhorer
@ 2014-09-10  5:23             ` Dave Chinner
  2014-09-11  5:47               ` Leslie Rhorer
  0 siblings, 1 reply; 35+ messages in thread
From: Dave Chinner @ 2014-09-10  5:23 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: xfs

On Tue, Sep 09, 2014 at 11:51:42PM -0500, Leslie Rhorer wrote:
> On 9/9/2014 10:33 PM, Dave Chinner wrote:
> >On Tue, Sep 09, 2014 at 10:10:45PM -0500, Leslie Rhorer wrote:
> >>On 9/9/2014 8:53 PM, Dave Chinner wrote:
> >>>On Tue, Sep 09, 2014 at 08:12:38PM -0500, Leslie Rhorer wrote:
> >>>>On 9/9/2014 5:06 PM, Dave Chinner wrote:
> >>	I've never used git on a package maintained in my distro.  Will I
> >>have issues when I upgrade to Debian Jessie in a few months, since
> >>this is not being managed by apt / dpkg?  It looks like Jessie has
> >>3.2.1 of xfs-progs.
> >
> >If you're using debian you can build debian packages directly from
> >the git tree via "make deb" (I use it all the time for pushing
> >new builds to my test machines) and so when you upgrade to Jessie it
> >should just replace your custom built package correctly...
> 
> 	Thanks a ton, Dave (and everyone else who helped).  That seems to
> have worked just fine.  The three grunged entries are gone and the
> system is happily copying over the backups.  Now I'll run another
> rsync with checksum to make sure everything is good before putting
> the backup into production.  I'm also going to upgrade the
> controller BIOS just in case.

Good to hear. Hopefully everything will check out. Just yell if you
need more help. ;)

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Corrupted files
  2014-09-10  5:23             ` Dave Chinner
@ 2014-09-11  5:47               ` Leslie Rhorer
  0 siblings, 0 replies; 35+ messages in thread
From: Leslie Rhorer @ 2014-09-11  5:47 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 9/10/2014 12:23 AM, Dave Chinner wrote:
> On Tue, Sep 09, 2014 at 11:51:42PM -0500, Leslie Rhorer wrote:
>> On 9/9/2014 10:33 PM, Dave Chinner wrote:
>>> On Tue, Sep 09, 2014 at 10:10:45PM -0500, Leslie Rhorer wrote:
>>>> On 9/9/2014 8:53 PM, Dave Chinner wrote:
>>>>> On Tue, Sep 09, 2014 at 08:12:38PM -0500, Leslie Rhorer wrote:
>>>>>> On 9/9/2014 5:06 PM, Dave Chinner wrote:
>>>> 	I've never used git on a package maintained in my distro.  Will I
>>>> have issues when I upgrade to Debian Jessie in a few months, since
>>>> this is not being managed by apt / dpkg?  It looks like Jessie has
>>>> 3.2.1 of xfs-progs.
>>>
>>> If you're using debian you can build debian packages directly from
>>> the git tree via "make deb" (I use it all the time for pushing
>>> new builds to my test machines) and so when you upgrade to Jessie it
>>> should just replace your custom built package correctly...
>>
>> 	Thanks a ton, Dave (and everyone else who helped).  That seems to
>> have worked just fine.  The three grunged entries are gone and the
>> system is happily copying over the backups.  Now I'll run another
>> rsync with checksum to make sure everything is good before putting
>> the backup into production.  I'm also going to upgrade the
>> controller BIOS just in case.
>
> Good to hear. Hopefully everything will check out. Just yell if you
> need more help. ;)

	Thanks.  The rsync compare just finished on the non-volatile areas of 
the file system without a single mismatch and no missing files.  That's 
good enough for me.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2014-09-12  7:06 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-09 15:21 Corrupted files Leslie Rhorer
2014-09-09 15:50 ` Sean Caron
2014-09-09 16:03   ` Sean Caron
2014-09-09 22:24     ` Eric Sandeen
2014-09-09 22:57       ` Sean Caron
2014-09-10  1:00         ` Roger Willcocks
2014-09-10  1:23           ` Leslie Rhorer
2014-09-10  5:09         ` Eric Sandeen
2014-09-10  0:48       ` Leslie Rhorer
2014-09-10  1:10         ` Roger Willcocks
2014-09-10  1:31           ` Leslie Rhorer
2014-09-10 14:24             ` Emmanuel Florac
2014-09-10 14:49               ` Sean Caron
2014-09-09 16:08 ` Emmanuel Florac
2014-09-09 22:06 ` Dave Chinner
2014-09-10  1:12   ` Leslie Rhorer
2014-09-10  1:25     ` Sean Caron
2014-09-10  1:43       ` Leslie Rhorer
2014-09-10 14:31         ` Emmanuel Florac
2014-09-10 14:52           ` Grozdan
2014-09-10 15:12             ` Emmanuel Florac
2014-09-10 15:32               ` Grozdan
2014-09-10 14:54           ` Sean Caron
2014-09-10 23:18           ` Leslie Rhorer
2014-09-11 13:24           ` Greg Freemyer
2014-09-12  7:06             ` Emmanuel Florac
2014-09-10  1:53     ` Dave Chinner
2014-09-10  3:10       ` Leslie Rhorer
2014-09-10  3:33         ` Dave Chinner
2014-09-10  4:14           ` Leslie Rhorer
2014-09-10  4:22             ` Leslie Rhorer
2014-09-10 14:34               ` Emmanuel Florac
2014-09-10  4:51           ` Leslie Rhorer
2014-09-10  5:23             ` Dave Chinner
2014-09-11  5:47               ` Leslie Rhorer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox