public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Leslie Rhorer <lrhorer@mygrande.net>
To: Martin Papik <mp6058@gmail.com>
Cc: xfs@oss.sgi.com
Subject: Re: XFS File system in trouble
Date: Mon, 20 Jul 2015 03:35:09 -0500	[thread overview]
Message-ID: <55ACB2BD.6050601@mygrande.net> (raw)
In-Reply-To: <55ACABD7.8000500@gmail.com>

On 7/20/2015 3:05 AM, Martin Papik wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
>
> Since you've already found one HW related fault, would you consider
> booting into memtest for a couple of passes just to be on the safe
> side.

	I did that after confirming the one stick of memory was bad.  Twice.  I 
got over 20,000 errors on the bad stick, and 0 on the good one.  I also 
swapped the locations on the motherboard, and the bad stick still failed 
while the good one passed 100%.

> And did you by any chance look at SMART if applicable and
> possibly running a test on the drives.

	Yes. SMART found no errors, but think about it.  Every time tar tries 
to create a directory when untarring that file in that location, the 
file system croaks when it tries to create a directory. Not when reading 
and not when writing other than when it creates a directory.  When I 
create the directory manualy, the process quits failing at that point 
and fails later on during a different directory create.  The array 
remains intact when reading, and dmesg shows no drive errors.  I've 
re-synced the array, which reads every byte on all 8 drives without a 
single mismatch - several times.  To my knowledge, no read has ever 
failed except after the filesystem goes offline.  I thought reads were 
failing during the CRC checks, but that was a red herring.

> Another test I sometimes do
> when I'm unsure about disks is "cat /dev/sda > /dev/null" (i.e. a
> whole disk read test)

echo repair > /sys/block/md0/md/sync_action reads not one drive, but 
every byte on all 8 drives.

> and see (dmesg) if any errors show up, unless

	'Nary one, and no mismatches.

> you're willing to run badblocks in a read-write nondestructive mode.
> In my experience the read test or badblocks can be run simultaneously
> with smartctl -t long. But as a start I'd look at smartctl --all
> /dev/sd? and see if there are any bad signs. I hope this helps. Good luck
>
>
> On 07/20/2015 10:41 AM, Leslie Rhorer wrote:
>> On 7/19/2015 6:27 PM, Dave Chinner wrote:
>>> On Sat, Jul 18, 2015 at 08:02:50PM -0500, Leslie Rhorer wrote:
>>>>
>>>> I found the problem with md5sum (and probably nfs, as well).
>>>> One of the memory modules in the server was bad.  The problem
>>>> with XFS persists.  Every time tar tried to create the
>>>> directory:
>>>
>>> Now you need to run xfs_repair.
>>
>> I do that every time the array implodes.  It makes no difference.
>> It never mentions cleaning the structure tar says needs cleaning,
>> and the next time I run tar on that file, the filesystem craters.
>>
>> _______________________________________________ xfs mailing list
>> xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iQIcBAEBCgAGBQJVrKuzAAoJELsEaSRwbVYrdjoP/3n1W9YtcpdiDoylp6tDYcjF
> vEVz7IWLv2cOky8Lp+0WAZ4Z0WMhcutFzT571H1Vc+jT/UgO25pQHa3yLYTboPuZ
> +tBidVUycs7ZIr9QCZFs2uPQ/7YstamB+F7paCTMKtOJJr5CZLiYX4iyJ9sFmWVY
> UFPAIhyoqD5CFgoaAkwCmk50kNiT0aPM7egizIUVEt14cWuxZxMN0NIJ5b0WJfAk
> qtNQjstVI/xYDgsImm2ZAm19SfOG9ltm2G9zafRr6lR6rRtXjtZX8zEg0l/o9XUw
> OifghjoSup8OCzvX6+4+Soj/3mCKZv4rkBm3exf4YzfQ9eVG6Ktele2rLIs1sl3O
> hUrZUNEl8hYGJeb5gBHFV/TLWDMMwNde/6JiBVy0V8EbDF1lvR4jYpUwThOE0jyL
> ZbzZe4N/B0qvB1OpLDkHrMVm9NPtDkfXdTtM2kRmo5955xtkK09yHF/v64kz7IKc
> 2rM5pOwTR6HWE8RF2j9UujgPjw6nEUuY01TvIMGYzMfkJTI+sVjeDQfwnPG8tzIa
> x4uLa4vTrBD5IaICjAmQiY69qqmt5Vg42G4latZVTYQLelvWQ774mXZfgfT/GtbT
> RKzVwvYowWr/EBhtp7ix/1rWANTFiX0lxOPnRmUFvu8UJnyZhR0/EYbJYy1+jTt7
> O7hZMfAayQBsnVcSK1JC
> =3Ubd
> -----END PGP SIGNATURE-----
>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2015-07-20  8:35 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-18  1:46 XFS File system in trouble Rhorer, Leslie
2015-07-18 14:16 ` Eric Sandeen
2015-07-18 17:23   ` Rhorer, Leslie
2015-07-18 17:47     ` Kris Rusocki
2015-07-18 18:12       ` Leslie Rhorer
2015-07-19  1:02       ` Leslie Rhorer
2015-07-19 23:27         ` Dave Chinner
2015-07-20  7:41           ` Leslie Rhorer
2015-07-20  8:05             ` Martin Papik
2015-07-20  8:35               ` Leslie Rhorer [this message]
2015-07-20  8:52                 ` Martin Papik
2015-07-20 13:08                 ` Gim Leong Chin
2015-07-20 13:34             ` Eric Sandeen
2015-07-23  3:18             ` Eric Sandeen
2015-07-24 13:47               ` Leslie Rhorer
2015-07-24 14:44                 ` Eric Sandeen
2015-07-24 15:29                   ` Rhorer, Leslie
2015-07-20 11:17         ` Brian Foster
2015-07-23  1:45           ` Leslie Rhorer
2015-07-23 11:36             ` Brian Foster
2015-07-28  7:46           ` Leslie Rhorer
2015-07-28  8:35             ` Stefan Ring
2015-07-28 10:48             ` Roger Willcocks
2015-07-28 12:33             ` Brian Foster
2015-07-28 15:13               ` Leslie Rhorer
2015-07-28 16:53                 ` Eric Sandeen
2015-07-28 19:12                   ` Martin Papik
2015-07-28 19:52                     ` Martin Steigerwald
2015-07-28 22:11                 ` Brian Foster
2015-08-02 20:24                   ` Leslie Rhorer
2015-08-04  7:52                     ` Leslie Rhorer
2015-08-04 12:19                       ` Brian Foster
2015-08-04 22:42                       ` Dave Chinner
2015-08-10  1:37                         ` Leslie Rhorer
2015-08-13  6:21                           ` Leslie Rhorer
2015-08-14  1:26                             ` Dave Chinner
2015-08-14 23:12                               ` Leslie Rhorer
2015-08-15 12:28                                 ` Roger Willcocks
2015-08-15 18:48                                   ` Eric Sandeen
2015-08-15 18:57                                     ` Roger Willcocks
2015-08-15 22:48                                       ` Dave Chinner
2015-08-15 19:00                                     ` Eric Sandeen
2015-08-15 19:13                                       ` Roger Willcocks
2015-08-16  0:32                                       ` Eric Sandeen
2015-08-18  2:14                                 ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55ACB2BD.6050601@mygrande.net \
    --to=lrhorer@mygrande.net \
    --cc=mp6058@gmail.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox