From: Martin Papik <mp6058@gmail.com>
To: Leslie Rhorer <lrhorer@mygrande.net>
Cc: xfs@oss.sgi.com
Subject: Re: XFS File system in trouble
Date: Mon, 20 Jul 2015 11:52:38 +0300 [thread overview]
Message-ID: <55ACB6D6.2000100@gmail.com> (raw)
In-Reply-To: <55ACB2BD.6050601@mygrande.net>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Just wanted to make sure since I didn't catch any mention of these
checks. And based on your thoroughness I assume you ran memtest after
the ram replacement. What I'd try next in your situation is to boot a
different version of the kernel (possibly a different distro) and see
if the errors are the same, I'd try something bootable from a DVD or a
USB stick. What do you think?
On 07/20/2015 11:35 AM, Leslie Rhorer wrote:
> On 7/20/2015 3:05 AM, Martin Papik wrote:
>
> Since you've already found one HW related fault, would you consider
> booting into memtest for a couple of passes just to be on the safe
> side.
>
>> I did that after confirming the one stick of memory was bad.
>> Twice. I got over 20,000 errors on the bad stick, and 0 on the
>> good one. I also swapped the locations on the motherboard, and
>> the bad stick still failed while the good one passed 100%.
>
> And did you by any chance look at SMART if applicable and possibly
> running a test on the drives.
>
>> Yes. SMART found no errors, but think about it. Every time tar
>> tries to create a directory when untarring that file in that
>> location, the file system croaks when it tries to create a
>> directory. Not when reading and not when writing other than when
>> it creates a directory. When I create the directory manualy, the
>> process quits failing at that point and fails later on during a
>> different directory create. The array remains intact when
>> reading, and dmesg shows no drive errors. I've re-synced the
>> array, which reads every byte on all 8 drives without a single
>> mismatch - several times. To my knowledge, no read has ever
>> failed except after the filesystem goes offline. I thought
>> reads were failing during the CRC checks, but that was a red
>> herring.
>
> Another test I sometimes do when I'm unsure about disks is "cat
> /dev/sda > /dev/null" (i.e. a whole disk read test)
>
>> echo repair > /sys/block/md0/md/sync_action reads not one drive,
>> but every byte on all 8 drives.
>
> and see (dmesg) if any errors show up, unless
>
>> 'Nary one, and no mismatches.
>
> you're willing to run badblocks in a read-write nondestructive
> mode. In my experience the read test or badblocks can be run
> simultaneously with smartctl -t long. But as a start I'd look at
> smartctl --all /dev/sd? and see if there are any bad signs. I hope
> this helps. Good luck
>
>
> On 07/20/2015 10:41 AM, Leslie Rhorer wrote:
>>>> On 7/19/2015 6:27 PM, Dave Chinner wrote:
>>>>> On Sat, Jul 18, 2015 at 08:02:50PM -0500, Leslie Rhorer
>>>>> wrote:
>>>>>>
>>>>>> I found the problem with md5sum (and probably nfs, as
>>>>>> well). One of the memory modules in the server was bad.
>>>>>> The problem with XFS persists. Every time tar tried to
>>>>>> create the directory:
>>>>>
>>>>> Now you need to run xfs_repair.
>>>>
>>>> I do that every time the array implodes. It makes no
>>>> difference. It never mentions cleaning the structure tar
>>>> says needs cleaning, and the next time I run tar on that
>>>> file, the filesystem craters.
>>>>
>>>> _______________________________________________ xfs mailing
>>>> list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs
>
>>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAEBCgAGBQJVrLbVAAoJELsEaSRwbVYr/JoQAKGcNBTtswnSJ9SYpBQMc8aO
m2WQaHzLDPkSPLWYeWSGc3clPuf4FdP3A9bDcclCnVV/Ex0WJiCalYfa1Zqpnq5P
BinRp1w/cbfTTazLspFT9ySuoloOqNXTPz0MB4uxRTnIDb3Hcahw0O6HhOuZixW3
ocaEOXqVs1cc4YzPwT4Z9aWBEX3ZutMvxNKM4VWT1m8aoRZ3eJMPUKHN04PDUKyT
4Mwilypg9R6r6iberZ9zVwFy0LerElg9Cb90AGLNpyGCutGbOZH7VsoBUTnAmh2E
dz4uruFU0x8n87MQccXfSvZQIWG16UDxwjQjEiD4EHtRhYYTNVgq2V8ak94u8w99
0p5WG5+dEnVV0Qgjk2DaZy305LP+5oc2D9GkXJgGTFjMPVV3+9Tnq/XDlm2Hgxn8
hq2q0DoPDQVFMzNLxpGCJfuIdAO3o7z/1rjHpeP2Ol6pPw+hT8SQMehTBU4vMlcp
SeZzg485rVtQrWtXVJaRhITAQWSvQxjm9QqLAMdon0oxdKAPZIOtQgr8oEGKgfr7
mknqFPon7sa0c4nAZT7DtTOS+OATbTnYAoUqIuxRf4NCD7dbFUQrccU4/peEE4/H
SPzOfgOiAArOVZwWEc7JvydpcKqaEUzYb2KyzsGJFuJHZodrSTzXmUMg/Muc+iQ5
Ao/NeFe/1flevZ060ZEX
=1/q4
-----END PGP SIGNATURE-----
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2015-07-20 8:52 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-18 1:46 XFS File system in trouble Rhorer, Leslie
2015-07-18 14:16 ` Eric Sandeen
2015-07-18 17:23 ` Rhorer, Leslie
2015-07-18 17:47 ` Kris Rusocki
2015-07-18 18:12 ` Leslie Rhorer
2015-07-19 1:02 ` Leslie Rhorer
2015-07-19 23:27 ` Dave Chinner
2015-07-20 7:41 ` Leslie Rhorer
2015-07-20 8:05 ` Martin Papik
2015-07-20 8:35 ` Leslie Rhorer
2015-07-20 8:52 ` Martin Papik [this message]
2015-07-20 13:08 ` Gim Leong Chin
2015-07-20 13:34 ` Eric Sandeen
2015-07-23 3:18 ` Eric Sandeen
2015-07-24 13:47 ` Leslie Rhorer
2015-07-24 14:44 ` Eric Sandeen
2015-07-24 15:29 ` Rhorer, Leslie
2015-07-20 11:17 ` Brian Foster
2015-07-23 1:45 ` Leslie Rhorer
2015-07-23 11:36 ` Brian Foster
2015-07-28 7:46 ` Leslie Rhorer
2015-07-28 8:35 ` Stefan Ring
2015-07-28 10:48 ` Roger Willcocks
2015-07-28 12:33 ` Brian Foster
2015-07-28 15:13 ` Leslie Rhorer
2015-07-28 16:53 ` Eric Sandeen
2015-07-28 19:12 ` Martin Papik
2015-07-28 19:52 ` Martin Steigerwald
2015-07-28 22:11 ` Brian Foster
2015-08-02 20:24 ` Leslie Rhorer
2015-08-04 7:52 ` Leslie Rhorer
2015-08-04 12:19 ` Brian Foster
2015-08-04 22:42 ` Dave Chinner
2015-08-10 1:37 ` Leslie Rhorer
2015-08-13 6:21 ` Leslie Rhorer
2015-08-14 1:26 ` Dave Chinner
2015-08-14 23:12 ` Leslie Rhorer
2015-08-15 12:28 ` Roger Willcocks
2015-08-15 18:48 ` Eric Sandeen
2015-08-15 18:57 ` Roger Willcocks
2015-08-15 22:48 ` Dave Chinner
2015-08-15 19:00 ` Eric Sandeen
2015-08-15 19:13 ` Roger Willcocks
2015-08-16 0:32 ` Eric Sandeen
2015-08-18 2:14 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55ACB6D6.2000100@gmail.com \
--to=mp6058@gmail.com \
--cc=lrhorer@mygrande.net \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.