From: Martin Papik <mp6058@gmail.com>
To: Leslie Rhorer <lrhorer@mygrande.net>
Cc: xfs@oss.sgi.com
Subject: Re: XFS File system in trouble
Date: Mon, 20 Jul 2015 11:52:38 +0300 [thread overview]
Message-ID: <55ACB6D6.2000100@gmail.com> (raw)
In-Reply-To: <55ACB2BD.6050601@mygrande.net>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Just wanted to make sure since I didn't catch any mention of these
checks. And based on your thoroughness I assume you ran memtest after
the ram replacement. What I'd try next in your situation is to boot a
different version of the kernel (possibly a different distro) and see
if the errors are the same, I'd try something bootable from a DVD or a
USB stick. What do you think?
On 07/20/2015 11:35 AM, Leslie Rhorer wrote:
> On 7/20/2015 3:05 AM, Martin Papik wrote:
>
> Since you've already found one HW related fault, would you consider
> booting into memtest for a couple of passes just to be on the safe
> side.
>
>> I did that after confirming the one stick of memory was bad.
>> Twice. I got over 20,000 errors on the bad stick, and 0 on the
>> good one. I also swapped the locations on the motherboard, and
>> the bad stick still failed while the good one passed 100%.
>
> And did you by any chance look at SMART if applicable and possibly
> running a test on the drives.
>
>> Yes. SMART found no errors, but think about it. Every time tar
>> tries to create a directory when untarring that file in that
>> location, the file system croaks when it tries to create a
>> directory. Not when reading and not when writing other than when
>> it creates a directory. When I create the directory manualy, the
>> process quits failing at that point and fails later on during a
>> different directory create. The array remains intact when
>> reading, and dmesg shows no drive errors. I've re-synced the
>> array, which reads every byte on all 8 drives without a single
>> mismatch - several times. To my knowledge, no read has ever
>> failed except after the filesystem goes offline. I thought
>> reads were failing during the CRC checks, but that was a red
>> herring.
>
> Another test I sometimes do when I'm unsure about disks is "cat
> /dev/sda > /dev/null" (i.e. a whole disk read test)
>
>> echo repair > /sys/block/md0/md/sync_action reads not one drive,
>> but every byte on all 8 drives.
>
> and see (dmesg) if any errors show up, unless
>
>> 'Nary one, and no mismatches.
>
> you're willing to run badblocks in a read-write nondestructive
> mode. In my experience the read test or badblocks can be run
> simultaneously with smartctl -t long. But as a start I'd look at
> smartctl --all /dev/sd? and see if there are any bad signs. I hope
> this helps. Good luck
>
>
> On 07/20/2015 10:41 AM, Leslie Rhorer wrote:
>>>> On 7/19/2015 6:27 PM, Dave Chinner wrote:
>>>>> On Sat, Jul 18, 2015 at 08:02:50PM -0500, Leslie Rhorer
>>>>> wrote:
>>>>>>
>>>>>> I found the problem with md5sum (and probably nfs, as
>>>>>> well). One of the memory modules in the server was bad.
>>>>>> The problem with XFS persists. Every time tar tried to
>>>>>> create the directory:
>>>>>
>>>>> Now you need to run xfs_repair.
>>>>
>>>> I do that every time the array implodes. It makes no
>>>> difference. It never mentions cleaning the structure tar
>>>> says needs cleaning, and the next time I run tar on that
>>>> file, the filesystem craters.
>>>>
>>>> _______________________________________________ xfs mailing
>>>> list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs
>
>>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAEBCgAGBQJVrLbVAAoJELsEaSRwbVYr/JoQAKGcNBTtswnSJ9SYpBQMc8aO
m2WQaHzLDPkSPLWYeWSGc3clPuf4FdP3A9bDcclCnVV/Ex0WJiCalYfa1Zqpnq5P
BinRp1w/cbfTTazLspFT9ySuoloOqNXTPz0MB4uxRTnIDb3Hcahw0O6HhOuZixW3
ocaEOXqVs1cc4YzPwT4Z9aWBEX3ZutMvxNKM4VWT1m8aoRZ3eJMPUKHN04PDUKyT
4Mwilypg9R6r6iberZ9zVwFy0LerElg9Cb90AGLNpyGCutGbOZH7VsoBUTnAmh2E
dz4uruFU0x8n87MQccXfSvZQIWG16UDxwjQjEiD4EHtRhYYTNVgq2V8ak94u8w99
0p5WG5+dEnVV0Qgjk2DaZy305LP+5oc2D9GkXJgGTFjMPVV3+9Tnq/XDlm2Hgxn8
hq2q0DoPDQVFMzNLxpGCJfuIdAO3o7z/1rjHpeP2Ol6pPw+hT8SQMehTBU4vMlcp
SeZzg485rVtQrWtXVJaRhITAQWSvQxjm9QqLAMdon0oxdKAPZIOtQgr8oEGKgfr7
mknqFPon7sa0c4nAZT7DtTOS+OATbTnYAoUqIuxRf4NCD7dbFUQrccU4/peEE4/H
SPzOfgOiAArOVZwWEc7JvydpcKqaEUzYb2KyzsGJFuJHZodrSTzXmUMg/Muc+iQ5
Ao/NeFe/1flevZ060ZEX
=1/q4
-----END PGP SIGNATURE-----
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2015-07-20 8:52 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-18 1:46 XFS File system in trouble Rhorer, Leslie
2015-07-18 14:16 ` Eric Sandeen
2015-07-18 17:23 ` Rhorer, Leslie
2015-07-18 17:47 ` Kris Rusocki
2015-07-18 18:12 ` Leslie Rhorer
2015-07-19 1:02 ` Leslie Rhorer
2015-07-19 23:27 ` Dave Chinner
2015-07-20 7:41 ` Leslie Rhorer
2015-07-20 8:05 ` Martin Papik
2015-07-20 8:35 ` Leslie Rhorer
2015-07-20 8:52 ` Martin Papik [this message]
2015-07-20 13:08 ` Gim Leong Chin
2015-07-20 13:34 ` Eric Sandeen
2015-07-23 3:18 ` Eric Sandeen
2015-07-24 13:47 ` Leslie Rhorer
2015-07-24 14:44 ` Eric Sandeen
2015-07-24 15:29 ` Rhorer, Leslie
2015-07-20 11:17 ` Brian Foster
2015-07-23 1:45 ` Leslie Rhorer
2015-07-23 11:36 ` Brian Foster
2015-07-28 7:46 ` Leslie Rhorer
2015-07-28 8:35 ` Stefan Ring
2015-07-28 10:48 ` Roger Willcocks
2015-07-28 12:33 ` Brian Foster
2015-07-28 15:13 ` Leslie Rhorer
2015-07-28 16:53 ` Eric Sandeen
2015-07-28 19:12 ` Martin Papik
2015-07-28 19:52 ` Martin Steigerwald
2015-07-28 22:11 ` Brian Foster
2015-08-02 20:24 ` Leslie Rhorer
2015-08-04 7:52 ` Leslie Rhorer
2015-08-04 12:19 ` Brian Foster
2015-08-04 22:42 ` Dave Chinner
2015-08-10 1:37 ` Leslie Rhorer
2015-08-13 6:21 ` Leslie Rhorer
2015-08-14 1:26 ` Dave Chinner
2015-08-14 23:12 ` Leslie Rhorer
2015-08-15 12:28 ` Roger Willcocks
2015-08-15 18:48 ` Eric Sandeen
2015-08-15 18:57 ` Roger Willcocks
2015-08-15 22:48 ` Dave Chinner
2015-08-15 19:00 ` Eric Sandeen
2015-08-15 19:13 ` Roger Willcocks
2015-08-16 0:32 ` Eric Sandeen
2015-08-18 2:14 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55ACB6D6.2000100@gmail.com \
--to=mp6058@gmail.com \
--cc=lrhorer@mygrande.net \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox