public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Joe Landman <joe.landman@gmail.com>
To: xfs@oss.sgi.com
Subject: Re: XFS recovery resumes...
Date: Sun, 18 Aug 2013 18:57:10 -0400	[thread overview]
Message-ID: <52115146.6070507@gmail.com> (raw)
In-Reply-To: <240990.4028.1376863911761.JavaMail.root@benjamin.baylink.com>

On 08/18/2013 06:11 PM, Jay Ashworth wrote:
> ----- Original Message -----
>> From: "Joe Landman" <joe.landman@gmail.com>
>
>>> You need at least 497MB RAM to run with prefetching enabled.
>>
>> ^^^^^
>>
>> This is 1/2 GB ram, and you didn't specify the memory options of the
>> xfs_repair ... so I'm going to guess at this point that you ran out of
>> ram. Paging while running xfs_repair is no fun.
>>
>> How much ram do you have in this box? Next question is, is this an ECC
>> memory box?
>
> 512M.  It's a *very* old KT6V based board, and when we tried to expand
> it several years back, it went bat-guano with any more than half a gig.

Ahhh .... ok.  Got it.


>
>> Not sure if you are hitting a bug as much as running into something
>> else like a hardware limit (RAM) or a memory stick issue.
>
> Well, the upstream cause was a 7 year old Antec power supply that
> finally died, about a month ago, slowly.

Ok.  I've had power supplies take down memory in the past.  You might be 
hitting a bad memory cell courtesy of the PS.

>
>> Do you have EDAC (or mcelog) on? Any errors from this?
>
> I don't have mcelog on, and no, the memory isn't registered, but a
> 4-pass run of Memtest+ came up clean, so I'm speculating that the

Not registered (which is just buffered), but ECC.  ECC does a parity 
computation on some number of bits, and provides you a rough "good/bad" 
binary state of a particular area of memory.  If the parity bits stored 
don't match what is computed on read, then odds are that something is 
wrong.  Its not foolproof, but its a good mechanism to catch potential 
errors.

We've had cases where Memtest(*) reported everything fine, yet I was 
able to generate ECC errors in a few minutes by running a memory 
intensive app.  Memtest does do some hardware exercise, but its not 
usually hitting memory the way apps do.  That difference can be 
significant.  This is in part why the day job stopped using memtest for 
testing a number of years ago.  We now run heavy duty electronic 
structure codes, and pi/e/... computations for burn in.

> *continuing* problem isn't hardware; I'm pretty sure it was just the
> failing 12V rail on the dying PS.  I just have to clean up after it
> enough to get *one* of these 2 drives cleaned off, then I can make a
> new FS, and play musical files.

Ahhh ...

I was running a Plex server on an old machine for a while.  I had to 
shift over to a beefier box with ECC ram and more CPUs.  Right now my 
Plex server has 8 cpus, 24 GB RAM, and about 1TB of disk (old).  Once 
you start doing recoding on the fly (multi-resolution output), you need 
the ram and processor power.

>
> Or, I may just go grab a 3TB external after all.  :-)

If you do that, and you still hit the error, chances are you might need 
to swap out your MB and CPU/RAM to something newer (not to mention the 
PS).  I'd recommend ECC based systems if at all possible.  Xfs can and 
will get very unhappy if bits are flipped on its data structures while 
you are making changes to the file system.

--

Joe

>
> Cheers,
> -- jra
>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-08-18 22:57 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <29874428.3384.1376259762936.JavaMail.root@benjamin.baylink.com>
2013-08-11 22:36 ` XFS recovery resumes Jay Ashworth
2013-08-18 21:38   ` Jay Ashworth
2013-08-18 21:51     ` Joe Landman
2013-08-18 22:11       ` Jay Ashworth
2013-08-18 22:57         ` Joe Landman [this message]
2013-08-18 23:21           ` Jay Ashworth
2013-08-18 22:06     ` Stan Hoeppner
2013-08-19  3:55       ` Jay Ashworth
2013-08-19  6:47         ` Stan Hoeppner
2013-08-24 23:43           ` Jay Ashworth
2013-08-25  3:44             ` Stan Hoeppner
2013-08-25 15:29               ` Jay Ashworth
2013-08-25 17:45                 ` Stan Hoeppner
2013-08-25 20:27                   ` Jay Ashworth
2013-08-26  5:45                     ` Stan Hoeppner
2013-08-26 15:42                       ` Jay Ashworth
2013-08-24 23:48           ` Default mkfs parms for my DVR drive Jay Ashworth
2013-08-25  0:00             ` Joe Landman
2013-08-25  0:41               ` Jay Ashworth
2013-08-25  3:41                 ` Jay Ashworth
2013-08-22  9:16   ` XFS recovery resumes Stefan Ring
2013-08-27 23:59     ` Dave Chinner
2013-08-28  0:19       ` Jay Ashworth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52115146.6070507@gmail.com \
    --to=joe.landman@gmail.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox