public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Jay Ashworth <jra@baylink.com>
To: xfs@oss.sgi.com
Subject: Re: XFS recovery resumes...
Date: Sun, 18 Aug 2013 19:21:31 -0400 (EDT)	[thread overview]
Message-ID: <24872711.4036.1376868091485.JavaMail.root@benjamin.baylink.com> (raw)
In-Reply-To: <52115146.6070507@gmail.com>

----- Original Message -----
> From: "Joe Landman" <joe.landman@gmail.com>

> Ok. I've had power supplies take down memory in the past. You might be
> hitting a bad memory cell courtesy of the PS.

Possibly, though see below.

> >> Do you have EDAC (or mcelog) on? Any errors from this?
> >
> > I don't have mcelog on, and no, the memory isn't registered, but a
> > 4-pass run of Memtest+ came up clean, so I'm speculating that the
> 
> Not registered (which is just buffered), but ECC. ECC does a parity
> computation on some number of bits, and provides you a rough "good/bad"
> binary state of a particular area of memory. If the parity bits stored
> don't match what is computed on read, then odds are that something is
> wrong. Its not foolproof, but its a good mechanism to catch potential
> errors.

Sure.  In my experience, all ECC is registered/buffered, and no non-ECC
is, so I use it as shorthand.  No possible chance this northbridge would
do ECC, no.  :-)

> We've had cases where Memtest(*) reported everything fine, yet I was
> able to generate ECC errors in a few minutes by running a memory
> intensive app. Memtest does do some hardware exercise, but its not
> usually hitting memory the way apps do. That difference can be
> significant. This is in part why the day job stopped using memtest for
> testing a number of years ago. We now run heavy duty electronic
> structure codes, and pi/e/... computations for burn in.

Fair point.  I did also run the non-+ version of Memtest, which I
understand uses a different algorithm, and a couple other things
I found on the UBCD, so I'm *relatively* confident I don't have a
running RAM problem, though as you say, not 100%.
 
> > *continuing* problem isn't hardware; I'm pretty sure it was just the
> > failing 12V rail on the dying PS. I just have to clean up after it
> > enough to get *one* of these 2 drives cleaned off, then I can make a
> > new FS, and play musical files.
> 
> Ahhh ...
> 
> I was running a Plex server on an old machine for a while. I had to
> shift over to a beefier box with ECC ram and more CPUs. Right now my
> Plex server has 8 cpus, 24 GB RAM, and about 1TB of disk (old). Once
> you start doing recoding on the fly (multi-resolution output), you
> need the ram and processor power.
> 
> >
> > Or, I may just go grab a 3TB external after all. :-)
> 
> If you do that, and you still hit the error, chances are you might
> need to swap out your MB and CPU/RAM to something newer (not to mention the
> PS). I'd recommend ECC based systems if at all possible. Xfs can and
> will get very unhappy if bits are flipped on its data structures while
> you are making changes to the file system.

As it happens, Dave helped me clean up a mess 4 or 5 years ago, where
a *wire opened up* on the PATA cable, and all my data structures had
a missing bit.  Ghod was that a mess.

We did end up getting the drive.  So assuming I can reliably read the
big drive (I have a 3T, a 2T, and a 1T all with different problems),
I'm going to move all the files from it to the new 3T I just bought,
and then play musical files down the chain one at a time.

Thank ghod the new season hasn't started yet.  ;-)

Thanks for the help, Joe. 

Oh, and the script that Stan was so worried about?  It's all 
rm and mv commands.  5859 of them.

Cheers,
-- jra
-- 
Jay R. Ashworth                  Baylink                       jra@baylink.com
Designer                     The Things I Think                       RFC 2100
Ashworth & Associates     http://baylink.pitas.com         2000 Land Rover DII
St Petersburg FL USA               #natog                      +1 727 647 1274

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-08-18 23:21 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <29874428.3384.1376259762936.JavaMail.root@benjamin.baylink.com>
2013-08-11 22:36 ` XFS recovery resumes Jay Ashworth
2013-08-18 21:38   ` Jay Ashworth
2013-08-18 21:51     ` Joe Landman
2013-08-18 22:11       ` Jay Ashworth
2013-08-18 22:57         ` Joe Landman
2013-08-18 23:21           ` Jay Ashworth [this message]
2013-08-18 22:06     ` Stan Hoeppner
2013-08-19  3:55       ` Jay Ashworth
2013-08-19  6:47         ` Stan Hoeppner
2013-08-24 23:43           ` Jay Ashworth
2013-08-25  3:44             ` Stan Hoeppner
2013-08-25 15:29               ` Jay Ashworth
2013-08-25 17:45                 ` Stan Hoeppner
2013-08-25 20:27                   ` Jay Ashworth
2013-08-26  5:45                     ` Stan Hoeppner
2013-08-26 15:42                       ` Jay Ashworth
2013-08-24 23:48           ` Default mkfs parms for my DVR drive Jay Ashworth
2013-08-25  0:00             ` Joe Landman
2013-08-25  0:41               ` Jay Ashworth
2013-08-25  3:41                 ` Jay Ashworth
2013-08-22  9:16   ` XFS recovery resumes Stefan Ring
2013-08-27 23:59     ` Dave Chinner
2013-08-28  0:19       ` Jay Ashworth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=24872711.4036.1376868091485.JavaMail.root@benjamin.baylink.com \
    --to=jra@baylink.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox