All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jay Ashworth <jra@baylink.com>
To: xfs@oss.sgi.com
Subject: Re: XFS recovery resumes...
Date: Sun, 18 Aug 2013 19:21:31 -0400 (EDT)	[thread overview]
Message-ID: <24872711.4036.1376868091485.JavaMail.root@benjamin.baylink.com> (raw)
In-Reply-To: <52115146.6070507@gmail.com>

----- Original Message -----
> From: "Joe Landman" <joe.landman@gmail.com>

> Ok. I've had power supplies take down memory in the past. You might be
> hitting a bad memory cell courtesy of the PS.

Possibly, though see below.

> >> Do you have EDAC (or mcelog) on? Any errors from this?
> >
> > I don't have mcelog on, and no, the memory isn't registered, but a
> > 4-pass run of Memtest+ came up clean, so I'm speculating that the
> 
> Not registered (which is just buffered), but ECC. ECC does a parity
> computation on some number of bits, and provides you a rough "good/bad"
> binary state of a particular area of memory. If the parity bits stored
> don't match what is computed on read, then odds are that something is
> wrong. Its not foolproof, but its a good mechanism to catch potential
> errors.

Sure.  In my experience, all ECC is registered/buffered, and no non-ECC
is, so I use it as shorthand.  No possible chance this northbridge would
do ECC, no.  :-)

> We've had cases where Memtest(*) reported everything fine, yet I was
> able to generate ECC errors in a few minutes by running a memory
> intensive app. Memtest does do some hardware exercise, but its not
> usually hitting memory the way apps do. That difference can be
> significant. This is in part why the day job stopped using memtest for
> testing a number of years ago. We now run heavy duty electronic
> structure codes, and pi/e/... computations for burn in.

Fair point.  I did also run the non-+ version of Memtest, which I
understand uses a different algorithm, and a couple other things
I found on the UBCD, so I'm *relatively* confident I don't have a
running RAM problem, though as you say, not 100%.
 
> > *continuing* problem isn't hardware; I'm pretty sure it was just the
> > failing 12V rail on the dying PS. I just have to clean up after it
> > enough to get *one* of these 2 drives cleaned off, then I can make a
> > new FS, and play musical files.
> 
> Ahhh ...
> 
> I was running a Plex server on an old machine for a while. I had to
> shift over to a beefier box with ECC ram and more CPUs. Right now my
> Plex server has 8 cpus, 24 GB RAM, and about 1TB of disk (old). Once
> you start doing recoding on the fly (multi-resolution output), you
> need the ram and processor power.
> 
> >
> > Or, I may just go grab a 3TB external after all. :-)
> 
> If you do that, and you still hit the error, chances are you might
> need to swap out your MB and CPU/RAM to something newer (not to mention the
> PS). I'd recommend ECC based systems if at all possible. Xfs can and
> will get very unhappy if bits are flipped on its data structures while
> you are making changes to the file system.

As it happens, Dave helped me clean up a mess 4 or 5 years ago, where
a *wire opened up* on the PATA cable, and all my data structures had
a missing bit.  Ghod was that a mess.

We did end up getting the drive.  So assuming I can reliably read the
big drive (I have a 3T, a 2T, and a 1T all with different problems),
I'm going to move all the files from it to the new 3T I just bought,
and then play musical files down the chain one at a time.

Thank ghod the new season hasn't started yet.  ;-)

Thanks for the help, Joe. 

Oh, and the script that Stan was so worried about?  It's all 
rm and mv commands.  5859 of them.

Cheers,
-- jra
-- 
Jay R. Ashworth                  Baylink                       jra@baylink.com
Designer                     The Things I Think                       RFC 2100
Ashworth & Associates     http://baylink.pitas.com         2000 Land Rover DII
St Petersburg FL USA               #natog                      +1 727 647 1274

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-08-18 23:21 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <29874428.3384.1376259762936.JavaMail.root@benjamin.baylink.com>
2013-08-11 22:36 ` XFS recovery resumes Jay Ashworth
2013-08-18 21:38   ` Jay Ashworth
2013-08-18 21:51     ` Joe Landman
2013-08-18 22:11       ` Jay Ashworth
2013-08-18 22:57         ` Joe Landman
2013-08-18 23:21           ` Jay Ashworth [this message]
2013-08-18 22:06     ` Stan Hoeppner
2013-08-19  3:55       ` Jay Ashworth
2013-08-19  6:47         ` Stan Hoeppner
2013-08-24 23:43           ` Jay Ashworth
2013-08-25  3:44             ` Stan Hoeppner
2013-08-25 15:29               ` Jay Ashworth
2013-08-25 17:45                 ` Stan Hoeppner
2013-08-25 20:27                   ` Jay Ashworth
2013-08-26  5:45                     ` Stan Hoeppner
2013-08-26 15:42                       ` Jay Ashworth
2013-08-24 23:48           ` Default mkfs parms for my DVR drive Jay Ashworth
2013-08-25  0:00             ` Joe Landman
2013-08-25  0:41               ` Jay Ashworth
2013-08-25  3:41                 ` Jay Ashworth
2013-08-22  9:16   ` XFS recovery resumes Stefan Ring
2013-08-27 23:59     ` Dave Chinner
2013-08-28  0:19       ` Jay Ashworth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=24872711.4036.1376868091485.JavaMail.root@benjamin.baylink.com \
    --to=jra@baylink.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.