All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jay Ashworth <jra@baylink.com>
To: xfs@oss.sgi.com
Subject: Re: XFS recovery resumes...
Date: Sat, 24 Aug 2013 19:43:23 -0400 (EDT)	[thread overview]
Message-ID: <20493414.4932.1377387802955.JavaMail.root@benjamin.baylink.com> (raw)
In-Reply-To: <5211BF74.9060605@hardwarefreak.com>

----- Original Message -----
> From: "Stan Hoeppner" <stan@hardwarefreak.com>

> Joe appears to have hit the nail on the head WRT this being a hardware
> problem. This error confirms it. It would appear that when the Antec
> PSU went South it damaged a motherboard device, possibly a VRM, probably
> a cap or two, or more. Maybe damaged a DRAM cell or few that work fine
> with memtest86+ but not with the access pattern generated by your XFS
> workload.

Well, it appears you may be right. 

I'd got all the data off that 3T with no read failures, and then remade
the filesystem.

I had to use -f because it saw the old one, but I don't know if that's
pertinent here or not.

Anyroad, I made the new filesystem, with whatever mkfs.xfs's defaults are
for a 3T filesystem in 3.1.11, and then started rsyncing the 2TB drive onto
it, so I could fix that one.

Got 88GB in, and did the same thing:

===========================================
Aug 22 13:34:13 duckling kernel: [67215.008867] XFS (sda1): Corruption detected. Unmount and run xfs_repair
Aug 22 13:34:13 duckling kernel: [67215.008899] XFS (sda1): Internal error xfs_trans_cancel at line 1467 of file /home/abuild/rpmbuild/BUILD/kernel-default-3.4.47/linux-3.4/fs/xfs/xfs_trans.c.  Caller 0xe3d9349d
Aug 22 13:34:13 duckling kernel: [67215.008903]
Aug 22 13:34:13 duckling kernel: [67215.008910] Pid: 4122, comm: rsync Not tainted 3.4.47-2.38-default #1
Aug 22 13:34:13 duckling kernel: [67215.008914] Call Trace:
Aug 22 13:34:13 duckling kernel: [67215.008946]  [<c0205349>] try_stack_unwind+0x199/0x1b0
Aug 22 13:34:13 duckling kernel: [67215.008959]  [<c02041c7>] dump_trace+0x47/0xf0
Aug 22 13:34:13 duckling kernel: [67215.008968]  [<c02053ab>] show_trace_log_lvl+0x4b/0x60
Aug 22 13:34:13 duckling kernel: [67215.008975]  [<c02053d8>] show_trace+0x18/0x20
Aug 22 13:34:13 duckling kernel: [67215.008986]  [<c06825ba>] dump_stack+0x6d/0x72
Aug 22 13:34:13 duckling kernel: [67215.009137]  [<e3dd2d47>] xfs_trans_cancel+0xe7/0x110 [xfs]
Aug 22 13:34:13 duckling kernel: [67215.009426]  [<e3d9349d>] xfs_create+0x22d/0x570 [xfs]
Aug 22 13:34:13 duckling kernel: [67215.009551]  [<e3d8aafa>] xfs_vn_mknod+0x8a/0x170 [xfs]
Aug 22 13:34:13 duckling kernel: [67215.009624]  [<c032ce03>] vfs_create+0xa3/0x130
Aug 22 13:34:13 duckling kernel: [67215.009634]  [<c032f215>] do_last+0x6b5/0x7e0
Aug 22 13:34:13 duckling kernel: [67215.009644]  [<c032f42a>] path_openat+0xaa/0x360
Aug 22 13:34:13 duckling kernel: [67215.009652]  [<c032f7ce>] do_filp_open+0x2e/0x80
Aug 22 13:34:13 duckling kernel: [67215.009664]  [<c032133e>] do_sys_open+0xee/0x1d0
Aug 22 13:34:13 duckling kernel: [67215.009673]  [<c0321450>] sys_open+0x30/0x40
Aug 22 13:34:13 duckling kernel: [67215.009687]  [<c069331c>] sysenter_do_call+0x12/0x28
Aug 22 13:34:13 duckling kernel: [67215.009719]  [<b76bb430>] 0xb76bb42f
Aug 22 13:34:13 duckling kernel: [67215.009726] XFS (sda1): xfs_do_force_shutdown(0x8) called from line 1468 of file /home/abuild/rpmbuild/BUILD/kernel-default-3.4.47/linux-3.4/fs/xfs/xfs_trans.c.  Return address = 0xe3dd2d5f
Aug 22 13:34:13 duckling kernel: [67215.034952] XFS (sda1): Corruption of in-memory data detected.  Shutting down filesystem
Aug 22 13:34:13 duckling kernel: [67215.034966] XFS (sda1): Please umount the filesystem and rectify the problem(s)
===========================================

Followed by the obligatory:

Aug 22 13:35:37 duckling kernel: [67299.040080] XFS (sda1): xfs_log_force: error 5 returned.

a lot.

> I'd first try manually clocking the DIMMs down a bit, from 400 to 333,
> or 333 to 266, whichever is called for. IIRC that VIA Northbrige has
> decoupled CPU and DRAM buses so you should be able to clock the DRAM
> down without affecting CPU frequency. If the problem persists, swap the
> DIMMs if you have some on hand or can get them really cheap like $10
> for a pair. 

I'll try swapping it; this mobo has always gotten whacky if we went over 512M,
which is why we haven't. 

I don't know if I can manually reclock the ram, though I might can turn the 
waitstates up.

> If that doesn't fix it, this may be a viable inexpensive
> solution:
> 
> http://www.newegg.com/Product/Product.aspx?Item=N82E16813186215
> http://www.newegg.com/Product/Product.aspx?Item=N82E16819103888
> http://www.newegg.com/Product/Product.aspx?Item=N82E16820145252
> 
> $109 to replace your central electronics complex. This is the least
> expensive quality set of parts with good feature set I could come up
> with at Newegg, to take the sting out of dropping cash on a forced
> upgrade. $15 more for the Foxconn AM3 board w/HDMI if you have a newer
> TV or AV receiver.

Well, I can live without HDMI, but my present MS-7021 mobo has 5 PCI
slots, and I'm using all of them: 2 PVR-150s, a PVR-500, and a SiI
4-port raid (which will talk to 2 and 3TB drives; the motherboard SATA
won't even see them).

I forget what's in 5, but I think it was the only VGA card I had with
S-Video out.

So, while that's a damn nice price point, it will require me to buy
a bunch of Ethernet tuners as well.  <sigh>

I'll try the RAM.  It's really odd, though, that the badblocks workload 
and both memtests couldn't find a problem, if it is the memory plane...

Cheers,
-- jra
-- 
Jay R. Ashworth                  Baylink                       jra@baylink.com
Designer                     The Things I Think                       RFC 2100
Ashworth & Associates     http://baylink.pitas.com         2000 Land Rover DII
St Petersburg FL USA               #natog                      +1 727 647 1274

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-08-24 23:43 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <29874428.3384.1376259762936.JavaMail.root@benjamin.baylink.com>
2013-08-11 22:36 ` XFS recovery resumes Jay Ashworth
2013-08-18 21:38   ` Jay Ashworth
2013-08-18 21:51     ` Joe Landman
2013-08-18 22:11       ` Jay Ashworth
2013-08-18 22:57         ` Joe Landman
2013-08-18 23:21           ` Jay Ashworth
2013-08-18 22:06     ` Stan Hoeppner
2013-08-19  3:55       ` Jay Ashworth
2013-08-19  6:47         ` Stan Hoeppner
2013-08-24 23:43           ` Jay Ashworth [this message]
2013-08-25  3:44             ` Stan Hoeppner
2013-08-25 15:29               ` Jay Ashworth
2013-08-25 17:45                 ` Stan Hoeppner
2013-08-25 20:27                   ` Jay Ashworth
2013-08-26  5:45                     ` Stan Hoeppner
2013-08-26 15:42                       ` Jay Ashworth
2013-08-24 23:48           ` Default mkfs parms for my DVR drive Jay Ashworth
2013-08-25  0:00             ` Joe Landman
2013-08-25  0:41               ` Jay Ashworth
2013-08-25  3:41                 ` Jay Ashworth
2013-08-22  9:16   ` XFS recovery resumes Stefan Ring
2013-08-27 23:59     ` Dave Chinner
2013-08-28  0:19       ` Jay Ashworth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20493414.4932.1377387802955.JavaMail.root@benjamin.baylink.com \
    --to=jra@baylink.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.