From: Eric Sandeen <sandeen@sandeen.net>
To: Stan Hoeppner <stan@hardwarefreak.com>
Cc: xfs@oss.sgi.com
Subject: Re: Stalled xfs_repair on 100TB filesystem
Date: Tue, 02 Mar 2010 18:44:29 -0600 [thread overview]
Message-ID: <4B8DB0ED.5040109@sandeen.net> (raw)
In-Reply-To: <4B8DAECA.50701@hardwarefreak.com>
Stan Hoeppner wrote:
> Jason Vagalatos put forth on 3/2/2010 11:22 AM:
>> Hello,
>> On Friday 2/26 I started an xfs_repair on a 100TB filesystem:
>>
>> #> nohup xfs_repair -v -l /dev/logfs-sessions/logdev /dev/logfs-sessions/sessions > /root/xfs_repair.out.logfs1.sjc.02262010 &
>>
>> I've been monitoring the process with 'top' and tailing the output file from the redirect above. I believe the repair has "stalled". When the process was running 'top' showed almost all physical memory consumed and 12.6G of virt memory consumed by xfs_repair. It made it all the way to Phase 6 and has been sitting at agno = 14 for almost 48 hours. The memory consumption of xfs_repair has ceased but the process is still "running" and consuming 100% CPU:
>
> Here's how another user solved this xfs_repair "hanging" problem. I say
> "hang" because "stall" didn't return the right Google results.
>
> http://marc.info/?l=linux-xfs&m=120600321509730&w=2
>
> Excerpt:
>
> "In betwenn I created a test filesystem 360GB with 120million inodes on it.
> xfs_repair without options is unable to complete. If I run xfs_repair -o
> bhash=8192 the repair process terminates normally (the filesystem is
> actually ok)."
>
> Unfortunately it appears you'll have to start the repair over again.
>
FWIW, Jason - which xfsprogs version are you running? This patch went in a while back:
> [PATCH] libxfs: increase hash chain depth when we run out of slots
> A couple people reported xfs_repair hangs after
> "Traversing filesystem ..." in xfs_repair. This happens
> when all slots in the cache are full and referenced, and the
> loop in cache_node_get() which tries to shake unused entries
> fails to find any - it just keeps upping the priority and goes
> forever.
>
> This can be worked around by restarting xfs_repair with
> -P and/or "-o bhash=<largersize>" for older xfs_repair.
>
> I started down the path of increasing the number of hash buckets
> on the fly, but Barry suggested simply increasing the max allowed
> depth which is much simpler (thanks!)
>
> Resizing the hash lengths does mean that cache_report ends up with
> most things in the "greater-than" category:
>
> ...
> Hash buckets with 23 entries 3 ( 3%)
> Hash buckets with 24 entries 3 ( 3%)
> Hash buckets with >24 entries 50 ( 85%)
>
> but I think I'll save that fix for another patch unless there's
> real concern right now.
>
> I tested this on the metadump image provided by Tomek.
>
> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
> Reported-by: Tomek Kruszona <bloodyscarion@gmail.com>
> Reported-by: Riku Paananen <riku.paananen@helsinki.fi>
> ---
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2010-03-03 0:43 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-02 17:22 Stalled xfs_repair on 100TB filesystem Jason Vagalatos
2010-03-03 0:25 ` Dave Chinner
2010-03-03 0:35 ` Stan Hoeppner
2010-03-03 0:44 ` Eric Sandeen [this message]
2010-03-03 1:15 ` Jason Vagalatos
2010-03-03 2:08 ` Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B8DB0ED.5040109@sandeen.net \
--to=sandeen@sandeen.net \
--cc=stan@hardwarefreak.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.