From: Eric Sandeen <sandeen@sandeen.net>
To: Stan Hoeppner <stan@hardwarefreak.com>
Cc: xfs@oss.sgi.com
Subject: Re: Stalled xfs_repair on 100TB filesystem
Date: Tue, 02 Mar 2010 18:44:29 -0600 [thread overview]
Message-ID: <4B8DB0ED.5040109@sandeen.net> (raw)
In-Reply-To: <4B8DAECA.50701@hardwarefreak.com>
Stan Hoeppner wrote:
> Jason Vagalatos put forth on 3/2/2010 11:22 AM:
>> Hello,
>> On Friday 2/26 I started an xfs_repair on a 100TB filesystem:
>>
>> #> nohup xfs_repair -v -l /dev/logfs-sessions/logdev /dev/logfs-sessions/sessions > /root/xfs_repair.out.logfs1.sjc.02262010 &
>>
>> I've been monitoring the process with 'top' and tailing the output file from the redirect above. I believe the repair has "stalled". When the process was running 'top' showed almost all physical memory consumed and 12.6G of virt memory consumed by xfs_repair. It made it all the way to Phase 6 and has been sitting at agno = 14 for almost 48 hours. The memory consumption of xfs_repair has ceased but the process is still "running" and consuming 100% CPU:
>
> Here's how another user solved this xfs_repair "hanging" problem. I say
> "hang" because "stall" didn't return the right Google results.
>
> http://marc.info/?l=linux-xfs&m=120600321509730&w=2
>
> Excerpt:
>
> "In betwenn I created a test filesystem 360GB with 120million inodes on it.
> xfs_repair without options is unable to complete. If I run xfs_repair -o
> bhash=8192 the repair process terminates normally (the filesystem is
> actually ok)."
>
> Unfortunately it appears you'll have to start the repair over again.
>
FWIW, Jason - which xfsprogs version are you running? This patch went in a while back:
> [PATCH] libxfs: increase hash chain depth when we run out of slots
> A couple people reported xfs_repair hangs after
> "Traversing filesystem ..." in xfs_repair. This happens
> when all slots in the cache are full and referenced, and the
> loop in cache_node_get() which tries to shake unused entries
> fails to find any - it just keeps upping the priority and goes
> forever.
>
> This can be worked around by restarting xfs_repair with
> -P and/or "-o bhash=<largersize>" for older xfs_repair.
>
> I started down the path of increasing the number of hash buckets
> on the fly, but Barry suggested simply increasing the max allowed
> depth which is much simpler (thanks!)
>
> Resizing the hash lengths does mean that cache_report ends up with
> most things in the "greater-than" category:
>
> ...
> Hash buckets with 23 entries 3 ( 3%)
> Hash buckets with 24 entries 3 ( 3%)
> Hash buckets with >24 entries 50 ( 85%)
>
> but I think I'll save that fix for another patch unless there's
> real concern right now.
>
> I tested this on the metadump image provided by Tomek.
>
> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
> Reported-by: Tomek Kruszona <bloodyscarion@gmail.com>
> Reported-by: Riku Paananen <riku.paananen@helsinki.fi>
> ---
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2010-03-03 0:43 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-02 17:22 Stalled xfs_repair on 100TB filesystem Jason Vagalatos
2010-03-03 0:25 ` Dave Chinner
2010-03-03 0:35 ` Stan Hoeppner
2010-03-03 0:44 ` Eric Sandeen [this message]
2010-03-03 1:15 ` Jason Vagalatos
2010-03-03 2:08 ` Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B8DB0ED.5040109@sandeen.net \
--to=sandeen@sandeen.net \
--cc=stan@hardwarefreak.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox