public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] libxfs: increase hash chain depth when we run out of slots
@ 2009-09-17 16:06 Eric Sandeen
  2009-09-17 18:09 ` Christoph Hellwig
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Sandeen @ 2009-09-17 16:06 UTC (permalink / raw)
  To: xfs-oss; +Cc: Tomek Kruszona, Riku Paananen

A couple people reported xfs_repair hangs after
"Traversing filesystem ..." in xfs_repair.  This happens
when all slots in the cache are full and referenced, and the
loop in cache_node_get() which tries to shake unused entries
fails to find any - it just keeps upping the priority and goes
forever.

This can be worked around by restarting xfs_repair with
-P and/or "-o bhash=<largersize>" for older xfs_repair.

I started down the path of increasing the number of hash buckets
on the fly, but Barry suggested simply increasing the max allowed
depth which is much simpler (thanks!)

Resizing the hash lengths does mean that cache_report ends up with
most things in the "greater-than" category:

...
Hash buckets with  23 entries      3 (  3%)
Hash buckets with  24 entries      3 (  3%)
Hash buckets with >24 entries     50 ( 85%)

but I think I'll save that fix for another patch unless there's
real concern right now.

I tested this on the metadump image provided by Tomek.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reported-by: Tomek Kruszona <bloodyscarion@gmail.com>
Reported-by: Riku Paananen <riku.paananen@helsinki.fi>
---

diff --git a/libxfs/cache.c b/libxfs/cache.c
index 48f91d7..56b24e7 100644
--- a/libxfs/cache.c
+++ b/libxfs/cache.c
@@ -83,6 +83,18 @@ cache_init(
 }
 
 void
+cache_expand(
+	struct cache *		cache)
+{
+	pthread_mutex_lock(&cache->c_mutex);
+#ifdef CACHE_DEBUG
+	fprintf(stderr, "doubling cache size to %d\n", 2 * cache->c_maxcount);
+#endif
+	cache->c_maxcount *= 2;
+	pthread_mutex_unlock(&cache->c_mutex);
+}
+
+void
 cache_walk(
 	struct cache *		cache,
 	cache_walk_t		visit)
@@ -344,6 +356,15 @@ cache_node_get(
 		if (node)
 			break;
 		priority = cache_shake(cache, priority, 0);
+		/*
+		 * We start at 0; if we free CACHE_SHAKE_COUNT we get
+		 * back the same priority, if not we get back priority+1.
+		 * If we exceed CACHE_MAX_PRIORITY all slots are full; grow it.
+		 */
+		if (priority > CACHE_MAX_PRIORITY) {
+			priority = 0;
+			cache_expand(cache);
+		}
 	}
 
 	node->cn_hashidx = hashidx;

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] libxfs: increase hash chain depth when we run out of slots
  2009-09-17 16:06 [PATCH] libxfs: increase hash chain depth when we run out of slots Eric Sandeen
@ 2009-09-17 18:09 ` Christoph Hellwig
  2009-09-17 19:02   ` Eric Sandeen
  0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2009-09-17 18:09 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Tomek Kruszona, Riku Paananen, xfs-oss

On Thu, Sep 17, 2009 at 11:06:16AM -0500, Eric Sandeen wrote:
> A couple people reported xfs_repair hangs after
> "Traversing filesystem ..." in xfs_repair.  This happens
> when all slots in the cache are full and referenced, and the
> loop in cache_node_get() which tries to shake unused entries
> fails to find any - it just keeps upping the priority and goes
> forever.
> 
> This can be worked around by restarting xfs_repair with
> -P and/or "-o bhash=<largersize>" for older xfs_repair.
> 
> I started down the path of increasing the number of hash buckets
> on the fly, but Barry suggested simply increasing the max allowed
> depth which is much simpler (thanks!)
> 
> Resizing the hash lengths does mean that cache_report ends up with
> most things in the "greater-than" category:
> 
> ...
> Hash buckets with  23 entries      3 (  3%)
> Hash buckets with  24 entries      3 (  3%)
> Hash buckets with >24 entries     50 ( 85%)
> 
> but I think I'll save that fix for another patch unless there's
> real concern right now.
> 
> I tested this on the metadump image provided by Tomek.

How large is that image?  I really think we need to start collecting
these images for regression testing.


The patch looks good to me,


Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] libxfs: increase hash chain depth when we run out of slots
  2009-09-17 18:09 ` Christoph Hellwig
@ 2009-09-17 19:02   ` Eric Sandeen
  0 siblings, 0 replies; 3+ messages in thread
From: Eric Sandeen @ 2009-09-17 19:02 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Tomek Kruszona, Eric Sandeen, Riku Paananen, xfs-oss

Christoph Hellwig wrote:
> On Thu, Sep 17, 2009 at 11:06:16AM -0500, Eric Sandeen wrote:
>> A couple people reported xfs_repair hangs after
>> "Traversing filesystem ..." in xfs_repair.  This happens
>> when all slots in the cache are full and referenced, and the
>> loop in cache_node_get() which tries to shake unused entries
>> fails to find any - it just keeps upping the priority and goes
>> forever.
>>
>> This can be worked around by restarting xfs_repair with
>> -P and/or "-o bhash=<largersize>" for older xfs_repair.
>>
>> I started down the path of increasing the number of hash buckets
>> on the fly, but Barry suggested simply increasing the max allowed
>> depth which is much simpler (thanks!)
>>
>> Resizing the hash lengths does mean that cache_report ends up with
>> most things in the "greater-than" category:
>>
>> ...
>> Hash buckets with  23 entries      3 (  3%)
>> Hash buckets with  24 entries      3 (  3%)
>> Hash buckets with >24 entries     50 ( 85%)
>>
>> but I think I'll save that fix for another patch unless there's
>> real concern right now.
>>
>> I tested this on the metadump image provided by Tomek.
> 
> How large is that image?  I really think we need to start collecting
> these images for regression testing.

zipped metadump is 170M; unzipped 1.1G.

Crafting a special test fs somehow might be better; maybe with an
artificially low bhashsize or something ....  yeah, I know.  I'm not
sure how to manage the regression testing.  Working backwards to a
minimal testcase on these would be extremely time-consuming and/or
impossible I'm afraid.

> The patch looks good to me,

thanks for the review

-Eric

> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-09-17 19:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-17 16:06 [PATCH] libxfs: increase hash chain depth when we run out of slots Eric Sandeen
2009-09-17 18:09 ` Christoph Hellwig
2009-09-17 19:02   ` Eric Sandeen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox