public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Tweak tracing allocation sizes
@ 2008-09-02  5:48 Lachlan McIlroy
  2008-09-02  5:56 ` Dave Chinner
  0 siblings, 1 reply; 9+ messages in thread
From: Lachlan McIlroy @ 2008-09-02  5:48 UTC (permalink / raw)
  To: xfs-dev, xfs-oss

The size of a single ktrace entry is 16 pointers so 128 bytes.  For the case
of XFS_RW_KTRACE_SIZE which is 128 entries this equates to 16KB and on a system
with 4KB pages that is under memory pressure this can stall that process for a
significant time while it hunts for 4 free pages.  Cutting this value back to
32 means it will only need one page.

Initialize the ktrace system with a zone size of 32 so calls to ktrace_alloc()
that want 32 trace entries (ie 1 page) will go through the ktrace entry zone.
Bump INODE_TRACE_SIZE from 16 to 32 since if we are going to allocate half a
page we might as well give it a full page and have it allocate from the zone
too.

More can be done here but these changes reduce the liklihood of hitting
deadlocks due to memory pressure.

--- a/fs/xfs/linux-2.6/xfs_lrw.h	2008-09-02 15:28:27.000000000 +1000
+++ b/fs/xfs/linux-2.6/xfs_lrw.h	2008-08-22 14:50:55.000000000 +1000
@@ -28,7 +28,7 @@ struct xfs_iomap;
 /*
  * Defines for the trace mechanisms in xfs_lrw.c.
  */
-#define	XFS_RW_KTRACE_SIZE	128
+#define	XFS_RW_KTRACE_SIZE	32
 
 #define	XFS_READ_ENTER		1
 #define	XFS_WRITE_ENTER		2

--- a/fs/xfs/linux-2.6/xfs_super.c	2008-09-02 15:28:27.000000000 +1000
+++ b/fs/xfs/linux-2.6/xfs_super.c	2008-08-22 14:52:58.000000000 +1000
@@ -2151,7 +2151,7 @@ init_xfs_fs(void)
 
 	printk(message);
 
-	ktrace_init(64);
+	ktrace_init(32);
 	vn_init();
 	xfs_dir_startup();
 
--- a/fs/xfs/linux-2.6/xfs_vnode.h	2008-09-02 15:28:27.000000000 +1000
+++ b/fs/xfs/linux-2.6/xfs_vnode.h	2008-08-22 14:52:14.000000000 +1000
@@ -126,7 +126,7 @@ static inline void vn_atime_to_time_t(st
  */
 #if defined(XFS_INODE_TRACE)
 
-#define	INODE_TRACE_SIZE	16		/* number of trace entries */
+#define	INODE_TRACE_SIZE	32		/* number of trace entries */
 #define	INODE_KTRACE_ENTRY	1
 #define	INODE_KTRACE_EXIT	2
 #define	INODE_KTRACE_HOLD	3

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Tweak tracing allocation sizes
  2008-09-02  5:48 [PATCH] Tweak tracing allocation sizes Lachlan McIlroy
@ 2008-09-02  5:56 ` Dave Chinner
  2008-09-02  6:12   ` Lachlan McIlroy
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2008-09-02  5:56 UTC (permalink / raw)
  To: Lachlan McIlroy; +Cc: xfs-dev, xfs-oss

On Tue, Sep 02, 2008 at 03:48:46PM +1000, Lachlan McIlroy wrote:
> The size of a single ktrace entry is 16 pointers so 128 bytes.  For the case
> of XFS_RW_KTRACE_SIZE which is 128 entries this equates to 16KB and on a system
> with 4KB pages that is under memory pressure this can stall that process for a
> significant time while it hunts for 4 free pages.  Cutting this value back to
> 32 means it will only need one page.

That will effectively render that type of tracing useless - 32
entries is not enough history to capture enough
read/write/map/invalidate trace events to be meaningful. In the past
I've often had to increase this to 256 or 512 entries to be able to
capture the events necessary to debug problems...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Tweak tracing allocation sizes
  2008-09-02  5:56 ` Dave Chinner
@ 2008-09-02  6:12   ` Lachlan McIlroy
  2008-09-02  6:27     ` Dave Chinner
  2008-09-02  6:50     ` Andi Kleen
  0 siblings, 2 replies; 9+ messages in thread
From: Lachlan McIlroy @ 2008-09-02  6:12 UTC (permalink / raw)
  To: Lachlan McIlroy, xfs-dev, xfs-oss

Dave Chinner wrote:
> On Tue, Sep 02, 2008 at 03:48:46PM +1000, Lachlan McIlroy wrote:
>> The size of a single ktrace entry is 16 pointers so 128 bytes.  For the case
>> of XFS_RW_KTRACE_SIZE which is 128 entries this equates to 16KB and on a system
>> with 4KB pages that is under memory pressure this can stall that process for a
>> significant time while it hunts for 4 free pages.  Cutting this value back to
>> 32 means it will only need one page.
> 
> That will effectively render that type of tracing useless - 32
> entries is not enough history to capture enough
> read/write/map/invalidate trace events to be meaningful. In the past
> I've often had to increase this to 256 or 512 entries to be able to
> capture the events necessary to debug problems...

A system that constantly locks up and/or stalls is useless too.  Allocating
4 or more pages for every inode just taxes the system.  Can you offer an
alternative - maybe a very large global trace buffer that is allocated at mount
time and shared by all inodes?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Tweak tracing allocation sizes
  2008-09-02  6:12   ` Lachlan McIlroy
@ 2008-09-02  6:27     ` Dave Chinner
  2008-09-02 21:54       ` Christoph Hellwig
  2008-09-02  6:50     ` Andi Kleen
  1 sibling, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2008-09-02  6:27 UTC (permalink / raw)
  To: Lachlan McIlroy; +Cc: xfs-dev, xfs-oss

On Tue, Sep 02, 2008 at 04:12:14PM +1000, Lachlan McIlroy wrote:
> Dave Chinner wrote:
>> On Tue, Sep 02, 2008 at 03:48:46PM +1000, Lachlan McIlroy wrote:
>>> The size of a single ktrace entry is 16 pointers so 128 bytes.  For the case
>>> of XFS_RW_KTRACE_SIZE which is 128 entries this equates to 16KB and on a system
>>> with 4KB pages that is under memory pressure this can stall that process for a
>>> significant time while it hunts for 4 free pages.  Cutting this value back to
>>> 32 means it will only need one page.
>>
>> That will effectively render that type of tracing useless - 32
>> entries is not enough history to capture enough
>> read/write/map/invalidate trace events to be meaningful. In the past
>> I've often had to increase this to 256 or 512 entries to be able to
>> capture the events necessary to debug problems...
>
> A system that constantly locks up and/or stalls is useless too.  Allocating
> 4 or more pages for every inode just taxes the system.  Can you offer an
> alternative

Buy more memory for your test machine? (I can't beleive I'm saying that
to an SGI guy ;)

How about using the SLUB allocator rather than SLAB and tweaking
it's settings to do order 2 allocations for every slab so most
allocations are doing order 2 allocations?

> - maybe a very large global trace buffer that is allocated at mount
> time and shared by all inodes?

Sure, we've got that for various other trace types (e.g. the "vnode"
trace). You'd need to add idbg stuff for filtering based on the
inode the buffer belongs to....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Tweak tracing allocation sizes
  2008-09-02  6:12   ` Lachlan McIlroy
  2008-09-02  6:27     ` Dave Chinner
@ 2008-09-02  6:50     ` Andi Kleen
  2008-09-02 21:55       ` Christoph Hellwig
  1 sibling, 1 reply; 9+ messages in thread
From: Andi Kleen @ 2008-09-02  6:50 UTC (permalink / raw)
  To: lachlan; +Cc: xfs-dev, xfs-oss

Lachlan McIlroy <lachlan@sgi.com> writes:

> Dave Chinner wrote:
>> On Tue, Sep 02, 2008 at 03:48:46PM +1000, Lachlan McIlroy wrote:
>>> The size of a single ktrace entry is 16 pointers so 128 bytes.  For the case
>>> of XFS_RW_KTRACE_SIZE which is 128 entries this equates to 16KB and on a system
>>> with 4KB pages that is under memory pressure this can stall that process for a
>>> significant time while it hunts for 4 free pages.  Cutting this value back to
>>> 32 means it will only need one page.
>> That will effectively render that type of tracing useless - 32
>> entries is not enough history to capture enough
>> read/write/map/invalidate trace events to be meaningful. In the past
>> I've often had to increase this to 256 or 512 entries to be able to
>> capture the events necessary to debug problems...
>
> A system that constantly locks up and/or stalls is useless too.  Allocating
> 4 or more pages for every inode just taxes the system.  Can you offer an
> alternative - maybe a very large global trace buffer that is allocated at mount
> time and shared by all inodes?

You could use vmalloc(). While that is also not fast it will at least
not stall.

-Andi

-- 
ak@linux.intel.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Tweak tracing allocation sizes
  2008-09-02  6:27     ` Dave Chinner
@ 2008-09-02 21:54       ` Christoph Hellwig
  0 siblings, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2008-09-02 21:54 UTC (permalink / raw)
  To: Lachlan McIlroy, xfs-dev, xfs-oss

On Tue, Sep 02, 2008 at 04:27:22PM +1000, Dave Chinner wrote:
> > - maybe a very large global trace buffer that is allocated at mount
> > time and shared by all inodes?
> 
> Sure, we've got that for various other trace types (e.g. the "vnode"
> trace). You'd need to add idbg stuff for filtering based on the
> inode the buffer belongs to....

That's probably the best idea.  Currently all these per-object ktrace
buffers do really large no-MAYFAIL allocation all over.  They are in
fact the remaining reason not to simply directly call vmalloc for
large allocations instead of all our current mess in the kmem_
functions.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Tweak tracing allocation sizes
  2008-09-02  6:50     ` Andi Kleen
@ 2008-09-02 21:55       ` Christoph Hellwig
  2008-09-03  7:06         ` Andi Kleen
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2008-09-02 21:55 UTC (permalink / raw)
  To: Andi Kleen; +Cc: lachlan, xfs-dev, xfs-oss

On Tue, Sep 02, 2008 at 08:50:52AM +0200, Andi Kleen wrote:
> > alternative - maybe a very large global trace buffer that is allocated at mount
> > time and shared by all inodes?
> 
> You could use vmalloc(). While that is also not fast it will at least
> not stall.

In fact kmem_alloc first tries vmalloc, and then falls back to slab
when it fails.  See fs/xfs/linux-2.6/kmem.c:kmem_alloc().

Yes, it's all a big mess..

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Tweak tracing allocation sizes
  2008-09-02 21:55       ` Christoph Hellwig
@ 2008-09-03  7:06         ` Andi Kleen
  2008-09-03 12:06           ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: Andi Kleen @ 2008-09-03  7:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Andi Kleen, lachlan, xfs-dev, xfs-oss

On Tue, Sep 02, 2008 at 05:55:16PM -0400, Christoph Hellwig wrote:
> On Tue, Sep 02, 2008 at 08:50:52AM +0200, Andi Kleen wrote:
> > > alternative - maybe a very large global trace buffer that is allocated at mount
> > > time and shared by all inodes?
> > 
> > You could use vmalloc(). While that is also not fast it will at least
> > not stall.
> 
> In fact kmem_alloc first tries vmalloc, and then falls back to slab
> when it fails.  See fs/xfs/linux-2.6/kmem.c:kmem_alloc().

You mean the other way around? 

Anyways it sounds like like MAX_SLAB_SIZE is just too big.

-Andi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Tweak tracing allocation sizes
  2008-09-03  7:06         ` Andi Kleen
@ 2008-09-03 12:06           ` Christoph Hellwig
  0 siblings, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2008-09-03 12:06 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Christoph Hellwig, lachlan, xfs-dev, xfs-oss

On Wed, Sep 03, 2008 at 09:06:54AM +0200, Andi Kleen wrote:
> On Tue, Sep 02, 2008 at 05:55:16PM -0400, Christoph Hellwig wrote:
> > On Tue, Sep 02, 2008 at 08:50:52AM +0200, Andi Kleen wrote:
> > > > alternative - maybe a very large global trace buffer that is allocated at mount
> > > > time and shared by all inodes?
> > > 
> > > You could use vmalloc(). While that is also not fast it will at least
> > > not stall.
> > 
> > In fact kmem_alloc first tries vmalloc, and then falls back to slab
> > when it fails.  See fs/xfs/linux-2.6/kmem.c:kmem_alloc().
> 
> You mean the other way around? 

No, take a look at the function.  I guess the intent is that vmalloc
can fail due to a full vmalloc area and kmalloc could theoretically
still scucceed.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-09-03 12:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-02  5:48 [PATCH] Tweak tracing allocation sizes Lachlan McIlroy
2008-09-02  5:56 ` Dave Chinner
2008-09-02  6:12   ` Lachlan McIlroy
2008-09-02  6:27     ` Dave Chinner
2008-09-02 21:54       ` Christoph Hellwig
2008-09-02  6:50     ` Andi Kleen
2008-09-02 21:55       ` Christoph Hellwig
2008-09-03  7:06         ` Andi Kleen
2008-09-03 12:06           ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox