public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [patch, debug, 2/2] Use power-of-2 size ktrace buffers
@ 2008-02-18 23:01 David Chinner
  2008-02-22  4:08 ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: David Chinner @ 2008-02-18 23:01 UTC (permalink / raw)
  To: xfs-dev; +Cc: xfs-oss

Now that the ktrace_enter() code is using atomics,
the non-power-of-2 buffer sizes - which require modulus
operations to get the index - are showing up as using
substantial CPU in the profiles.

Force the buffer sizes to be rounded up to the nearest
power of two and use masking rather than modulus operations
to convert the index counter to the buffer index. This
reduces ktrace_enter overhead to 8% of a CPU time, and
again almost halves the trace intensive test runtime.

Signed-off-by: Dave Chinner <dgc@sgi.com>
---
 fs/xfs/support/ktrace.c |   22 ++++++++++++++--------
 fs/xfs/support/ktrace.h |    1 +
 2 files changed, 15 insertions(+), 8 deletions(-)

Index: 2.6.x-xfs-new/fs/xfs/support/ktrace.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/support/ktrace.c	2008-02-18 15:33:13.849314747 +1100
+++ 2.6.x-xfs-new/fs/xfs/support/ktrace.c	2008-02-18 15:33:16.920921133 +1100
@@ -24,7 +24,7 @@ static int          ktrace_zentries;
 void __init
 ktrace_init(int zentries)
 {
-	ktrace_zentries = zentries;
+	ktrace_zentries = roundup_pow_of_two(zentries);
 
 	ktrace_hdr_zone = kmem_zone_init(sizeof(ktrace_t),
 					"ktrace_hdr");
@@ -47,13 +47,16 @@ ktrace_uninit(void)
  * ktrace_alloc()
  *
  * Allocate a ktrace header and enough buffering for the given
- * number of entries.
+ * number of entries. Round the number of entries up to a
+ * power of 2 so we can do fast masking to get the index from
+ * the atomic index counter.
  */
 ktrace_t *
 ktrace_alloc(int nentries, unsigned int __nocast sleep)
 {
 	ktrace_t        *ktp;
 	ktrace_entry_t  *ktep;
+	int		entries;
 
 	ktp = (ktrace_t*)kmem_zone_alloc(ktrace_hdr_zone, sleep);
 
@@ -70,11 +73,12 @@ ktrace_alloc(int nentries, unsigned int 
 	/*
 	 * Special treatment for buffers with the ktrace_zentries entries
 	 */
-	if (nentries == ktrace_zentries) {
+	entries = roundup_pow_of_two(nentries);
+	if (entries == ktrace_zentries) {
 		ktep = (ktrace_entry_t*)kmem_zone_zalloc(ktrace_ent_zone,
 							    sleep);
 	} else {
-		ktep = (ktrace_entry_t*)kmem_zalloc((nentries * sizeof(*ktep)),
+		ktep = (ktrace_entry_t*)kmem_zalloc((entries * sizeof(*ktep)),
 							    sleep | KM_LARGE);
 	}
 
@@ -91,7 +95,9 @@ ktrace_alloc(int nentries, unsigned int 
 	}
 
 	ktp->kt_entries  = ktep;
-	ktp->kt_nentries = nentries;
+	ktp->kt_nentries = entries;
+	ASSERT(is_power_of_2(entries));
+	ktp->kt_index_mask = entries - 1;
 	atomic_set(&ktp->kt_index, 0);
 	ktp->kt_rollover = 0;
 	return ktp;
@@ -160,7 +166,7 @@ ktrace_enter(
 	 * Grab an entry by pushing the index up to the next one.
 	 */
 	index = atomic_add_return(1, &ktp->kt_index);
-	index = (index - 1) % ktp->kt_nentries;
+	index = (index - 1) & ktp->kt_index_mask;
 	if (!ktp->kt_rollover && index == ktp->kt_nentries - 1)
 		ktp->kt_rollover = 1;
 
@@ -197,7 +203,7 @@ ktrace_nentries(
 	if (ktp == NULL)
 		return 0;
 
-	index = atomic_read(&ktp->kt_index) % ktp->kt_nentries;
+	index = atomic_read(&ktp->kt_index) & ktp->kt_index_mask;
 	return (ktp->kt_rollover ? ktp->kt_nentries : index);
 }
 
@@ -223,7 +229,7 @@ ktrace_first(ktrace_t   *ktp, ktrace_sna
 	int             nentries;
 
 	if (ktp->kt_rollover)
-		index = atomic_read(&ktp->kt_index) % ktp->kt_nentries;
+		index = atomic_read(&ktp->kt_index) & ktp->kt_index_mask;
 	else
 		index = 0;
 
Index: 2.6.x-xfs-new/fs/xfs/support/ktrace.h
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/support/ktrace.h	2008-02-18 15:32:28.911073469 +1100
+++ 2.6.x-xfs-new/fs/xfs/support/ktrace.h	2008-02-18 15:33:16.920921133 +1100
@@ -31,6 +31,7 @@ typedef struct ktrace_entry {
 typedef struct ktrace {
 	int		kt_nentries;	/* number of entries in trace buf */
 	atomic_t	kt_index;	/* current index in entries */
+	unsigned int	kt_index_mask;
 	int		kt_rollover;
 	ktrace_entry_t	*kt_entries;	/* buffer of entries */
 } ktrace_t;

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch, debug, 2/2] Use power-of-2 size ktrace buffers
  2008-02-18 23:01 [patch, debug, 2/2] Use power-of-2 size ktrace buffers David Chinner
@ 2008-02-22  4:08 ` Christoph Hellwig
  2008-02-22  4:30   ` David Chinner
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2008-02-22  4:08 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs-dev, xfs-oss

On Tue, Feb 19, 2008 at 10:01:12AM +1100, David Chinner wrote:
> Now that the ktrace_enter() code is using atomics,
> the non-power-of-2 buffer sizes - which require modulus
> operations to get the index - are showing up as using
> substantial CPU in the profiles.
> 
> Force the buffer sizes to be rounded up to the nearest
> power of two and use masking rather than modulus operations
> to convert the index counter to the buffer index. This
> reduces ktrace_enter overhead to 8% of a CPU time, and
> again almost halves the trace intensive test runtime.

Looks fine aswell.  You might aswell kill this zentries stuff and always
use kmalloc instead of caches as the power of two multiples of
sizeof(u64) should always have matching caches available.

While we're at it the ktrace stuff should simply go away mid-term
with all the tracing stuff in mainline now.  As a start all macros
calling into ktrace should become markers which allow always bulding
them in with almost zero overhead (a single no-op instruction at their
callsite) and allowing to load the actual tracing module later.  As
a second step ktrace should be replaced with something based on the
various trace thingies floating around allowing to read out the trace
buffer from userspace instead of having to rely on kdb.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch, debug, 2/2] Use power-of-2 size ktrace buffers
  2008-02-22  4:08 ` Christoph Hellwig
@ 2008-02-22  4:30   ` David Chinner
  2008-02-23  0:17     ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: David Chinner @ 2008-02-22  4:30 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: David Chinner, xfs-dev, xfs-oss

On Thu, Feb 21, 2008 at 11:08:07PM -0500, Christoph Hellwig wrote:
> On Tue, Feb 19, 2008 at 10:01:12AM +1100, David Chinner wrote:
> > Now that the ktrace_enter() code is using atomics,
> > the non-power-of-2 buffer sizes - which require modulus
> > operations to get the index - are showing up as using
> > substantial CPU in the profiles.
> > 
> > Force the buffer sizes to be rounded up to the nearest
> > power of two and use masking rather than modulus operations
> > to convert the index counter to the buffer index. This
> > reduces ktrace_enter overhead to 8% of a CPU time, and
> > again almost halves the trace intensive test runtime.
> 
> Looks fine aswell.  You might aswell kill this zentries stuff and always
> use kmalloc instead of caches as the power of two multiples of
> sizeof(u64) should always have matching caches available.

Ok, I'll cook up another patch for that and send it out.

> While we're at it the ktrace stuff should simply go away mid-term
> with all the tracing stuff in mainline now.  As a start all macros
> calling into ktrace should become markers which allow always bulding
> them in with almost zero overhead (a single no-op instruction at their
> callsite) and allowing to load the actual tracing module later.

Sure, that could be done.

> As
> a second step ktrace should be replaced with something based on the
> various trace thingies floating around allowing to read out the trace
> buffer from userspace instead of having to rely on kdb.

Yeah, that make sense for tracing from userspace. But most of the
time I find a need for the tracing is when the machine has crashed
or assert failed. Hence I don't see that we can really remove the
kdb side of things. Perhaps two different tracing modules could
be done.....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch, debug, 2/2] Use power-of-2 size ktrace buffers
  2008-02-22  4:30   ` David Chinner
@ 2008-02-23  0:17     ` Christoph Hellwig
  0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2008-02-23  0:17 UTC (permalink / raw)
  To: David Chinner; +Cc: Christoph Hellwig, xfs-dev, xfs-oss

On Fri, Feb 22, 2008 at 03:30:20PM +1100, David Chinner wrote:
> Ok, I'll cook up another patch for that and send it out.

Yeah, making that a separate patch is fine with me.  It's just a logical
next step.

> > a second step ktrace should be replaced with something based on the
> > various trace thingies floating around allowing to read out the trace
> > buffer from userspace instead of having to rely on kdb.
> 
> Yeah, that make sense for tracing from userspace. But most of the
> time I find a need for the tracing is when the machine has crashed
> or assert failed. Hence I don't see that we can really remove the
> kdb side of things. Perhaps two different tracing modules could
> be done.....

It should still be one tracing buffer, with both a userspace space
interface and one from the various kernel debuggers.  Adding the
trace buffers to core dumps in addition would be even better.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-02-23  0:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-18 23:01 [patch, debug, 2/2] Use power-of-2 size ktrace buffers David Chinner
2008-02-22  4:08 ` Christoph Hellwig
2008-02-22  4:30   ` David Chinner
2008-02-23  0:17     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox