* [patch, debug, 2/2] Use power-of-2 size ktrace buffers
@ 2008-02-18 23:01 David Chinner
2008-02-22 4:08 ` Christoph Hellwig
0 siblings, 1 reply; 4+ messages in thread
From: David Chinner @ 2008-02-18 23:01 UTC (permalink / raw)
To: xfs-dev; +Cc: xfs-oss
Now that the ktrace_enter() code is using atomics,
the non-power-of-2 buffer sizes - which require modulus
operations to get the index - are showing up as using
substantial CPU in the profiles.
Force the buffer sizes to be rounded up to the nearest
power of two and use masking rather than modulus operations
to convert the index counter to the buffer index. This
reduces ktrace_enter overhead to 8% of a CPU time, and
again almost halves the trace intensive test runtime.
Signed-off-by: Dave Chinner <dgc@sgi.com>
---
fs/xfs/support/ktrace.c | 22 ++++++++++++++--------
fs/xfs/support/ktrace.h | 1 +
2 files changed, 15 insertions(+), 8 deletions(-)
Index: 2.6.x-xfs-new/fs/xfs/support/ktrace.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/support/ktrace.c 2008-02-18 15:33:13.849314747 +1100
+++ 2.6.x-xfs-new/fs/xfs/support/ktrace.c 2008-02-18 15:33:16.920921133 +1100
@@ -24,7 +24,7 @@ static int ktrace_zentries;
void __init
ktrace_init(int zentries)
{
- ktrace_zentries = zentries;
+ ktrace_zentries = roundup_pow_of_two(zentries);
ktrace_hdr_zone = kmem_zone_init(sizeof(ktrace_t),
"ktrace_hdr");
@@ -47,13 +47,16 @@ ktrace_uninit(void)
* ktrace_alloc()
*
* Allocate a ktrace header and enough buffering for the given
- * number of entries.
+ * number of entries. Round the number of entries up to a
+ * power of 2 so we can do fast masking to get the index from
+ * the atomic index counter.
*/
ktrace_t *
ktrace_alloc(int nentries, unsigned int __nocast sleep)
{
ktrace_t *ktp;
ktrace_entry_t *ktep;
+ int entries;
ktp = (ktrace_t*)kmem_zone_alloc(ktrace_hdr_zone, sleep);
@@ -70,11 +73,12 @@ ktrace_alloc(int nentries, unsigned int
/*
* Special treatment for buffers with the ktrace_zentries entries
*/
- if (nentries == ktrace_zentries) {
+ entries = roundup_pow_of_two(nentries);
+ if (entries == ktrace_zentries) {
ktep = (ktrace_entry_t*)kmem_zone_zalloc(ktrace_ent_zone,
sleep);
} else {
- ktep = (ktrace_entry_t*)kmem_zalloc((nentries * sizeof(*ktep)),
+ ktep = (ktrace_entry_t*)kmem_zalloc((entries * sizeof(*ktep)),
sleep | KM_LARGE);
}
@@ -91,7 +95,9 @@ ktrace_alloc(int nentries, unsigned int
}
ktp->kt_entries = ktep;
- ktp->kt_nentries = nentries;
+ ktp->kt_nentries = entries;
+ ASSERT(is_power_of_2(entries));
+ ktp->kt_index_mask = entries - 1;
atomic_set(&ktp->kt_index, 0);
ktp->kt_rollover = 0;
return ktp;
@@ -160,7 +166,7 @@ ktrace_enter(
* Grab an entry by pushing the index up to the next one.
*/
index = atomic_add_return(1, &ktp->kt_index);
- index = (index - 1) % ktp->kt_nentries;
+ index = (index - 1) & ktp->kt_index_mask;
if (!ktp->kt_rollover && index == ktp->kt_nentries - 1)
ktp->kt_rollover = 1;
@@ -197,7 +203,7 @@ ktrace_nentries(
if (ktp == NULL)
return 0;
- index = atomic_read(&ktp->kt_index) % ktp->kt_nentries;
+ index = atomic_read(&ktp->kt_index) & ktp->kt_index_mask;
return (ktp->kt_rollover ? ktp->kt_nentries : index);
}
@@ -223,7 +229,7 @@ ktrace_first(ktrace_t *ktp, ktrace_sna
int nentries;
if (ktp->kt_rollover)
- index = atomic_read(&ktp->kt_index) % ktp->kt_nentries;
+ index = atomic_read(&ktp->kt_index) & ktp->kt_index_mask;
else
index = 0;
Index: 2.6.x-xfs-new/fs/xfs/support/ktrace.h
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/support/ktrace.h 2008-02-18 15:32:28.911073469 +1100
+++ 2.6.x-xfs-new/fs/xfs/support/ktrace.h 2008-02-18 15:33:16.920921133 +1100
@@ -31,6 +31,7 @@ typedef struct ktrace_entry {
typedef struct ktrace {
int kt_nentries; /* number of entries in trace buf */
atomic_t kt_index; /* current index in entries */
+ unsigned int kt_index_mask;
int kt_rollover;
ktrace_entry_t *kt_entries; /* buffer of entries */
} ktrace_t;
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [patch, debug, 2/2] Use power-of-2 size ktrace buffers
2008-02-18 23:01 [patch, debug, 2/2] Use power-of-2 size ktrace buffers David Chinner
@ 2008-02-22 4:08 ` Christoph Hellwig
2008-02-22 4:30 ` David Chinner
0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2008-02-22 4:08 UTC (permalink / raw)
To: David Chinner; +Cc: xfs-dev, xfs-oss
On Tue, Feb 19, 2008 at 10:01:12AM +1100, David Chinner wrote:
> Now that the ktrace_enter() code is using atomics,
> the non-power-of-2 buffer sizes - which require modulus
> operations to get the index - are showing up as using
> substantial CPU in the profiles.
>
> Force the buffer sizes to be rounded up to the nearest
> power of two and use masking rather than modulus operations
> to convert the index counter to the buffer index. This
> reduces ktrace_enter overhead to 8% of a CPU time, and
> again almost halves the trace intensive test runtime.
Looks fine aswell. You might aswell kill this zentries stuff and always
use kmalloc instead of caches as the power of two multiples of
sizeof(u64) should always have matching caches available.
While we're at it the ktrace stuff should simply go away mid-term
with all the tracing stuff in mainline now. As a start all macros
calling into ktrace should become markers which allow always bulding
them in with almost zero overhead (a single no-op instruction at their
callsite) and allowing to load the actual tracing module later. As
a second step ktrace should be replaced with something based on the
various trace thingies floating around allowing to read out the trace
buffer from userspace instead of having to rely on kdb.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [patch, debug, 2/2] Use power-of-2 size ktrace buffers
2008-02-22 4:08 ` Christoph Hellwig
@ 2008-02-22 4:30 ` David Chinner
2008-02-23 0:17 ` Christoph Hellwig
0 siblings, 1 reply; 4+ messages in thread
From: David Chinner @ 2008-02-22 4:30 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: David Chinner, xfs-dev, xfs-oss
On Thu, Feb 21, 2008 at 11:08:07PM -0500, Christoph Hellwig wrote:
> On Tue, Feb 19, 2008 at 10:01:12AM +1100, David Chinner wrote:
> > Now that the ktrace_enter() code is using atomics,
> > the non-power-of-2 buffer sizes - which require modulus
> > operations to get the index - are showing up as using
> > substantial CPU in the profiles.
> >
> > Force the buffer sizes to be rounded up to the nearest
> > power of two and use masking rather than modulus operations
> > to convert the index counter to the buffer index. This
> > reduces ktrace_enter overhead to 8% of a CPU time, and
> > again almost halves the trace intensive test runtime.
>
> Looks fine aswell. You might aswell kill this zentries stuff and always
> use kmalloc instead of caches as the power of two multiples of
> sizeof(u64) should always have matching caches available.
Ok, I'll cook up another patch for that and send it out.
> While we're at it the ktrace stuff should simply go away mid-term
> with all the tracing stuff in mainline now. As a start all macros
> calling into ktrace should become markers which allow always bulding
> them in with almost zero overhead (a single no-op instruction at their
> callsite) and allowing to load the actual tracing module later.
Sure, that could be done.
> As
> a second step ktrace should be replaced with something based on the
> various trace thingies floating around allowing to read out the trace
> buffer from userspace instead of having to rely on kdb.
Yeah, that make sense for tracing from userspace. But most of the
time I find a need for the tracing is when the machine has crashed
or assert failed. Hence I don't see that we can really remove the
kdb side of things. Perhaps two different tracing modules could
be done.....
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [patch, debug, 2/2] Use power-of-2 size ktrace buffers
2008-02-22 4:30 ` David Chinner
@ 2008-02-23 0:17 ` Christoph Hellwig
0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2008-02-23 0:17 UTC (permalink / raw)
To: David Chinner; +Cc: Christoph Hellwig, xfs-dev, xfs-oss
On Fri, Feb 22, 2008 at 03:30:20PM +1100, David Chinner wrote:
> Ok, I'll cook up another patch for that and send it out.
Yeah, making that a separate patch is fine with me. It's just a logical
next step.
> > a second step ktrace should be replaced with something based on the
> > various trace thingies floating around allowing to read out the trace
> > buffer from userspace instead of having to rely on kdb.
>
> Yeah, that make sense for tracing from userspace. But most of the
> time I find a need for the tracing is when the machine has crashed
> or assert failed. Hence I don't see that we can really remove the
> kdb side of things. Perhaps two different tracing modules could
> be done.....
It should still be one tracing buffer, with both a userspace space
interface and one from the various kernel debuggers. Adding the
trace buffers to core dumps in addition would be even better.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-02-23 0:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-18 23:01 [patch, debug, 2/2] Use power-of-2 size ktrace buffers David Chinner
2008-02-22 4:08 ` Christoph Hellwig
2008-02-22 4:30 ` David Chinner
2008-02-23 0:17 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox