public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* RFC: NUMA modifications to cyclictest
@ 2010-01-19 23:14 Clark Williams
  2010-01-20  6:51 ` Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: Clark Williams @ 2010-01-19 23:14 UTC (permalink / raw)
  To: RT; +Cc: LKML, Carsten Emde, John Kacur, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 1408 bytes --]

RT-ers,

Lately we've been struggling with some performance issues on high-core
count (>16 cores) NUMA machines with the RT kernel. During the course
of troubleshooting this issue, we tried using the 'numactl' program to
constrain our measurement testing tool (rteval) to a particular memory
node, rather than letting everything float. Doing so showed marked
improvement in both max latency and jitter.  While this doesn't solve
our performance problems I thought it might make sense to have a --numa
mode for cylictest that compliments the --smp mode just added. 

The big difference here is that when using --numa, each measurement
thread (one per cpu) has it's stack allocated from the memory node
associated with it's cpu. Also, the major data structures for each
thread (parameter block, statistics block and histogram) are allocated
from the appropriate node. This is done with calls into libnuma,
which means this will add a dependency on libnuma. 

The intent is to measure latency on a numa system in the same way a
well-written RT application would run on a NUMA machine, that is
minimizing the off-node memory references. 

If you're interested in looking at this, please pull the numa branch
from my git repo at:

	git://git.kernel.org/pub/scm/linux/kernel/git/clrkwllms/rt-tests.git

and let me know if you find bugs or disagree with the approach. 

Thanks,
Clark

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RFC: NUMA modifications to cyclictest
  2010-01-19 23:14 RFC: NUMA modifications to cyclictest Clark Williams
@ 2010-01-20  6:51 ` Thomas Gleixner
  2010-01-20 13:37   ` Clark Williams
  0 siblings, 1 reply; 4+ messages in thread
From: Thomas Gleixner @ 2010-01-20  6:51 UTC (permalink / raw)
  To: Clark Williams; +Cc: RT, LKML, Carsten Emde, John Kacur

On Tue, 19 Jan 2010, Clark Williams wrote:
> RT-ers,
> 
> Lately we've been struggling with some performance issues on high-core
> count (>16 cores) NUMA machines with the RT kernel. During the course
> of troubleshooting this issue, we tried using the 'numactl' program to
> constrain our measurement testing tool (rteval) to a particular memory
> node, rather than letting everything float. Doing so showed marked
> improvement in both max latency and jitter.  While this doesn't solve
> our performance problems I thought it might make sense to have a --numa
> mode for cylictest that compliments the --smp mode just added. 
> 
> The big difference here is that when using --numa, each measurement
> thread (one per cpu) has it's stack allocated from the memory node
> associated with it's cpu. Also, the major data structures for each
> thread (parameter block, statistics block and histogram) are allocated
> from the appropriate node. This is done with calls into libnuma,
> which means this will add a dependency on libnuma. 

That might cause some trouble for embedded folks. :(
 
> The intent is to measure latency on a numa system in the same way a
> well-written RT application would run on a NUMA machine, that is
> minimizing the off-node memory references. 

Agreed.

	tglx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RFC: NUMA modifications to cyclictest
  2010-01-20  6:51 ` Thomas Gleixner
@ 2010-01-20 13:37   ` Clark Williams
  2010-01-20 15:54     ` Nikita V. Youshchenko
  0 siblings, 1 reply; 4+ messages in thread
From: Clark Williams @ 2010-01-20 13:37 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: RT, LKML, Carsten Emde, John Kacur

[-- Attachment #1: Type: text/plain, Size: 1986 bytes --]

On Wed, 20 Jan 2010 07:51:41 +0100 (CET)
Thomas Gleixner <tglx@linutronix.de> wrote:

> On Tue, 19 Jan 2010, Clark Williams wrote:
> > RT-ers,
> > 
> > Lately we've been struggling with some performance issues on high-core
> > count (>16 cores) NUMA machines with the RT kernel. During the course
> > of troubleshooting this issue, we tried using the 'numactl' program to
> > constrain our measurement testing tool (rteval) to a particular memory
> > node, rather than letting everything float. Doing so showed marked
> > improvement in both max latency and jitter.  While this doesn't solve
> > our performance problems I thought it might make sense to have a --numa
> > mode for cylictest that compliments the --smp mode just added. 
> > 
> > The big difference here is that when using --numa, each measurement
> > thread (one per cpu) has it's stack allocated from the memory node
> > associated with it's cpu. Also, the major data structures for each
> > thread (parameter block, statistics block and histogram) are allocated
> > from the appropriate node. This is done with calls into libnuma,
> > which means this will add a dependency on libnuma. 
> 
> That might cause some trouble for embedded folks. :(

Yeah, that's why I send the RFC, wanted to see who would hate me for
it :).

Carsten already told me off-list that one of his build machines didn't
have numa.h, so I'm going to have to rearrange the build a bit.

As much as I hate to say it, I think the best option is to use autoconf
to detect if libnuma is available on the build platform and to take
appropriate steps if it's not. 

The other idea I toyed with was dynamic loading of libnuma so there's
not an install dependency for the libnuma package with the rt-tests
package. I only use five functions from libnuma, so that's not too bad
a set of function pointers to manage. Hmmm, that probably won't work
very well, since I'll still have to include numa.h. Sigh...

Clark

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RFC: NUMA modifications to cyclictest
  2010-01-20 13:37   ` Clark Williams
@ 2010-01-20 15:54     ` Nikita V. Youshchenko
  0 siblings, 0 replies; 4+ messages in thread
From: Nikita V. Youshchenko @ 2010-01-20 15:54 UTC (permalink / raw)
  To: Clark Williams; +Cc: Thomas Gleixner, RT, LKML, Carsten Emde, John Kacur

> The other idea I toyed with was dynamic loading of libnuma so there's
> not an install dependency for the libnuma package with the rt-tests
> package. I only use five functions from libnuma, so that's not too bad
> a set of function pointers to manage. Hmmm, that probably won't work
> very well, since I'll still have to include numa.h. Sigh...

It's an overkill.

We are not talking about use package to be installed into millions of 
systems in binary form.

rt-tests are used by developers; developers may compile those with needed 
options.

Nikita

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-01-20 16:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-19 23:14 RFC: NUMA modifications to cyclictest Clark Williams
2010-01-20  6:51 ` Thomas Gleixner
2010-01-20 13:37   ` Clark Williams
2010-01-20 15:54     ` Nikita V. Youshchenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox