public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* LTTng 0.108 provides many performance improvements
@ 2009-03-13 23:28 Mathieu Desnoyers
  2009-03-14 16:26 ` [ltt-dev] " Mathieu Desnoyers
  0 siblings, 1 reply; 2+ messages in thread
From: Mathieu Desnoyers @ 2009-03-13 23:28 UTC (permalink / raw)
  To: ltt-dev, linux-kernel; +Cc: mbligh

Hi,

I just released LTTng 0.108. Time had come to do a bit of performance
tuning using oprofile.

Basically, the tbench workload, under flight recorder tracing, passed
from a 52 % slowdown with previous lttng to a 32 % slowdown with lttng
0.108 on my test machine (8-cores x86_64, 16GB ram).

Modifications done :

- inlined fast paths. Modularity is now provided by the build system,
  not by callbacks anymore. Selecting between lockless and locked buffer
  management must be done at compile-time. I'd like to keep the
  "transport" around because it will be used eventually to specify where
  the information must be sent rather than selecting the buffer management
  mechanism (e.g. sent to physical pages (contiguous or non-contiguous),
  video card memory...). The "transport" option is still there, but it
  currently does not do much. The slow paths are now done in function
  calls.

- Fixed false sharing problem. It looks like the kzalloc_node()
  allocator, used to allocate the commit counters, does not align the
  memory allocated on cache lines.

Therefore I think the new code will be _much_ easier to optimize,
because the fastpaths are very well identified and much smaller than
they were before. I diminished the tracer stack space used, register
usage and instruction cache usage.

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [ltt-dev] LTTng 0.108 provides many performance improvements
  2009-03-13 23:28 LTTng 0.108 provides many performance improvements Mathieu Desnoyers
@ 2009-03-14 16:26 ` Mathieu Desnoyers
  0 siblings, 0 replies; 2+ messages in thread
From: Mathieu Desnoyers @ 2009-03-14 16:26 UTC (permalink / raw)
  To: ltt-dev, linux-kernel; +Cc: mbligh

* Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote:
> Hi,
> 
> I just released LTTng 0.108. Time had come to do a bit of performance
> tuning using oprofile.
> 
> Basically, the tbench workload, under flight recorder tracing, passed
> from a 52 % slowdown with previous lttng to a 32 % slowdown with lttng
> 0.108 on my test machine (8-cores x86_64, 16GB ram).
> 

Down to 30% performance impact by using a pointer array instead of a
linked list to manage the buffer pages (it's in 0.110). I ensure that
the accesses done on the array will never cause vmalloc faults (I think
the kernel could potentially fall back to vmalloc'd memory if the array
is too large to be allocated with kmalloc, but I'm unsure about this, as
I cannot find the offending code in 2.6.29-rc7). I call
vmalloc_sync_all() after allocating the array to make sure the TLBs are
populated (it's just safer).

Mathieu

> Modifications done :
> 
> - inlined fast paths. Modularity is now provided by the build system,
>   not by callbacks anymore. Selecting between lockless and locked buffer
>   management must be done at compile-time. I'd like to keep the
>   "transport" around because it will be used eventually to specify where
>   the information must be sent rather than selecting the buffer management
>   mechanism (e.g. sent to physical pages (contiguous or non-contiguous),
>   video card memory...). The "transport" option is still there, but it
>   currently does not do much. The slow paths are now done in function
>   calls.
> 
> - Fixed false sharing problem. It looks like the kzalloc_node()
>   allocator, used to allocate the commit counters, does not align the
>   memory allocated on cache lines.
> 
> Therefore I think the new code will be _much_ easier to optimize,
> because the fastpaths are very well identified and much smaller than
> they were before. I diminished the tracer stack space used, register
> usage and instruction cache usage.
> 
> Mathieu
> 
> -- 
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> 
> _______________________________________________
> ltt-dev mailing list
> ltt-dev@lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-03-14 16:26 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-13 23:28 LTTng 0.108 provides many performance improvements Mathieu Desnoyers
2009-03-14 16:26 ` [ltt-dev] " Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox