public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [ANNOUNCE] pahole and other DWARF2 utilities
@ 2006-10-30 21:33 Arnaldo Carvalho de Melo
  2006-10-31  4:33 ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2006-10-30 21:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: lwn

Hi,

	I've been working on some DWARF2 utilities and thought that it
is about time I announce it to the community, so that what is already
available can be used by people interested in reducing structure sizes
and otherwise taking advantage of the information available in the elf
sections of files compiled with 'gcc -g' or in the case of the kernel
with CONFIG_DEBUG_INFO enabled, so here it goes the description of said
tools:

pahole: Poke-a-Hole is a tool to find out holes in structures, holes
being defined as the space between members of functions due to alignemnt
rules that could be used for new struct entries or to reorganize
existing structures to reduce its size, without more ado lets see what
that means:

[acme@newtoy net-2.6]$ pahole kernel/sched.o task_struct
/* include2/asm/system.h:11 */
struct task_struct {
        volatile long int       state;          /*     0     4 */
        struct thread_info *    thread_info;    /*     4     4 */
        atomic_t                usage;          /*     8     4 */
        long unsigned int       flags;          /*    12     4 */

	<SNIP>

        short unsigned int         ioprio;      /*    52     2 */

        /* XXX 2 bytes hole, try to pack */

        long unsigned int          sleep_avg;   /*    56     4 */ */
        unsigned char              fpu_counter; /*   388     1 */

        /* XXX 3 bytes hole, try to pack */

        int                        oomkilladj;  /*   392     4 */

	<SNIP>

}; /* size: 1312, sum members: 1287, holes: 3, sum holes: 13, padding: 12 */

	It doesn't uses any source code files, just the DWARF2
information in ELF sections, inserted by 'gcc -g', to print out the
above information, current goodies being just to show where are holes
that can be used to reduce the struct size, which is even more useful as
we transition to 64bit architectures, where such holes are more
frequent, as we can see in this example:

[acme@newtoy ~]$ file kdump.debug
kdump.debug: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV),
dynamically linked (uses shared libs), not stripped
[acme@newtoy ~]$ pahole kdump.debug _IO_FILE | head -7
/* /usr/include/stdio.h:46 */
struct _IO_FILE {
        int   _flags;               /*     0     4 */

        /* XXX 4 bytes hole, try to pack */

        char *_IO_read_ptr;         /*     8     8 */
[acme@newtoy ~]$


	The columns in the comments are (offset, sizeof(member).

	Tons more information is available in the DWARF2 ELF sections,
making it possible to use it for other purposes, and thats where the
next dwarf comes in, pfunct:

[acme@newtoy net-2.6]$ pfunct net/ipv4/tcp_ipv4.o tcp_v4_rcv
/* /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/tcp_ipv4.c:1054 */
int tcp_v4_rcv(struct sk_buff * skb);
/* size: 2175 */

	pfunct uses the DWARF2 information to get function details, such
as its full prototype and function size, that allows us to do some more
interesting queries, such as:

[acme@newtoy net-2.6]$ pfunct --size net/ipv4/netfilter/ip_conntrack.ko
| sort -k 2 -nr | head -10
tcp_packet: 3349
ip_conntrack_in: 1146
icmp_error: 874
ip_conntrack_expect_related: 804
__ip_conntrack_confirm: 586
tcp_new: 527
ip_conntrack_init: 525
tcp_error: 508
ip_conntrack_helper_unregister: 482
ip_conntrack_alloc: 469
[acme@newtoy net-2.6]$

	The top ten functions by size (in bytes) in any ELF file with
debugging information!

	The code is available in a git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/pahole.git

	Just for browsing the cset comments that may well provide hints
on how this thingy can be useful:

http://www.kernel.org/git/?p=linux/kernel/git/acme/pahole.git;a=summary

	Further ideas on how to use the DWARF2 information include tools
that will show where inlines are being used, how much code is added by
inline functions, possibly rewriting asm-offsets.c, converting ostra
(callgraph tool) to use this information, correlate valgrind's
cachegrind information to suggest struct member reorganization to
exploit cacheline locality and more.

	Documentation is very much a disaster, but I guess the current
state of things is useful for interested hackers, so that I thought it
was time got announce this.

	Ideas for additional tools are more than welcome!

- Arnaldo
Mandriva Labs
http://www.mandriva.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ANNOUNCE] pahole and other DWARF2 utilities
  2006-10-30 21:33 [ANNOUNCE] pahole and other DWARF2 utilities Arnaldo Carvalho de Melo
@ 2006-10-31  4:33 ` Andrew Morton
  2006-10-31 16:05   ` Thiago Galesi
                     ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Andrew Morton @ 2006-10-31  4:33 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-kernel, lwn

On Mon, 30 Oct 2006 18:33:19 -0300
Arnaldo Carvalho de Melo <acme@mandriva.com> wrote:

> Hi,
> 
> 	I've been working on some DWARF2 utilities and thought that it
> is about time I announce it to the community, so that what is already
> available can be used by people interested in reducing structure sizes
> and otherwise taking advantage of the information available in the elf
> sections of files compiled with 'gcc -g' or in the case of the kernel
> with CONFIG_DEBUG_INFO enabled, so here it goes the description of said
> tools:
> 
> pahole: Poke-a-Hole is a tool to find out holes in structures, holes
> being defined as the space between members of functions due to alignemnt
> rules that could be used for new struct entries or to reorganize
> existing structures to reduce its size, without more ado lets see what
> that means:
> 
> ...
>
> 	Further ideas on how to use the DWARF2 information include tools
> that will show where inlines are being used, how much code is added by
> inline functions,

It would be quite useful to be able to identify inlined functions which are
good candidates for uninlining.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ANNOUNCE] pahole and other DWARF2 utilities
  2006-10-31  4:33 ` Andrew Morton
@ 2006-10-31 16:05   ` Thiago Galesi
  2006-10-31 17:28     ` Arnaldo Carvalho de Melo
  2006-10-31 17:22   ` Arnaldo Carvalho de Melo
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Thiago Galesi @ 2006-10-31 16:05 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Arnaldo Carvalho de Melo, linux-kernel, lwn

> >       Further ideas on how to use the DWARF2 information include tools
> > that will show where inlines are being used, how much code is added by
> > inline functions,
>
> It would be quite useful to be able to identify inlined functions which are
> good candidates for uninlining.
>
> -

Arnaldo, can't we get a call count for functions? (yes, it is not a
run-time call count, but rather, how many times the function if called
in the code) I guess this would help for this purpose of finding
candidates for inlining, uninlining.

-- 
-
Thiago Galesi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ANNOUNCE] pahole and other DWARF2 utilities
  2006-10-31  4:33 ` Andrew Morton
  2006-10-31 16:05   ` Thiago Galesi
@ 2006-10-31 17:22   ` Arnaldo Carvalho de Melo
  2006-10-31 20:45     ` Arnaldo Carvalho de Melo
  2006-11-03 15:51   ` Arnaldo Carvalho de Melo
  2006-11-03 19:07   ` Arnaldo Carvalho de Melo
  3 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2006-10-31 17:22 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, lwn

On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> On Mon, 30 Oct 2006 18:33:19 -0300
> Arnaldo Carvalho de Melo <acme@mandriva.com> wrote:
> 
> > 	I've been working on some DWARF2 utilities and thought that it
> > is about time I announce it to the community, so that what is already
> > available can be used by people interested in reducing structure sizes
> > and otherwise taking advantage of the information available in the elf
> > sections of files compiled with 'gcc -g' or in the case of the kernel
> > with CONFIG_DEBUG_INFO enabled, so here it goes the description of said
> > tools:
> > 
> > pahole: Poke-a-Hole is a tool to find out holes in structures, holes
> > being defined as the space between members of functions due to alignemnt
> > rules that could be used for new struct entries or to reorganize
> > existing structures to reduce its size, without more ado lets see what
> > that means:
> > 
> > ...
> >
> > 	Further ideas on how to use the DWARF2 information include tools
> > that will show where inlines are being used, how much code is added by
> > inline functions,
> 
> It would be quite useful to be able to identify inlined functions which are
> good candidates for uninlining.

I'm working on making good use of this information:

--------------- 8< --------------

3.3.8.2 Concrete Inlined Instances

Each inline expansion of an inlinable subroutine is represented by a
debugging information entry with the tag DW_TAG_inlined_subroutine.
Each such entry should be a direct child of the entry that represents
the scope with in which the inlining occurs.

--------------- 8< --------------

To write this tool:

<Ralf> So imagine a tool which says function x was inlined y times
bloating the code by z bytes :)

:-)

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ANNOUNCE] pahole and other DWARF2 utilities
  2006-10-31 16:05   ` Thiago Galesi
@ 2006-10-31 17:28     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2006-10-31 17:28 UTC (permalink / raw)
  To: Thiago Galesi; +Cc: Andrew Morton, linux-kernel, lwn

On Tue, Oct 31, 2006 at 02:05:06PM -0200, Thiago Galesi wrote:
> >>       Further ideas on how to use the DWARF2 information include tools
> >> that will show where inlines are being used, how much code is added by
> >> inline functions,
> >
> >It would be quite useful to be able to identify inlined functions which are
> >good candidates for uninlining.
> >
> >-
> 
> Arnaldo, can't we get a call count for functions? (yes, it is not a
> run-time call count, but rather, how many times the function if called
> in the code) I guess this would help for this purpose of finding
> candidates for inlining, uninlining.

At least for inline expansions, yes, for normal function calls I have to
study more the DWARF2 documentation, but I guess its feasible.

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ANNOUNCE] pahole and other DWARF2 utilities
  2006-10-31 17:22   ` Arnaldo Carvalho de Melo
@ 2006-10-31 20:45     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2006-10-31 20:45 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, lwn

On Tue, Oct 31, 2006 at 02:22:37PM -0300, Arnaldo Carvalho de Melo wrote:
> On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> > On Mon, 30 Oct 2006 18:33:19 -0300
> > Arnaldo Carvalho de Melo <acme@mandriva.com> wrote:
> > 
> > > 	I've been working on some DWARF2 utilities and thought that it
> > > is about time I announce it to the community, so that what is already
> > > available can be used by people interested in reducing structure sizes
> > > and otherwise taking advantage of the information available in the elf
> > > sections of files compiled with 'gcc -g' or in the case of the kernel
> > > with CONFIG_DEBUG_INFO enabled, so here it goes the description of said
> > > tools:
> > > 
> > > pahole: Poke-a-Hole is a tool to find out holes in structures, holes
> > > being defined as the space between members of functions due to alignemnt
> > > rules that could be used for new struct entries or to reorganize
> > > existing structures to reduce its size, without more ado lets see what
> > > that means:
> > > 
> > > ...
> > >
> > > 	Further ideas on how to use the DWARF2 information include tools
> > > that will show where inlines are being used, how much code is added by
> > > inline functions,
> > 
> > It would be quite useful to be able to identify inlined functions which are
> > good candidates for uninlining.

For now people can take a look at:

http://oops.merseine.nu:81/acme/net.ipv4.tcp.o.pahole

Where all the types in headers included from net/ipv4/tcp.c that have
holes can be seen, for instance:

/* /pub/scm/linux/kernel/git/acme/net-2.6/include/linux/dqblk_xfs.h:143
 * */
struct fs_quota_stat {
        __s8             qs_version;           /*     0     1 */

        /* XXX 1 bytes hole, try to pack */

        __u16            qs_flags;             /*     2     2 */
        __s8             qs_pad;               /*     4     1 */

        /* XXX 3 bytes hole, try to pack */

        fs_qfilestat_t   qs_uquota;            /*     8    20 */
        fs_qfilestat_t   qs_gquota;            /*    28    20 */
        __u32            qs_incoredqs;         /*    48     4 */
        __s32            qs_btimelimit;        /*    52     4 */
        __s32            qs_itimelimit;        /*    56     4 */
        __s32            qs_rtbtimelimit;      /*    60     4 */
        __u16            qs_bwarnlimit;        /*    64     2 */
        __u16            qs_iwarnlimit;        /*    66     2 */
}; /* size: 68, sum members: 64, holes: 2, sum holes: 4 */


	See? two holes, that can be combined and reduce the size of this
struct by 4 bytes, just moving qs_pad to be defined just before
qs_flags, many more holes are there to harvest :-)

	Of course, mistakes from the past for structs that are exported
to userspace have to be kept that way, and in other cases where grouping
members for cacheline locality optimizations, etc.

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ANNOUNCE] pahole and other DWARF2 utilities
  2006-10-31  4:33 ` Andrew Morton
  2006-10-31 16:05   ` Thiago Galesi
  2006-10-31 17:22   ` Arnaldo Carvalho de Melo
@ 2006-11-03 15:51   ` Arnaldo Carvalho de Melo
  2006-11-03 19:07   ` Arnaldo Carvalho de Melo
  3 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2006-11-03 15:51 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, lwn

On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> On Mon, 30 Oct 2006 18:33:19 -0300
> Arnaldo Carvalho de Melo <acme@mandriva.com> wrote:
> 
> > Hi,
> > 
> > 	I've been working on some DWARF2 utilities and thought that it
> > is about time I announce it to the community, so that what is already
> > available can be used by people interested in reducing structure sizes
> > and otherwise taking advantage of the information available in the elf
> > sections of files compiled with 'gcc -g' or in the case of the kernel
> > with CONFIG_DEBUG_INFO enabled, so here it goes the description of said
> > tools:
> > 
> > pahole: Poke-a-Hole is a tool to find out holes in structures, holes
> > being defined as the space between members of functions due to alignemnt
> > rules that could be used for new struct entries or to reorganize
> > existing structures to reduce its size, without more ado lets see what
> > that means:
> > 
> > ...
> >
> > 	Further ideas on how to use the DWARF2 information include tools
> > that will show where inlines are being used, how much code is added by
> > inline functions,
> 
> It would be quite useful to be able to identify inlined functions which are
> good candidates for uninlining.

Getting there, next step is to per CU (Compilation Unit, .o files)
inlining stats :-)

Ah, the sizes are different because sometimes just some parts of inline
functions are "sourced", as indicated by the DW_AT_ranges DWARF
attribute.

Repo continues at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/pahole.git

Another suggested was for a stack hole finding tool, similar to what
pahole does for structs :-)

Another example, this time for schedule():

http://oops.merseine.nu:81/acme/schedule.inlines.txt

Regards,

- Arnaldo

commit a42afe1acffc5e57ab504c008b8b75c124bf07de
Author: Arnaldo Carvalho de Melo <acme@mandriva.com>
Date:   Fri Nov 3 12:41:19 2006 -0300

    [CLASSES]: Add support for DW_TAG_inlined_subroutine

    Output of pfunct using this information (all for a make allyesconfig build):

    Top 5 functions by size of inlined functions in net/ipv4:

    [acme@newtoy guinea_pig-2.6]$ pfunct -I net/ipv4/built-in.o | sort -k3 -nr | head -5
    ip_route_input: 19 7086
    tcp_ack: 33 6415
    do_ip_vs_set_ctl: 23 4193
    q931_help: 8 3822
    ip_defrag: 19 3318
    [acme@newtoy guinea_pig-2.6]$

    And by number of inline expansions:

    [acme@newtoy guinea_pig-2.6]$ pfunct -I net/ipv4/built-in.o | sort -k2 -nr | head -5
    dump_packet: 35 905
    tcp_v4_rcv: 34 1773
    tcp_recvmsg: 34 928
    tcp_ack: 33 6415
    tcp_rcv_established: 31 1195
    [acme@newtoy guinea_pig-2.6]$

    And the list of expansions on a specific function:

    [acme@newtoy guinea_pig-2.6]$ pfunct -i net/ipv4/built-in.o tcp_v4_rcv
    /* net/ipv4/tcp_ipv4.c:1054 */
    int tcp_v4_rcv(struct sk_buff * skb);
    /* size: 2189, variables: 8, goto labels: 6, inline expansions: 34 (1773 bytes) */

    /* inline expansions in tcp_v4_rcv:
    current_thread_info: 8
    pskb_may_pull: 36
    pskb_may_pull: 29
    tcp_v4_checksum_init: 139
    __fswab32: 2
    __fswab32: 2
    inet_iif: 12
    __inet_lookup: 292
    __fswab16: 20
    inet_ehashfn: 25
    inet_ehash_bucket: 18
    prefetch: 4
    prefetch: 4
    prefetch: 4
    sock_hold: 4
    xfrm4_policy_check: 59
    nf_reset: 66
    sk_filter: 135
    __skb_trim: 20
    get_softnet_dma: 68
    tcp_prequeue: 257
    sk_add_backlog: 40
    sock_put: 27
    xfrm4_policy_check: 46
    tcp_checksum_complete: 29
    current_thread_info: 8
    sock_put: 20
    xfrm4_policy_check: 50
    tcp_checksum_complete: 29
    current_thread_info: 8
    current_thread_info: 8
    sock_put: 20
    xfrm4_policy_check: 50
    tcp_checksum_complete: 29
    current_thread_info: 8
    inet_iif: 9
    inet_lookup_listener: 36
    inet_twsk_put: 114
    tcp_v4_timewait_ack: 153
    */
    [acme@newtoy guinea_pig-2.6]$

    Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ANNOUNCE] pahole and other DWARF2 utilities
  2006-10-31  4:33 ` Andrew Morton
                     ` (2 preceding siblings ...)
  2006-11-03 15:51   ` Arnaldo Carvalho de Melo
@ 2006-11-03 19:07   ` Arnaldo Carvalho de Melo
  2006-11-04 21:03     ` Top 100 inline functions (make allyesconfig) was " Arnaldo Carvalho de Melo
  3 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2006-11-03 19:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, lwn

On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> On Mon, 30 Oct 2006 18:33:19 -0300
> Arnaldo Carvalho de Melo <acme@mandriva.com> wrote:
> 
> > 	Further ideas on how to use the DWARF2 information include tools
> > that will show where inlines are being used, how much code is added by
> > inline functions,
> 
> It would be quite useful to be able to identify inlined functions which are
> good candidates for uninlining.

Top 50 inline functions expanded more than once by sum of its expansions
in a vmlinux file built for qemu, most things are modules, columns are
(inline function name, number of times it was expanded, sum in bytes of
its expansions, number of source files where expansions ocurred):

[acme@newtoy guinea_pig-2.6]$ pfunct --total_inline_stats
../../acme/OUTPUT/qemu/net-2.6/vmlinux | grep -v ': 1 ' | sort -k3 -nr |
head -50

get_current                        676   5732 155
xfrm_selector_match                  6   4778   2
__memcpy                           177   4326  89
kmalloc                            185   3991 119
__constant_c_memset                113   3556  69
__constant_c_and_count_memset      225   3161 156
prefetch                           333   2915 101
__ext3_journal_dirty_metadata       44   2810   6
skb_put                             34   2650  27
module_put                          80   2613  42
strcmp                             108   2506  49
__ext3_journal_get_write_access     41   2482   6
down                                57   2253  19
__fswab16                           96   2172  33
dst_release                         34   2130  23
list_add_tail                       88   2030  67
kzalloc                             89   2007  76
__constant_memcpy                  146   1930 118
tcp_done                             8   1918   4
brelse                             128   1897  16
__nlmsg_put                         21   1856  13
INIT_LIST_HEAD                     226   1848  88
pci_read_config_byte                54   1802   9
list_del_init                      103   1782  39
ip_rt_put                           27   1692  12
pci_read_config_word                50   1675  11
strlen                             108   1671  64
__xfrm6_selector_match               3   1615   2
__skb_trim                          25   1604  21
do_follow_link                       2   1543   1
strncmp                             48   1533  22
__xfrm4_selector_match               6   1525   2
outb_p                             136   1518   9
tcp_set_state                       14   1501   5
find_group_orlov                     2   1456   2
inet_twsk_put                       16   1448   5
pci_write_config_byte               38   1433  10
up                                  68   1372  19
pci_read_config_dword               42   1357  12
raw_local_irq_restore              366   1292  88
skb_tailroom                        62   1239  23
set_bit                            155   1232  68
put_task_struct                     53   1227  11
print_irq_desc                       2   1206   1
skb_trim                            14   1192  13
__do_follow_link                     2   1190   1
nf_hook_thresh                      16   1164   8
dget                                47   1147  19
__raw_local_irq_save               314   1145  85
__fswab32                          130   1117  28

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Top 100 inline functions (make allyesconfig) was Re: [ANNOUNCE] pahole and other DWARF2 utilities
  2006-11-03 19:07   ` Arnaldo Carvalho de Melo
@ 2006-11-04 21:03     ` Arnaldo Carvalho de Melo
  2006-11-05  6:30       ` Adrian Bunk
  0 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2006-11-04 21:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, lwn

On Fri, Nov 03, 2006 at 04:07:29PM -0300, Arnaldo Carvalho de Melo wrote:
> On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> > On Mon, 30 Oct 2006 18:33:19 -0300
> > Arnaldo Carvalho de Melo <acme@mandriva.com> wrote:
> > 
> > > 	Further ideas on how to use the DWARF2 information include tools
> > > that will show where inlines are being used, how much code is added by
> > > inline functions,
> > 
> > It would be quite useful to be able to identify inlined functions which are
> > good candidates for uninlining.
> 
> Top 50 inline functions expanded more than once by sum of its expansions
> in a vmlinux file built for qemu, most things are modules, columns are
> (inline function name, number of times it was expanded, sum in bytes of
> its expansions, number of source files where expansions ocurred):
> 
> [acme@newtoy guinea_pig-2.6]$ pfunct --total_inline_stats
> ../../acme/OUTPUT/qemu/net-2.6/vmlinux | grep -v ': 1 ' | sort -k3 -nr |
> head -50
> 
> get_current                        676   5732 155

Ok, this time for a 'make allyesconfig' build, top 100, for the list of
all 6021 inline functions that were expanded more than once in this 281
MB vmlinux image download the 93 KB files at:

http://oops.merseine.nu:81/acme/vmlinux.allyesconfig.inlines.txt.gz

totsz = Total size of all expansions for this inline function
nrexp = Number of times this function was expanded (inlined)
avgsz = Average expansion size
nrsrc = number of source files where this function was expanded

Some cases are bogus due to namespace colisions, I'll work on mangling
the function name with the file where it was defined, but most should
be ok.

- Arnaldo

        name                             totsz / nrexp = avgsz  nrsrc
        -------------------------------------------------------------
     1	outb                             65077    8180       7    505
     2	__fswab16                        58827    2313      25    459
     3	__memcpy                         54640    2241      24   1141
     4	writel                           49011    5989       8    364
     5	__constant_c_and_count_memset    42967    3163      13   1708
     6	skb_put                          41512     741      56    416
     7	kmalloc                          38684    2345      16   1366
     8	__constant_memcpy                35716    2536      14   1633
     9	get_current                      31233    4058       7    886
    10	cfi_build_cmd                    28439     131     217      7
    11	strcmp                           23324    1008      23    329
    12	kzalloc                          22413    1326      16   1014
    13	current_thread_info              21816    2815       7   1319
    14	readl                            21575    3953       5    363
    15	__constant_c_memset              21014     732      28    578
    16	strcpy                           20681    1420      14    611
    17	__fswab32                        19797    3566       5    442
    18	init_hw                          18441       7    2634     12
    19	strncmp                          18199     596      30    212
    20	writeb                           17825    2611       6    205
    21	INIT_LIST_HEAD                   15476    1713       9    746
    22	__OUTPLL                         15399     125     123      4
    23	inb                              15174    3246       4    499
    24	snd_echo_create                  15098       5    3019      5
    25	NCR5380_information_transfer     14699       6    2449      6
    26	__INPLL                          14541     117     124      4
    27	outw                             14475    1437      10    135
    28	up                               14467     796      18    189
    29	down                             13710     396      34    152
    30	do_write_buffer                  13674       3    4558      3
    31	pci_write_config_byte            13069     527      24    176
    32	outb_p                           12659    1241      10     97
    33	strlen                           12658     931      13    435
    34	load_firmware                    12616       7    1802     13
    35	pci_read_config_byte             12369     560      22    218
    36	cfi_send_gen_cmd                 11559      41     281      5
    37	module_put                       11462     265      43    203
    38	skb_push                         11297     259      43    182
    39	set_bit                          11160    1296       8    676
    40	readb                            11015    1728       6    219
    41	radeon_pll_errata_after_data     10855     127      85      4
    42	skb_pull                         10812     303      35    187
    43	outl                             10718    1151       9    128
    44	netif_wake_queue                 10355     348      29    173
    45	add_timer                         9896     390      25    252
    46	pci_free_consistent               9834     376      26    130
    47	clear_bit                         9654     962      10    600
    48	list_add_tail                     9570     712      13    434
    49	__fswab64                         9553     521      18     97
    50	ahd_outb                          9538     295      32      4
    51	hscx_int_main                     9526      12     793     12
    52	prefetch                          9467    1764       5    720
    53	pci_read_config_dword             9247     425      21    163
    54	pci_write_config_dword            9116     402      22    140
    55	dev_alloc_skb                     9072     221      41    209
    56	constant_test_bit                 8764    1955       4   1034
    57	dev_kfree_skb_irq                 8617      98      87    104
    58	writew                            8569    1148       7    128
    59	ahc_outb                          8426     261      32      6
    60	netif_stop_queue                  8364     495      16    187
    61	brelse                            8354     822      10    155
    62	skb_reserve                       8187     444      18    320
    63	ahc_inb                           8182     303      27      6
    64	pci_read_config_word              8174     359      22    178
    65	skb_trim                          8153     112      72    101
    66	i_size_read                       8071     181      44     67
    67	list_del_init                     7916     461      17    210
    68	pci_map_single                    7756     186      41    103
    69	jedec_reset                       7689       6    1281      1
    70	WriteHSCXCMDR                     7576      78      97     17
    71	frontend_init                     7352       7    1050      7
    72	le_key_k_type                     7302      60     121     12
    73	pci_write_config_word             7123     284      25    137
    74	dst_release                       7059     125      56     71
    75	ahd_set_modes                     6930      55     126      4
    76	strncpy                           6651     296      22    153
    77	usb_serial_debug_data             6380      61     104     25
    78	pci_alloc_consistent              6352     214      29    129
    79	atomic_inc                        6235    1022       6    683
    80	readw                             6170    1001       6    122
    81	block_til_ready                   6146      10     614     10
    82	load_dsp                          6141       5    1228     12
    83	test_and_set_bit                  6096     605      10    394
    84	try_module_get                    6065     176      34    151
    85	input_report_key                  5939     208      28     65
    86	load_module                       5933       2    2966      2
    87	usb_fill_bulk_urb                 5846     114      51     62
    88	skb_queue_head_init               5783     151      38    110
    89	ahd_inb                           5739     208      27      4
    90	device_init                       5689       2    2844      2
    91	sctp_add_cmd_sf                   5665     197      28      2
    92	pci_set_drvdata                   5595     496      11    268
    93	skb_tailroom                      5432     313      17    117
    94	port_detect                       5419       2    2709      2
    95	dequeue_rx                        5339       3    1779      3
    96	dev_to_shost                      5338     167      31     22
    97	skb_header_pointer                5289     120      44     53
    98	prism2_init_local_data            5226       3    1742      3
    99	sb_bread                          5216     164      31     87
   100	strchr                            5171     195      26    115

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Top 100 inline functions (make allyesconfig) was Re: [ANNOUNCE] pahole and other DWARF2 utilities
  2006-11-04 21:03     ` Top 100 inline functions (make allyesconfig) was " Arnaldo Carvalho de Melo
@ 2006-11-05  6:30       ` Adrian Bunk
  2006-11-05 16:42         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 11+ messages in thread
From: Adrian Bunk @ 2006-11-05  6:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Andrew Morton, linux-kernel, lwn

On Sat, Nov 04, 2006 at 06:03:32PM -0300, Arnaldo Carvalho de Melo wrote:
> On Fri, Nov 03, 2006 at 04:07:29PM -0300, Arnaldo Carvalho de Melo wrote:
> > On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> > > On Mon, 30 Oct 2006 18:33:19 -0300
> > > Arnaldo Carvalho de Melo <acme@mandriva.com> wrote:
> > > 
> > > > 	Further ideas on how to use the DWARF2 information include tools
> > > > that will show where inlines are being used, how much code is added by
> > > > inline functions,
> > > 
> > > It would be quite useful to be able to identify inlined functions which are
> > > good candidates for uninlining.
> > 
> > Top 50 inline functions expanded more than once by sum of its expansions
> > in a vmlinux file built for qemu, most things are modules, columns are
> > (inline function name, number of times it was expanded, sum in bytes of
> > its expansions, number of source files where expansions ocurred):
> > 
> > [acme@newtoy guinea_pig-2.6]$ pfunct --total_inline_stats
> > ../../acme/OUTPUT/qemu/net-2.6/vmlinux | grep -v ': 1 ' | sort -k3 -nr |
> > head -50
> > 
> > get_current                        676   5732 155
> 
> Ok, this time for a 'make allyesconfig' build, top 100, for the list of
> all 6021 inline functions that were expanded more than once in this 281
> MB vmlinux image download the 93 KB files at:
>...

Thanks, this is interesting data.

One thing you could do for improving the result:

allyesconfig turns on all debugging option, and there might be functions 
that are significantely larger due to this fact.

Unsetting *DEBUG* options in the .config might bring a better focus 
on the real-world problems.

> - Arnaldo
>...

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Top 100 inline functions (make allyesconfig) was Re: [ANNOUNCE] pahole and other DWARF2 utilities
  2006-11-05  6:30       ` Adrian Bunk
@ 2006-11-05 16:42         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2006-11-05 16:42 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: Andrew Morton, linux-kernel, lwn

On Sun, Nov 05, 2006 at 07:30:37AM +0100, Adrian Bunk wrote:
> On Sat, Nov 04, 2006 at 06:03:32PM -0300, Arnaldo Carvalho de Melo wrote:
> > On Fri, Nov 03, 2006 at 04:07:29PM -0300, Arnaldo Carvalho de Melo wrote:
> > > On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> > > > On Mon, 30 Oct 2006 18:33:19 -0300
> > > > Arnaldo Carvalho de Melo <acme@mandriva.com> wrote:
> > > > 
> > > > > 	Further ideas on how to use the DWARF2 information include tools
> > > > > that will show where inlines are being used, how much code is added by
> > > > > inline functions,
> > > > 
> > > > It would be quite useful to be able to identify inlined functions which are
> > > > good candidates for uninlining.
> > > 
> > > Top 50 inline functions expanded more than once by sum of its expansions
> > > in a vmlinux file built for qemu, most things are modules, columns are
> > > (inline function name, number of times it was expanded, sum in bytes of
> > > its expansions, number of source files where expansions ocurred):
> > > 
> > > [acme@newtoy guinea_pig-2.6]$ pfunct --total_inline_stats
> > > ../../acme/OUTPUT/qemu/net-2.6/vmlinux | grep -v ': 1 ' | sort -k3 -nr |
> > > head -50
> > > 
> > > get_current                        676   5732 155
> > 
> > Ok, this time for a 'make allyesconfig' build, top 100, for the list of
> > all 6021 inline functions that were expanded more than once in this 281
> > MB vmlinux image download the 93 KB files at:
> >...
> 
> Thanks, this is interesting data.
> 
> One thing you could do for improving the result:
> 
> allyesconfig turns on all debugging option, and there might be functions 
> that are significantely larger due to this fact.
> 
> Unsetting *DEBUG* options in the .config might bring a better focus 
> on the real-world problems.

Sure thing, I did it with allyesconfig to see if the tools were able to
handle that much data, its not perfect, far from it, but it works on my
notebook :-) Neverthless its already a data point for lots of
interesting cases.

One thing I'll do is to get the debug rpms in, say, Mandriva, Fedora,
etc and use them as more down to earth guinea pigs, for that I'll add
support for multi file, not just for multi object, single file ELF
files. Also just using the config files used in major distros is on my
TODO list, of course enabling the extra config options needed to have
the DWARF2 elf sections needed by the tools, these sections don't affect
the binary, are just extra ELF sections, that the 'strip(1)' tool loves
:-)

Stay tuned,

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2006-11-05 16:42 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-30 21:33 [ANNOUNCE] pahole and other DWARF2 utilities Arnaldo Carvalho de Melo
2006-10-31  4:33 ` Andrew Morton
2006-10-31 16:05   ` Thiago Galesi
2006-10-31 17:28     ` Arnaldo Carvalho de Melo
2006-10-31 17:22   ` Arnaldo Carvalho de Melo
2006-10-31 20:45     ` Arnaldo Carvalho de Melo
2006-11-03 15:51   ` Arnaldo Carvalho de Melo
2006-11-03 19:07   ` Arnaldo Carvalho de Melo
2006-11-04 21:03     ` Top 100 inline functions (make allyesconfig) was " Arnaldo Carvalho de Melo
2006-11-05  6:30       ` Adrian Bunk
2006-11-05 16:42         ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox