From: William Lee Irwin III <wli@holomorphy.com>
To: Denis Vlasenko <vda@port.imtp.ilyichevsk.odessa.ua>
Cc: linux-kernel@vger.kernel.org, Alan Cox <alan@lxorguk.ukuu.org.uk>,
Larry McVoy <lm@bitmover.com>
Subject: Re: top hogs CPU in 2.6: kallsyms_lookup is very slow
Date: Fri, 17 Sep 2004 06:33:30 -0700 [thread overview]
Message-ID: <20040917133330.GR9106@holomorphy.com> (raw)
In-Reply-To: <200409171532.58898.vda@port.imtp.ilyichevsk.odessa.ua>
On Friday 17 September 2004 14:03, William Lee Irwin III wrote:
>> To address this in a meaningful way, we're going to have to get some
>> profiling data. The built-in kernel profiler should suffice, though you
>> may want to run the test for a longer, fixed period of time (I
>> recommend making the test run as long as 60s and recording the number
>> of operations completed). Also, please snapshot the profile state with
>> readprofile(1) immediately before and after the microbenchmark runs on
>> both kernels. This should only require booting into the kernels you've
>> already built with an additional commandline parameter, e.g. profile=2.
On Fri, Sep 17, 2004 at 03:34:59PM +0300, Denis Vlasenko wrote:
> I have profile=2. Profiles were collected with this script:
> ./openclose100000
> readprofile -m System.map -r
> ./openclose100000
> ./openclose100000
> ./openclose100000
> ./openclose100000
> readprofile -m System.map | sort -r
Excellent!
On Fri, Sep 17, 2004 at 03:34:59PM +0300, Denis Vlasenko wrote:
> 2.4:
> 2994 total 0.0013
> 620 link_path_walk 0.2892
> 285 d_lookup 1.2076
> 156 dput 0.4815
> 118 kmem_cache_free 0.7564
> 109 system_call 1.9464
> 106 kmem_cache_alloc 0.5096
> 103 strncpy_from_user 1.2875
[...]
> 2.6:
> 3200 total 0.0013
> 790 link_path_walk 0.2759
> 287 __d_lookup 1.2756
> 277 get_empty_filp 1.4503
> 146 strncpy_from_user 1.8250
> 110 dput 0.3254
> 109 system_call 2.4773
Very odd that get_empty_filp() and strncpy_from_user() should be so
high on your profiles. They're not tremendously different between 2.4
and 2.6. It may be the case that 2.6 makes some inappropriate choice of
algorithms for usercopying on systems of your vintage. get_empty_filp()
is more mysterious still.
What was the relative performance of 2.4 vs. 2.6? e.g. 2.6 completed
some percentage of the number of operations as 2.4 in the given time?
Additive differential profile follows. I wonder if certain symbols'
hits should be folded together for a proper comparison.
8.9688% __d_lookup
6.8526% get_empty_filp
3.9794% link_path_walk
2.8750% do_lookup
2.8621% find_trylock_page
1.8750% may_open
1.7188% follow_mount
1.6875% __fput
1.5216% path_lookup
1.2812% find_next_zero_bit
1.1223% strncpy_from_user
0.9117% locks_remove_flock
0.9042% dentry_open
0.8750% file_ra_state_init
0.8125% page_put_link
0.7188% syscall_exit
0.5271% read_cache_page
0.4310% fs_may_remount_ro
0.4279% filp_close
0.3794% filp_open
0.3750% page_follow_link_light
0.3125% file_kill
0.2812% zap_pte_range
0.2812% eventpoll_init_file
0.1875% eventpoll_release_file
0.1617% locks_remove_posix
0.1250% syscall_call
0.1250% __copy_to_user_ll
0.0938% detach_mnt
0.0873% clear_user
0.0625% read_cache_pages
0.0625% page_remove_rmap
0.0464% file_move
0.0312% zap_pmd_range
0.0312% unmap_vmas
0.0312% pte_alloc_one
0.0312% process_backlog
0.0312% free_page_and_swap_cache
0.0312% free_hot_cold_page
0.0312% find_get_page
0.0312% dst_alloc
0.0312% do_no_page
0.0312% do_anonymous_page
0.0312% __up_write
0.0140% write_profile
0.0000% ret_from_sys_call
0.0000% path_walk
0.0000% path_init
0.0000% page_follow_link
0.0000% inet_rtm_newrule
0.0000% fib_lookup
0.0000% do_truncate
0.0000% d_lookup
0.0000% copy_page_range
0.0000% check_mnt
0.0000% cached_lookup
0.0000% __generic_copy_to_user
0.0000% __free_pages
0.0000% __find_get_page
0.0000% __constant_c_and_count_memset
-0.0065% do_wp_page
-0.0334% inet_rtm_newrule
-0.0334% fib_lookup
-0.0334% copy_page_range
-0.0829% update_atime
-0.0958% dnotify_flush
-0.1002% check_mnt
-0.1065% getname
-0.1336% do_truncate
-0.2004% path_walk
-0.2344% system_call
-0.3674% __generic_copy_to_user
-0.3759% sys_open
-0.4008% ret_from_sys_call
-0.5495% mark_page_accessed
-0.5645% path_release
-0.6280% page_getlink
-0.8142% get_unused_fd
-0.8350% __find_get_page
-0.8684% __free_pages
-0.9177% sys_close
-0.9286% fd_install
-1.5191% lookup_mnt
-1.5890% vfs_permission
-1.7729% dput
-1.8336% open_namei
-1.8842% kmem_cache_alloc
-1.9361% permission
-2.1376% cached_lookup
-2.2850% kmem_cache_free
-2.4048% path_init
-2.6386% page_follow_link
-2.6386% __constant_c_and_count_memset
-2.8228% fput
-9.5190% d_lookup
-- wli
next prev parent reply other threads:[~2004-09-17 13:33 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-09-16 11:28 top hogs CPU in 2.6: kallsyms_lookup is very slow Denis Vlasenko
2004-09-16 11:45 ` William Lee Irwin III
2004-09-16 11:57 ` Denis Vlasenko
2004-09-16 12:17 ` William Lee Irwin III
2004-09-16 12:31 ` Denis Vlasenko
2004-09-17 8:57 ` Denis Vlasenko
2004-09-17 11:03 ` William Lee Irwin III
2004-09-17 12:34 ` Denis Vlasenko
2004-09-17 13:33 ` William Lee Irwin III [this message]
2004-09-17 20:36 ` Denis Vlasenko
2004-09-17 21:32 ` William Lee Irwin III
[not found] ` <200409172155.29561.vda@port.imtp.ilyichevsk.odessa.ua>
2004-09-18 13:57 ` [PATCH] trivial patch for 2.4: always inline __constant_* Denis Vlasenko
2004-09-18 14:03 ` [PATCH] trivial patch for 2.4: resolve megaraid_info name collision Denis Vlasenko
2004-09-18 14:02 ` Marcelo Tosatti
2004-09-18 15:05 ` [PATCH] trivial patch for 2.4: always inline __constant_* Marcelo Tosatti
2004-09-17 9:01 ` top hogs CPU in 2.6: kallsyms_lookup is very slow Denis Vlasenko
2004-09-17 10:55 ` William Lee Irwin III
2004-09-17 11:27 ` Paulo Marques
2004-09-17 11:21 ` Denis Vlasenko
2004-09-17 12:10 ` William Lee Irwin III
2004-09-17 13:04 ` Denis Vlasenko
2004-09-17 13:55 ` Larry McVoy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040917133330.GR9106@holomorphy.com \
--to=wli@holomorphy.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=lm@bitmover.com \
--cc=vda@port.imtp.ilyichevsk.odessa.ua \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox