* Re: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
@ 2002-01-30 6:21 ` Stephane Eranian
2002-01-30 6:26 ` Niels Christiansen
` (13 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: Stephane Eranian @ 2002-01-30 6:21 UTC (permalink / raw)
To: linux-ia64
Duraid,
>
> I've got a big (production) code that depends on a simple 'graph'
> library in C++. To try it out on Itanium, I stripped it down and whipped up
> a tiny 'benchmark' of the graph library. When I tried this on an Itanium
> system here (2.4.9-SMP, gcc/g++ 2.96/3.0.3 and intel c++ 6.0) the code runs
> for about 30 seconds, 20 of which is reported as 'system' time. The thing
> is, this is (supposed to be) a CPU-bound job: on my home x86 system, it runs
> in about 3 seconds, and with negligible system time.
>
> 'top' and 'time' show the gobs of system time being used, but
> 'strace' picks up nothing out of the ordinary (0.014 seconds system time,
> dominated by calls to brk() (this is expected).
>
Just by curiosity, have you looked at the system log in /var/log/messages?
I am guessing your program does some floating point operations.
Have you seen something about floating-point software assist? Another
potential cause could be unaligned accesses. That you can also see
in the system log.
--
-Stephane
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
2002-01-30 6:21 ` Stephane Eranian
@ 2002-01-30 6:26 ` Niels Christiansen
2002-01-30 6:44 ` duraid
` (12 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: Niels Christiansen @ 2002-01-30 6:26 UTC (permalink / raw)
To: linux-ia64
> You can grab the source (18kB) here:
>
> http://www.idesign.fl.net.au/why_systime/why_systime.tar.gz
>
> just 'make' (be sure 'gcc' and 'g++' point to a recent enough
> version of gcc/g++ - preferably 3.0 or later) and then run 'a.out'.
I tried using gcc 3.0.2 on a RedHat 7.2 4-way with 2.4.16 kernel
but got errors as shown below. I have a profiling tool (currently
internal use only) which I intended to use but I need the make to
succeed first...
You say you run on an SMP. How many processors?
... Niels
-------------------------------------------------------------------
[nchris@tollway why_systime]$ make
gcc -c mt19937b-int.c
g++ Graph.cpp bench.cpp mt19937b-int.o
In file included from BlockTree.cpp:17,
from Graph.h:7,
from Graph.cpp:2:
BlockTree.h:121:8: warning: extra tokens at end of #endif directive
/tmp/ccC11dFe.s: Assembler messages:
/tmp/ccC11dFe.s:15538: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI4Edge4DCSOE10AddNewPageEv#)'
/tmp/ccC11dFe.s:15539: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI4Edge4DCSOE10RemovePageEPNS_17ResourceCachePageE#)'
/tmp/ccC11dFe.s:15548: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI6Vertex4DCSOE10AddNewPageEv#)'
/tmp/ccC11dFe.s:15549: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI6Vertex4DCSOE10RemovePageEPNS_17ResourceCachePageE#)'
/tmp/ccC11dFe.s:15558: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic11MemoryCacheI4DCSOE10AddNewPageEv#)'
/tmp/ccC11dFe.s:15559: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic11MemoryCacheI4DCSOE10RemovePageEPNS_17ResourceCachePageE#)'
In file included from BlockTree.cpp:17,
from Graph.h:7,
from bench.cpp:3:
BlockTree.h:121:8: warning: extra tokens at end of #endif directive
/tmp/ccC11dFe.s: Assembler messages:
/tmp/ccC11dFe.s:3233: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI4Edge4DCSOE10AddNewPageEv#)'
/tmp/ccC11dFe.s:3234: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI4Edge4DCSOE10RemovePageEPNS_17ResourceCachePageE#)'
/tmp/ccC11dFe.s:3243: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI6Vertex4DCSOE10AddNewPageEv#)'
/tmp/ccC11dFe.s:3244: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI6Vertex4DCSOE10RemovePageEPNS_17ResourceCachePageE#)'
make: *** [default] Error 1
[nchris@tollway why_systime]$. gcc -v
Reading specs from /usr/lib/gcc-lib/ia64-unknown-linux/3.0.2/specs
Configured with: /usr/src/gcc-3.0.2/configure --prefix /usr --mandir
/usr/share/man
Thread model: single
gcc version 3.0.2
[nchris@tollway why_systime]$ g++ -v
Reading specs from /usr/lib/gcc-lib/ia64-unknown-linux/3.0.2/specs
Configured with: /usr/src/gcc-3.0.2/configure --prefix /usr --mandir
/usr/share/man
Thread model: single
gcc version 3.0.2
[nchris@tollway why_systime]$
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
2002-01-30 6:21 ` Stephane Eranian
2002-01-30 6:26 ` Niels Christiansen
@ 2002-01-30 6:44 ` duraid
2002-01-30 6:46 ` duraid
` (11 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: duraid @ 2002-01-30 6:44 UTC (permalink / raw)
To: linux-ia64
Quoting Niels Christiansen <nchr@us.ibm.com>:
> I tried using gcc 3.0.2 on a RedHat 7.2 4-way with 2.4.16 kernel
> but got errors as shown below. I have a profiling tool (currently
> internal use only) which I intended to use but I need the make to
> succeed first...
Okay I'll come clean: these are the compilers I've tested:
gcc version 2.96 20000731 (Debian GNU/Linux IA64 experimental)
Reading specs from /usr/lib/gcc-lib/ia64-linux/3.0.3/specs
Configured with: ../src/configure -v
--enable-languages=c,c++,java,f77,proto,objc --prefix=/usr --infodir=/share/info
--mandir=/share/man --enable-shared --with-gnu-as --with-gnu-ld
--with-system-zlib --enable-long-long --enable-nls --without-included-gettext
--disable-checking --enable-threads=posix --enable-java-gc=boehm
--with-cpp-install-dir=bin --enable-objc-gc ia64-linux
Thread model: posix
gcc version 3.0.3
and finally:
Intel(R) C++ Itanium(TM) Compiler for Itanium(TM)-based applications
Version 6.0 Beta, Build 20011129
The program builds OK with all three of those, but the source needs a bit of
twiddling to work with the Intel compiler.
> You say you run on an SMP. How many processors?
4 processors in the box, but it's just a single-threaded code.
Duraid
P.S. looks like your binutils is a bit out of date?
-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
` (2 preceding siblings ...)
2002-01-30 6:44 ` duraid
@ 2002-01-30 6:46 ` duraid
2002-01-30 6:54 ` Stephane Eranian
` (10 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: duraid @ 2002-01-30 6:46 UTC (permalink / raw)
To: linux-ia64
Quoting Stephane Eranian <eranian@frankl.hpl.hp.com>:
> Just by curiosity, have you looked at the system log in /var/log/messages?
permission denied ;)
> I am guessing your program does some floating point operations.
> Have you seen something about floating-point software assist?
No flops in this code!
> Another potential cause could be unaligned accesses. That you can also see
> in the system log.
'dmesg' works and sure enough, I get unaligned accesses. But not enough to get
*that* sort of a performance hit, surely? To explain what I'm seeing, each
unaligned access would have to cost 0.05 seconds or so!
Duraid
-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
` (3 preceding siblings ...)
2002-01-30 6:46 ` duraid
@ 2002-01-30 6:54 ` Stephane Eranian
2002-01-30 6:58 ` Niels Christiansen
` (9 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: Stephane Eranian @ 2002-01-30 6:54 UTC (permalink / raw)
To: linux-ia64
On Wed, Jan 30, 2002 at 05:46:45PM +1100, duraid@fl.net.au wrote:
> No flops in this code!
>
Sometimes, the compiler uses floating point register for integer
operations for speed reason.
> > Another potential cause could be unaligned accesses. That you can also see
> > in the system log.
>
> 'dmesg' works and sure enough, I get unaligned accesses. But not enough to get
> *that* sort of a performance hit, surely? To explain what I'm seeing, each
> unaligned access would have to cost 0.05 seconds or so!
The kernel DOES NOT generate a printk() for every unaligned accesses you
get, it's throttled. Get the address from the syslog and check the code.
You may be casting ints into pointers. Linux/ia64 uses the LP64 data
model: long and pointers are 64 bits but int are 32 bits only.
--
-Stephane
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
` (4 preceding siblings ...)
2002-01-30 6:54 ` Stephane Eranian
@ 2002-01-30 6:58 ` Niels Christiansen
2002-01-30 7:02 ` Niels Christiansen
` (8 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: Niels Christiansen @ 2002-01-30 6:58 UTC (permalink / raw)
To: linux-ia64
This is the output I got on a different IA64 system (with gcc 3.0.3),
where the program compiled. You definitely have an alignment problem.
edges poptime walktime edges found
a.out(14573): unaligned access to 0x600000000000ba4c, ip=0x40000000000074a1
a.out(14573): unaligned access to 0x600000000000ba3c, ip=0x40000000000074c0
a.out(14573): unaligned access to 0x600000000000ba4c, ip=0x4000000000006df1
a.out(14573): unaligned access to 0x600000000000ba4c, ip=0x4000000000006f01
a.out(14573): unaligned access to 0x600000000091c15c, ip=0x4000000000006fc0
a.out(14573): unaligned access to 0x600000000091c15c, ip=0x4000000000007020
a.out(14573): unaligned access to 0x60000000011528ac, ip=0x4000000000007301
a.out(14573): unaligned access to 0x600000000054dbfc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x600000000054db3c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x600000000054da7c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x600000000054d2fc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x600000000071307c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000712e9c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000001ccdf4c, ip=0x40000000000025d1
a.out(14573): unaligned access to 0x6000000001ccdf5c, ip=0x40000000000026a1
a.out(14573): unaligned access to 0x60000000001ec9cc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000001dd93c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000001dd87c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000001d9bec, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000db02ac, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000d7406c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000d73fac, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000d73eec, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000007915bc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000007914fc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000006a08bc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000006a07fc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000013b54fc, ip=0x4000000000001d50
a.out(14573): unaligned access to 0x60000000013b550c, ip=0x4000000000001c60
a.out(14573): unaligned access to 0x60000000013b550c, ip=0x4000000000001d50
a.out(14573): unaligned access to 0x600000000065bb9c, ip=0x4000000000000f80
a.out(14573): unaligned access to 0x6000000000ddd24c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000dd5a4c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000dd598c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000dd58cc, ip=0x4000000000005dd0
200000 36000000 12000000 1468642 -
Done!
This is the profile for your process:
LTC Linprof Version 1.0.3 using Symlib/Linux Version 1.0.3 -- Jan 24 2002
15:06:09
Details for Process 14573 Task 14573 /home/nchris/test/why_systime/a.out
---- Listed by function ----
Ticks %of Tid %of Tot Mode Function [Source] {binary}
------- ------- ------- ------ --------------------------------------
10283 21.72 5.21 user __do_global_ctors_aux
[/usr/src/build/45425-ia64/BUILD/glibc-2.2.4/build-ia64-linux/csu/crti.S]
/home/nchris/test/why_systime/a.out}
5908 12.48 3.00 kernel __copy_user []
4791 10.12 2.43 kernel ia64_handle_unaligned []
4696 9.92 2.38 kernel __cyg_profile_func_enter []
4471 9.44 2.27 kernel dispatch_unaligned_handler [ivt.S]
4416 9.33 2.24 kernel __cyg_profile_func_exit []
3065 6.47 1.55 kernel save_switch_stack []
1576 3.33 0.80 kernel emulate_load_int [unaligned.c]
1238 2.62 0.63 kernel setreg [unaligned.c]
1123 2.37 0.57 kernel load_switch_stack [entry.S]
925 1.95 0.47 kernel within_logging_rate_limit [unaligned.c]
640 1.35 0.32 kernel ia64_prepare_handle_unaligned []
546 1.15 0.28 user _ZN14VertexIterator7GetNextEv []
{/home/nchris/test/why_systime/a.out}
468 0.99 0.24 user _ZN4Edge16AddVertexMappingEP6VertexP5Graphb
[] {/home/nchris/test/why_systime/a.out}
460 0.97 0.23 user _ZN6Vertex14AddEdgeMappingEP4EdgeP5Graphb
[] {/home/nchris/test/why_systime/a.out}
458 0.97 0.23 user ibase []
{/home/nchris/test/why_systime/a.out}
456 0.96 0.23 user _ZNK14VertexIterator5GetAtEv []
{/home/nchris/test/why_systime/a.out}
364 0.77 0.18 kernel ia64_leave_kernel []
224 0.47 0.11 user _ZNK4Edge17GetVertexIteratorEv []
{/home/nchris/test/why_systime/a.out}
196 0.41 0.10 user _Z21PrintRelatedVertexIDsP5GraphP6Vertex []
{/home/nchris/test/why_systime/a.out}
148 0.31 0.08 user _Z8PopulateP5Graph []
{/home/nchris/test/why_systime/a.out}
126 0.27 0.06 user _ZNK12EdgeIterator5GetAtEv []
{/home/nchris/test/why_systime/a.out}
113 0.24 0.06 kernel emulate_store_int [unaligned.c]
110 0.23 0.06 user _ZN12EdgeIterator7GetNextEv []
{/home/nchris/test/why_systime/a.out}
76 0.16 0.04 user __umoddi3 [] {/usr/lib/libgcc_s.so.1}
74 0.16 0.04 user _ZN4EdgeC1Ev []
{/home/nchris/test/why_systime/a.out}
71 0.15 0.04 kernel getreg [unaligned.c]
46 0.10 0.02 kernel clear_page []
46 0.10 0.02 user _ZN5Graph16AddVertexMappingEP4EdgeP6Vertexb
[] {/home/nchris/test/why_systime/a.out}
27 0.06 0.01 user main []
{/home/nchris/test/why_systime/a.out}
21 0.04 0.01 user _ZN5Graph7AddEdgeEjj []
{/home/nchris/test/why_systime/a.out}
21 0.04 0.01 kernel _ltr_64 []
20 0.04 0.01 user __divdi3 [] {/usr/lib/libgcc_s.so.1}
20 0.04 0.01 user _ZN5Graph9GetVertexEj []
{/home/nchris/test/why_systime/a.out}
17 0.04 0.01 user _ZN5Graph9AddVertexEj []
{/home/nchris/test/why_systime/a.out}
13 0.03 0.01 kernel rmqueue [page_alloc.c]
11 0.02 0.01 user _ZN14VertexIteratorC1EPK4Edge []
{/home/nchris/test/why_systime/a.out}
10 0.02 0.01 user __gmon_start__ []
{/home/nchris/test/why_systime/a.out}
6 0.01 0.00 kernel __free_pages_ok [page_alloc.c]
6 0.01 0.00 user memmove [] {/lib/libc-2.2.4.so}
4 0.01 0.00 kernel do_wp_page [memory.c]
4 0.01 0.00 user _ZN6VertexC1Ev []
{/home/nchris/test/why_systime/a.out}
3 0.01 0.00 kernel handle_mm_fault []
3 0.01 0.00 user _ZN12EdgeIteratorC1EPK6Vertex []
{/home/nchris/test/why_systime/a.out}
3 0.01 0.00 kernel sys_ioctl []
3 0.01 0.00 kernel find_vma_prev []
3 0.01 0.00 user _dl_lookup_symbol [] {/lib/ld-2.2.4.so}
3 0.01 0.00 kernel ltr_spin_unlock []
2 0.00 0.00 user _ZNK6Vertex15GetEdgeIteratorEv []
{/home/nchris/test/why_systime/a.out}
2 0.00 0.00 kernel __scsi_end_request [scsi_lib.c]
2 0.00 0.00 kernel __lru_cache_del []
1 0.00 0.00 kernel unlock_page []
1 0.00 0.00 user chunk_alloc [malloc.c] {/lib/libc-2.2.4.so}
1 0.00 0.00 kernel ltr_spin_lock_entry []
1 0.00 0.00 user _dl_relocate_object [] {/lib/ld-2.2.4.so}
1 0.00 0.00 kernel scsi_old_done []
1 0.00 0.00 user _dl_lookup_versioned_symbol []
{/lib/ld-2.2.4.so}
1 0.00 0.00 kernel memset []
1 0.00 0.00 kernel __alloc_pages []
1 0.00 0.00 kernel batch_entropy_store []
1 0.00 0.00 kernel ia64_do_page_fault []
1 0.00 0.00 user chunk_free [malloc.c] {/lib/libc-2.2.4.so}
1 0.00 0.00 kernel page_fault [ivt.S]
1 0.00 0.00 user _dl_fini [libgcc2.c] {/lib/ld-2.2.4.so}
1 0.00 0.00 kernel demine_args [ivt.S]
1 0.00 0.00 user __malloc [fde-glibc.c] {/lib/libc-2.2.4.so}
1 0.00 0.00 user __libc_malloc [] {/lib/libc-2.2.4.so}
1 0.00 0.00 user __umoddi3 [libgcc2.c] {/lib/ld-2.2.4.so}
1 0.00 0.00 user __sbrk [] {/lib/libc-2.2.4.so}
1 0.00 0.00 kernel do_anonymous_page [memory.c]
1 0.00 0.00 kernel __up_write []
1 0.00 0.00 kernel do_no_page [memory.c]
------- ------- -------
47339 100.00 24.00
---- Listed by binary ----
Ticks %of Tid %of Tot Binary
------- ------- ------- --------------------------------------
33515 70.80 16.99 Kernel
13710 28.96 6.95 /home/nchris/test/why_systime/a.out
96 0.20 0.05 /usr/lib/libgcc_s.so.1
11 0.02 0.01 /lib/libc-2.2.4.so
7 0.01 0.00 /lib/ld-2.2.4.so
--------------------------------------------------------
LTC Linprof Version 1.0.3 - Totals by Process and TaskID
--------------------------------------------------------
----Kernel----- -----User------ -----Total-----
Process Task Ticks %of Tot Ticks %of Tot Ticks %of Tot
Executing...
------- ------- ------- ------- ------- ------- ------- -------
-------------------
0 0 82288 41.72 0 0.00 82288 41.72
0 9 3 0.00 0 0.00 3 0.00
1 1 0 0.00 1 0.00 1 0.00
/lib/ld-2.2.4.so
10 10 49722 25.21 0 0.00 49722 25.21
12 12 27 0.01 0 0.00 27 0.01
543 543 163 0.08 3 0.00 166 0.08
/lib/ld-2.2.4.so
548 548 4 0.00 1 0.00 5 0.00
/lib/ld-2.2.4.so
814 814 1 0.00 2 0.00 3 0.00
/lib/ld-2.2.4.so
14452 14452 6 0.00 0 0.00 6 0.00
/lib/ld-2.2.4.so
138 138 1154 0.59 0 0.00 1154 0.59
139 139 30 0.02 0 0.00 30 0.02
140 140 21 0.01 0 0.00 21 0.01
14515 14515 5 0.00 1 0.00 6 0.00
/lib/ld-2.2.4.so
14572 14572 9 0.00 0 0.00 9 0.00
/lib/ld-2.2.4.so
14573 14573 33515 16.99 13824 7.01 47339 24.00
/home/nchris/test/why_systime/a.out
14574 14574 2 0.00 0 0.00 2 0.00
/lib/ld-2.2.4.so
14575 14575 4925 2.50 0 0.00 4925 2.50
/lib/ld-2.2.4.so
14576 14576 3582 1.82 0 0.00 3582 1.82
/lib/ld-2.2.4.so
14577 14577 3850 1.95 0 0.00 3850 1.95
/lib/ld-2.2.4.so
14578 14578 3449 1.75 0 0.00 3449 1.75
/lib/ld-2.2.4.so
14579 14579 1 0.00 0 0.00 1 0.00
/lib/ld-2.2.4.so
14580 14580 56 0.03 584 0.30 640 0.32
/usr/bin/perl
14581 14581 8 0.00 2 0.00 10 0.01
/usr/sbin/lintrace
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
` (5 preceding siblings ...)
2002-01-30 6:58 ` Niels Christiansen
@ 2002-01-30 7:02 ` Niels Christiansen
2002-01-30 7:27 ` Niels Christiansen
` (7 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: Niels Christiansen @ 2002-01-30 7:02 UTC (permalink / raw)
To: linux-ia64
> P.S. looks like your binutils is a bit out of date?
Could well be. I had to grab the source and recompile
because of a bug (which turned out to be xmalloc but I
didn't know at the time). So I'm not really sure what
I'm using -- just that it works for me :-)
The system that compiled okay has the binutils it came with.
... Niels
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
` (6 preceding siblings ...)
2002-01-30 7:02 ` Niels Christiansen
@ 2002-01-30 7:27 ` Niels Christiansen
2002-01-30 9:36 ` duraid
` (6 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: Niels Christiansen @ 2002-01-30 7:27 UTC (permalink / raw)
To: linux-ia64
Oops, forgot to mention these two lines. They represent trace
overhead because my kernel is instrumented to trace all function
entry and exit points. All it means is that 9.92 + 9.33 percent
of the time spent in your process is trace overhead. It should
not skew the relative weight of the other profile data.
4696 9.92 2.38 kernel __cyg_profile_func_enter []
4416 9.33 2.24 kernel __cyg_profile_func_exit []
I also checked lock stats. You wait for and hold locks less
than 150 msecs. What I can't explain is why the system is 60%
busy (although I know that ACPI handlers keep one processor
100% busy).
-----------------------------------------------------------------
LTC Whydle Version 1.0.3 *** PROCESSOR TIME DISTRIBUTION
-----------------------------------------------------------------
Total TID-0 Other non-IRQ
CPU Idle % Busy % IRQ % IRQ % Busy %
--- ------- ------- ------- ------- -------
0 62.649 37.351 0.000 0.000 37.351
1 43.502 56.498 0.000 0.000 56.498
2 0.000 100.000 0.000 0.000 100.000
3 55.906 44.094 0.000 0.000 44.094
------- ------- ------- ------- -------
40.514 59.486 0.000 0.000 59.486
... Niels
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
` (7 preceding siblings ...)
2002-01-30 7:27 ` Niels Christiansen
@ 2002-01-30 9:36 ` duraid
2002-01-30 10:04 ` [Linux-ia64] What's taking all the system time..? The unaligned accesses are! Duraid Madina
` (5 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: duraid @ 2002-01-30 9:36 UTC (permalink / raw)
To: linux-ia64
Quoting Stephane Eranian <eranian@frankl.hpl.hp.com>:
> On Wed, Jan 30, 2002 at 05:46:45PM +1100, duraid@fl.net.au wrote:
> > No flops in this code!
> >
> Sometimes, the compiler uses floating point register for integer
> operations for speed reason.
Granted, but (God willing) nobody will ever encounter a compiler crazy enough
to emit FP code that needs software assistance!
> The kernel DOES NOT generate a printk() for every unaligned accesses you
> get, it's throttled.
That's what my intuition told me. But the sysadmin assured me that the opposite
was true. Bad sysadmin.
> You may be casting ints into pointers. Linux/ia64 uses the LP64 data
> model: long and pointers are 64 bits but int are 32 bits only.
#@!%#@!% (I knew that already, but !#%#@%#%) Is it just an urban myth, or was
it once declared upon high that longs shall always be 32 bits, long longs shall
always be 64 bits, and ints shall be Whatever Length Is Most Efficiently Dealt
With By The Hardware(tm)?
Whoever chose LP64 should be sent to Guantanamo Bay for some "those blindfolds
are for their own safety" interrogation!
Duraid
^ permalink raw reply [flat|nested] 16+ messages in thread* RE: [Linux-ia64] What's taking all the system time..? The unaligned accesses are!
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
` (8 preceding siblings ...)
2002-01-30 9:36 ` duraid
@ 2002-01-30 10:04 ` Duraid Madina
2002-01-30 17:43 ` [Linux-ia64] What's taking all the system time..? Boehm, Hans
` (4 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: Duraid Madina @ 2002-01-30 10:04 UTC (permalink / raw)
To: linux-ia64
Okay, sorry for not figuring this one out *before* spamming the list:
the unaligned accesses were really killing things. Now everything's
running five times faster. Yay!
Duraid
^ permalink raw reply [flat|nested] 16+ messages in thread* RE: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
` (9 preceding siblings ...)
2002-01-30 10:04 ` [Linux-ia64] What's taking all the system time..? The unaligned accesses are! Duraid Madina
@ 2002-01-30 17:43 ` Boehm, Hans
2002-01-30 17:49 ` Grant Grundler
` (3 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: Boehm, Hans @ 2002-01-30 17:43 UTC (permalink / raw)
To: linux-ia64
> Is it just an
> urban myth, or was
> it once declared upon high that longs shall always be 32
> bits, ...
I'm not sure it's even an urban myth. The first (1978) edition of K&R lists
sizes of integral types on some sample hardware. Longs were 36 bits on a
Honeywell 6000. Longs have never uniformly been 32 bits.
Hans
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
` (10 preceding siblings ...)
2002-01-30 17:43 ` [Linux-ia64] What's taking all the system time..? Boehm, Hans
@ 2002-01-30 17:49 ` Grant Grundler
2002-01-30 18:04 ` Niels Christiansen
` (2 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: Grant Grundler @ 2002-01-30 17:49 UTC (permalink / raw)
To: linux-ia64
duraid@fl.net.au wrote:
> it once declared upon high that longs shall always be 32 bits, long longs
> shall always be 64 bits, and ints shall be Whatever Length Is Most
> Efficiently Dealt With By The Hardware(tm)?
I believe the "long long" was a non-standard implementation for
32-bit compilers to declare/access 64-bit data structures.
And for LP64, ints are guaranteed to be 32-bits or it would have been
called ILP64.
grant
^ permalink raw reply [flat|nested] 16+ messages in thread* RE: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
` (11 preceding siblings ...)
2002-01-30 17:49 ` Grant Grundler
@ 2002-01-30 18:04 ` Niels Christiansen
2002-01-30 21:57 ` Jim Wilson
2002-02-01 21:28 ` David Mosberger
14 siblings, 0 replies; 16+ messages in thread
From: Niels Christiansen @ 2002-01-30 18:04 UTC (permalink / raw)
To: linux-ia64
> > Is it just an urban myth, or was it once declared upon
> > high that longs shall always be 32 bits, ...
>
> I'm not sure it's even an urban myth. The first (1978)
> edition of K&R lists sizes of integral types on some
> sample hardware. Longs were 36 bits on a Honeywell 6000.
> Longs have never uniformly been 32 bits.
Could be an urban myth but that neither makes it right
or makes sense. As for Guantanamo Bay, don't wish it upon
anybody.
... Niels
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
` (12 preceding siblings ...)
2002-01-30 18:04 ` Niels Christiansen
@ 2002-01-30 21:57 ` Jim Wilson
2002-02-01 21:28 ` David Mosberger
14 siblings, 0 replies; 16+ messages in thread
From: Jim Wilson @ 2002-01-30 21:57 UTC (permalink / raw)
To: linux-ia64
>/tmp/ccC11dFe.s:15538: Error: Unknown opcode `data16.ua @iplt
You need a newer binutils if you want to use gcc 3.0.[23] to compile C++ code.
New IA-64 assembler syntax was introduced last summer, because it was not
possible to fully implement the C++ ABI with the original assembler syntax.
This new assembler syntax is used starting with gcc 3.0.2, and is not supported
by any binutils older that last summer. That unfortunately means that binutils
2.11 is not new enough, you have to use an unreleased binutils from the FSF
development source tree.
Jim
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Linux-ia64] What's taking all the system time..?
2002-01-30 5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
` (13 preceding siblings ...)
2002-01-30 21:57 ` Jim Wilson
@ 2002-02-01 21:28 ` David Mosberger
14 siblings, 0 replies; 16+ messages in thread
From: David Mosberger @ 2002-02-01 21:28 UTC (permalink / raw)
To: linux-ia64
>>>>> On Wed, 30 Jan 2002 20:36:48 +1100, duraid@fl.net.au said:
>> The kernel DOES NOT generate a printk() for every unaligned
>> accesses you get, it's throttled.
Duraid> That's what my intuition told me. But the sysadmin assured
Duraid> me that the opposite was true. Bad sysadmin.
If you want to catch all unaligned faults, you can do:
prctl --unaligned=signal gdb PROGNAME
This way, gdb will stop (with a SIGBUS signal) when an unaligned
access occurs.
--david
^ permalink raw reply [flat|nested] 16+ messages in thread