public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [Linux-ia64] What's taking all the system time..?
@ 2002-01-30  5:48 Duraid Madina
  2002-01-30  6:21 ` Stephane Eranian
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: Duraid Madina @ 2002-01-30  5:48 UTC (permalink / raw)
  To: linux-ia64

Hi guys,

	I've got a big (production) code that depends on a simple 'graph'
library in C++. To try it out on Itanium, I stripped it down and whipped up
a tiny 'benchmark' of the graph library. When I tried this on an Itanium
system here (2.4.9-SMP, gcc/g++ 2.96/3.0.3 and intel c++ 6.0) the code runs
for about 30 seconds, 20 of which is reported as 'system' time. The thing
is, this is (supposed to be) a CPU-bound job: on my home x86 system, it runs
in about 3 seconds, and with negligible system time.

	'top' and 'time' show the gobs of system time being used, but
'strace' picks up nothing out of the ordinary (0.014 seconds system time,
dominated by calls to brk() (this is expected).

	I'm hoping someone with a spare moment could try and reproduce this,
and perhaps let me know what's going on? Right now, I'm guessing libc
brokenness on ia64, but perhaps it's VM suckage? :\ Alas, on the box I have
access to, I can't profile the code because adding -pg when compiling leads
to a binary that segfaults at startup (ditto with the Intel compiler, so i'm
guessing the profile libraries are borked).

	You can grab the source (18kB) here:

	http://www.idesign.fl.net.au/why_systime/why_systime.tar.gz

	just 'make' (be sure 'gcc' and 'g++' point to a recent enough
version of gcc/g++ - preferably 3.0 or later) and then run 'a.out'.

	Lost,

	Duraid



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
@ 2002-01-30  6:21 ` Stephane Eranian
  2002-01-30  6:26 ` Niels Christiansen
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Stephane Eranian @ 2002-01-30  6:21 UTC (permalink / raw)
  To: linux-ia64

Duraid,

> 
> 	I've got a big (production) code that depends on a simple 'graph'
> library in C++. To try it out on Itanium, I stripped it down and whipped up
> a tiny 'benchmark' of the graph library. When I tried this on an Itanium
> system here (2.4.9-SMP, gcc/g++ 2.96/3.0.3 and intel c++ 6.0) the code runs
> for about 30 seconds, 20 of which is reported as 'system' time. The thing
> is, this is (supposed to be) a CPU-bound job: on my home x86 system, it runs
> in about 3 seconds, and with negligible system time.
> 
> 	'top' and 'time' show the gobs of system time being used, but
> 'strace' picks up nothing out of the ordinary (0.014 seconds system time,
> dominated by calls to brk() (this is expected).
> 
Just by curiosity, have you looked at the system log in /var/log/messages?
I am guessing your program does some floating point operations.
Have you seen something about floating-point software assist? Another 
potential cause could be unaligned accesses. That you can also see 
in the system log.

-- 
-Stephane


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
  2002-01-30  6:21 ` Stephane Eranian
@ 2002-01-30  6:26 ` Niels Christiansen
  2002-01-30  6:44 ` duraid
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Niels Christiansen @ 2002-01-30  6:26 UTC (permalink / raw)
  To: linux-ia64

> You can grab the source (18kB) here:
>
> http://www.idesign.fl.net.au/why_systime/why_systime.tar.gz
>
> just 'make' (be sure 'gcc' and 'g++' point to a recent enough
> version of gcc/g++ - preferably 3.0 or later) and then run 'a.out'.

I tried using gcc 3.0.2 on a RedHat 7.2 4-way with 2.4.16 kernel
but got errors as shown below.  I have a profiling tool (currently
internal use only) which I intended to use but I need the make to
succeed first...

You say you run on an SMP.  How many processors?

... Niels

-------------------------------------------------------------------

[nchris@tollway why_systime]$ make
gcc -c mt19937b-int.c
g++ Graph.cpp bench.cpp mt19937b-int.o
In file included from BlockTree.cpp:17,
                 from Graph.h:7,
                 from Graph.cpp:2:
BlockTree.h:121:8: warning: extra tokens at end of #endif directive
/tmp/ccC11dFe.s: Assembler messages:
/tmp/ccC11dFe.s:15538: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI4Edge4DCSOE10AddNewPageEv#)'
/tmp/ccC11dFe.s:15539: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI4Edge4DCSOE10RemovePageEPNS_17ResourceCachePageE#)'
/tmp/ccC11dFe.s:15548: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI6Vertex4DCSOE10AddNewPageEv#)'
/tmp/ccC11dFe.s:15549: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI6Vertex4DCSOE10RemovePageEPNS_17ResourceCachePageE#)'
/tmp/ccC11dFe.s:15558: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic11MemoryCacheI4DCSOE10AddNewPageEv#)'
/tmp/ccC11dFe.s:15559: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic11MemoryCacheI4DCSOE10RemovePageEPNS_17ResourceCachePageE#)'
In file included from BlockTree.cpp:17,
                 from Graph.h:7,
                 from bench.cpp:3:
BlockTree.h:121:8: warning: extra tokens at end of #endif directive
/tmp/ccC11dFe.s: Assembler messages:
/tmp/ccC11dFe.s:3233: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI4Edge4DCSOE10AddNewPageEv#)'
/tmp/ccC11dFe.s:3234: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI4Edge4DCSOE10RemovePageEPNS_17ResourceCachePageE#)'
/tmp/ccC11dFe.s:3243: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI6Vertex4DCSOE10AddNewPageEv#)'
/tmp/ccC11dFe.s:3244: Error: Unknown opcode `data16.ua @iplt
(_ZN5basic13ResourceCacheI6Vertex4DCSOE10RemovePageEPNS_17ResourceCachePageE#)'
make: *** [default] Error 1
[nchris@tollway why_systime]$. gcc -v
Reading specs from /usr/lib/gcc-lib/ia64-unknown-linux/3.0.2/specs
Configured with: /usr/src/gcc-3.0.2/configure --prefix /usr --mandir
/usr/share/man
Thread model: single
gcc version 3.0.2
[nchris@tollway why_systime]$ g++ -v
Reading specs from /usr/lib/gcc-lib/ia64-unknown-linux/3.0.2/specs
Configured with: /usr/src/gcc-3.0.2/configure --prefix /usr --mandir
/usr/share/man
Thread model: single
gcc version 3.0.2
[nchris@tollway why_systime]$



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
  2002-01-30  6:21 ` Stephane Eranian
  2002-01-30  6:26 ` Niels Christiansen
@ 2002-01-30  6:44 ` duraid
  2002-01-30  6:46 ` duraid
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: duraid @ 2002-01-30  6:44 UTC (permalink / raw)
  To: linux-ia64

Quoting Niels Christiansen <nchr@us.ibm.com>:

> I tried using gcc 3.0.2 on a RedHat 7.2 4-way with 2.4.16 kernel
> but got errors as shown below.  I have a profiling tool (currently
> internal use only) which I intended to use but I need the make to
> succeed first...

Okay I'll come clean: these are the compilers I've tested:

gcc version 2.96 20000731 (Debian GNU/Linux IA64 experimental)

Reading specs from /usr/lib/gcc-lib/ia64-linux/3.0.3/specs
Configured with: ../src/configure -v
--enable-languages=c,c++,java,f77,proto,objc --prefix=/usr --infodir=/share/info
--mandir=/share/man --enable-shared --with-gnu-as --with-gnu-ld
--with-system-zlib --enable-long-long --enable-nls --without-included-gettext
--disable-checking --enable-threads=posix --enable-java-gc=boehm
--with-cpp-install-dir=bin --enable-objc-gc ia64-linux
Thread model: posix
gcc version 3.0.3

and finally:

Intel(R) C++ Itanium(TM) Compiler for Itanium(TM)-based applications
Version 6.0 Beta, Build 20011129 

The program builds OK with all three of those, but the source needs a bit of
twiddling to work with the Intel compiler.
 
> You say you run on an SMP.  How many processors?

4 processors in the box, but it's just a single-threaded code.

     Duraid

P.S. looks like your binutils is a bit out of date?




-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
                   ` (2 preceding siblings ...)
  2002-01-30  6:44 ` duraid
@ 2002-01-30  6:46 ` duraid
  2002-01-30  6:54 ` Stephane Eranian
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: duraid @ 2002-01-30  6:46 UTC (permalink / raw)
  To: linux-ia64

Quoting Stephane Eranian <eranian@frankl.hpl.hp.com>:


> Just by curiosity, have you looked at the system log in /var/log/messages?

permission denied ;)

> I am guessing your program does some floating point operations.
> Have you seen something about floating-point software assist?

No flops in this code!

> Another potential cause could be unaligned accesses. That you can also see 
> in the system log.

'dmesg' works and sure enough, I get unaligned accesses. But not enough to get
*that* sort of a performance hit, surely? To explain what I'm seeing, each
unaligned access would have to cost 0.05 seconds or so!

        Duraid




-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
                   ` (3 preceding siblings ...)
  2002-01-30  6:46 ` duraid
@ 2002-01-30  6:54 ` Stephane Eranian
  2002-01-30  6:58 ` Niels Christiansen
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Stephane Eranian @ 2002-01-30  6:54 UTC (permalink / raw)
  To: linux-ia64

On Wed, Jan 30, 2002 at 05:46:45PM +1100, duraid@fl.net.au wrote:
> No flops in this code!
> 
Sometimes, the compiler uses floating point register for integer
operations for speed reason.

> > Another potential cause could be unaligned accesses. That you can also see 
> > in the system log.
> 
> 'dmesg' works and sure enough, I get unaligned accesses. But not enough to get
> *that* sort of a performance hit, surely? To explain what I'm seeing, each
> unaligned access would have to cost 0.05 seconds or so!

The kernel DOES NOT generate a printk() for every unaligned accesses you
get, it's throttled. Get the address from the syslog  and check the code.
You may be casting ints into pointers. Linux/ia64 uses the LP64 data
model: long and pointers are 64 bits but int are 32 bits only. 

-- 
-Stephane


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
                   ` (4 preceding siblings ...)
  2002-01-30  6:54 ` Stephane Eranian
@ 2002-01-30  6:58 ` Niels Christiansen
  2002-01-30  7:02 ` Niels Christiansen
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Niels Christiansen @ 2002-01-30  6:58 UTC (permalink / raw)
  To: linux-ia64

This is the output I got on a different IA64 system (with gcc 3.0.3),
where the program compiled.  You definitely have an alignment problem.

edges           poptime         walktime                edges found
a.out(14573): unaligned access to 0x600000000000ba4c, ip=0x40000000000074a1
a.out(14573): unaligned access to 0x600000000000ba3c, ip=0x40000000000074c0
a.out(14573): unaligned access to 0x600000000000ba4c, ip=0x4000000000006df1
a.out(14573): unaligned access to 0x600000000000ba4c, ip=0x4000000000006f01
a.out(14573): unaligned access to 0x600000000091c15c, ip=0x4000000000006fc0
a.out(14573): unaligned access to 0x600000000091c15c, ip=0x4000000000007020
a.out(14573): unaligned access to 0x60000000011528ac, ip=0x4000000000007301
a.out(14573): unaligned access to 0x600000000054dbfc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x600000000054db3c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x600000000054da7c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x600000000054d2fc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x600000000071307c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000712e9c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000001ccdf4c, ip=0x40000000000025d1
a.out(14573): unaligned access to 0x6000000001ccdf5c, ip=0x40000000000026a1
a.out(14573): unaligned access to 0x60000000001ec9cc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000001dd93c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000001dd87c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000001d9bec, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000db02ac, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000d7406c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000d73fac, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000d73eec, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000007915bc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000007914fc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000006a08bc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000006a07fc, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x60000000013b54fc, ip=0x4000000000001d50
a.out(14573): unaligned access to 0x60000000013b550c, ip=0x4000000000001c60
a.out(14573): unaligned access to 0x60000000013b550c, ip=0x4000000000001d50
a.out(14573): unaligned access to 0x600000000065bb9c, ip=0x4000000000000f80
a.out(14573): unaligned access to 0x6000000000ddd24c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000dd5a4c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000dd598c, ip=0x4000000000005dd0
a.out(14573): unaligned access to 0x6000000000dd58cc, ip=0x4000000000005dd0
200000          36000000                12000000                1468642 -
Done!

This is the profile for your process:

LTC Linprof Version 1.0.3 using Symlib/Linux Version 1.0.3 -- Jan 24 2002
15:06:09

Details for Process 14573 Task 14573 /home/nchris/test/why_systime/a.out

                  ---- Listed by function ----
  Ticks %of Tid %of Tot  Mode   Function [Source] {binary}
------- ------- -------  ------ --------------------------------------
  10283   21.72    5.21  user   __do_global_ctors_aux
[/usr/src/build/45425-ia64/BUILD/glibc-2.2.4/build-ia64-linux/csu/crti.S]
/home/nchris/test/why_systime/a.out}
   5908   12.48    3.00  kernel __copy_user []
   4791   10.12    2.43  kernel ia64_handle_unaligned []
   4696    9.92    2.38  kernel __cyg_profile_func_enter []
   4471    9.44    2.27  kernel dispatch_unaligned_handler [ivt.S]
   4416    9.33    2.24  kernel __cyg_profile_func_exit []
   3065    6.47    1.55  kernel save_switch_stack []
   1576    3.33    0.80  kernel emulate_load_int [unaligned.c]
   1238    2.62    0.63  kernel setreg [unaligned.c]
   1123    2.37    0.57  kernel load_switch_stack [entry.S]
    925    1.95    0.47  kernel within_logging_rate_limit [unaligned.c]
    640    1.35    0.32  kernel ia64_prepare_handle_unaligned []
    546    1.15    0.28  user   _ZN14VertexIterator7GetNextEv []
{/home/nchris/test/why_systime/a.out}
    468    0.99    0.24  user   _ZN4Edge16AddVertexMappingEP6VertexP5Graphb
[] {/home/nchris/test/why_systime/a.out}
    460    0.97    0.23  user   _ZN6Vertex14AddEdgeMappingEP4EdgeP5Graphb
[] {/home/nchris/test/why_systime/a.out}
    458    0.97    0.23  user   ibase []
{/home/nchris/test/why_systime/a.out}
    456    0.96    0.23  user   _ZNK14VertexIterator5GetAtEv []
{/home/nchris/test/why_systime/a.out}
    364    0.77    0.18  kernel ia64_leave_kernel []
    224    0.47    0.11  user   _ZNK4Edge17GetVertexIteratorEv []
{/home/nchris/test/why_systime/a.out}
    196    0.41    0.10  user   _Z21PrintRelatedVertexIDsP5GraphP6Vertex []
{/home/nchris/test/why_systime/a.out}
    148    0.31    0.08  user   _Z8PopulateP5Graph []
{/home/nchris/test/why_systime/a.out}
    126    0.27    0.06  user   _ZNK12EdgeIterator5GetAtEv []
{/home/nchris/test/why_systime/a.out}
    113    0.24    0.06  kernel emulate_store_int [unaligned.c]
    110    0.23    0.06  user   _ZN12EdgeIterator7GetNextEv []
{/home/nchris/test/why_systime/a.out}
     76    0.16    0.04  user   __umoddi3 [] {/usr/lib/libgcc_s.so.1}
     74    0.16    0.04  user   _ZN4EdgeC1Ev []
{/home/nchris/test/why_systime/a.out}
     71    0.15    0.04  kernel getreg [unaligned.c]
     46    0.10    0.02  kernel clear_page []
     46    0.10    0.02  user   _ZN5Graph16AddVertexMappingEP4EdgeP6Vertexb
[] {/home/nchris/test/why_systime/a.out}
     27    0.06    0.01  user   main []
{/home/nchris/test/why_systime/a.out}
     21    0.04    0.01  user   _ZN5Graph7AddEdgeEjj []
{/home/nchris/test/why_systime/a.out}
     21    0.04    0.01  kernel _ltr_64 []
     20    0.04    0.01  user   __divdi3 [] {/usr/lib/libgcc_s.so.1}
     20    0.04    0.01  user   _ZN5Graph9GetVertexEj []
{/home/nchris/test/why_systime/a.out}
     17    0.04    0.01  user   _ZN5Graph9AddVertexEj []
{/home/nchris/test/why_systime/a.out}
     13    0.03    0.01  kernel rmqueue [page_alloc.c]
     11    0.02    0.01  user   _ZN14VertexIteratorC1EPK4Edge []
{/home/nchris/test/why_systime/a.out}
     10    0.02    0.01  user   __gmon_start__ []
{/home/nchris/test/why_systime/a.out}
      6    0.01    0.00  kernel __free_pages_ok [page_alloc.c]
      6    0.01    0.00  user   memmove [] {/lib/libc-2.2.4.so}
      4    0.01    0.00  kernel do_wp_page [memory.c]
      4    0.01    0.00  user   _ZN6VertexC1Ev []
{/home/nchris/test/why_systime/a.out}
      3    0.01    0.00  kernel handle_mm_fault []
      3    0.01    0.00  user   _ZN12EdgeIteratorC1EPK6Vertex []
{/home/nchris/test/why_systime/a.out}
      3    0.01    0.00  kernel sys_ioctl []
      3    0.01    0.00  kernel find_vma_prev []
      3    0.01    0.00  user   _dl_lookup_symbol [] {/lib/ld-2.2.4.so}
      3    0.01    0.00  kernel ltr_spin_unlock []
      2    0.00    0.00  user   _ZNK6Vertex15GetEdgeIteratorEv []
{/home/nchris/test/why_systime/a.out}
      2    0.00    0.00  kernel __scsi_end_request [scsi_lib.c]
      2    0.00    0.00  kernel __lru_cache_del []
      1    0.00    0.00  kernel unlock_page []
      1    0.00    0.00  user   chunk_alloc [malloc.c] {/lib/libc-2.2.4.so}
      1    0.00    0.00  kernel ltr_spin_lock_entry []
      1    0.00    0.00  user   _dl_relocate_object [] {/lib/ld-2.2.4.so}
      1    0.00    0.00  kernel scsi_old_done []
      1    0.00    0.00  user   _dl_lookup_versioned_symbol []
{/lib/ld-2.2.4.so}
      1    0.00    0.00  kernel memset []
      1    0.00    0.00  kernel __alloc_pages []
      1    0.00    0.00  kernel batch_entropy_store []
      1    0.00    0.00  kernel ia64_do_page_fault []
      1    0.00    0.00  user   chunk_free [malloc.c] {/lib/libc-2.2.4.so}
      1    0.00    0.00  kernel page_fault [ivt.S]
      1    0.00    0.00  user   _dl_fini [libgcc2.c] {/lib/ld-2.2.4.so}
      1    0.00    0.00  kernel demine_args [ivt.S]
      1    0.00    0.00  user   __malloc [fde-glibc.c] {/lib/libc-2.2.4.so}
      1    0.00    0.00  user   __libc_malloc [] {/lib/libc-2.2.4.so}
      1    0.00    0.00  user   __umoddi3 [libgcc2.c] {/lib/ld-2.2.4.so}
      1    0.00    0.00  user   __sbrk [] {/lib/libc-2.2.4.so}
      1    0.00    0.00  kernel do_anonymous_page [memory.c]
      1    0.00    0.00  kernel __up_write []
      1    0.00    0.00  kernel do_no_page [memory.c]
------- ------- -------
  47339  100.00   24.00

                  ---- Listed by binary ----
  Ticks %of Tid %of Tot  Binary
------- ------- -------  --------------------------------------
  33515   70.80   16.99  Kernel
  13710   28.96    6.95  /home/nchris/test/why_systime/a.out
     96    0.20    0.05  /usr/lib/libgcc_s.so.1
     11    0.02    0.01  /lib/libc-2.2.4.so
      7    0.01    0.00  /lib/ld-2.2.4.so


          --------------------------------------------------------
          LTC Linprof Version 1.0.3 - Totals by Process and TaskID
          --------------------------------------------------------

                ----Kernel----- -----User------ -----Total-----
Process    Task   Ticks %of Tot   Ticks %of Tot   Ticks %of Tot
Executing...
------- ------- ------- ------- ------- ------- ------- -------
-------------------
      0       0   82288   41.72       0    0.00   82288   41.72
      0       9       3    0.00       0    0.00       3    0.00
      1       1       0    0.00       1    0.00       1    0.00
/lib/ld-2.2.4.so
     10      10   49722   25.21       0    0.00   49722   25.21
     12      12      27    0.01       0    0.00      27    0.01
    543     543     163    0.08       3    0.00     166    0.08
/lib/ld-2.2.4.so
    548     548       4    0.00       1    0.00       5    0.00
/lib/ld-2.2.4.so
    814     814       1    0.00       2    0.00       3    0.00
/lib/ld-2.2.4.so
  14452   14452       6    0.00       0    0.00       6    0.00
/lib/ld-2.2.4.so
    138     138    1154    0.59       0    0.00    1154    0.59
    139     139      30    0.02       0    0.00      30    0.02
    140     140      21    0.01       0    0.00      21    0.01
  14515   14515       5    0.00       1    0.00       6    0.00
/lib/ld-2.2.4.so
  14572   14572       9    0.00       0    0.00       9    0.00
/lib/ld-2.2.4.so
  14573   14573   33515   16.99   13824    7.01   47339   24.00
/home/nchris/test/why_systime/a.out
  14574   14574       2    0.00       0    0.00       2    0.00
/lib/ld-2.2.4.so
  14575   14575    4925    2.50       0    0.00    4925    2.50
/lib/ld-2.2.4.so
  14576   14576    3582    1.82       0    0.00    3582    1.82
/lib/ld-2.2.4.so
  14577   14577    3850    1.95       0    0.00    3850    1.95
/lib/ld-2.2.4.so
  14578   14578    3449    1.75       0    0.00    3449    1.75
/lib/ld-2.2.4.so
  14579   14579       1    0.00       0    0.00       1    0.00
/lib/ld-2.2.4.so
  14580   14580      56    0.03     584    0.30     640    0.32
/usr/bin/perl
  14581   14581       8    0.00       2    0.00      10    0.01
/usr/sbin/lintrace



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
                   ` (5 preceding siblings ...)
  2002-01-30  6:58 ` Niels Christiansen
@ 2002-01-30  7:02 ` Niels Christiansen
  2002-01-30  7:27 ` Niels Christiansen
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Niels Christiansen @ 2002-01-30  7:02 UTC (permalink / raw)
  To: linux-ia64

> P.S. looks like your binutils is a bit out of date?

Could well be.  I had to grab the source and recompile
because of a bug (which turned out to be xmalloc but I
didn't know at the time).  So I'm not really sure what
I'm using -- just that it works for me :-)

The system that compiled okay has the binutils it came with.

... Niels



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
                   ` (6 preceding siblings ...)
  2002-01-30  7:02 ` Niels Christiansen
@ 2002-01-30  7:27 ` Niels Christiansen
  2002-01-30  9:36 ` duraid
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Niels Christiansen @ 2002-01-30  7:27 UTC (permalink / raw)
  To: linux-ia64

Oops, forgot to mention these two lines.  They represent trace
overhead because my kernel is instrumented to trace all function
entry and exit points.  All it means is that 9.92 + 9.33 percent
of the time spent in your process is trace overhead.  It should
not skew the relative weight of the other profile data.

   4696    9.92    2.38  kernel __cyg_profile_func_enter []
   4416    9.33    2.24  kernel __cyg_profile_func_exit []

I also checked lock stats.  You wait for and hold locks less
than 150 msecs.  What I can't explain is why the system is 60%
busy (although I know that ACPI handlers keep one processor
100% busy).

-----------------------------------------------------------------
    LTC Whydle Version 1.0.3 *** PROCESSOR TIME DISTRIBUTION
-----------------------------------------------------------------

             Total   TID-0   Other  non-IRQ
CPU  Idle %  Busy %  IRQ %   IRQ %   Busy %
--- ------- ------- ------- ------- -------
  0  62.649  37.351   0.000   0.000  37.351
  1  43.502  56.498   0.000   0.000  56.498
  2   0.000 100.000   0.000   0.000 100.000
  3  55.906  44.094   0.000   0.000  44.094
    ------- ------- ------- ------- -------
     40.514  59.486   0.000   0.000  59.486

... Niels



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
                   ` (7 preceding siblings ...)
  2002-01-30  7:27 ` Niels Christiansen
@ 2002-01-30  9:36 ` duraid
  2002-01-30 10:04 ` [Linux-ia64] What's taking all the system time..? The unaligned accesses are! Duraid Madina
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: duraid @ 2002-01-30  9:36 UTC (permalink / raw)
  To: linux-ia64

Quoting Stephane Eranian <eranian@frankl.hpl.hp.com>:

> On Wed, Jan 30, 2002 at 05:46:45PM +1100, duraid@fl.net.au wrote:
> > No flops in this code!
> > 
> Sometimes, the compiler uses floating point register for integer
> operations for speed reason.

Granted, but (God willing) nobody will ever encounter a compiler crazy enough 
to emit FP code that needs software assistance!

> The kernel DOES NOT generate a printk() for every unaligned accesses you
> get, it's throttled.

That's what my intuition told me. But the sysadmin assured me that the opposite 
was true. Bad sysadmin.

> You may be casting ints into pointers. Linux/ia64 uses the LP64 data
> model: long and pointers are 64 bits but int are 32 bits only. 

#@!%#@!% (I knew that already, but !#%#@%#%) Is it just an urban myth, or was 
it once declared upon high that longs shall always be 32 bits, long longs shall 
always be 64 bits, and ints shall be Whatever Length Is Most Efficiently Dealt 
With By The Hardware(tm)?

Whoever chose LP64 should be sent to Guantanamo Bay for some "those blindfolds 
are for their own safety" interrogation!

     Duraid



^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [Linux-ia64] What's taking all the system time..? The unaligned accesses are!
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
                   ` (8 preceding siblings ...)
  2002-01-30  9:36 ` duraid
@ 2002-01-30 10:04 ` Duraid Madina
  2002-01-30 17:43 ` [Linux-ia64] What's taking all the system time..? Boehm, Hans
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Duraid Madina @ 2002-01-30 10:04 UTC (permalink / raw)
  To: linux-ia64

Okay, sorry for not figuring this one out *before* spamming the list:
the unaligned accesses were really killing things. Now everything's
running five times faster. Yay!

	Duraid




^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
                   ` (9 preceding siblings ...)
  2002-01-30 10:04 ` [Linux-ia64] What's taking all the system time..? The unaligned accesses are! Duraid Madina
@ 2002-01-30 17:43 ` Boehm, Hans
  2002-01-30 17:49 ` Grant Grundler
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Boehm, Hans @ 2002-01-30 17:43 UTC (permalink / raw)
  To: linux-ia64

> Is it just an 
> urban myth, or was 
> it once declared upon high that longs shall always be 32 
> bits, ...
I'm not sure it's even an urban myth.  The first (1978) edition of K&R lists
sizes of integral types on some sample hardware.  Longs were 36 bits on a
Honeywell 6000.  Longs have never uniformly been 32 bits.

Hans


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
                   ` (10 preceding siblings ...)
  2002-01-30 17:43 ` [Linux-ia64] What's taking all the system time..? Boehm, Hans
@ 2002-01-30 17:49 ` Grant Grundler
  2002-01-30 18:04 ` Niels Christiansen
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Grant Grundler @ 2002-01-30 17:49 UTC (permalink / raw)
  To: linux-ia64

duraid@fl.net.au wrote:
> it once declared upon high that longs shall always be 32 bits, long longs
> shall always be 64 bits, and ints shall be Whatever Length Is Most
> Efficiently Dealt With By The Hardware(tm)?

I believe the "long long" was a non-standard implementation for
32-bit compilers to declare/access 64-bit data structures.

And for LP64, ints are guaranteed to be 32-bits or it would have been
called ILP64.

grant


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
                   ` (11 preceding siblings ...)
  2002-01-30 17:49 ` Grant Grundler
@ 2002-01-30 18:04 ` Niels Christiansen
  2002-01-30 21:57 ` Jim Wilson
  2002-02-01 21:28 ` David Mosberger
  14 siblings, 0 replies; 16+ messages in thread
From: Niels Christiansen @ 2002-01-30 18:04 UTC (permalink / raw)
  To: linux-ia64

> > Is it just an urban myth, or was it once declared upon
> > high that longs shall always be 32 bits, ...
>
> I'm not sure it's even an urban myth.  The first (1978)
> edition of K&R lists sizes of integral types on some
> sample hardware.  Longs were 36 bits on a Honeywell 6000.
> Longs have never uniformly been 32 bits.

Could be an urban myth but that neither makes it right
or makes sense.  As for Guantanamo Bay, don't wish it upon
anybody.

... Niels



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
                   ` (12 preceding siblings ...)
  2002-01-30 18:04 ` Niels Christiansen
@ 2002-01-30 21:57 ` Jim Wilson
  2002-02-01 21:28 ` David Mosberger
  14 siblings, 0 replies; 16+ messages in thread
From: Jim Wilson @ 2002-01-30 21:57 UTC (permalink / raw)
  To: linux-ia64

>/tmp/ccC11dFe.s:15538: Error: Unknown opcode `data16.ua @iplt

You need a newer binutils if you want to use gcc 3.0.[23] to compile C++ code.

New IA-64 assembler syntax was introduced last summer, because it was not
possible to fully implement the C++ ABI with the original assembler syntax.
This new assembler syntax is used starting with gcc 3.0.2, and is not supported
by any binutils older that last summer.  That unfortunately means that binutils
2.11 is not new enough, you have to use an unreleased binutils from the FSF
development source tree.

Jim


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Linux-ia64] What's taking all the system time..?
  2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
                   ` (13 preceding siblings ...)
  2002-01-30 21:57 ` Jim Wilson
@ 2002-02-01 21:28 ` David Mosberger
  14 siblings, 0 replies; 16+ messages in thread
From: David Mosberger @ 2002-02-01 21:28 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Wed, 30 Jan 2002 20:36:48 +1100, duraid@fl.net.au said:

  >> The kernel DOES NOT generate a printk() for every unaligned
  >> accesses you get, it's throttled.

  Duraid> That's what my intuition told me. But the sysadmin assured
  Duraid> me that the opposite was true. Bad sysadmin.

If you want to catch all unaligned faults, you can do:

	prctl --unaligned=signal gdb PROGNAME

This way, gdb will stop (with a SIGBUS signal) when an unaligned
access occurs.

	--david


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2002-02-01 21:28 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-30  5:48 [Linux-ia64] What's taking all the system time..? Duraid Madina
2002-01-30  6:21 ` Stephane Eranian
2002-01-30  6:26 ` Niels Christiansen
2002-01-30  6:44 ` duraid
2002-01-30  6:46 ` duraid
2002-01-30  6:54 ` Stephane Eranian
2002-01-30  6:58 ` Niels Christiansen
2002-01-30  7:02 ` Niels Christiansen
2002-01-30  7:27 ` Niels Christiansen
2002-01-30  9:36 ` duraid
2002-01-30 10:04 ` [Linux-ia64] What's taking all the system time..? The unaligned accesses are! Duraid Madina
2002-01-30 17:43 ` [Linux-ia64] What's taking all the system time..? Boehm, Hans
2002-01-30 17:49 ` Grant Grundler
2002-01-30 18:04 ` Niels Christiansen
2002-01-30 21:57 ` Jim Wilson
2002-02-01 21:28 ` David Mosberger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox