Re: optimize __gp location - Christian Hildner

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Christian Hildner <christian.hildner@hob.de>
To: linux-ia64@vger.kernel.org
Subject: Re: optimize __gp location
Date: Tue, 25 Jan 2005 07:30:26 +0000	[thread overview]
Message-ID: <41F5F592.3090806@hob.de> (raw)
In-Reply-To: <B05667366EE6204181EABE9C1B1C0EB50589FCE9@scsmsx401.amr.corp.intel.com>

Keith Owens schrieb:

>On Mon, 24 Jan 2005 14:44:22 +0100, 
>Christian Hildner <christian.hildner@hob.de> wrote:
>  
>
>>Keith Owens schrieb:
>>    
>>
>>>When jiffies is within 22 bit range of __gp, the linker writes the
>>>sequence as
>>>
>>>   addl r20=offset_of(jiffies,__gp),r1;;
>>>   mov r16=r20;;
>>>   ld8.acq r23=[r16]	// value of jiffies
>>>
>>>      
>>>
>>Is there a restriction to not rewrite to
>>
>>   addl r16=offset_of(jiffies,__gp),r1;;
>>   ld8.acq r23=[r16]	// value of jiffies
>>   nop.i 0
>>
>>because that would save at least one cycle and would make bundling easier (dependend of additional instructions, of course).
>>    
>>
>
>The code snippet was a simplification of what gcc actually does.  If
>you look at some object code, you will find that the 3 instructions are
>already spread over multiple bundles.  Moving the final ld8 upwards
>cannot save any cycles, you still have to execute the same number of
>bundles.  
>
But it is one instruction group less. And that relates to at least (here 
exactly) one cycle.

>A real example from kernel/sched.o
>
>    4830:       09 50 20 42 00 21       [MMI]       adds r10=8,r33
>                        4832: LTOFF22X  jiffies
>    4836:       20 81 84 00 42 c0                   adds r18\x16,r33
>    483c:       01 08 00 90                         addl r14=0,r1;;
>    4840:       08 00 08 1e d8 19       [MMI]       stf.spill [r15]ò
>                        4841: LDXMOV    jiffies
>                        4842: LTOFF22X  __per_cpu_offset
>    4846:       b0 00 38 30 20 40                   ld8 r11=[r14]
>    484c:       03 08 00 90                         addl r26=0,r1
>    4850:       08 a0 00 02 00 24       [MMI]       addl r20=0,r1
>                        4850: LTOFF22X  .data.percpu+0x440
>    4856:       90 00 01 20 40 e0                   shladd r9=r32,1,r0
>    485c:       02 00 59 00                         sxt4 r23=r32
>    4860:       08 40 00 14 18 10       [MMI]       ld8 r8=[r10]
>    4866:       10 01 48 30 20 e0                   ld8 r17=[r18]
>    486c:       04 00 c4 00                         mov r39°
>    4870:       05 00 00 00 01 40       [MLX]       nop.m 0x0
>    4876:       10 00 00 00 00 60                   movl r27=0x10624dd3;;
>    487c:       33 55 6c 62 
>    4880:       10 00 00 00 01 00       [MIB]       nop.m 0x0
>    4886:       f0 40 e0 f0 29 00                   shl r15=r8,7
>    488c:       00 00 00 20                         nop.b 0x0
>    4890:       09 c0 00 34 18 10       [MMI]       ld8 r24=[r26]
>                        4890: LDXMOV    __per_cpu_offset
>    4896:       30 00 2c 70 21 40                   ld8.acq r3=[r11]
>
>The LDXMOV relocation is designed to make it simple to convert the
>instruction from ld8 r11=[r14] to mov r11=r14, it is easy to do in
>place.
>
Ok, simplicity is an argument.

>  Moving an entire slot around is a lot messier, for no
>performance gain.
>
You have still one memory unit wasted for the mov logically being a nop. 
So dependant on the cpu implementation there is a possible loss of one 
cycle specially for memory intensive code fragments/instructions groups. 
In the example the LDXMOV instruction group has seven memory units 
utilized. And if the cpu has only six of them implemented? But I see the 
complexity when changing that. It would result in the need for another 
optimizer step. A linker optimizer?

Christian

next prev parent reply	other threads:[~2005-01-25  7:30 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-01-21 23:22 optimize __gp location Chen, Kenneth W
2005-01-22  1:02 ` Keith Owens
2005-01-22  1:02 ` Luck, Tony
2005-01-22  2:20 ` Chen, Kenneth W
2005-01-22  3:09 ` Keith Owens
2005-01-24  7:51 ` Christian Hildner
2005-01-24 13:22 ` Keith Owens
2005-01-24 13:29   ` Matthew Wilcox
2005-01-24 13:44 ` Christian Hildner
2005-01-24 15:32 ` Keith Owens
2005-01-24 17:51 ` David Mosberger
2005-01-24 17:53 ` David Mosberger
2005-01-25  7:30 ` Christian Hildner [this message]
2005-01-25 19:44 ` Chen, Kenneth W
2005-01-25 19:51 ` David Mosberger
2005-01-25 19:57 ` Chen, Kenneth W
2005-01-25 20:01 ` David Mosberger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41F5F592.3090806@hob.de \
    --to=christian.hildner@hob.de \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.