From: Keith Owens <kaos@sgi.com>
To: linux-ia64@vger.kernel.org
Subject: Re: optimize __gp location
Date: Mon, 24 Jan 2005 15:32:08 +0000 [thread overview]
Message-ID: <13890.1106580728@ocs3.ocs.com.au> (raw)
In-Reply-To: <B05667366EE6204181EABE9C1B1C0EB50589FCE9@scsmsx401.amr.corp.intel.com>
On Mon, 24 Jan 2005 14:44:22 +0100,
Christian Hildner <christian.hildner@hob.de> wrote:
>Keith Owens schrieb:
>>When jiffies is within 22 bit range of __gp, the linker writes the
>>sequence as
>>
>> addl r20=offset_of(jiffies,__gp),r1;;
>> mov r16=r20;;
>> ld8.acq r23=[r16] // value of jiffies
>>
>Is there a restriction to not rewrite to
>
> addl r16=offset_of(jiffies,__gp),r1;;
> ld8.acq r23=[r16] // value of jiffies
> nop.i 0
>
>because that would save at least one cycle and would make bundling easier (dependend of additional instructions, of course).
The code snippet was a simplification of what gcc actually does. If
you look at some object code, you will find that the 3 instructions are
already spread over multiple bundles. Moving the final ld8 upwards
cannot save any cycles, you still have to execute the same number of
bundles. A real example from kernel/sched.o
4830: 09 50 20 42 00 21 [MMI] adds r10=8,r33
4832: LTOFF22X jiffies
4836: 20 81 84 00 42 c0 adds r18\x16,r33
483c: 01 08 00 90 addl r14=0,r1;;
4840: 08 00 08 1e d8 19 [MMI] stf.spill [r15]ò
4841: LDXMOV jiffies
4842: LTOFF22X __per_cpu_offset
4846: b0 00 38 30 20 40 ld8 r11=[r14]
484c: 03 08 00 90 addl r26=0,r1
4850: 08 a0 00 02 00 24 [MMI] addl r20=0,r1
4850: LTOFF22X .data.percpu+0x440
4856: 90 00 01 20 40 e0 shladd r9=r32,1,r0
485c: 02 00 59 00 sxt4 r23=r32
4860: 08 40 00 14 18 10 [MMI] ld8 r8=[r10]
4866: 10 01 48 30 20 e0 ld8 r17=[r18]
486c: 04 00 c4 00 mov r39°
4870: 05 00 00 00 01 40 [MLX] nop.m 0x0
4876: 10 00 00 00 00 60 movl r27=0x10624dd3;;
487c: 33 55 6c 62
4880: 10 00 00 00 01 00 [MIB] nop.m 0x0
4886: f0 40 e0 f0 29 00 shl r15=r8,7
488c: 00 00 00 20 nop.b 0x0
4890: 09 c0 00 34 18 10 [MMI] ld8 r24=[r26]
4890: LDXMOV __per_cpu_offset
4896: 30 00 2c 70 21 40 ld8.acq r3=[r11]
The LDXMOV relocation is designed to make it simple to convert the
instruction from ld8 r11=[r14] to mov r11=r14, it is easy to do in
place. Moving an entire slot around is a lot messier, for no
performance gain.
next prev parent reply other threads:[~2005-01-24 15:32 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-01-21 23:22 optimize __gp location Chen, Kenneth W
2005-01-22 1:02 ` Keith Owens
2005-01-22 1:02 ` Luck, Tony
2005-01-22 2:20 ` Chen, Kenneth W
2005-01-22 3:09 ` Keith Owens
2005-01-24 7:51 ` Christian Hildner
2005-01-24 13:22 ` Keith Owens
2005-01-24 13:29 ` Matthew Wilcox
2005-01-24 13:44 ` Christian Hildner
2005-01-24 15:32 ` Keith Owens [this message]
2005-01-24 17:51 ` David Mosberger
2005-01-24 17:53 ` David Mosberger
2005-01-25 7:30 ` Christian Hildner
2005-01-25 19:44 ` Chen, Kenneth W
2005-01-25 19:51 ` David Mosberger
2005-01-25 19:57 ` Chen, Kenneth W
2005-01-25 20:01 ` David Mosberger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=13890.1106580728@ocs3.ocs.com.au \
--to=kaos@sgi.com \
--cc=linux-ia64@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.