From: Russell King <rmk+lkml@arm.linux.org.uk>
To: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Nicolas Pitre <nico@cam.org>,
Andrew Morton <akpm@linux-foundation.org>,
torvalds@linux-foundation.org, dhowells@redhat.com,
mingo@elte.hu, a.p.zijlstra@chello.nl,
linux-kernel@vger.kernel.org, ralf@linux-mips.org,
benh@kernel.crashing.org, paulus@samba.org, davem@davemloft.net,
mingo@redhat.com, tglx@linutronix.de, rostedt@goodmis.org,
linux-arch@vger.kernel.org
Subject: Re: [PATCH] convert cnt32_to_63 to inline
Date: Tue, 11 Nov 2008 19:13:55 +0000 [thread overview]
Message-ID: <20081111191355.GA13724@flint.arm.linux.org.uk> (raw)
In-Reply-To: <20081111182759.GA8052@Krystal>
On Tue, Nov 11, 2008 at 01:28:00PM -0500, Mathieu Desnoyers wrote:
> Let's see what it gives once implemented. Only compile-tested. Assumes
> pxa, sa110 and mn10300 are all UP-only. Correct smp_rmb() are used for
> arm versatile.
Versatile is also UP only.
The following are results from PXA built with gcc 3.4.3:
1. two additional registers used in sched_clock()
2. 8 additional bytes of code (which are needless if gcc was more inteligent)
both of these I put down to inefficiencies in gcc's register allocation.
3. worse instruction scheduling - two inter-dependent loads next to each
other causing a pipeline stall
Actual reading of variables/hardware is unaffected by this patch.
Old code:
c: e59f3050 ldr r3, [pc, #80] ; load address of oscr2ns_scale
10: e59fc050 ldr ip, [pc, #80] ; load address of __m_cnt_hi
14: e5932000 ldr r2, [r3] ; read oscr2ns_scale
18: e59f304c ldr r3, [pc, #76] ; load address of OSCR
1c: e59c1000 ldr r1, [ip] ; read __m_cnt_hi
20: e1a07002 mov r7, r2
24: e3a08000 mov r8, #0 ; 0x0
28: e5933000 ldr r3, [r3] ; read OSCR register
...
58: e1820b04 orr r0, r2, r4, lsl #22
5c: e1a01524 lsr r1, r4, #10
60: e89da9f0 ldm sp, {r4, r5, r6, r7, r8, fp, sp, pc}
New code:
c: e59f0058 ldr r0, [pc, #88] ; load address of oscr2ns_scale
10: e5901000 ldr r1, [r0] ; read oscr2ns_scale <= pipeline stall
14: e59f3054 ldr r3, [pc, #84] ; load address of __m_cnt_hi
18: e1a08001 mov r8, r1
1c: e5932000 ldr r2, [r3] ; read __m_cnt_hi
20: e59f304c ldr r3, [pc, #76] ; load address of OSCR
24: e1a09002 mov r9, r2
28: e3a0a000 mov sl, #0 ; 0x0
2c: e5933000 ldr r3, [r3] ; read OSCR
...
58: e1825b04 orr r5, r2, r4, lsl #22
5c: e1a06524 lsr r6, r4, #10
60: e1a01006 mov r1, r6
64: e1a00005 mov r0, r5
68: e89daff0 ldm sp, {r4, r5, r6, r7, r8, r9, sl, fp, sp, pc}
Versatile:
1. 12 additional bytes of code
2. same number of registers
3. worse instruction scheduling causing pipeline stall
Actual reading of variables/hardware is unaffected by this patch.
So, we have two platforms where this patch makes things visibly worse
with no material benefit.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:
next prev parent reply other threads:[~2008-11-11 19:16 UTC|newest]
Thread overview: 118+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-07 5:23 [RFC patch 00/18] Trace Clock v2 Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 01/18] get_cycles() : kconfig HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 02/18] get_cycles() : x86 HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 03/18] get_cycles() : sparc64 HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 04/18] get_cycles() : powerpc64 HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-07 14:56 ` Josh Boyer
2008-11-07 18:14 ` Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 05/18] get_cycles() : MIPS HAVE_GET_CYCLES_32 Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 06/18] Trace clock generic Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 07/18] Trace clock core Mathieu Desnoyers
2008-11-07 5:52 ` Andrew Morton
2008-11-07 6:16 ` Mathieu Desnoyers
2008-11-07 6:26 ` Andrew Morton
2008-11-07 16:12 ` Mathieu Desnoyers
2008-11-07 16:19 ` Andrew Morton
2008-11-07 18:16 ` Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 08/18] cnt32_to_63 should use smp_rmb() Mathieu Desnoyers
2008-11-07 6:05 ` Andrew Morton
2008-11-07 8:12 ` Nicolas Pitre
2008-11-07 8:38 ` Andrew Morton
2008-11-07 11:20 ` David Howells
2008-11-07 15:01 ` Nicolas Pitre
2008-11-07 15:50 ` Andrew Morton
2008-11-07 16:21 ` David Howells
2008-11-07 16:29 ` Andrew Morton
2008-11-07 17:10 ` David Howells
2008-11-07 17:26 ` Andrew Morton
2008-11-07 18:00 ` Mathieu Desnoyers
2008-11-07 18:21 ` Andrew Morton
2008-11-07 18:30 ` Harvey Harrison
2008-11-07 18:42 ` Mathieu Desnoyers
2008-11-07 18:33 ` Mathieu Desnoyers
2008-11-07 18:36 ` Linus Torvalds
2008-11-07 18:45 ` Andrew Morton
2008-11-07 16:47 ` Nicolas Pitre
2008-11-07 16:55 ` David Howells
2008-11-07 17:21 ` Andrew Morton
2008-11-07 20:03 ` Nicolas Pitre
2008-11-07 16:07 ` David Howells
2008-11-07 16:47 ` Mathieu Desnoyers
2008-11-07 17:04 ` David Howells
2008-11-07 17:17 ` Mathieu Desnoyers
2008-11-07 23:27 ` David Howells
2008-11-07 20:11 ` Russell King
2008-11-07 21:36 ` Mathieu Desnoyers
2008-11-07 22:18 ` Russell King
2008-11-07 22:36 ` Mathieu Desnoyers
2008-11-07 23:41 ` David Howells
2008-11-08 0:15 ` Russell King
2008-11-08 0:45 ` David Howells
2008-11-08 15:24 ` Nicolas Pitre
2008-11-08 23:20 ` [PATCH] clarify usage expectations for cnt32_to_63() Nicolas Pitre
2008-11-09 2:25 ` Mathieu Desnoyers
2008-11-09 2:54 ` Nicolas Pitre
2008-11-09 5:06 ` Nicolas Pitre
2008-11-09 5:27 ` [PATCH v2] " Nicolas Pitre
2008-11-09 6:48 ` Mathieu Desnoyers
2008-11-09 13:34 ` Nicolas Pitre
2008-11-09 13:43 ` Russell King
2008-11-09 16:22 ` Mathieu Desnoyers
2008-11-10 4:20 ` Nicolas Pitre
2008-11-10 4:42 ` Andrew Morton
2008-11-10 21:34 ` Nicolas Pitre
2008-11-10 21:58 ` Andrew Morton
2008-11-10 23:15 ` Nicolas Pitre
2008-11-10 23:22 ` Andrew Morton
2008-11-10 23:38 ` Steven Rostedt
2008-11-11 0:26 ` Nicolas Pitre
2008-11-11 18:28 ` [PATCH] convert cnt32_to_63 to inline Mathieu Desnoyers
2008-11-11 19:13 ` Russell King [this message]
2008-11-11 20:11 ` Mathieu Desnoyers
2008-11-11 21:51 ` Russell King
2008-11-12 3:48 ` Mathieu Desnoyers
2008-11-11 21:00 ` Nicolas Pitre
2008-11-11 21:13 ` Russell King
2008-11-11 22:31 ` David Howells
2008-11-11 22:37 ` Peter Zijlstra
2008-11-12 1:13 ` Steven Rostedt
2008-11-07 11:03 ` [RFC patch 08/18] cnt32_to_63 should use smp_rmb() David Howells
2008-11-07 16:51 ` Mathieu Desnoyers
2008-11-07 20:18 ` Nicolas Pitre
2008-11-07 23:55 ` David Howells
2008-11-07 10:59 ` David Howells
2008-11-07 10:55 ` David Howells
2008-11-07 17:09 ` Mathieu Desnoyers
2008-11-07 17:33 ` Steven Rostedt
2008-11-07 19:18 ` Mathieu Desnoyers
2008-11-07 19:32 ` Peter Zijlstra
2008-11-07 20:02 ` Mathieu Desnoyers
2008-11-07 20:45 ` Mathieu Desnoyers
2008-11-07 20:54 ` Paul E. McKenney
2008-11-07 21:04 ` Steven Rostedt
2008-11-08 0:34 ` Paul E. McKenney
2008-11-07 21:16 ` Mathieu Desnoyers
2008-11-07 23:50 ` David Howells
2008-11-08 0:55 ` Steven Rostedt
2008-11-09 11:51 ` David Howells
2008-11-09 14:31 ` Steven Rostedt
2008-11-09 16:18 ` Mathieu Desnoyers
2008-11-07 20:08 ` Steven Rostedt
2008-11-07 20:55 ` Paul E. McKenney
2008-11-07 21:27 ` Mathieu Desnoyers
2008-11-07 20:36 ` Nicolas Pitre
2008-11-07 20:55 ` Mathieu Desnoyers
2008-11-07 21:22 ` Nicolas Pitre
2008-11-07 5:23 ` [RFC patch 09/18] Powerpc : Trace clock Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 10/18] Sparc64 " Mathieu Desnoyers
2008-11-07 5:45 ` David Miller
2008-11-07 5:23 ` [RFC patch 11/18] LTTng timestamp sh Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 12/18] LTTng - TSC synchronicity test Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 13/18] x86 : remove arch-specific tsc_sync.c Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 14/18] MIPS use tsc_sync.c Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 15/18] MIPS : export hpt frequency for trace_clock Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 16/18] MIPS create empty sync_core() Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 17/18] MIPS : Trace clock Mathieu Desnoyers
2008-11-07 11:53 ` Peter Zijlstra
2008-11-07 17:44 ` Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 18/18] x86 trace clock Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081111191355.GA13724@flint.arm.linux.org.uk \
--to=rmk+lkml@arm.linux.org.uk \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=benh@kernel.crashing.org \
--cc=davem@davemloft.net \
--cc=dhowells@redhat.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=nico@cam.org \
--cc=paulus@samba.org \
--cc=ralf@linux-mips.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox