All of lore.kernel.org
 help / color / mirror / Atom feed
From: Russell King <rmk+lkml@arm.linux.org.uk>
To: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Nicolas Pitre <nico@cam.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	torvalds@linux-foundation.org, dhowells@redhat.com,
	mingo@elte.hu, a.p.zijlstra@chello.nl,
	linux-kernel@vger.kernel.org, ralf@linux-mips.org,
	benh@kernel.crashing.org, paulus@samba.org, davem@davemloft.net,
	mingo@redhat.com, tglx@linutronix.de, rostedt@goodmis.org,
	linux-arch@vger.kernel.org
Subject: Re: [PATCH] convert cnt32_to_63 to inline
Date: Tue, 11 Nov 2008 19:13:55 +0000	[thread overview]
Message-ID: <20081111191355.GA13724@flint.arm.linux.org.uk> (raw)
In-Reply-To: <20081111182759.GA8052@Krystal>

On Tue, Nov 11, 2008 at 01:28:00PM -0500, Mathieu Desnoyers wrote:
> Let's see what it gives once implemented. Only compile-tested. Assumes
> pxa, sa110 and mn10300 are all UP-only. Correct smp_rmb() are used for
> arm versatile.

Versatile is also UP only.

The following are results from PXA built with gcc 3.4.3:

1. two additional registers used in sched_clock()
2. 8 additional bytes of code (which are needless if gcc was more inteligent)

both of these I put down to inefficiencies in gcc's register allocation.

3. worse instruction scheduling - two inter-dependent loads next to each
   other causing a pipeline stall

Actual reading of variables/hardware is unaffected by this patch.

Old code:

   c:   e59f3050        ldr     r3, [pc, #80]   ; load address of oscr2ns_scale
  10:   e59fc050        ldr     ip, [pc, #80]   ; load address of __m_cnt_hi
  14:   e5932000        ldr     r2, [r3]	; read oscr2ns_scale
  18:   e59f304c        ldr     r3, [pc, #76]   ; load address of OSCR
  1c:   e59c1000        ldr     r1, [ip]	; read __m_cnt_hi
  20:   e1a07002        mov     r7, r2
  24:   e3a08000        mov     r8, #0  ; 0x0
  28:   e5933000        ldr     r3, [r3]	; read OSCR register
...
  58:   e1820b04        orr     r0, r2, r4, lsl #22
  5c:   e1a01524        lsr     r1, r4, #10
  60:   e89da9f0        ldm     sp, {r4, r5, r6, r7, r8, fp, sp, pc}


New code:

   c:   e59f0058        ldr     r0, [pc, #88]   ; load address of oscr2ns_scale
  10:   e5901000        ldr     r1, [r0]	; read oscr2ns_scale <= pipeline stall
  14:   e59f3054        ldr     r3, [pc, #84]   ; load address of __m_cnt_hi
  18:   e1a08001        mov     r8, r1
  1c:   e5932000        ldr     r2, [r3]	; read __m_cnt_hi
  20:   e59f304c        ldr     r3, [pc, #76]   ; load address of OSCR
  24:   e1a09002        mov     r9, r2
  28:   e3a0a000        mov     sl, #0  ; 0x0
  2c:   e5933000        ldr     r3, [r3]	; read OSCR
...
  58:   e1825b04        orr     r5, r2, r4, lsl #22
  5c:   e1a06524        lsr     r6, r4, #10
  60:   e1a01006        mov     r1, r6
  64:   e1a00005        mov     r0, r5
  68:   e89daff0        ldm     sp, {r4, r5, r6, r7, r8, r9, sl, fp, sp, pc}

Versatile:

1. 12 additional bytes of code
2. same number of registers
3. worse instruction scheduling causing pipeline stall

Actual reading of variables/hardware is unaffected by this patch.

So, we have two platforms where this patch makes things visibly worse
with no material benefit.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

  reply	other threads:[~2008-11-11 19:13 UTC|newest]

Thread overview: 123+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-07  5:23 [RFC patch 00/18] Trace Clock v2 Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 01/18] get_cycles() : kconfig HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 02/18] get_cycles() : x86 HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 03/18] get_cycles() : sparc64 HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 04/18] get_cycles() : powerpc64 HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-07 14:56   ` Josh Boyer
2008-11-07 18:14     ` Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 05/18] get_cycles() : MIPS HAVE_GET_CYCLES_32 Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 06/18] Trace clock generic Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 07/18] Trace clock core Mathieu Desnoyers
2008-11-07  5:52   ` Andrew Morton
2008-11-07  6:16     ` Mathieu Desnoyers
2008-11-07  6:26       ` Andrew Morton
2008-11-07 16:12         ` Mathieu Desnoyers
2008-11-07 16:19           ` Andrew Morton
2008-11-07 18:16             ` Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 08/18] cnt32_to_63 should use smp_rmb() Mathieu Desnoyers
2008-11-07  6:05   ` Andrew Morton
2008-11-07  8:12     ` Nicolas Pitre
2008-11-07  8:38       ` Andrew Morton
2008-11-07 11:20         ` David Howells
2008-11-07 15:01           ` Nicolas Pitre
2008-11-07 15:50             ` Andrew Morton
2008-11-07 16:21               ` David Howells
2008-11-07 16:29                 ` Andrew Morton
2008-11-07 17:10                   ` David Howells
2008-11-07 17:26                     ` Andrew Morton
2008-11-07 18:00                       ` Mathieu Desnoyers
2008-11-07 18:21                         ` Andrew Morton
2008-11-07 18:30                           ` Harvey Harrison
2008-11-07 18:42                             ` Mathieu Desnoyers
2008-11-07 18:33                           ` Mathieu Desnoyers
2008-11-07 18:36                           ` Linus Torvalds
2008-11-07 18:45                             ` Andrew Morton
2008-11-07 16:47               ` Nicolas Pitre
2008-11-07 16:55                 ` David Howells
2008-11-07 17:21                 ` Andrew Morton
2008-11-07 20:03                   ` Nicolas Pitre
2008-11-07 16:07             ` David Howells
2008-11-07 16:47           ` Mathieu Desnoyers
2008-11-07 17:04             ` David Howells
2008-11-07 17:17               ` Mathieu Desnoyers
2008-11-07 23:27                 ` David Howells
2008-11-07 20:11             ` Russell King
2008-11-07 21:36               ` Mathieu Desnoyers
2008-11-07 21:36                 ` Mathieu Desnoyers
2008-11-07 22:18                 ` Russell King
2008-11-07 22:36                   ` Mathieu Desnoyers
2008-11-07 22:36                     ` Mathieu Desnoyers
2008-11-07 23:41                   ` David Howells
2008-11-08  0:15                     ` Russell King
2008-11-08  0:45                       ` David Howells
2008-11-08 15:24                       ` Nicolas Pitre
2008-11-08 23:20                         ` [PATCH] clarify usage expectations for cnt32_to_63() Nicolas Pitre
2008-11-09  2:25                           ` Mathieu Desnoyers
2008-11-09  2:54                             ` Nicolas Pitre
2008-11-09  5:06                               ` Nicolas Pitre
2008-11-09  5:27                                 ` [PATCH v2] " Nicolas Pitre
2008-11-09  6:48                                   ` Mathieu Desnoyers
2008-11-09 13:34                                     ` Nicolas Pitre
2008-11-09 13:43                                       ` Russell King
2008-11-09 16:22                                       ` Mathieu Desnoyers
2008-11-10  4:20                                         ` Nicolas Pitre
2008-11-10  4:42                                           ` Andrew Morton
2008-11-10 21:34                                             ` Nicolas Pitre
2008-11-10 21:58                                               ` Andrew Morton
2008-11-10 23:15                                                 ` Nicolas Pitre
2008-11-10 23:22                                                   ` Andrew Morton
2008-11-10 23:38                                                     ` Steven Rostedt
2008-11-11  0:26                                                     ` Nicolas Pitre
2008-11-11 18:28                                                       ` [PATCH] convert cnt32_to_63 to inline Mathieu Desnoyers
2008-11-11 19:13                                                         ` Russell King [this message]
2008-11-11 20:11                                                           ` Mathieu Desnoyers
2008-11-11 20:11                                                             ` Mathieu Desnoyers
2008-11-11 21:51                                                             ` Russell King
2008-11-12  3:48                                                               ` Mathieu Desnoyers
2008-11-12  3:48                                                                 ` Mathieu Desnoyers
2008-11-11 21:00                                                         ` Nicolas Pitre
2008-11-11 21:13                                                           ` Russell King
2008-11-11 22:31                                                         ` David Howells
2008-11-11 22:37                                                           ` Peter Zijlstra
2008-11-12  1:13                                                             ` Steven Rostedt
2008-11-07 11:03       ` [RFC patch 08/18] cnt32_to_63 should use smp_rmb() David Howells
2008-11-07 16:51         ` Mathieu Desnoyers
2008-11-07 20:18           ` Nicolas Pitre
2008-11-07 23:55             ` David Howells
2008-11-07 10:59     ` David Howells
2008-11-07 10:55   ` David Howells
2008-11-07 17:09     ` Mathieu Desnoyers
2008-11-07 17:33       ` Steven Rostedt
2008-11-07 19:18         ` Mathieu Desnoyers
2008-11-07 19:32           ` Peter Zijlstra
2008-11-07 20:02             ` Mathieu Desnoyers
2008-11-07 20:45               ` Mathieu Desnoyers
2008-11-07 20:54                 ` Paul E. McKenney
2008-11-07 21:04                   ` Steven Rostedt
2008-11-08  0:34                     ` Paul E. McKenney
2008-11-07 21:16                   ` Mathieu Desnoyers
2008-11-07 23:50         ` David Howells
2008-11-08  0:55           ` Steven Rostedt
2008-11-09 11:51             ` David Howells
2008-11-09 14:31               ` Steven Rostedt
2008-11-09 16:18               ` Mathieu Desnoyers
2008-11-07 20:08       ` Steven Rostedt
2008-11-07 20:55         ` Paul E. McKenney
2008-11-07 21:27         ` Mathieu Desnoyers
2008-11-07 20:36       ` Nicolas Pitre
2008-11-07 20:55         ` Mathieu Desnoyers
2008-11-07 21:22           ` Nicolas Pitre
2008-11-07  5:23 ` [RFC patch 09/18] Powerpc : Trace clock Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 10/18] Sparc64 " Mathieu Desnoyers
2008-11-07  5:45   ` David Miller
2008-11-07  5:23 ` [RFC patch 11/18] LTTng timestamp sh Mathieu Desnoyers
2008-11-07  5:23   ` Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 12/18] LTTng - TSC synchronicity test Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 13/18] x86 : remove arch-specific tsc_sync.c Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 14/18] MIPS use tsc_sync.c Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 15/18] MIPS : export hpt frequency for trace_clock Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 16/18] MIPS create empty sync_core() Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 17/18] MIPS : Trace clock Mathieu Desnoyers
2008-11-07 11:53   ` Peter Zijlstra
2008-11-07 17:44     ` Mathieu Desnoyers
2008-11-07  5:23 ` [RFC patch 18/18] x86 trace clock Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081111191355.GA13724@flint.arm.linux.org.uk \
    --to=rmk+lkml@arm.linux.org.uk \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=davem@davemloft.net \
    --cc=dhowells@redhat.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=nico@cam.org \
    --cc=paulus@samba.org \
    --cc=ralf@linux-mips.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.