From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Nicolas Pitre <nico@cam.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
linux-kernel@vger.kernel.org, Ralf Baechle <ralf@linux-mips.org>,
benh@kernel.crashing.org, paulus@samba.org,
David Miller <davem@davemloft.net>,
Ingo Molnar <mingo@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Steven Rostedt <rostedt@goodmis.org>,
linux-arch@vger.kernel.org
Subject: Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()
Date: Fri, 7 Nov 2008 11:47:58 -0500 [thread overview]
Message-ID: <20081107164758.GB22134@Krystal> (raw)
In-Reply-To: <25363.1226056819@redhat.com>
* David Howells (dhowells@redhat.com) wrote:
> Andrew Morton <akpm@linux-foundation.org> wrote:
>
> > We have a macro which must only have a single usage in any particular
> > kernel build (and nothing to detect a violation of that).
>
> That's not true. It's a macro containing a _static_ local variable, therefore
> the macro may be used multiple times, and each time it's used the compiler
> will allocate a new piece of storage.
>
> > It apparently tries to avoid races via ordering tricks, as long
> > as it is called with sufficient frequency. But nothing guarantees
> > that it _is_ called sufficiently frequently?
>
> The comment attached to it clearly states this restriction. Therefore the
> caller must guarantee it. That is something Mathieu's code and my code must
> deal with, not Nicolas's.
>
> > There is absolutely no reason why the first two of these quite bad things
> > needed to be done. In fact there is no reason why it needed to be
> > implemented as a macro at all.
>
> There's a very good reason to implement it as either a macro or an inline
> function: it's faster. Moving the whole thing out of line would impose an
> additional function call overhead - with a 64-bit return value on 32-bit
> platforms. For my case - sched_clock() - I'm willing to burn a bit of extra
> space to get the extra speed.
>
> > As I said in the text which you deleted and ignored, this would be
> > better if it was implemented as a C function which requires that the
> > caller explicitly pass in a reference to the state storage.
>
> I'd be quite happy if it was:
>
> static inline u64 cnt32_to_63(u32 cnt_lo, u32 *__m_cnt_hi)
> {
> union cnt32_to_63 __x;
> __x.hi = *__m_cnt_hi;
> __x.lo = cnt_lo;
> if (unlikely((s32)(__x.hi ^ __x.lo) < 0))
> *__m_cnt_hi =
> __x.hi = (__x.hi ^ 0x80000000) + (__x.hi >> 31);
> return __x.val;
> }
>
Almost there. At least, with this kind of implementation, we would not
have to resort to various tricks to make sure a single code path is
called at a certain frequency. We would simply have to make sure the
__m_cnt_hi value is updated at a certain frequency. Thinking about
"data" rather than "code" makes much more sense.
The only missing thing here is the correct ordering. The problem is, as
I presented in more depth in my previous discussion with Nicolas, that
the __m_cnt_hi value has to be read before cnt_lo. First off, using this
macro with get_cycles() is simply buggy, because the macro expects
_perfect_ order of timestamps, no skew whatsoever, or otherwise time
could jump. This macro is therefore good only for mmio reads. One should
use per-cpu variables to keep the state of get_cycles() reads (as I did
in my other patch).
The following approach should work :
static inline u64 cnt32_to_63(u32 io_addr, u32 *__m_cnt_hi)
{
union cnt32_to_63 __x;
__x.hi = *__m_cnt_hi; /* memory read for high bits internal state */
smp_rmb(); /*
* read high bits before low bits insures time
* does not go backward.
*/
__x.lo = readl(cnt_lo); /* mmio read */
if (unlikely((s32)(__x.hi ^ __x.lo) < 0))
*__m_cnt_hi =
__x.hi = (__x.hi ^ 0x80000000) + (__x.hi >> 31);
return __x.val;
}
But any get_cycles() user of cnt32_to_63() should be shot down. The
bright side is : there is no way get_cycles() can be used with this
new code. :)
e.g. of incorrect users for arm (unless they are UP only, but that seems
like a weird design argument) :
mach-sa1100/include/mach/SA-1100.h:#define OSCR __REG(0x90000010)
/* OS timer Counter Reg. */
mach-sa1100/generic.c: unsigned long long v = cnt32_to_63(OSCR);
mach-pxa/include/mach/pxa-regs.h:#define OSCR __REG(0x40A00010) /* OS
Timer Counter Register */
mach-pxa/time.c: unsigned long long v = cnt32_to_63(OSCR);
Correct user :
mach-versatile/core.c: unsigned long long v =
cnt32_to_63(readl(VERSATILE_REFCOUNTER));
The new call would look like :
/* Hi 32-bits of versatile refcounter state, kept for cnt32_to_64. */
static u32 versatile_refcounter_hi;
unsigned long long v = cnt32_to_64(VERSATILE_REFCOUNTER, refcounter_hi);
Mathieu
> I imagine this would compile pretty much the same as the macro. I think it
> would make it more obvious about the independence of the storage.
>
> Alternatively, perhaps Nicolas just needs to mention this in the comment more
> clearly.
>
> David
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
next prev parent reply other threads:[~2008-11-07 16:48 UTC|newest]
Thread overview: 124+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20081107052336.652868737@polymtl.ca>
2008-11-07 5:23 ` [RFC patch 01/18] get_cycles() : kconfig HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 02/18] get_cycles() : x86 HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 03/18] get_cycles() : sparc64 HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 04/18] get_cycles() : powerpc64 HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-07 14:56 ` Josh Boyer
2008-11-07 18:14 ` Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 05/18] get_cycles() : MIPS HAVE_GET_CYCLES_32 Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 06/18] Trace clock generic Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 07/18] Trace clock core Mathieu Desnoyers
2008-11-07 5:52 ` Andrew Morton
2008-11-07 6:16 ` Mathieu Desnoyers
2008-11-07 6:26 ` Andrew Morton
2008-11-07 16:12 ` Mathieu Desnoyers
2008-11-07 16:19 ` Andrew Morton
2008-11-07 18:16 ` Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 08/18] cnt32_to_63 should use smp_rmb() Mathieu Desnoyers
2008-11-07 6:05 ` Andrew Morton
2008-11-07 8:12 ` Nicolas Pitre
2008-11-07 8:38 ` Andrew Morton
2008-11-07 11:20 ` David Howells
2008-11-07 15:01 ` Nicolas Pitre
2008-11-07 15:01 ` Nicolas Pitre
2008-11-07 15:50 ` Andrew Morton
2008-11-07 16:21 ` David Howells
2008-11-07 16:29 ` Andrew Morton
2008-11-07 17:10 ` David Howells
2008-11-07 17:26 ` Andrew Morton
2008-11-07 18:00 ` Mathieu Desnoyers
2008-11-07 18:21 ` Andrew Morton
2008-11-07 18:30 ` Harvey Harrison
2008-11-07 18:42 ` Mathieu Desnoyers
2008-11-07 18:33 ` Mathieu Desnoyers
2008-11-07 18:36 ` Linus Torvalds
2008-11-07 18:45 ` Andrew Morton
2008-11-07 16:47 ` Nicolas Pitre
2008-11-07 16:55 ` David Howells
2008-11-07 17:21 ` Andrew Morton
2008-11-07 20:03 ` Nicolas Pitre
2008-11-07 20:03 ` Nicolas Pitre
2008-11-07 16:07 ` David Howells
2008-11-07 16:47 ` Mathieu Desnoyers [this message]
2008-11-07 17:04 ` David Howells
2008-11-07 17:17 ` Mathieu Desnoyers
2008-11-07 23:27 ` David Howells
2008-11-07 20:11 ` Russell King
2008-11-07 20:11 ` Russell King
2008-11-07 21:36 ` Mathieu Desnoyers
2008-11-07 21:36 ` Mathieu Desnoyers
2008-11-07 22:18 ` Russell King
2008-11-07 22:18 ` Russell King
2008-11-07 22:36 ` Mathieu Desnoyers
2008-11-07 22:36 ` Mathieu Desnoyers
2008-11-07 23:41 ` David Howells
2008-11-08 0:15 ` Russell King
2008-11-08 0:15 ` Russell King
2008-11-08 0:45 ` David Howells
2008-11-08 15:24 ` Nicolas Pitre
2008-11-08 23:20 ` [PATCH] clarify usage expectations for cnt32_to_63() Nicolas Pitre
2008-11-09 2:25 ` Mathieu Desnoyers
2008-11-09 2:54 ` Nicolas Pitre
2008-11-09 5:06 ` Nicolas Pitre
2008-11-09 5:27 ` [PATCH v2] " Nicolas Pitre
2008-11-09 6:48 ` Mathieu Desnoyers
2008-11-09 13:34 ` Nicolas Pitre
2008-11-09 13:43 ` Russell King
2008-11-09 13:43 ` Russell King
2008-11-09 16:22 ` Mathieu Desnoyers
2008-11-10 4:20 ` Nicolas Pitre
2008-11-10 4:42 ` Andrew Morton
2008-11-10 21:34 ` Nicolas Pitre
2008-11-10 21:58 ` Andrew Morton
2008-11-10 23:15 ` Nicolas Pitre
2008-11-10 23:15 ` Nicolas Pitre
2008-11-10 23:22 ` Andrew Morton
2008-11-10 23:38 ` Steven Rostedt
2008-11-11 0:26 ` Nicolas Pitre
2008-11-11 18:28 ` [PATCH] convert cnt32_to_63 to inline Mathieu Desnoyers
2008-11-11 19:13 ` Russell King
2008-11-11 19:13 ` Russell King
2008-11-11 20:11 ` Mathieu Desnoyers
2008-11-11 20:11 ` Mathieu Desnoyers
2008-11-11 21:51 ` Russell King
2008-11-11 21:51 ` Russell King
2008-11-12 3:48 ` Mathieu Desnoyers
2008-11-12 3:48 ` Mathieu Desnoyers
2008-11-11 21:00 ` Nicolas Pitre
2008-11-11 21:13 ` Russell King
2008-11-11 21:13 ` Russell King
2008-11-11 22:31 ` David Howells
2008-11-11 22:37 ` Peter Zijlstra
2008-11-12 1:13 ` Steven Rostedt
2008-11-07 11:03 ` [RFC patch 08/18] cnt32_to_63 should use smp_rmb() David Howells
2008-11-07 11:03 ` David Howells
2008-11-07 16:51 ` Mathieu Desnoyers
2008-11-07 20:18 ` Nicolas Pitre
2008-11-07 23:55 ` David Howells
2008-11-07 10:59 ` David Howells
2008-11-07 10:55 ` David Howells
2008-11-07 17:09 ` Mathieu Desnoyers
2008-11-07 17:33 ` Steven Rostedt
2008-11-07 17:33 ` Steven Rostedt
2008-11-07 19:18 ` Mathieu Desnoyers
2008-11-07 19:32 ` Peter Zijlstra
2008-11-07 20:02 ` Mathieu Desnoyers
2008-11-07 20:45 ` Mathieu Desnoyers
2008-11-07 20:54 ` Paul E. McKenney
2008-11-07 21:04 ` Steven Rostedt
2008-11-08 0:34 ` Paul E. McKenney
2008-11-07 21:16 ` Mathieu Desnoyers
2008-11-07 23:50 ` David Howells
2008-11-08 0:55 ` Steven Rostedt
2008-11-09 11:51 ` David Howells
2008-11-09 14:31 ` Steven Rostedt
2008-11-09 14:31 ` Steven Rostedt
2008-11-09 16:18 ` Mathieu Desnoyers
2008-11-07 20:08 ` Steven Rostedt
2008-11-07 20:55 ` Paul E. McKenney
2008-11-07 21:27 ` Mathieu Desnoyers
2008-11-07 20:36 ` Nicolas Pitre
2008-11-07 20:55 ` Mathieu Desnoyers
2008-11-07 21:22 ` Nicolas Pitre
2008-11-07 5:23 ` [RFC patch 09/18] Powerpc : Trace clock Mathieu Desnoyers
2008-11-07 5:23 ` [RFC patch 10/18] Sparc64 " Mathieu Desnoyers
2008-11-07 5:45 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081107164758.GB22134@Krystal \
--to=mathieu.desnoyers@polymtl.ca \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=benh@kernel.crashing.org \
--cc=davem@davemloft.net \
--cc=dhowells@redhat.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=nico@cam.org \
--cc=paulus@samba.org \
--cc=ralf@linux-mips.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox