From: Zoltan Menyhart <Zoltan.Menyhart@bull.net>
To: Christoph Lameter <clameter@sgi.com>
Cc: akpm@osdl.org, ak@suse.de, linux-kernel@vger.kernel.org,
linux-ia64@vger.kernel.org
Subject: Re: light weight counters: race free through local_t?
Date: Thu, 15 Jun 2006 14:22:40 +0200 [thread overview]
Message-ID: <44915110.2050100@bull.net> (raw)
In-Reply-To: <Pine.LNX.4.64.0606140928500.4030@schroedinger.engr.sgi.com>
Christoph Lameter wrote:
> Could you do a clock cycle comparision of an
>
> atomic_inc(__get_per_cpu(var))
> (the fallback of local_t on ia64)
>
> vs.
>
> local_irq_save(flags)
> __get_per_cpu(var)++
> local_irq_restore(flags)
> (ZVC like implementation)
>
> vs.
>
> get_per_cpu(var)++
> put_cpu()
> (current light weight counters)
The only thing I have at hand is a small test for the 1st case:
#include <stdio.h>
#include <asm/atomic.h>
#define GET_ITC() \
({ \
unsigned long ia64_intri_res; \
\
asm volatile ("mov %0=ar.itc" : "=r"(ia64_intri_res)); \
ia64_intri_res; \
})
#define N (1000 * 1000 * 100L)
atomic_t data;
main(int c, char *v[])
{
unsigned long cycles;
int i;
cycles = GET_ITC();
for (i = 0; i < N; i++)
ia64_fetchadd4_rel(&data, 1);
cycles = GET_ITC() - cycles;
printf("%ld %d\n", cycles / N, atomic_read(&data));
}
It gives 11 clock cycles.
(The loop organizing instructions are "absorbed".)
"atomic_inc(__get_per_cpu(var))" compiles into:
mov rx = 0xffffffffffffxxxx // &__get_per_cpu(var)
;;
fetchadd4.rel ry = [rx], 1
It _should_ take 11 clock cycles, too. (Assuming it is in L2.)
For the 2nd case:
With a bit of modification, I can measure what
"__get_per_cpu(var)++" costs: 7 or 10 clock cycles, depending on
if the chance to find the counter in L1 is 100% or 0%:
int data;
static inline void store(int *addr, int data){
asm volatile ("st4 [%1] = %0" :: "r"(data), "r"(addr) : "memory");
}
static inline int load_nt1(int *addr)
{
int tmp;
asm volatile ("ld4.nt1 %0=[%1]" : "=r"(tmp) : "r" (addr));
return tmp;
}
main(int c, char *v[])
{
unsigned long cycles;
int i, d;
cycles = GET_ITC();
for (i = 0; i < N; i++)
// Avoid optimizing out the "st4"
store(&data, data + 1);
cycles = GET_ITC() - cycles;
printf("%ld %d\n", cycles / N, data);
cycles = GET_ITC();
for (i = 0; i < N; i++){
// Do not use L1
d = load_nt1(&data);
store(&data, d + 1);
}
cycles = GET_ITC() - cycles;
printf("%ld %d\n", cycles / N, data);
}
"local_irq_save(flags)" compiles into:
mov rx = psr ;; // 13 clock cycles
rsm 0x4000 ;; // 5 clock cycles
"local_irq_restore(flags)" compiles into (at least):
ssm 0x4000 // 5 clock cycles
For the 3dr case:
If CONFIG_PREEMPT, then you need to add 2 * 7 clock cycles
for inc_preempt_count() / dec_preempt_count() + some more
for preempt_check_resched().
My conclusion: let's stick to atomic counters.
Regards,
Zoltan
next prev parent reply other threads:[~2006-06-15 12:22 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-10 5:30 light weight counters: race free through local_t? Christoph Lameter
2006-06-14 16:05 ` Zoltan Menyhart
2006-06-14 16:33 ` Christoph Lameter
2006-06-15 12:22 ` Zoltan Menyhart [this message]
2006-06-15 15:56 ` Christoph Lameter
2006-06-15 16:46 ` Zoltan Menyhart
2006-06-15 18:14 ` Christoph Lameter
2006-06-16 9:14 ` Zoltan Menyhart
2006-06-15 16:06 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44915110.2050100@bull.net \
--to=zoltan.menyhart@bull.net \
--cc=ak@suse.de \
--cc=akpm@osdl.org \
--cc=clameter@sgi.com \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox