From: Eric Dumazet <dada1@cosmosbay.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Theodore Tso <tytso@mit.edu>,
linux kernel <linux-kernel@vger.kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Mingming Cao <cmm@us.ibm.com>,
linux-ext4@vger.kernel.org, Christoph Lameter <clameter@sgi.com>,
Rusty Russell <rusty@rustcorp.com.au>
Subject: Re: [PATCH] percpu_counter: Fix __percpu_counter_sum()
Date: Wed, 10 Dec 2008 23:56:37 +0100 [thread overview]
Message-ID: <49404925.7090902@cosmosbay.com> (raw)
In-Reply-To: <20081209214921.b3944687.akpm@linux-foundation.org>
Andrew Morton a écrit :
> On Wed, 10 Dec 2008 06:09:08 +0100 Eric Dumazet <dada1@cosmosbay.com> wrote:
>
>> Now percpu_counter_sum() is 'fixed', what about "percpu_counter_add()" ?
>>
>> void __percpu_counter_add(struct percpu_counter *fbc, s64 amount, s32 batch)
>> {
>> s64 count;
>> s32 *pcount;
>> int cpu = get_cpu();
>>
>> pcount = per_cpu_ptr(fbc->counters, cpu);
>> count = *pcount + amount;
>> if (count >= batch || count <= -batch) {
>> spin_lock(&fbc->lock);
>> fbc->count += count;
>> *pcount = 0;
>> spin_unlock(&fbc->lock);
>> } else {
>> *pcount = count;
>> }
>> put_cpu();
>> }
>>
>>
>> If I read this well, this is not IRQ safe.
>
> Sure. It's racy against interrupts on this cpu, it'll deadlock over
> the non-irq-safe spinlock and lockdep will have a coronary over it.
>
>> get_cpu() only disables preemption IMHO
>
> yes
>
>> For nr_files, nr_dentry, nr_inodes, it should not be a problem.
>
> yes
>
>> But for network counters (only in net-next-2.6)
>> and lib/proportions.c, we have a problem ?
>
> yes
>
>> Using local_t instead of s32 for cpu
>> local counter here is possible, so that fast path doesnt have
>> to disable interrupts
>>
>> (use a local_t instead of s32 for fbc->counters)
>>
>> void __percpu_counter_add_irqsafe(struct percpu_counter *fbc, s64 amount, s32 batch)
>> {
>> long count;
>> local_t *pcount;
>>
>> /* following code only matters on 32bit arches */
>> if (sizeof(amount) != sizeof(local_t)) {
>> if (unlikely(amount >= batch || amount <= -batch))) {
>> spin_lock_irqsave(&fbc->lock, flags);
>> fbc->count += amount;
>> spin_unlock_irqrestore(&fbc->lock, flags);
>> return;
>> }
>> }
>> pcount = per_cpu_ptr(fbc->counters, get_cpu());
>> count = local_add_return((long)amount, pcount);
>> if (unlikely(count >= batch || count <= -batch)) {
>> unsigned long flags;
>>
>> local_sub(count, pcount);
>> spin_lock_irqsave(&fbc->lock, flags);
>> fbc->count += count;
>> spin_unlock_irqrestore(&fbc->lock, flags);
>> }
>> put_cpu();
>> }
>
>
> I think it's reasonable. If the batching is working as intended, the
> increased cost of s/spin_lock/spin_lock_irqsave/ should be
> insignificant.
>
> In fact, if *at all* possible it would be best to make percpu_counters
> irq-safe under all circumstances and avoid fattening and complicating the
> interface.
>
>
>
> But before adding more dependencies on local_t I do think we should
> refresh ourselves on Christoph's objections to them - I remember
> finding them fairly convincing at the time, but I don't recall the
> details.
>
> <searches for a long time>
>
> Here, I think:
> http://lkml.indiana.edu/hypermail/linux/kernel/0805.3/2482.html
>
> Rusty, Christoph: talk to me. If we add a new user of local_t in core
> kernel, will we regret it?
>
__percpu_counter_add() already disables preemption (calling get_cpu())
But then, some (all but x86 ;) ) arches dont have true local_t and we fallback
to plain atomic_long_t, and this is wrong because it would add a LOCKED
instruction in fast path.
I remember Christoph added FAST_CMPXCHG_LOCAL, but no more uses of it in current
tree.
Ie : using local_t only if CONFIG_FAST_CMPXCHG_LOCAL, else something like :
void __percpu_counter_add_irqsafe(struct percpu_counter *fbc, s64 amount, s32 batch)
{
s64 count;
s32 *pcount = per_cpu_ptr(fbc->counters, get_cpu());
unsigned long flags;
local_irq_save(flags);
count = *pcount + amount;
if (unlikely(count >= batch || count <= -batch)) {
spin_lock(&fbc->lock);
fbc->count += count;
spin_unlock(&fbc->lock);
count = 0;
}
*pcount = count;
local_irq_restore(flags);
put_cpu();
}
EXPORT_SYMBOL(__percpu_counter_add_irqsafe);
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: Eric Dumazet <dada1@cosmosbay.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Theodore Tso <tytso@mit.edu>,
linux kernel <linux-kernel@vger.kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Mingming Cao <cmm@us.ibm.com>,
linux-ext4@vger.kernel.org, Christoph Lameter <clameter@sgi.com>,
Rusty Russell <rusty@rustcorp.com.au>
Subject: Re: [PATCH] percpu_counter: Fix __percpu_counter_sum()
Date: Wed, 10 Dec 2008 23:56:37 +0100 [thread overview]
Message-ID: <49404925.7090902@cosmosbay.com> (raw)
In-Reply-To: <20081209214921.b3944687.akpm@linux-foundation.org>
Andrew Morton a écrit :
> On Wed, 10 Dec 2008 06:09:08 +0100 Eric Dumazet <dada1@cosmosbay.com> wrote:
>
>> Now percpu_counter_sum() is 'fixed', what about "percpu_counter_add()" ?
>>
>> void __percpu_counter_add(struct percpu_counter *fbc, s64 amount, s32 batch)
>> {
>> s64 count;
>> s32 *pcount;
>> int cpu = get_cpu();
>>
>> pcount = per_cpu_ptr(fbc->counters, cpu);
>> count = *pcount + amount;
>> if (count >= batch || count <= -batch) {
>> spin_lock(&fbc->lock);
>> fbc->count += count;
>> *pcount = 0;
>> spin_unlock(&fbc->lock);
>> } else {
>> *pcount = count;
>> }
>> put_cpu();
>> }
>>
>>
>> If I read this well, this is not IRQ safe.
>
> Sure. It's racy against interrupts on this cpu, it'll deadlock over
> the non-irq-safe spinlock and lockdep will have a coronary over it.
>
>> get_cpu() only disables preemption IMHO
>
> yes
>
>> For nr_files, nr_dentry, nr_inodes, it should not be a problem.
>
> yes
>
>> But for network counters (only in net-next-2.6)
>> and lib/proportions.c, we have a problem ?
>
> yes
>
>> Using local_t instead of s32 for cpu
>> local counter here is possible, so that fast path doesnt have
>> to disable interrupts
>>
>> (use a local_t instead of s32 for fbc->counters)
>>
>> void __percpu_counter_add_irqsafe(struct percpu_counter *fbc, s64 amount, s32 batch)
>> {
>> long count;
>> local_t *pcount;
>>
>> /* following code only matters on 32bit arches */
>> if (sizeof(amount) != sizeof(local_t)) {
>> if (unlikely(amount >= batch || amount <= -batch))) {
>> spin_lock_irqsave(&fbc->lock, flags);
>> fbc->count += amount;
>> spin_unlock_irqrestore(&fbc->lock, flags);
>> return;
>> }
>> }
>> pcount = per_cpu_ptr(fbc->counters, get_cpu());
>> count = local_add_return((long)amount, pcount);
>> if (unlikely(count >= batch || count <= -batch)) {
>> unsigned long flags;
>>
>> local_sub(count, pcount);
>> spin_lock_irqsave(&fbc->lock, flags);
>> fbc->count += count;
>> spin_unlock_irqrestore(&fbc->lock, flags);
>> }
>> put_cpu();
>> }
>
>
> I think it's reasonable. If the batching is working as intended, the
> increased cost of s/spin_lock/spin_lock_irqsave/ should be
> insignificant.
>
> In fact, if *at all* possible it would be best to make percpu_counters
> irq-safe under all circumstances and avoid fattening and complicating the
> interface.
>
>
>
> But before adding more dependencies on local_t I do think we should
> refresh ourselves on Christoph's objections to them - I remember
> finding them fairly convincing at the time, but I don't recall the
> details.
>
> <searches for a long time>
>
> Here, I think:
> http://lkml.indiana.edu/hypermail/linux/kernel/0805.3/2482.html
>
> Rusty, Christoph: talk to me. If we add a new user of local_t in core
> kernel, will we regret it?
>
__percpu_counter_add() already disables preemption (calling get_cpu())
But then, some (all but x86 ;) ) arches dont have true local_t and we fallback
to plain atomic_long_t, and this is wrong because it would add a LOCKED
instruction in fast path.
I remember Christoph added FAST_CMPXCHG_LOCAL, but no more uses of it in current
tree.
Ie : using local_t only if CONFIG_FAST_CMPXCHG_LOCAL, else something like :
void __percpu_counter_add_irqsafe(struct percpu_counter *fbc, s64 amount, s32 batch)
{
s64 count;
s32 *pcount = per_cpu_ptr(fbc->counters, get_cpu());
unsigned long flags;
local_irq_save(flags);
count = *pcount + amount;
if (unlikely(count >= batch || count <= -batch)) {
spin_lock(&fbc->lock);
fbc->count += count;
spin_unlock(&fbc->lock);
count = 0;
}
*pcount = count;
local_irq_restore(flags);
put_cpu();
}
EXPORT_SYMBOL(__percpu_counter_add_irqsafe);
next prev parent reply other threads:[~2008-12-10 22:57 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-03 18:40 [PATCH] percpu_counter: fix CPU unplug race in percpu_counter_destroy() Eric Dumazet
2008-12-03 20:24 ` [PATCH] percpu_counter: Fix __percpu_counter_sum() Eric Dumazet
2008-12-03 20:45 ` Peter Zijlstra
2008-12-04 6:14 ` David Miller
2008-12-07 4:22 ` Andrew Morton
2008-12-07 4:22 ` Andrew Morton
2008-12-07 10:25 ` Peter Zijlstra
2008-12-07 13:28 ` Eric Dumazet
2008-12-07 13:28 ` Eric Dumazet
2008-12-07 17:28 ` Andrew Morton
2008-12-07 18:00 ` Eric Dumazet
2008-12-07 18:00 ` Eric Dumazet
2008-12-08 4:52 ` Andrew Morton
2008-12-08 22:12 ` Theodore Tso
2008-12-08 22:20 ` Peter Zijlstra
2008-12-08 23:00 ` Theodore Tso
2008-12-08 23:05 ` Peter Zijlstra
2008-12-08 23:08 ` Peter Zijlstra
2008-12-09 8:12 ` Eric Dumazet
2008-12-09 8:12 ` Eric Dumazet
2008-12-09 8:34 ` Peter Zijlstra
2008-12-09 8:34 ` Peter Zijlstra
2008-12-10 5:09 ` Eric Dumazet
2008-12-10 5:49 ` Andrew Morton
2008-12-10 22:56 ` Eric Dumazet [this message]
2008-12-10 22:56 ` Eric Dumazet
2008-12-12 8:17 ` Rusty Russell
2008-12-12 8:22 ` Eric Dumazet
2008-12-12 8:22 ` Eric Dumazet
2008-12-12 11:08 ` [PATCH] percpu_counter: use local_t and atomic_long_t if possible Eric Dumazet
2008-12-12 11:08 ` Eric Dumazet
2008-12-12 11:29 ` Peter Zijlstra
2008-12-23 11:43 ` Peter Zijlstra
2008-12-25 13:26 ` Rusty Russell
2008-12-15 12:53 ` [PATCH] percpu_counter: Fix __percpu_counter_sum() Rusty Russell
2008-12-16 20:16 ` Ingo Molnar
2008-12-10 7:12 ` Peter Zijlstra
2008-12-08 23:07 ` Andrew Morton
2008-12-08 23:49 ` Theodore Tso
2008-12-08 22:22 ` Andrew Morton
2008-12-08 22:44 ` Mingming Cao
2008-12-08 22:44 ` Mingming Cao
2008-12-07 22:24 ` [PATCH] atomic: fix a typo in atomic_long_xchg() Eric Dumazet
2008-12-07 22:24 ` Eric Dumazet
2008-12-07 15:28 ` [PATCH] percpu_counter: Fix __percpu_counter_sum() Theodore Tso
2008-12-08 4:42 ` Andrew Morton
2008-12-08 17:55 ` Mingming Cao
2008-12-08 17:55 ` Mingming Cao
2008-12-11 16:32 ` [stable] " Greg KH
2008-12-08 17:44 ` Mingming Cao
2008-12-08 17:44 ` Mingming Cao
2008-12-04 6:13 ` [PATCH] percpu_counter: fix CPU unplug race in percpu_counter_destroy() David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49404925.7090902@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=cmm@us.ibm.com \
--cc=davem@davemloft.net \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rusty@rustcorp.com.au \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.