From: Oleg Nesterov <oleg@redhat.com>
To: Kent Overstreet <koverstreet@google.com>
Cc: tj@kernel.org, srivatsa.bhat@linux.vnet.ibm.com,
rusty@rustcorp.com.au, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] generic dynamic per cpu refcounting
Date: Fri, 25 Jan 2013 19:09:41 +0100 [thread overview]
Message-ID: <20130125180941.GA16896@redhat.com> (raw)
In-Reply-To: <20130124232024.GA584@google.com>
(add lkml)
On 01/24, Kent Overstreet wrote:
>
> This has already been on lkml and is in Andrew's tree, Tejun just asked
> me to send it out again:
I'll try to read this code later, just a couple of questions after a quick
glance. Sorry this was already discussed...
> +struct percpu_ref {
> + atomic64_t count;
> + unsigned long pcpu_count;
> +};
The code looks a bit tricky mostly because you pack state/pointer/jiffies
into ->pcpu_count. The same for ->count.
I assume that you have a good reason to shrink the sizeof(percpu_ref), but
I am curious: who is the user of this thing?
> + * percpu_ref_get - increment a dynamic percpu refcount
> + *
> + * Increments @ref and possibly converts it to percpu counters. Must be called
> + * with rcu_read_lock() held, and may potentially drop/reacquire rcu_read_lock()
> + * to allocate percpu counters - if sleeping/allocation isn't safe for some
> + * other reason (e.g. a spinlock), see percpu_ref_get_noalloc().
And this looks strange. It must be called under rcu_read_lock(), but
->rcu_read_lock_nesting must be == 1. Otherwise rcu_read_unlock() in
percpu_ref_alloc() won't work.
Again, I think you have a reason, but could you explain? IOW, why we
can't make it might_sleep() instead? The fast path can do rcu_read_lock()
itself.
> +static inline void percpu_ref_get_noalloc(struct percpu_ref *ref)
> +{
> + __percpu_ref_get(ref, false);
> +}
and this could be percpu_ref_get_atomic().
Once again, I am not arguing, just can't understand.
> +void __percpu_ref_get(struct percpu_ref *ref, bool alloc)
> +{
> + unsigned long pcpu_count;
> + uint64_t v;
> +
> + pcpu_count = ACCESS_ONCE(ref->pcpu_count);
> +
> + if (REF_STATUS(pcpu_count) == PCPU_REF_PTR) {
> + /* for rcu - we're not using rcu_dereference() */
> + smp_read_barrier_depends();
> + __this_cpu_inc(*((unsigned __percpu *) pcpu_count));
The comment looks confusing a bit... smp_read_barrier_depends() is not
for rcu, we obviously need it to access (unsigned __percpu *) pcpu_count.
But yes, since we didn't use rcu_dereference() we have to add it by hand.
> +int percpu_ref_kill(struct percpu_ref *ref)
> +{
> ...
> + if (status == PCPU_REF_PTR) {
> + unsigned count = 0, cpu;
> +
> + synchronize_rcu();
> +
> + for_each_possible_cpu(cpu)
> + count += *per_cpu_ptr((unsigned __percpu *) pcpu_count, cpu);
> +
> + pr_debug("global %lli pcpu %i",
> + atomic64_read(&ref->count) & PCPU_COUNT_MASK,
> + (int) count);
> +
> + atomic64_add((int) count, &ref->count);
> + smp_wmb();
> + /* Between setting global count and setting PCPU_REF_DEAD */
> + ref->pcpu_count = PCPU_REF_DEAD;
The coment explains what the code does, but not why ;)
I guess this is for percpu_ref_put(), and this wmb() pairs with implicit
mb() implied by atomic64_dec_return().
> + free_percpu((unsigned __percpu *) pcpu_count);
I guess it could be freed right after for_each_possible_cpu() above, but
this doesn't matter.
Oleg.
next parent reply other threads:[~2013-01-25 18:10 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20130124232024.GA584@google.com>
2013-01-25 18:09 ` Oleg Nesterov [this message]
2013-01-25 18:29 ` [PATCH] generic dynamic per cpu refcounting Oleg Nesterov
2013-01-28 18:10 ` Kent Overstreet
2013-01-28 18:50 ` Oleg Nesterov
2013-01-25 19:11 ` Oleg Nesterov
2013-01-28 18:15 ` Kent Overstreet
2013-01-28 18:27 ` Tejun Heo
2013-01-28 18:49 ` Kent Overstreet
2013-01-28 18:55 ` Tejun Heo
2013-01-28 20:22 ` Kent Overstreet
2013-01-28 20:27 ` Tejun Heo
2013-01-28 20:55 ` Kent Overstreet
2013-01-28 21:18 ` Tejun Heo
2013-01-28 21:24 ` Kent Overstreet
2013-01-28 21:28 ` Tejun Heo
2013-01-28 21:36 ` Tejun Heo
2013-01-28 21:48 ` Kent Overstreet
2013-01-28 21:45 ` Kent Overstreet
2013-01-28 21:50 ` Tejun Heo
2013-01-29 16:39 ` Kent Overstreet
2013-01-29 19:29 ` Tejun Heo
2013-01-29 19:51 ` Kent Overstreet
2013-01-29 20:02 ` Tejun Heo
2013-01-29 21:45 ` Kent Overstreet
2013-01-29 22:06 ` Tejun Heo
2013-01-29 18:04 ` [PATCH] module: Convert to generic percpu refcounts Kent Overstreet
2013-01-28 18:07 ` [PATCH] generic dynamic per cpu refcounting Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130125180941.GA16896@redhat.com \
--to=oleg@redhat.com \
--cc=koverstreet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rusty@rustcorp.com.au \
--cc=srivatsa.bhat@linux.vnet.ibm.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.