public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dipankar Sarma <dipankar@in.ibm.com>
To: Greg KH <greg@kroah.com>
Cc: Ravikiran G Thirumalai <kiran@in.ibm.com>, linux-kernel@vger.kernel.org
Subject: Re: [RFC] Refcounting of objects part of a lockfree collection
Date: Wed, 14 Jul 2004 20:52:35 +0530	[thread overview]
Message-ID: <20040714152235.GA5956@in.ibm.com> (raw)
In-Reply-To: <20040714142614.GA15742@kroah.com>

On Wed, Jul 14, 2004 at 07:26:14AM -0700, Greg KH wrote:
> On Wed, Jul 14, 2004 at 01:56:22PM +0530, Dipankar Sarma wrote:
> > Well, the kref has the same get/put race if used in a lock-free
> > look-up. When you do a kref_get() it is assumed that another
> > cpu will not see a 1-to-0 transition of the reference count.
> 
> You mean kref_put(), right?

No, I meant kref_get(). See below.

> > If that indeed happens, ->release() will get invoked more
> > than once for that object which is bad.
> 
> As kref_put() uses a atomic_t, how can that transistion happen twice?
> 
> What can happen is kref_get() and kref_put() can race if the last
> kref_put() happens at the same time that kref_get().  But that is solved
> by having the caller guarantee that this can not happen (see my 2004 OLS
> paper for more info about this.)

Yes, and how do the callers guarantee that ? Using a lock, right ?
What Kiran's patch does is to allow those callers to use lock-free
algorithms. Let's look at the race -

---------------------------------------------------------------
                                                                                
CPU #0                                 CPU #1
------                                 ------
                                                                                
                                 my_put() from a user who did my_lookup() ...
                                                                                
                                 [ ->count is 1 ]
                                                                                
In my_lookup() ...               atomic_dec_and_test(&m->count)
                                                                                
                                 [ ->count is 0 ]
m = my_get(my_list[i]);
                                                                                
[ ->count is 1 ]                 call_rcu(&m->head, free, m);
                                                                                
return m;
                                                                                
[This CPU can now context
 switch and allow RCU to
 proceed]
                                                                                
                                 free(m);
                                                                                
Somebody dereferences m and
invalid memory reference
---------------------------------------------------------------

This can happen if my_lookup() is lock-free.

> > The other issue is that there are many refcounted data structures
> > like dentry, dst_entry, file etc. that do not use kref.
> 
> At this time, sure.  But you could always change that :)
> (and yes, to do so, we can always shrink the size of struct kref if
> really needed...)

How are you going to shrink it ? You need the ->release() method
and that is a nice way for drivers to get rid of objects.

> 
> > If everybody were to use kref, we could possibly apply Kiran's
> > lock-free extensions to kref itself and be done with it.
> 
> Ok, sounds like a plan to me.  Having 2 refcount implementations in the
> kernel that work alike, yet a bit different, is not acceptable.  Please
> rework struct kref to do this.

And I suspect that Andrew thwak me for trying to increase dentry size :)
Anyway, the summary is this - Kiran is not trying to introduce
a new refcounting API. He is just adding lock-free support from
an existing refcounting mechanism that is used in VFS. If kref users need to do
lock-free lookup, sure we should add it to kref_xxx APIs also.

> > Until then, we need the lock-free refcounting support from non-kref
> > refcounting objects.
>
> We've lived without it until now somehow :)

Actually, we already use lock-free refcounting in route cache, dcache. In those
cases, we work around this race using a different algorithm.

Thanks
Dipankar

  reply	other threads:[~2004-07-14 15:23 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-14  4:53 [RFC] Refcounting of objects part of a lockfree collection Ravikiran G Thirumalai
2004-07-14  4:56 ` [RFC] Lock free fd lookup Ravikiran G Thirumalai
2004-07-14 15:17   ` Chris Wright
2004-07-15 14:22     ` Jesse Barnes
2004-07-15 16:10       ` Dipankar Sarma
2004-07-15 16:22         ` Jesse Barnes
2004-07-15 16:34         ` Chris Wright
2004-07-16  5:38           ` Ravikiran G Thirumalai
2004-07-16  6:27       ` William Lee Irwin III
2004-07-17  0:55         ` Keith Owens
2004-07-17  1:19           ` William Lee Irwin III
2004-07-17  2:12             ` Keith Owens
2004-07-17  2:34               ` William Lee Irwin III
2004-07-17  2:28             ` Keith Owens
2004-07-17  3:16               ` William Lee Irwin III
2004-07-17 13:48     ` Peter Zijlstra
2004-07-14  7:07 ` [RFC] Refcounting of objects part of a lockfree collection Greg KH
2004-07-14  8:26   ` Dipankar Sarma
2004-07-14 14:26     ` Greg KH
2004-07-14 15:22       ` Dipankar Sarma [this message]
2004-07-14 17:03         ` Greg KH
2004-07-14 17:49           ` Dipankar Sarma
2004-07-14 18:03             ` Greg KH
2004-07-15  6:21       ` Ravikiran G Thirumalai
2004-07-15  6:56         ` Dipankar Sarma
2004-07-14  8:57   ` Ravikiran G Thirumalai
2004-07-14 17:08     ` Greg KH
2004-07-14 18:17       ` Dipankar Sarma
2004-07-15  8:02       ` Ravikiran G Thirumalai
2004-07-15  9:36         ` Dipankar Sarma
2004-07-16 14:32         ` Greg KH
2004-07-16 15:50           ` Ravikiran G Thirumalai
  -- strict thread matches above, loose matches on Subject: below --
2004-07-14 10:24 Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040714152235.GA5956@in.ibm.com \
    --to=dipankar@in.ibm.com \
    --cc=greg@kroah.com \
    --cc=kiran@in.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox