Re: [PATCH 0/3] refcounting improvements in sysfs.

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Neil Brown <neilb@suse.de>
To: Tejun Heo <teheo@suse.de>
Cc: Greg Kroah-Hartman <gregkh@suse.de>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/3] refcounting improvements in sysfs.
Date: Mon, 29 Mar 2010 16:10:54 +1100	[thread overview]
Message-ID: <20100329161054.2dfcdfa8@notabene.brown> (raw)
In-Reply-To: <4BAC5513.8060608@suse.de>

On Fri, 26 Mar 2010 15:32:51 +0900
Tejun Heo <teheo@suse.de> wrote:

> Hello, Neil.
> 
> On 03/26/2010 03:02 PM, Neil Brown wrote:
> >> Nice article.  In general, yeap, I agree it would be nice to have a
> >> working reference count abstraction.  However, kref along with kobject
> >> is a good example of obscurity by abstraction anti pattern.  :-)
> > 
> > I'm not at all sure that opinion would be universal....
> > 
> > refcounting is something that it is quite easy to get wrong.  There are
> > several slightly different models for refcounting and if you don't have a
> > clear understanding of the different use cases it is easy to get confused
> > about exactly what model is being used and so use a refcount wrongly.
> > kref certainly doesn't cover all models for refcounting but it does cover one
> > fairly common one very well and I think that it's use bring clarity rather
> > than obscurity.
> > Of course if it is used for a refcount which should really follow a different
> > model then that can cause confusion...
> 
> I don't know.  After spending some time with k* and device model, I
> grew a pretty strong dislike for abstractions without clear functional
> necessities.  kref being much simpler than kobject, the abuse is much
> less severe but there have been kref misuses (in kobject and SCSI
> midlayer) which could have been avoided or at least easily located if
> they simply had used atomic_t instead of dreaming up some mystical
> properties of kref.

Hi Tejun,

this strikes me as really valuable experience that it would be great to
share.   While I generally like kref and see value in at least some of
kobject I don't for a moment suppose they are perfect.  Fixing them requires
a good understanding of the problems they cause.  If you have that knowledge,
it would be a great resource for anyone wanting to 'fix' kobject.
Are you interested in writing anything (more) up at all???


> 
> >> kobject API incorrectly suggests that it deals with the last put
> >> problem.  There still are large number of code paths which do the
> >> following,
> >>
> >>   if (!(kob = kobject_get(kobj)))
> >> 	return;
> > 
> > kobject_get *always* returns exactly the argument that was passed to it.
> > (kref_get doesn't have a return value.)
> > 
> > I don't see how the code above has any bearing on the last-put
> > problem, which I think kref and thus kobject do handle exactly
> > correctly.
> 
> Oh, I should have been more explicit.  It's not directly related to
> kref but just something that always comes to my mind when thinking
> about k* abstractions.  The above bogus condition checks used to be
> used quite widely.  The programmer for some reason believed the last
> kobject_put() somehow will magically make future kobject_get()s return
> NULL, which of course doesn't make any sense but hey the
> implementation is buried kilometers deep, the API and other usages
> seem to suggest that and it's easy to imagine something up when you're
> tired.

This would argue that having a return value from kobject_get violates Rusty's
law that interfaces should be hard to misuse.
We could probably change that - it is only used 19 times in the current
kernel.
It probably doesn't help that Documentation/kobject.txt includes the text:

    A successful call to kobject_get() will increment the kobject's reference
    counter and return the pointer to the kobject.

which seems to suggest that an unsuccessful call is possible.


> 
> As an another example, please take a look at the kref API.
> 
>   int kref_put(struct kref *kref, void (*release) (struct kref *kref));
> 
> The function takes @release callback but under which context is it
> called?  If you look at the source code, it's called in-line which
> isn't clear from the API at all (why the hell take a callback if
> you're gonna call it in-line?) and there have been *several* subtle
> bugs which could have been avoided or at least would have been caught
> much easier if it were not for that meaningless separate release
> callback.  It's just too easy to forget about the executing context
> when people write and review stuff over function boundaries.
> 
>  void put_something(something)
>  {
> 	 if (kref_put(&something->kref))
> 		 do something which might dead lock;
>  }
> 
> is way easier to avoid bugs and review than
> 
>  void really_kill_something(struct kref *kref)
>  {
> 	 struct my_something *something = container_of(...);
> 
> 	 do something which might dead lock;
>  }
> 
>  void put_something(something)
>  {
> 	 kref_put(&someting->kref);
>  }
> 
> This is *way* worse than atomic_t not better and the confusion is
> caused exactly by superflous abstraction which leads the users of the
> API to imagine some non-existing function of the abstraction and
> hinders the flow of review.

I'm not immediately convinced by this, though maybe I haven't seen enough
examples yet.

The deadlocks that I have come across would not have been any more obvious in
either of the above - they were caused because sysfs_remove deadlocks when
called from inside a sysfs attribute action...

Also, while this re-write is possible for kref uses it isn't really possible
in kobject as the 'final_put' function must be included in the ktype (though
maybe you don't like that either).

What would be really helpful here is a survey of what sorts of things are
actually done in final_put functions so would could create some guidelines
about how to write the release functions.

Thanks,
NeilBrown


> 
> >> I believe (or at least hope) the actual problem cases are mostly fixed
> >> now but there still are a lot of misconceptions around how stuff built
> >> on kref/kobject is synchronized and they sometimes lead to race
> >> conditions buried deep under several layers of abstractions and it
> >> becomes very hard to see those race conditions when they are buried
> >> deep.
> > 
> > I agree that there probably misconceptions about how kref works and they are
> > probably based on a lack of appreciation of the subtle differences in
> > flavours of refcounts.  Hence my desire to create and document different
> > k*ref types which clarify the different use cases.
> 
> Oh, I'm not objecting to cleaning up how reference counts are done
> per-se but *PLEASE* refrain from introducing abstractions for
> abstraction's sake.  The k* stuff, device model and sysfs already
> walked down that road and got burned badly.
> 
> > BTW I'd be perfectly happy if the first patch was taken and
> > subsequent ones not.  I think they are a good idea, but I'm happy to
> > forgo them (for now:-).
> 
> If it can be done in a way that it doesn't substitute pure logical
> complexity with one which involves memory ordering issues, which is
> almost always way worse, I have no objection at all.
> 
> Thanks.
>

next prev parent reply	other threads:[~2010-03-29  5:11 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-24  3:20 [PATCH 0/3] refcounting improvements in sysfs NeilBrown
2010-03-24  3:20 ` [PATCH 2/3] sysfs: make s_count a kref NeilBrown
2010-03-26  4:29   ` Eric W. Biederman
2010-03-24  3:20 ` [PATCH 1/3] sysfs: simplify handling for s_active refcount NeilBrown
2010-03-26  4:24   ` Eric W. Biederman
2010-03-26  5:32     ` Neil Brown
2010-03-26  5:42       ` Tejun Heo
2010-03-26  7:53       ` Eric W. Biederman
2010-03-29  4:43         ` Neil Brown
2010-03-29  7:47           ` Neil Brown
2010-03-24  3:20 ` [PATCH 3/3] kref: create karef and use for sysfs_dirent->s_active NeilBrown
2010-03-26  4:50   ` Eric W. Biederman
2010-03-26  3:10 ` [PATCH 0/3] refcounting improvements in sysfs Eric W. Biederman
2010-03-26  3:28   ` Neil Brown
2010-03-26  4:49 ` Tejun Heo
2010-03-26  5:10   ` Tejun Heo
2010-03-26  6:02   ` Neil Brown
2010-03-26  6:32     ` Tejun Heo
2010-03-29  5:10       ` Neil Brown [this message]
2010-03-31  3:20         ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100329161054.2dfcdfa8@notabene.brown \
    --to=neilb@suse.de \
    --cc=gregkh@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=teheo@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.