Re: [PATCH 0/3] refcounting improvements in sysfs.

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Neil Brown <neilb@suse.de>
To: Tejun Heo <teheo@suse.de>
Cc: Greg Kroah-Hartman <gregkh@suse.de>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/3] refcounting improvements in sysfs.
Date: Mon, 29 Mar 2010 16:10:54 +1100	[thread overview]
Message-ID: <20100329161054.2dfcdfa8@notabene.brown> (raw)
In-Reply-To: <4BAC5513.8060608@suse.de>

On Fri, 26 Mar 2010 15:32:51 +0900
Tejun Heo <teheo@suse.de> wrote:

> Hello, Neil.
> 
> On 03/26/2010 03:02 PM, Neil Brown wrote:
> >> Nice article.  In general, yeap, I agree it would be nice to have a
> >> working reference count abstraction.  However, kref along with kobject
> >> is a good example of obscurity by abstraction anti pattern.  :-)
> > 
> > I'm not at all sure that opinion would be universal....
> > 
> > refcounting is something that it is quite easy to get wrong.  There are
> > several slightly different models for refcounting and if you don't have a
> > clear understanding of the different use cases it is easy to get confused
> > about exactly what model is being used and so use a refcount wrongly.
> > kref certainly doesn't cover all models for refcounting but it does cover one
> > fairly common one very well and I think that it's use bring clarity rather
> > than obscurity.
> > Of course if it is used for a refcount which should really follow a different
> > model then that can cause confusion...
> 
> I don't know.  After spending some time with k* and device model, I
> grew a pretty strong dislike for abstractions without clear functional
> necessities.  kref being much simpler than kobject, the abuse is much
> less severe but there have been kref misuses (in kobject and SCSI
> midlayer) which could have been avoided or at least easily located if
> they simply had used atomic_t instead of dreaming up some mystical
> properties of kref.

Hi Tejun,

this strikes me as really valuable experience that it would be great to
share.   While I generally like kref and see value in at least some of
kobject I don't for a moment suppose they are perfect.  Fixing them requires
a good understanding of the problems they cause.  If you have that knowledge,
it would be a great resource for anyone wanting to 'fix' kobject.
Are you interested in writing anything (more) up at all???


> 
> >> kobject API incorrectly suggests that it deals with the last put
> >> problem.  There still are large number of code paths which do the
> >> following,
> >>
> >>   if (!(kob = kobject_get(kobj)))
> >> 	return;
> > 
> > kobject_get *always* returns exactly the argument that was passed to it.
> > (kref_get doesn't have a return value.)
> > 
> > I don't see how the code above has any bearing on the last-put
> > problem, which I think kref and thus kobject do handle exactly
> > correctly.
> 
> Oh, I should have been more explicit.  It's not directly related to
> kref but just something that always comes to my mind when thinking
> about k* abstractions.  The above bogus condition checks used to be
> used quite widely.  The programmer for some reason believed the last
> kobject_put() somehow will magically make future kobject_get()s return
> NULL, which of course doesn't make any sense but hey the
> implementation is buried kilometers deep, the API and other usages
> seem to suggest that and it's easy to imagine something up when you're
> tired.

This would argue that having a return value from kobject_get violates Rusty's
law that interfaces should be hard to misuse.
We could probably change that - it is only used 19 times in the current
kernel.
It probably doesn't help that Documentation/kobject.txt includes the text:

    A successful call to kobject_get() will increment the kobject's reference
    counter and return the pointer to the kobject.

which seems to suggest that an unsuccessful call is possible.


> 
> As an another example, please take a look at the kref API.
> 
>   int kref_put(struct kref *kref, void (*release) (struct kref *kref));
> 
> The function takes @release callback but under which context is it
> called?  If you look at the source code, it's called in-line which
> isn't clear from the API at all (why the hell take a callback if
> you're gonna call it in-line?) and there have been *several* subtle
> bugs which could have been avoided or at least would have been caught
> much easier if it were not for that meaningless separate release
> callback.  It's just too easy to forget about the executing context
> when people write and review stuff over function boundaries.
> 
>  void put_something(something)
>  {
> 	 if (kref_put(&something->kref))
> 		 do something which might dead lock;
>  }
> 
> is way easier to avoid bugs and review than
> 
>  void really_kill_something(struct kref *kref)
>  {
> 	 struct my_something *something = container_of(...);
> 
> 	 do something which might dead lock;
>  }
> 
>  void put_something(something)
>  {
> 	 kref_put(&someting->kref);
>  }
> 
> This is *way* worse than atomic_t not better and the confusion is
> caused exactly by superflous abstraction which leads the users of the
> API to imagine some non-existing function of the abstraction and
> hinders the flow of review.

I'm not immediately convinced by this, though maybe I haven't seen enough
examples yet.

The deadlocks that I have come across would not have been any more obvious in
either of the above - they were caused because sysfs_remove deadlocks when
called from inside a sysfs attribute action...

Also, while this re-write is possible for kref uses it isn't really possible
in kobject as the 'final_put' function must be included in the ktype (though
maybe you don't like that either).

What would be really helpful here is a survey of what sorts of things are
actually done in final_put functions so would could create some guidelines
about how to write the release functions.

Thanks,
NeilBrown


> 
> >> I believe (or at least hope) the actual problem cases are mostly fixed
> >> now but there still are a lot of misconceptions around how stuff built
> >> on kref/kobject is synchronized and they sometimes lead to race
> >> conditions buried deep under several layers of abstractions and it
> >> becomes very hard to see those race conditions when they are buried
> >> deep.
> > 
> > I agree that there probably misconceptions about how kref works and they are
> > probably based on a lack of appreciation of the subtle differences in
> > flavours of refcounts.  Hence my desire to create and document different
> > k*ref types which clarify the different use cases.
> 
> Oh, I'm not objecting to cleaning up how reference counts are done
> per-se but *PLEASE* refrain from introducing abstractions for
> abstraction's sake.  The k* stuff, device model and sysfs already
> walked down that road and got burned badly.
> 
> > BTW I'd be perfectly happy if the first patch was taken and
> > subsequent ones not.  I think they are a good idea, but I'm happy to
> > forgo them (for now:-).
> 
> If it can be done in a way that it doesn't substitute pure logical
> complexity with one which involves memory ordering issues, which is
> almost always way worse, I have no objection at all.
> 
> Thanks.
>

next prev parent reply	other threads:[~2010-03-29  5:11 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-24  3:20 [PATCH 0/3] refcounting improvements in sysfs NeilBrown
2010-03-24  3:20 ` [PATCH 2/3] sysfs: make s_count a kref NeilBrown
2010-03-26  4:29   ` Eric W. Biederman
2010-03-24  3:20 ` [PATCH 3/3] kref: create karef and use for sysfs_dirent->s_active NeilBrown
2010-03-26  4:50   ` Eric W. Biederman
2010-03-24  3:20 ` [PATCH 1/3] sysfs: simplify handling for s_active refcount NeilBrown
2010-03-26  4:24   ` Eric W. Biederman
2010-03-26  5:32     ` Neil Brown
2010-03-26  5:42       ` Tejun Heo
2010-03-26  7:53       ` Eric W. Biederman
2010-03-29  4:43         ` Neil Brown
2010-03-29  7:47           ` Neil Brown
2010-03-26  3:10 ` [PATCH 0/3] refcounting improvements in sysfs Eric W. Biederman
2010-03-26  3:28   ` Neil Brown
2010-03-26  4:49 ` Tejun Heo
2010-03-26  5:10   ` Tejun Heo
2010-03-26  6:02   ` Neil Brown
2010-03-26  6:32     ` Tejun Heo
2010-03-29  5:10       ` Neil Brown [this message]
2010-03-31  3:20         ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100329161054.2dfcdfa8@notabene.brown \
    --to=neilb@suse.de \
    --cc=gregkh@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=teheo@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox