From: Greg KH <gregkh@suse.de>
To: Alex Chiang <achiang@hp.com>,
Vegard Nossum <vegard.nossum@gmail.com>,
Pekka Enberg <penberg@cs.helsinki.fi>,
Ingo Molnar <mingo@elte.hu>,
jbarnes@virtuousgeek.org, tj@kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH, RFC] sysfs: only allow one scheduled removal callback per kobj
Date: Tue, 10 Mar 2009 21:41:51 -0700 [thread overview]
Message-ID: <20090311044151.GB25840@suse.de> (raw)
In-Reply-To: <20090310232027.GC25665@ldl.fc.hp.com>
On Tue, Mar 10, 2009 at 05:20:27PM -0600, Alex Chiang wrote:
> Hi Vegard, sysfs folks,
>
> Vegard was nice enough to test my PCI remove/rescan patches under
> kmemcheck. Maybe "torture" is a more appropriate term. ;)
>
> My patch series introduces a sysfs "remove" attribute for PCI
> devices, which will remove that device (and child devices).
>
> http://thread.gmane.org/gmane.linux.kernel.pci/3495
>
> Vegard decided that he wanted to do something like:
>
> # while true ; do echo 1 > /sys/bus/pci/devices/.../remove ; done
>
> which caused a nasty oops in my code. You can see the results of
> his testing in the thread I referenced above.
>
> After looking at my code for a bit, I decided that maybe it
> wasn't completely my fault. ;) See, I'm using device_schedule_callback()
why? Are you really in interrupt context here to need to do the remove
at a later time?
> which really is a wrapper around sysfs_schedule_callback() which
> is the way that a sysfs attribute is supposed to remove itself to
> prevent deadlock.
Yeah, it's the scsi code that needed this mess :(
If at all possible, I would recommend not using it.
> The problem that Vegard's test exposed is that if you repeatedly
> call a sysfs attribute that's supposed to remove itself using
> device_schedule_callback, we'll keep scheduling work queue tasks
> with a kobj that we really want to release.
>
> [nb, I bet that /sys/bus/scsi/devices/.../delete will exhibit the
> same problems]
I think it's harder to remove devices multiple times, as it isn't done
through sysfs, but from another external request. But I could be wrong.
> This is very racy, and at some point, whatever remove handler
> we've scheduled with device_schedule_callback will end up
> referencing a freed kobj.
>
> I came up with the below patch which changes the semantics of
> device/sysfs_schedule_callback. We now only allow one in-flight
> callback per kobj, and return -EBUSY if that kobj already has a
> callback scheduled for it.
>
> This patch, along with my updated 07/11 patch in my series,
> prevents at least the first oops that Vegard reported, and I
> suspect it prevents the second kmemcheck error too, although I
> haven't tested under kmemcheck (yet*).
>
> I'm looking for comments on the approach I took, specifically:
>
> - are we ok with the new restriction I imposed?
> - is it ok to return -EBUSY to our callers?
> - is the simple linked list proof of concept
> implementation going to scale too poorly?
>
> To answer my own first two questions, I checked for callers of
> both device_ and sysfs_schedule_callback, and it looks like
> everyone is using it the same way: to schedule themselves for
> removal. That is, although the interface could be used to
> schedule any sort of callback, the only use case is the removal
> use case. I don't think it will be a problem to limit ourselves
> to one remove callback per kobj.
I agree.
> Maybe this patch really wants to be a new interface called
> sysfs_schedule_callback_once or _single, where we check for an
> already-scheduled callback for a kobj, and if we pass, then we
> simply continue on to the existing, unchanged
> sysfs_schedule_callback. I don't feel too strongly about creating
> a new interface, but my belief is that changing the semantics of
> the existing interface is probably the better solution.
>
> My opinion on my own third question is that removing a device is
> not in the performance path, so a simple linked list is
> sufficient.
>
> Depending on the feedback here, I'll resend this patch with a
> full changelog (and giving credit to Vegard/kmemcheck as Ingo
> requested I do) or I can rework it.
I have no objection to this change. But I really would recommend not
using this interface at all if possible.
thanks,
greg k-h
next prev parent reply other threads:[~2009-03-11 4:51 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-10 23:20 [PATCH, RFC] sysfs: only allow one scheduled removal callback per kobj Alex Chiang
2009-03-11 4:41 ` Greg KH [this message]
2009-03-11 7:03 ` Alex Chiang
2009-03-11 7:20 ` Tejun Heo
2009-03-12 0:27 ` Alex Chiang
2009-03-12 3:22 ` Greg KH
2009-03-12 22:02 ` Alex Chiang
2009-03-13 12:03 ` Cornelia Huck
2009-03-13 18:08 ` Alex Chiang
2009-03-11 15:32 ` Greg KH
2009-03-11 17:47 ` Cornelia Huck
2009-03-11 18:14 ` Alex Chiang
2009-03-11 18:19 ` Greg KH
2009-03-11 18:42 ` Alex Chiang
2009-03-12 10:25 ` Cornelia Huck
2009-03-12 21:33 ` Alex Chiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090311044151.GB25840@suse.de \
--to=gregkh@suse.de \
--cc=achiang@hp.com \
--cc=jbarnes@virtuousgeek.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=penberg@cs.helsinki.fi \
--cc=tj@kernel.org \
--cc=vegard.nossum@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.