* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
@ 2007-10-29 15:18 Alan Stern
2007-10-29 15:35 ` James Bottomley
0 siblings, 1 reply; 28+ messages in thread
From: Alan Stern @ 2007-10-29 15:18 UTC (permalink / raw)
To: James Bottomley
Cc: Kay Sievers, Hannes Reinecke, Greg KH, SCSI development list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3133 bytes --]
James:
With regard to the discussion below, is there any objection to the
attached patch? It moves the call to scsi_release_queue() from
scsi_device_dev_release_usercontext() to __scsi_remove_device(). In
fact, it might turn out that with this change, the extra _usercontext
routine isn't needed at all.
As far as I can see, the only reason for not adopting this patch would
be if a scsi_device's queue structure was needed even after the
scsi_device was unregistered. Does this ever happen? If it does, it
would be indicative of a nasty reference-loop problem (scsi_device
needs reference to request_queue, request_queue holds reference to
gendisk, gendisk holds reference to scsi_device).
Alan Stern
---------- Forwarded message ----------
Date: Tue, 23 Oct 2007 13:27:49 +0200
From: Kay Sievers <kay.sievers@vrfy.org>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: Greg KH <greg@kroah.com>,
Kernel development list <linux-kernel@vger.kernel.org>,
Hannes Reinecke <hare@suse.de>
Subject: Re: BUG in: Driver core: convert block from raw kobjects to core
devices
On Tue, 2007-10-23 at 00:14 -0400, Alan Stern wrote:
> On Tue, 23 Oct 2007, Kay Sievers wrote:
>
> > There must be something going wrong with the block patch in conjunction
> > with the crazy SCSI release logic.
>
> I don't know -- there's nothing obviously wrong with the block patch
> except the extra put_device. But you're right that the SCSI logic is
> crazy. The scsi_device is the parent of the gendisk, which is the
> parent of the request_queue. But the scsi_device holds a reference to
> the request_queue, which is dropped in the scsi_device's release
> routine!
That's the thing. We see a circular reference when we use the SCSI LUN
as the parent.
The sysfs relationship is: scsi_device -> genhd -> queue, while the
queue holds a ref to to the blockdev (parent), the blockdev to the
scsi_device (parent) and the scsi_devices to the queue (SCSI). All
waiting for their release functions to be called, to release the other
refs.
The scsi_device needs to drop the queue reference while the device is
removed, not when it's data is released. Hannes came up with the
attached patch, which seems to work fine here.
> That doesn't make much sense to me, and it is complicated by
> the fact that deleting a kobject doesn't drop the kobject's reference
> to its parent -- only releasing the kobject does.
Right, that makes things very complicated.
> In fact, for my development work I normally use a patch which makes
> kobjects behave better: They do drop the reference to their parent when
> they are deleted. This actually is nothing more than a reversion of a
> patch added several years ago to try and cover up some of the
> weaknesses of the SCSI stack! Now that the SCSI stack is in better
> shape, maybe my patch should be included in the mainstream kernel. The
> patch is below; see what you think.
Sounds good to me, to disconnect a dead object from its parent when it's
deleted. We only need to protect for "use after free" but not lock the
parent, I guess. We should give it a try.
Thanks,
Kay
[-- Attachment #2: Type: TEXT/X-PATCH, Size: 1258 bytes --]
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index daed37d..577bbf3 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -280,15 +280,6 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
list_del(&sdev->starved_entry);
spin_unlock_irqrestore(sdev->host->host_lock, flags);
- if (sdev->request_queue) {
- sdev->request_queue->queuedata = NULL;
- /* user context needed to free queue */
- scsi_free_queue(sdev->request_queue);
- /* temporary expedient, try to catch use of queue lock
- * after free of sdev */
- sdev->request_queue = NULL;
- }
-
scsi_target_reap(scsi_target(sdev));
kfree(sdev->inquiry);
@@ -799,6 +790,16 @@ void __scsi_remove_device(struct scsi_device *sdev)
if (sdev->host->hostt->slave_destroy)
sdev->host->hostt->slave_destroy(sdev);
transport_destroy_device(dev);
+
+ if (sdev->request_queue) {
+ sdev->request_queue->queuedata = NULL;
+ /* user context needed to free queue */
+ scsi_free_queue(sdev->request_queue);
+ /* temporary expedient, try to catch use of queue lock
+ * after free of sdev */
+ sdev->request_queue = NULL;
+ }
+
put_device(dev);
}
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-29 15:18 BUG in: Driver core: convert block from raw kobjects to core devices (fwd) Alan Stern
@ 2007-10-29 15:35 ` James Bottomley
2007-10-29 16:38 ` Alan Stern
2007-10-29 18:47 ` Alan Stern
0 siblings, 2 replies; 28+ messages in thread
From: James Bottomley @ 2007-10-29 15:35 UTC (permalink / raw)
To: Alan Stern; +Cc: Kay Sievers, Hannes Reinecke, Greg KH, SCSI development list
On Mon, 2007-10-29 at 11:18 -0400, Alan Stern wrote:
> With regard to the discussion below, is there any objection to the
> attached patch? It moves the call to scsi_release_queue() from
> scsi_device_dev_release_usercontext() to __scsi_remove_device(). In
> fact, it might turn out that with this change, the extra _usercontext
> routine isn't needed at all.
There's a flaw in the discussion.
> As far as I can see, the only reason for not adopting this patch would
> be if a scsi_device's queue structure was needed even after the
> scsi_device was unregistered. Does this ever happen? If it does, it
> would be indicative of a nasty reference-loop problem (scsi_device
> needs reference to request_queue, request_queue holds reference to
> gendisk, gendisk holds reference to scsi_device).
I'm not sure if we can do this patch. If you kill a device, you see
there's a processing state in scsi_prep_fn() for SDEV_DEL, which has a
printk message I see quite often when I unplug devices while they're
operating.
If you NULL out and free the request queue immediately after setting
SDEV_DEL, without allowing the devices and filesystems to finish, are
you sure we're not going to get a null deref on sdev->request_queue?
> Alan Stern
>
>
> ---------- Forwarded message ----------
> Date: Tue, 23 Oct 2007 13:27:49 +0200
> From: Kay Sievers <kay.sievers@vrfy.org>
> To: Alan Stern <stern@rowland.harvard.edu>
> Cc: Greg KH <greg@kroah.com>,
> Kernel development list <linux-kernel@vger.kernel.org>,
> Hannes Reinecke <hare@suse.de>
> Subject: Re: BUG in: Driver core: convert block from raw kobjects to core
> devices
>
> On Tue, 2007-10-23 at 00:14 -0400, Alan Stern wrote:
> > On Tue, 23 Oct 2007, Kay Sievers wrote:
> >
> > > There must be something going wrong with the block patch in conjunction
> > > with the crazy SCSI release logic.
> >
> > I don't know -- there's nothing obviously wrong with the block patch
> > except the extra put_device. But you're right that the SCSI logic is
> > crazy. The scsi_device is the parent of the gendisk, which is the
> > parent of the request_queue. But the scsi_device holds a reference to
> > the request_queue, which is dropped in the scsi_device's release
> > routine!
>
> That's the thing. We see a circular reference when we use the SCSI LUN
> as the parent.
> The sysfs relationship is: scsi_device -> genhd -> queue, while the
That's not true.
The relationship is scsi_device -> queue
However, genhd is conditioned on the ULD attachment (we can have queues
with no gendisk), so for instance, for sd it's
scsi_disk -> genhd -> queue
-> scsi_device
However, when we get to the device destruction part, the first thing
that should happen is that the driver should unbind, so we release the
driver specific structure and the genhd (and it's ref to the queue and
the scsi_device). Then we can let the mid-layer remove functions
release the queue.
> queue holds a ref to to the blockdev (parent), the blockdev to the
> scsi_device (parent) and the scsi_devices to the queue (SCSI). All
> waiting for their release functions to be called, to release the other
> refs.
This isn't how it's supposed to be ... the refs are supposed to be the
other way around (blockdev/genhd has a ref to the queue, but scsi_device
has a separate ref to the queue; the genhd should be releaseable even if
the device hasn't given up the queue).
I probably need more thread context to make a more informed statement,
though.
> The scsi_device needs to drop the queue reference while the device is
> removed, not when it's data is released. Hannes came up with the
> attached patch, which seems to work fine here.
>
> > That doesn't make much sense to me, and it is complicated by
> > the fact that deleting a kobject doesn't drop the kobject's reference
> > to its parent -- only releasing the kobject does.
>
> Right, that makes things very complicated.
>
> > In fact, for my development work I normally use a patch which makes
> > kobjects behave better: They do drop the reference to their parent when
> > they are deleted. This actually is nothing more than a reversion of a
> > patch added several years ago to try and cover up some of the
> > weaknesses of the SCSI stack! Now that the SCSI stack is in better
> > shape, maybe my patch should be included in the mainstream kernel. The
> > patch is below; see what you think.
>
> Sounds good to me, to disconnect a dead object from its parent when it's
> deleted. We only need to protect for "use after free" but not lock the
> parent, I guess. We should give it a try.
James
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-29 15:35 ` James Bottomley
@ 2007-10-29 16:38 ` Alan Stern
2007-10-29 16:45 ` James Bottomley
2007-10-29 18:47 ` Alan Stern
1 sibling, 1 reply; 28+ messages in thread
From: Alan Stern @ 2007-10-29 16:38 UTC (permalink / raw)
To: James Bottomley
Cc: Kay Sievers, Hannes Reinecke, Greg KH, Jens Axboe,
SCSI development list
On Mon, 29 Oct 2007, James Bottomley wrote:
> On Mon, 2007-10-29 at 11:18 -0400, Alan Stern wrote:
> > With regard to the discussion below, is there any objection to the
> > attached patch? It moves the call to scsi_release_queue() from
> > scsi_device_dev_release_usercontext() to __scsi_remove_device(). In
> > fact, it might turn out that with this change, the extra _usercontext
> > routine isn't needed at all.
>
> There's a flaw in the discussion.
>
> > As far as I can see, the only reason for not adopting this patch would
> > be if a scsi_device's queue structure was needed even after the
> > scsi_device was unregistered. Does this ever happen? If it does, it
> > would be indicative of a nasty reference-loop problem (scsi_device
> > needs reference to request_queue, request_queue holds reference to
> > gendisk, gendisk holds reference to scsi_device).
>
> I'm not sure if we can do this patch. If you kill a device, you see
> there's a processing state in scsi_prep_fn() for SDEV_DEL, which has a
> printk message I see quite often when I unplug devices while they're
> operating.
Me too.
> If you NULL out and free the request queue immediately after setting
> SDEV_DEL, without allowing the devices and filesystems to finish, are
> you sure we're not going to get a null deref on sdev->request_queue?
That's basically what I was asking you! :-) What happens if you try
to clean up a block queue when there are still outstanding requests?
> > On Tue, 2007-10-23 at 00:14 -0400, Alan Stern wrote:
> > > On Tue, 23 Oct 2007, Kay Sievers wrote:
> > >
> > > > There must be something going wrong with the block patch in conjunction
> > > > with the crazy SCSI release logic.
> > >
> > > I don't know -- there's nothing obviously wrong with the block patch
> > > except the extra put_device. But you're right that the SCSI logic is
> > > crazy. The scsi_device is the parent of the gendisk, which is the
> > > parent of the request_queue. But the scsi_device holds a reference to
> > > the request_queue, which is dropped in the scsi_device's release
> > > routine!
> >
> > That's the thing. We see a circular reference when we use the SCSI LUN
> > as the parent.
> > The sysfs relationship is: scsi_device -> genhd -> queue, while the
>
> That's not true.
>
> The relationship is scsi_device -> queue
I think Kay wrote his arrows backwards. Or maybe he just made a simple
mistake.
> However, genhd is conditioned on the ULD attachment (we can have queues
> with no gendisk), so for instance, for sd it's
>
> scsi_disk -> genhd -> queue
> -> scsi_device
>
> However, when we get to the device destruction part, the first thing
> that should happen is that the driver should unbind, so we release the
> driver specific structure and the genhd (and it's ref to the queue and
> the scsi_device). Then we can let the mid-layer remove functions
> release the queue.
That's what should happen. And in fact the scsi_disk structure doesn't
cause any difficulties; we're only worried about the scsi_device
structure.
> > queue holds a ref to to the blockdev (parent), the blockdev to the
> > scsi_device (parent) and the scsi_devices to the queue (SCSI). All
> > waiting for their release functions to be called, to release the other
> > refs.
>
> This isn't how it's supposed to be ... the refs are supposed to be the
> other way around (blockdev/genhd has a ref to the queue, but scsi_device
> has a separate ref to the queue; the genhd should be releaseable even if
> the device hasn't given up the queue).
No, that's not what happens. The genhd's parent is the scsi_device and
the queue's parent is the genhd. You can see this immediately from the
directory layout in sysfs. Thus:
the scsi_device holds a reference to the queue,
which holds a reference to the genhd,
which holds a reference to the scsi_device.
In fact it's not the struct device's reference to the parent which
causes the problem -- it's the embedded kobject's reference to the
kobject embedded in the parent device. This reference doesn't get
dropped when the device (and its embedded kobject) are removed; it
persists until they are released. But with the reference loop they
never do get released.
> I probably need more thread context to make a more informed statement,
> though.
If you want to wade through the email messages, the thread starts here:
http://marc.info/?t=119273573500006&r=1&w=2
Here's a brief summary. Kay wrote a patch modifying the block-device
core, the "Driver core: convert block from raw kobjects to core
devices" patch in $Subject. The patch itself is located here:
http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/gregkh-01-driver/block-device.patch
although this version has been updated; in its original form the patch
included an extra call to put_device(&disk_dev) right at the end of
del_gendisk() in fs/partitions/check.c.
In my testing I found the extra put_device() caused problems and I
complained to Kay about it. He explained that on his system the device
structures would never get released without it. In the end he agreed
the call was wrong (which is presumably why the patch was modified),
and we determined that the structures never got released because of the
circular references.
The best way to fix the problem is to drop the kobject references at
remove time instead of release time. Hence my question to you about
whether Hannes's patch to the SCSI core would be acceptable.
Alan Stern
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-29 16:38 ` Alan Stern
@ 2007-10-29 16:45 ` James Bottomley
2007-10-29 17:04 ` Alan Stern
0 siblings, 1 reply; 28+ messages in thread
From: James Bottomley @ 2007-10-29 16:45 UTC (permalink / raw)
To: Alan Stern
Cc: Kay Sievers, Hannes Reinecke, Greg KH, Jens Axboe,
SCSI development list
On Mon, 2007-10-29 at 12:38 -0400, Alan Stern wrote:
> No, that's not what happens. The genhd's parent is the scsi_device
> and
> the queue's parent is the genhd. You can see this immediately from
> the
> directory layout in sysfs. Thus:
How has this happened? The scsi_device doesn't know anything about the
genhd; how can it hold a reference to it? If you grep the code, you'll
see that only the ULD's (that's sd, sr, st, osst et al) know what a
gendisk is.
James
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-29 16:45 ` James Bottomley
@ 2007-10-29 17:04 ` Alan Stern
0 siblings, 0 replies; 28+ messages in thread
From: Alan Stern @ 2007-10-29 17:04 UTC (permalink / raw)
To: James Bottomley
Cc: Kay Sievers, Hannes Reinecke, Greg KH, Jens Axboe,
SCSI development list
On Mon, 29 Oct 2007, James Bottomley wrote:
> On Mon, 2007-10-29 at 12:38 -0400, Alan Stern wrote:
> > No, that's not what happens. The genhd's parent is the scsi_device
> > and
> > the queue's parent is the genhd. You can see this immediately from
> > the
> > directory layout in sysfs. Thus:
>
> How has this happened? The scsi_device doesn't know anything about the
> genhd; how can it hold a reference to it?
No, no -- it's the other way around. The scsi_device is the parent of
the genhd, so the genhd holds a reference to the scsi_device, not vice
versa.
Alan Stern
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-29 15:35 ` James Bottomley
2007-10-29 16:38 ` Alan Stern
@ 2007-10-29 18:47 ` Alan Stern
2007-10-29 19:13 ` Kay Sievers
2007-10-31 4:24 ` Greg KH
1 sibling, 2 replies; 28+ messages in thread
From: Alan Stern @ 2007-10-29 18:47 UTC (permalink / raw)
To: James Bottomley
Cc: Kay Sievers, Hannes Reinecke, Greg KH, SCSI development list
On Mon, 29 Oct 2007, James Bottomley wrote:
> I'm not sure if we can do this patch. If you kill a device, you see
> there's a processing state in scsi_prep_fn() for SDEV_DEL, which has a
> printk message I see quite often when I unplug devices while they're
> operating.
>
> If you NULL out and free the request queue immediately after setting
> SDEV_DEL, without allowing the devices and filesystems to finish, are
> you sure we're not going to get a null deref on sdev->request_queue?
Let's get at this another way. The patch below probably should be
applied in any case. It's essentially a reversion of this patch from
2003:
http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commit;h=10921a8f1305b8ec97794941db78b825db5839bc
which was written to compensate for problems in the SCSI stack. The
idea was to avoid releasing a kobject before all its children were
released. However as far as I can see, in general we should be able to
release a kobject as soon as all its children are removed and its
refcount drops to 0, without waiting for the children to be released.
To put it another way, once a child is removed from visibility it
should not try (or need) to access its parent at all. If it has to
then it should take an explicit reference to the parent.
Since the SCSI stack is now in much better shape, there doesn't seem to
be any reason to keep the old code. Do you agree that the patch below
is worth merging? I submitted it to Greg some time ago, but he didn't
want to accept it without some assurance that it is now safe.
It would fix the problem Kay and saw with the circular references,
because the references would all get dropped when the scsi_device is
removed instead of waiting for a release that will never come.
Alan Stern
Index: usb-2.6/lib/kobject.c
===================================================================
--- usb-2.6.orig/lib/kobject.c
+++ usb-2.6/lib/kobject.c
@@ -206,12 +206,16 @@ void kobject_init(struct kobject * kobj)
static void unlink(struct kobject * kobj)
{
+ struct kobject *parent = kobj->parent;
+
if (kobj->kset) {
spin_lock(&kobj->kset->list_lock);
list_del_init(&kobj->entry);
spin_unlock(&kobj->kset->list_lock);
}
+ kobj->parent = NULL;
kobject_put(kobj);
+ kobject_put(parent);
}
/**
@@ -255,7 +259,6 @@ int kobject_add(struct kobject * kobj)
if (error) {
/* unlink does the kobject_put() for us */
unlink(kobj);
- kobject_put(parent);
/* be noisy on error issues */
if (error == -EEXIST)
@@ -498,7 +501,6 @@ void kobject_cleanup(struct kobject * ko
{
struct kobj_type * t = get_ktype(kobj);
struct kset * s = kobj->kset;
- struct kobject * parent = kobj->parent;
const char *name = kobj->k_name;
pr_debug("kobject %s: cleaning up\n",kobject_name(kobj));
@@ -515,7 +517,6 @@ void kobject_cleanup(struct kobject * ko
}
if (s)
kset_put(s);
- kobject_put(parent);
}
static void kobject_release(struct kref *kref)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-29 18:47 ` Alan Stern
@ 2007-10-29 19:13 ` Kay Sievers
2007-10-31 4:25 ` Greg KH
2007-10-31 4:24 ` Greg KH
1 sibling, 1 reply; 28+ messages in thread
From: Kay Sievers @ 2007-10-29 19:13 UTC (permalink / raw)
To: Alan Stern
Cc: James Bottomley, Hannes Reinecke, Greg KH, SCSI development list
On Mon, 2007-10-29 at 14:47 -0400, Alan Stern wrote:
> On Mon, 29 Oct 2007, James Bottomley wrote:
>
> > I'm not sure if we can do this patch. If you kill a device, you see
> > there's a processing state in scsi_prep_fn() for SDEV_DEL, which has a
> > printk message I see quite often when I unplug devices while they're
> > operating.
> >
> > If you NULL out and free the request queue immediately after setting
> > SDEV_DEL, without allowing the devices and filesystems to finish, are
> > you sure we're not going to get a null deref on sdev->request_queue?
>
> Let's get at this another way. The patch below probably should be
> applied in any case. It's essentially a reversion of this patch from
> 2003:
>
> http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commit;h=10921a8f1305b8ec97794941db78b825db5839bc
>
> which was written to compensate for problems in the SCSI stack. The
> idea was to avoid releasing a kobject before all its children were
> released. However as far as I can see, in general we should be able to
> release a kobject as soon as all its children are removed and its
> refcount drops to 0, without waiting for the children to be released.
> To put it another way, once a child is removed from visibility it
> should not try (or need) to access its parent at all. If it has to
> then it should take an explicit reference to the parent.
>
> Since the SCSI stack is now in much better shape, there doesn't seem to
> be any reason to keep the old code. Do you agree that the patch below
> is worth merging? I submitted it to Greg some time ago, but he didn't
> want to accept it without some assurance that it is now safe.
>
> It would fix the problem Kay and saw with the circular references,
> because the references would all get dropped when the scsi_device is
> removed instead of waiting for a release that will never come.
It indeed fixes my problem. And it sounds you are right, the "fix" from
2003 is probably just a paper-over for a missing explicit reference that
time.
I'm all for giving the reversion a try, and add explicit parent get/put
if needed somewhere. If, for some reason, that will not happen, I'll
need to do something like this in the block patch, which will then be a
"fix for the paper-over solution for an unknown bug". :)
--- a/fs/partitions/check.c
+++ b/fs/partitions/check.c
...
+ device_del(&disk->dev);
+
+ /*
+ * disconnect from parent device, so the parent can clean up
+ * without waiting for us to clean up
+ *
+ * the driver core took this reference while we added ourself as
+ * a child of the parent device
+ */
+ parent = disk->dev.kobj.parent;
+ disk->dev.kobj.parent = NULL;
+ disk->dev.parent = NULL;
+ kobject_put(parent);
}
Thanks,
Kay
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-29 18:47 ` Alan Stern
2007-10-29 19:13 ` Kay Sievers
@ 2007-10-31 4:24 ` Greg KH
2007-10-31 15:51 ` Alan Stern
1 sibling, 1 reply; 28+ messages in thread
From: Greg KH @ 2007-10-31 4:24 UTC (permalink / raw)
To: Alan Stern
Cc: James Bottomley, Kay Sievers, Hannes Reinecke,
SCSI development list
On Mon, Oct 29, 2007 at 02:47:59PM -0400, Alan Stern wrote:
> On Mon, 29 Oct 2007, James Bottomley wrote:
>
> > I'm not sure if we can do this patch. If you kill a device, you see
> > there's a processing state in scsi_prep_fn() for SDEV_DEL, which has a
> > printk message I see quite often when I unplug devices while they're
> > operating.
> >
> > If you NULL out and free the request queue immediately after setting
> > SDEV_DEL, without allowing the devices and filesystems to finish, are
> > you sure we're not going to get a null deref on sdev->request_queue?
>
> Let's get at this another way. The patch below probably should be
> applied in any case. It's essentially a reversion of this patch from
> 2003:
>
> http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commit;h=10921a8f1305b8ec97794941db78b825db5839bc
>
> which was written to compensate for problems in the SCSI stack. The
> idea was to avoid releasing a kobject before all its children were
> released. However as far as I can see, in general we should be able to
> release a kobject as soon as all its children are removed and its
> refcount drops to 0, without waiting for the children to be released.
> To put it another way, once a child is removed from visibility it
> should not try (or need) to access its parent at all. If it has to
> then it should take an explicit reference to the parent.
>
> Since the SCSI stack is now in much better shape, there doesn't seem to
> be any reason to keep the old code. Do you agree that the patch below
> is worth merging? I submitted it to Greg some time ago, but he didn't
> want to accept it without some assurance that it is now safe.
>
> It would fix the problem Kay and saw with the circular references,
> because the references would all get dropped when the scsi_device is
> removed instead of waiting for a release that will never come.
Hm, will this really solve the problem properly? I kept that above
patch because of the very real problem where a device in the middle of
the device tree can go away without the lower devices being gone.
The example at the time was usb-serial devices. The physical USB device
goes away, but the tty device that userspace has open still sticks
around until userspace closes the char device.
With your patch, will things break for this usage model?
This kobject cleanup path is so fickle, I hate having to touch it :(
thanks,
greg k-h
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-29 19:13 ` Kay Sievers
@ 2007-10-31 4:25 ` Greg KH
2007-10-31 10:46 ` Kay Sievers
0 siblings, 1 reply; 28+ messages in thread
From: Greg KH @ 2007-10-31 4:25 UTC (permalink / raw)
To: Kay Sievers
Cc: Alan Stern, James Bottomley, Hannes Reinecke,
SCSI development list
On Mon, Oct 29, 2007 at 08:13:17PM +0100, Kay Sievers wrote:
> On Mon, 2007-10-29 at 14:47 -0400, Alan Stern wrote:
> > On Mon, 29 Oct 2007, James Bottomley wrote:
> >
> > > I'm not sure if we can do this patch. If you kill a device, you see
> > > there's a processing state in scsi_prep_fn() for SDEV_DEL, which has a
> > > printk message I see quite often when I unplug devices while they're
> > > operating.
> > >
> > > If you NULL out and free the request queue immediately after setting
> > > SDEV_DEL, without allowing the devices and filesystems to finish, are
> > > you sure we're not going to get a null deref on sdev->request_queue?
> >
> > Let's get at this another way. The patch below probably should be
> > applied in any case. It's essentially a reversion of this patch from
> > 2003:
> >
> > http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commit;h=10921a8f1305b8ec97794941db78b825db5839bc
> >
> > which was written to compensate for problems in the SCSI stack. The
> > idea was to avoid releasing a kobject before all its children were
> > released. However as far as I can see, in general we should be able to
> > release a kobject as soon as all its children are removed and its
> > refcount drops to 0, without waiting for the children to be released.
> > To put it another way, once a child is removed from visibility it
> > should not try (or need) to access its parent at all. If it has to
> > then it should take an explicit reference to the parent.
> >
> > Since the SCSI stack is now in much better shape, there doesn't seem to
> > be any reason to keep the old code. Do you agree that the patch below
> > is worth merging? I submitted it to Greg some time ago, but he didn't
> > want to accept it without some assurance that it is now safe.
> >
> > It would fix the problem Kay and saw with the circular references,
> > because the references would all get dropped when the scsi_device is
> > removed instead of waiting for a release that will never come.
>
> It indeed fixes my problem. And it sounds you are right, the "fix" from
> 2003 is probably just a paper-over for a missing explicit reference that
> time.
>
> I'm all for giving the reversion a try, and add explicit parent get/put
> if needed somewhere. If, for some reason, that will not happen, I'll
> need to do something like this in the block patch, which will then be a
> "fix for the paper-over solution for an unknown bug". :)
>
>
> --- a/fs/partitions/check.c
> +++ b/fs/partitions/check.c
> ...
> + device_del(&disk->dev);
> +
> + /*
> + * disconnect from parent device, so the parent can clean up
> + * without waiting for us to clean up
> + *
> + * the driver core took this reference while we added ourself as
> + * a child of the parent device
> + */
> + parent = disk->dev.kobj.parent;
Save off the parent before calling device_del() otherwise it could be a
stale pointer, right?
> + disk->dev.kobj.parent = NULL;
> + disk->dev.parent = NULL;
> + kobject_put(parent);
No, that's just wrong, if this is needed, something else really is
broken, and I don't think it's the driver core...
thanks,
greg k-h
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 4:25 ` Greg KH
@ 2007-10-31 10:46 ` Kay Sievers
2007-10-31 14:32 ` Greg KH
0 siblings, 1 reply; 28+ messages in thread
From: Kay Sievers @ 2007-10-31 10:46 UTC (permalink / raw)
To: Greg KH; +Cc: Alan Stern, James Bottomley, Hannes Reinecke,
SCSI development list
On Tue, 2007-10-30 at 21:25 -0700, Greg KH wrote:
> On Mon, Oct 29, 2007 at 08:13:17PM +0100, Kay Sievers wrote:
> > On Mon, 2007-10-29 at 14:47 -0400, Alan Stern wrote:
> > > On Mon, 29 Oct 2007, James Bottomley wrote:
> > >
> > > > I'm not sure if we can do this patch. If you kill a device, you see
> > > > there's a processing state in scsi_prep_fn() for SDEV_DEL, which has a
> > > > printk message I see quite often when I unplug devices while they're
> > > > operating.
> > > >
> > > > If you NULL out and free the request queue immediately after setting
> > > > SDEV_DEL, without allowing the devices and filesystems to finish, are
> > > > you sure we're not going to get a null deref on sdev->request_queue?
> > >
> > > Let's get at this another way. The patch below probably should be
> > > applied in any case. It's essentially a reversion of this patch from
> > > 2003:
> > >
> > > http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commit;h=10921a8f1305b8ec97794941db78b825db5839bc
> > >
> > > which was written to compensate for problems in the SCSI stack. The
> > > idea was to avoid releasing a kobject before all its children were
> > > released. However as far as I can see, in general we should be able to
> > > release a kobject as soon as all its children are removed and its
> > > refcount drops to 0, without waiting for the children to be released.
> > > To put it another way, once a child is removed from visibility it
> > > should not try (or need) to access its parent at all. If it has to
> > > then it should take an explicit reference to the parent.
> > >
> > > Since the SCSI stack is now in much better shape, there doesn't seem to
> > > be any reason to keep the old code. Do you agree that the patch below
> > > is worth merging? I submitted it to Greg some time ago, but he didn't
> > > want to accept it without some assurance that it is now safe.
> > >
> > > It would fix the problem Kay and saw with the circular references,
> > > because the references would all get dropped when the scsi_device is
> > > removed instead of waiting for a release that will never come.
> >
> > It indeed fixes my problem. And it sounds you are right, the "fix" from
> > 2003 is probably just a paper-over for a missing explicit reference that
> > time.
> >
> > I'm all for giving the reversion a try, and add explicit parent get/put
> > if needed somewhere. If, for some reason, that will not happen, I'll
> > need to do something like this in the block patch, which will then be a
> > "fix for the paper-over solution for an unknown bug". :)
> >
> >
> > --- a/fs/partitions/check.c
> > +++ b/fs/partitions/check.c
> > ...
> > + device_del(&disk->dev);
> > +
> > + /*
> > + * disconnect from parent device, so the parent can clean up
> > + * without waiting for us to clean up
> > + *
> > + * the driver core took this reference while we added ourself as
> > + * a child of the parent device
> > + */
> > + parent = disk->dev.kobj.parent;
>
> Save off the parent before calling device_del() otherwise it could be a
> stale pointer, right?
I still hold a ref, the one I drop a few lines later. It should be safe.
> > + disk->dev.kobj.parent = NULL;
> > + disk->dev.parent = NULL;
> > + kobject_put(parent);
>
> No, that's just wrong, if this is needed, something else really is
> broken, and I don't think it's the driver core...
It's just a kobject_orphan() function. We need to break the circiular
reference somewhere, one of the objects in the circle has to clean up,
to allow the others to clean up. The objects are properly deleted, but
all final references are only dropped on cleanup. The kobject_orphan()
here is just what Alan's patch is doing for the whole core.
With the current logic, if you have any parent referencing a child, the
core will deadlock, and never reach any cleanup funtion, right? That's
the loop I need to break here.
Thanks,
Kay
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 10:46 ` Kay Sievers
@ 2007-10-31 14:32 ` Greg KH
2007-10-31 15:15 ` James Bottomley
0 siblings, 1 reply; 28+ messages in thread
From: Greg KH @ 2007-10-31 14:32 UTC (permalink / raw)
To: Kay Sievers
Cc: Alan Stern, James Bottomley, Hannes Reinecke,
SCSI development list
On Wed, Oct 31, 2007 at 11:46:30AM +0100, Kay Sievers wrote:
> On Tue, 2007-10-30 at 21:25 -0700, Greg KH wrote:
> > On Mon, Oct 29, 2007 at 08:13:17PM +0100, Kay Sievers wrote:
> > > On Mon, 2007-10-29 at 14:47 -0400, Alan Stern wrote:
> > > > On Mon, 29 Oct 2007, James Bottomley wrote:
> > > >
> > > > > I'm not sure if we can do this patch. If you kill a device, you see
> > > > > there's a processing state in scsi_prep_fn() for SDEV_DEL, which has a
> > > > > printk message I see quite often when I unplug devices while they're
> > > > > operating.
> > > > >
> > > > > If you NULL out and free the request queue immediately after setting
> > > > > SDEV_DEL, without allowing the devices and filesystems to finish, are
> > > > > you sure we're not going to get a null deref on sdev->request_queue?
> > > >
> > > > Let's get at this another way. The patch below probably should be
> > > > applied in any case. It's essentially a reversion of this patch from
> > > > 2003:
> > > >
> > > > http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commit;h=10921a8f1305b8ec97794941db78b825db5839bc
> > > >
> > > > which was written to compensate for problems in the SCSI stack. The
> > > > idea was to avoid releasing a kobject before all its children were
> > > > released. However as far as I can see, in general we should be able to
> > > > release a kobject as soon as all its children are removed and its
> > > > refcount drops to 0, without waiting for the children to be released.
> > > > To put it another way, once a child is removed from visibility it
> > > > should not try (or need) to access its parent at all. If it has to
> > > > then it should take an explicit reference to the parent.
> > > >
> > > > Since the SCSI stack is now in much better shape, there doesn't seem to
> > > > be any reason to keep the old code. Do you agree that the patch below
> > > > is worth merging? I submitted it to Greg some time ago, but he didn't
> > > > want to accept it without some assurance that it is now safe.
> > > >
> > > > It would fix the problem Kay and saw with the circular references,
> > > > because the references would all get dropped when the scsi_device is
> > > > removed instead of waiting for a release that will never come.
> > >
> > > It indeed fixes my problem. And it sounds you are right, the "fix" from
> > > 2003 is probably just a paper-over for a missing explicit reference that
> > > time.
> > >
> > > I'm all for giving the reversion a try, and add explicit parent get/put
> > > if needed somewhere. If, for some reason, that will not happen, I'll
> > > need to do something like this in the block patch, which will then be a
> > > "fix for the paper-over solution for an unknown bug". :)
> > >
> > >
> > > --- a/fs/partitions/check.c
> > > +++ b/fs/partitions/check.c
> > > ...
> > > + device_del(&disk->dev);
> > > +
> > > + /*
> > > + * disconnect from parent device, so the parent can clean up
> > > + * without waiting for us to clean up
> > > + *
> > > + * the driver core took this reference while we added ourself as
> > > + * a child of the parent device
> > > + */
> > > + parent = disk->dev.kobj.parent;
> >
> > Save off the parent before calling device_del() otherwise it could be a
> > stale pointer, right?
>
> I still hold a ref, the one I drop a few lines later. It should be safe.
Ok, but it's still wrong that this is needed :)
> > > + disk->dev.kobj.parent = NULL;
> > > + disk->dev.parent = NULL;
> > > + kobject_put(parent);
> >
> > No, that's just wrong, if this is needed, something else really is
> > broken, and I don't think it's the driver core...
>
> It's just a kobject_orphan() function. We need to break the circiular
> reference somewhere, one of the objects in the circle has to clean up,
> to allow the others to clean up. The objects are properly deleted, but
> all final references are only dropped on cleanup. The kobject_orphan()
> here is just what Alan's patch is doing for the whole core.
>
> With the current logic, if you have any parent referencing a child, the
> core will deadlock, and never reach any cleanup funtion, right? That's
> the loop I need to break here.
Hm, I seem to have missed the part in this thread where someone said
that it was valid to have a parent reference a child device. That's
just wrong and needs to be fixed. Is that in the scsi layer somewhere?
The block layer? It sure isn't in the driver core...
thanks,
greg k-h
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 14:32 ` Greg KH
@ 2007-10-31 15:15 ` James Bottomley
2007-10-31 15:40 ` Kay Sievers
2007-10-31 15:58 ` Alan Stern
0 siblings, 2 replies; 28+ messages in thread
From: James Bottomley @ 2007-10-31 15:15 UTC (permalink / raw)
To: Greg KH; +Cc: Kay Sievers, Alan Stern, Hannes Reinecke, SCSI development list
On Wed, 2007-10-31 at 07:32 -0700, Greg KH wrote:
> Hm, I seem to have missed the part in this thread where someone said
> that it was valid to have a parent reference a child device. That's
> just wrong and needs to be fixed. Is that in the scsi layer somewhere?
> The block layer? It sure isn't in the driver core...
This is the piece I'm still not clear on. It's something to do with the
gendisk. I'd have to look in block, but I believe the queue takes a ref
to the gendisk.
The scsi_device has a ref to the queue and the scsi_disk (in sd) has a
ref to both the scsi_device and the gendisk. That means, until sd is
unbound and the scsi_disk released, there's an implied unbreakable
reference chain.
at least, I think that's what the problem is.
James
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 15:15 ` James Bottomley
@ 2007-10-31 15:40 ` Kay Sievers
2007-10-31 15:47 ` James Bottomley
2007-10-31 15:58 ` Alan Stern
1 sibling, 1 reply; 28+ messages in thread
From: Kay Sievers @ 2007-10-31 15:40 UTC (permalink / raw)
To: James Bottomley
Cc: Greg KH, Alan Stern, Hannes Reinecke, SCSI development list
On Wed, 2007-10-31 at 10:15 -0500, James Bottomley wrote:
> On Wed, 2007-10-31 at 07:32 -0700, Greg KH wrote:
> > Hm, I seem to have missed the part in this thread where someone said
> > that it was valid to have a parent reference a child device. That's
> > just wrong and needs to be fixed. Is that in the scsi layer somewhere?
> > The block layer? It sure isn't in the driver core...
>
> This is the piece I'm still not clear on. It's something to do with the
> gendisk. I'd have to look in block, but I believe the queue takes a ref
> to the gendisk.
Yes, the queue is a child of the disk.
> The scsi_device has a ref to the queue
Yeah, while the queue is a grandchild of the scsi_device with the
unified sysfs layout.
> and the scsi_disk (in sd) has a
> ref to both the scsi_device and the gendisk. That means, until sd is
> unbound and the scsi_disk released, there's an implied unbreakable
> reference chain.
>
> at least, I think that's what the problem is.
Yes, sounds right. We need to break that deleted-but-wait-for-cleanup at
least at one of the devices involved.
Thanks,
Kay
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 15:40 ` Kay Sievers
@ 2007-10-31 15:47 ` James Bottomley
2007-10-31 16:04 ` Alan Stern
0 siblings, 1 reply; 28+ messages in thread
From: James Bottomley @ 2007-10-31 15:47 UTC (permalink / raw)
To: Kay Sievers; +Cc: Greg KH, Alan Stern, Hannes Reinecke, SCSI development list
On Wed, 2007-10-31 at 16:40 +0100, Kay Sievers wrote:
> On Wed, 2007-10-31 at 10:15 -0500, James Bottomley wrote:
> > On Wed, 2007-10-31 at 07:32 -0700, Greg KH wrote:
> > > Hm, I seem to have missed the part in this thread where someone said
> > > that it was valid to have a parent reference a child device. That's
> > > just wrong and needs to be fixed. Is that in the scsi layer somewhere?
> > > The block layer? It sure isn't in the driver core...
> >
> > This is the piece I'm still not clear on. It's something to do with the
> > gendisk. I'd have to look in block, but I believe the queue takes a ref
> > to the gendisk.
>
> Yes, the queue is a child of the disk.
Right, so this goes gendisk->queue (-> meaning parent of, or takes
reference to)
> > The scsi_device has a ref to the queue
>
> Yeah, while the queue is a grandchild of the scsi_device with the
> unified sysfs layout.
No, the scsi_device is a direct parent of the queue, so we have
scsi_device->queue
> > and the scsi_disk (in sd) has a
> > ref to both the scsi_device and the gendisk. That means, until sd is
> > unbound and the scsi_disk released, there's an implied unbreakable
> > reference chain.
> >
> > at least, I think that's what the problem is.
>
> Yes, sounds right. We need to break that deleted-but-wait-for-cleanup at
> least at one of the devices involved.
But it's broken when the driver is unbound. Diagrammatically it's:
scsi_disk -> scsi_device -> queue
-> gendisk ->
It's not circular, it's released when scsi_disk is released. It can
become circular if there's some hidden dependency between any of the
components ... but I don't think there is.
James
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 4:24 ` Greg KH
@ 2007-10-31 15:51 ` Alan Stern
0 siblings, 0 replies; 28+ messages in thread
From: Alan Stern @ 2007-10-31 15:51 UTC (permalink / raw)
To: Greg KH
Cc: James Bottomley, Kay Sievers, Hannes Reinecke,
SCSI development list
On Tue, 30 Oct 2007, Greg KH wrote:
> > Since the SCSI stack is now in much better shape, there doesn't seem to
> > be any reason to keep the old code. Do you agree that the patch below
> > is worth merging? I submitted it to Greg some time ago, but he didn't
> > want to accept it without some assurance that it is now safe.
> >
> > It would fix the problem Kay and saw with the circular references,
> > because the references would all get dropped when the scsi_device is
> > removed instead of waiting for a release that will never come.
>
> Hm, will this really solve the problem properly? I kept that above
> patch because of the very real problem where a device in the middle of
> the device tree can go away without the lower devices being gone.
When you say "go away", do you mean "be deleted (deallocated)"? In
general that should not pose a problem. Once a child has been
unregistered it should not try to access its parent's structure any
more. Hence, if the parent structure gets deallocated before the child
structure, nothing bad will happen.
Or when you say "go away", do you mean "be removed (unregistered)"? If
a parent is removed while it still has children registered, that's a
genuine bug. (In fact, driver_del() should check for this and issue a
WARN_ON if it ever happens.) The driver doing the remove needs to be
fixed.
> The example at the time was usb-serial devices. The physical USB device
> goes away, but the tty device that userspace has open still sticks
> around until userspace closes the char device.
Now you are using "goes away" to mean "is unplugged from the computer"! :-)
There's nothing wrong with the physical device being unplugged, so long
as the logical usb_device structure remains allocated. And since the
usb_device is the parent of the tty device (it is the parent, isn't
it?), it _will_ remain pinned while the tty device is registered.
So the only problem that could arise would be if the tty device and the
usb_device are both unregistered while userspace continues to hold the
char device file open. There are two possible ways to handle this:
1. Ideally, the usb-serial driver would realize that the USB
device was disconnected (by checking a private flag) and would
therefore avoid trying to access the usb_device structure.
2. If that's impractical, the usb-serial driver could take an
explicit reference to the usb_device structure each time
the char device file is opened, and drop the reference when
the file is closed.
Either approach would work.
> With your patch, will things break for this usage model?
Not if the usb-serial drivers are written properly.
> This kobject cleanup path is so fickle, I hate having to touch it :(
Is usb-serial the worst case? Then if it turns out to be compatible
with this change, all other problems will be equally fixable.
Alan Stern
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 15:15 ` James Bottomley
2007-10-31 15:40 ` Kay Sievers
@ 2007-10-31 15:58 ` Alan Stern
2007-10-31 16:11 ` James Bottomley
1 sibling, 1 reply; 28+ messages in thread
From: Alan Stern @ 2007-10-31 15:58 UTC (permalink / raw)
To: James Bottomley
Cc: Greg KH, Kay Sievers, Hannes Reinecke, SCSI development list
On Wed, 31 Oct 2007, James Bottomley wrote:
> On Wed, 2007-10-31 at 07:32 -0700, Greg KH wrote:
> > Hm, I seem to have missed the part in this thread where someone said
> > that it was valid to have a parent reference a child device. That's
> > just wrong and needs to be fixed. Is that in the scsi layer somewhere?
> > The block layer? It sure isn't in the driver core...
>
> This is the piece I'm still not clear on. It's something to do with the
> gendisk. I'd have to look in block, but I believe the queue takes a ref
> to the gendisk.
>
> The scsi_device has a ref to the queue and the scsi_disk (in sd) has a
> ref to both the scsi_device and the gendisk. That means, until sd is
> unbound and the scsi_disk released, there's an implied unbreakable
> reference chain.
>
> at least, I think that's what the problem is.
No, you haven't got it right.
Parent Child Grandchild
------ ----- ----------
scsi_device gendisk request_queue
The odd part is that the scsi_device holds a reference to the queue.
That creates a reference loop:
scsi_device holds ref to request_queue (done explicitly)
request_queue holds ref to gendisk (implicit, parent-child)
gendisk holds ref to scsi_device (implicit, parent-child)
The scsi_disk adds confusion to the picture but it doesn't make things
any worse than they already are.
Alan Stern
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 15:47 ` James Bottomley
@ 2007-10-31 16:04 ` Alan Stern
2007-10-31 16:13 ` James Bottomley
0 siblings, 1 reply; 28+ messages in thread
From: Alan Stern @ 2007-10-31 16:04 UTC (permalink / raw)
To: James Bottomley
Cc: Kay Sievers, Greg KH, Hannes Reinecke, SCSI development list
On Wed, 31 Oct 2007, James Bottomley wrote:
> > Yes, the queue is a child of the disk.
>
> Right, so this goes gendisk->queue (-> meaning parent of, or takes
> reference to)
No, no! The _child_ takes an implicit reference to the _parent_, not
the other way around.
> > > The scsi_device has a ref to the queue
> >
> > Yeah, while the queue is a grandchild of the scsi_device with the
> > unified sysfs layout.
>
> No, the scsi_device is a direct parent of the queue, so we have
>
> scsi_device->queue
Wrong -- the gendisk is the direct parent of the queue. The relevant
line is in ll_rw_blk.c:blk_register_queue():
q->kobj.parent = kobject_get(&disk->dev.kobj);
> > Yes, sounds right. We need to break that deleted-but-wait-for-cleanup at
> > least at one of the devices involved.
>
> But it's broken when the driver is unbound. Diagrammatically it's:
>
> scsi_disk -> scsi_device -> queue
> -> gendisk ->
>
> It's not circular, it's released when scsi_disk is released. It can
> become circular if there's some hidden dependency between any of the
> components ... but I don't think there is.
Forget about the scsi_disk. It isn't part of the problem. Just
concentrate on the scsi_device, the gendisk, and the queue. We have:
scsi_device <- gendisk <- queue <- scsi_device,
where "A <- B" means that B holds a reference to A.
Alan Stern
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 15:58 ` Alan Stern
@ 2007-10-31 16:11 ` James Bottomley
0 siblings, 0 replies; 28+ messages in thread
From: James Bottomley @ 2007-10-31 16:11 UTC (permalink / raw)
To: Alan Stern; +Cc: Greg KH, Kay Sievers, Hannes Reinecke, SCSI development list
On Wed, 2007-10-31 at 11:58 -0400, Alan Stern wrote:
> On Wed, 31 Oct 2007, James Bottomley wrote:
>
> > On Wed, 2007-10-31 at 07:32 -0700, Greg KH wrote:
> > > Hm, I seem to have missed the part in this thread where someone said
> > > that it was valid to have a parent reference a child device. That's
> > > just wrong and needs to be fixed. Is that in the scsi layer somewhere?
> > > The block layer? It sure isn't in the driver core...
> >
> > This is the piece I'm still not clear on. It's something to do with the
> > gendisk. I'd have to look in block, but I believe the queue takes a ref
> > to the gendisk.
> >
> > The scsi_device has a ref to the queue and the scsi_disk (in sd) has a
> > ref to both the scsi_device and the gendisk. That means, until sd is
> > unbound and the scsi_disk released, there's an implied unbreakable
> > reference chain.
> >
> > at least, I think that's what the problem is.
>
> No, you haven't got it right.
>
> Parent Child Grandchild
> ------ ----- ----------
> scsi_device gendisk request_queue
This is what I think doesn't happen. The scsi_device should never be
parented to the gendisk.
>
> The odd part is that the scsi_device holds a reference to the queue.
> That creates a reference loop:
>
> scsi_device holds ref to request_queue (done explicitly)
> request_queue holds ref to gendisk (implicit, parent-child)
agreed.
> gendisk holds ref to scsi_device (implicit, parent-child)
Where exactly does this last part happen? the scsi_device is a mid layer
object; the mid-layer doesn't know about gendisks.
James
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 16:04 ` Alan Stern
@ 2007-10-31 16:13 ` James Bottomley
2007-10-31 16:24 ` Kay Sievers
0 siblings, 1 reply; 28+ messages in thread
From: James Bottomley @ 2007-10-31 16:13 UTC (permalink / raw)
To: Alan Stern; +Cc: Kay Sievers, Greg KH, Hannes Reinecke, SCSI development list
On Wed, 2007-10-31 at 12:04 -0400, Alan Stern wrote:
> On Wed, 31 Oct 2007, James Bottomley wrote:
>
> > > Yes, the queue is a child of the disk.
> >
> > Right, so this goes gendisk->queue (-> meaning parent of, or takes
> > reference to)
>
> No, no! The _child_ takes an implicit reference to the _parent_, not
> the other way around.
>
> > > > The scsi_device has a ref to the queue
> > >
> > > Yeah, while the queue is a grandchild of the scsi_device with the
> > > unified sysfs layout.
> >
> > No, the scsi_device is a direct parent of the queue, so we have
> >
> > scsi_device->queue
>
> Wrong -- the gendisk is the direct parent of the queue. The relevant
> line is in ll_rw_blk.c:blk_register_queue():
>
> q->kobj.parent = kobject_get(&disk->dev.kobj);
>
> > > Yes, sounds right. We need to break that deleted-but-wait-for-cleanup at
> > > least at one of the devices involved.
> >
> > But it's broken when the driver is unbound. Diagrammatically it's:
> >
> > scsi_disk -> scsi_device -> queue
> > -> gendisk ->
> >
> > It's not circular, it's released when scsi_disk is released. It can
> > become circular if there's some hidden dependency between any of the
> > components ... but I don't think there is.
>
> Forget about the scsi_disk. It isn't part of the problem. Just
> concentrate on the scsi_device, the gendisk, and the queue. We have:
>
> scsi_device <- gendisk <- queue <- scsi_device,
OK, so where does the gendisk get a reference to the scsi device?
James
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 16:13 ` James Bottomley
@ 2007-10-31 16:24 ` Kay Sievers
2007-10-31 16:31 ` James Bottomley
0 siblings, 1 reply; 28+ messages in thread
From: Kay Sievers @ 2007-10-31 16:24 UTC (permalink / raw)
To: James Bottomley
Cc: Alan Stern, Greg KH, Hannes Reinecke, SCSI development list
On Wed, 2007-10-31 at 11:13 -0500, James Bottomley wrote:
> On Wed, 2007-10-31 at 12:04 -0400, Alan Stern wrote:
> > On Wed, 31 Oct 2007, James Bottomley wrote:
> >
> > > > Yes, the queue is a child of the disk.
> > >
> > > Right, so this goes gendisk->queue (-> meaning parent of, or takes
> > > reference to)
> >
> > No, no! The _child_ takes an implicit reference to the _parent_, not
> > the other way around.
> >
> > > > > The scsi_device has a ref to the queue
> > > >
> > > > Yeah, while the queue is a grandchild of the scsi_device with the
> > > > unified sysfs layout.
> > >
> > > No, the scsi_device is a direct parent of the queue, so we have
> > >
> > > scsi_device->queue
> >
> > Wrong -- the gendisk is the direct parent of the queue. The relevant
> > line is in ll_rw_blk.c:blk_register_queue():
> >
> > q->kobj.parent = kobject_get(&disk->dev.kobj);
> >
> > > > Yes, sounds right. We need to break that deleted-but-wait-for-cleanup at
> > > > least at one of the devices involved.
> > >
> > > But it's broken when the driver is unbound. Diagrammatically it's:
> > >
> > > scsi_disk -> scsi_device -> queue
> > > -> gendisk ->
> > >
> > > It's not circular, it's released when scsi_disk is released. It can
> > > become circular if there's some hidden dependency between any of the
> > > components ... but I don't think there is.
> >
> > Forget about the scsi_disk. It isn't part of the problem. Just
> > concentrate on the scsi_device, the gendisk, and the queue. We have:
> >
> > scsi_device <- gendisk <- queue <- scsi_device,
>
> OK, so where does the gendisk get a reference to the scsi device?
In the unified sysfs layout where the silly and conceptual broken idea
of "class devices" gets removed.
Everything that has a "device" link today will just live below the
device the "device" link points to. The whole current kernel is already
converted to do this, besides the "raw kobject" gendisk's, and the SCSI
subsystem. The gendisk patch is queued in Greg's tree (see subject of
this mail), and the conversion from "struct class_device" to "struct
device" for the whole SCSI directory is coming soon.
With the gendisk pointing to "driverfs_dev" ("device" link) it will
become a child of the scsi_device.
Kay
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 16:24 ` Kay Sievers
@ 2007-10-31 16:31 ` James Bottomley
2007-10-31 16:42 ` Kay Sievers
2007-10-31 16:44 ` Alan Stern
0 siblings, 2 replies; 28+ messages in thread
From: James Bottomley @ 2007-10-31 16:31 UTC (permalink / raw)
To: Kay Sievers; +Cc: Alan Stern, Greg KH, Hannes Reinecke, SCSI development list
On Wed, 2007-10-31 at 17:24 +0100, Kay Sievers wrote:
> On Wed, 2007-10-31 at 11:13 -0500, James Bottomley wrote:
> > On Wed, 2007-10-31 at 12:04 -0400, Alan Stern wrote:
> > > On Wed, 31 Oct 2007, James Bottomley wrote:
> > >
> > > > > Yes, the queue is a child of the disk.
> > > >
> > > > Right, so this goes gendisk->queue (-> meaning parent of, or takes
> > > > reference to)
> > >
> > > No, no! The _child_ takes an implicit reference to the _parent_, not
> > > the other way around.
> > >
> > > > > > The scsi_device has a ref to the queue
> > > > >
> > > > > Yeah, while the queue is a grandchild of the scsi_device with the
> > > > > unified sysfs layout.
> > > >
> > > > No, the scsi_device is a direct parent of the queue, so we have
> > > >
> > > > scsi_device->queue
> > >
> > > Wrong -- the gendisk is the direct parent of the queue. The relevant
> > > line is in ll_rw_blk.c:blk_register_queue():
> > >
> > > q->kobj.parent = kobject_get(&disk->dev.kobj);
> > >
> > > > > Yes, sounds right. We need to break that deleted-but-wait-for-cleanup at
> > > > > least at one of the devices involved.
> > > >
> > > > But it's broken when the driver is unbound. Diagrammatically it's:
> > > >
> > > > scsi_disk -> scsi_device -> queue
> > > > -> gendisk ->
> > > >
> > > > It's not circular, it's released when scsi_disk is released. It can
> > > > become circular if there's some hidden dependency between any of the
> > > > components ... but I don't think there is.
> > >
> > > Forget about the scsi_disk. It isn't part of the problem. Just
> > > concentrate on the scsi_device, the gendisk, and the queue. We have:
> > >
> > > scsi_device <- gendisk <- queue <- scsi_device,
> >
> > OK, so where does the gendisk get a reference to the scsi device?
>
> In the unified sysfs layout where the silly and conceptual broken idea
> of "class devices" gets removed.
> Everything that has a "device" link today will just live below the
> device the "device" link points to. The whole current kernel is already
> converted to do this, besides the "raw kobject" gendisk's, and the SCSI
> subsystem. The gendisk patch is queued in Greg's tree (see subject of
> this mail), and the conversion from "struct class_device" to "struct
> device" for the whole SCSI directory is coming soon.
>
> With the gendisk pointing to "driverfs_dev" ("device" link) it will
> become a child of the scsi_device.
OK, light beginning to go on now.
The problem is that you've fallen into the conceptual trap we tried very
hard to avoid in the initial go around of joining SCSI upper layer
drivers to gendisks. That's why no gendisk references are held by the
mid-layer, only by the entities that represent the objects created by
upper layer drivers.
Doesn't this circularity now exist for everything? Every device that
creates a queue has a reference to the queue, every queue has a
reference to its attached gendisk and now every gendisk has a reference
to the device creating the queue? This doesn't look to be a SCSI
specific problem.
James
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 16:31 ` James Bottomley
@ 2007-10-31 16:42 ` Kay Sievers
2007-10-31 16:46 ` James Bottomley
2007-10-31 16:44 ` Alan Stern
1 sibling, 1 reply; 28+ messages in thread
From: Kay Sievers @ 2007-10-31 16:42 UTC (permalink / raw)
To: James Bottomley
Cc: Alan Stern, Greg KH, Hannes Reinecke, SCSI development list
On Wed, 2007-10-31 at 11:31 -0500, James Bottomley wrote:
> On Wed, 2007-10-31 at 17:24 +0100, Kay Sievers wrote:
> > On Wed, 2007-10-31 at 11:13 -0500, James Bottomley wrote:
> > > On Wed, 2007-10-31 at 12:04 -0400, Alan Stern wrote:
> > > > On Wed, 31 Oct 2007, James Bottomley wrote:
> > > >
> > > > > > Yes, the queue is a child of the disk.
> > > > >
> > > > > Right, so this goes gendisk->queue (-> meaning parent of, or takes
> > > > > reference to)
> > > >
> > > > No, no! The _child_ takes an implicit reference to the _parent_, not
> > > > the other way around.
> > > >
> > > > > > > The scsi_device has a ref to the queue
> > > > > >
> > > > > > Yeah, while the queue is a grandchild of the scsi_device with the
> > > > > > unified sysfs layout.
> > > > >
> > > > > No, the scsi_device is a direct parent of the queue, so we have
> > > > >
> > > > > scsi_device->queue
> > > >
> > > > Wrong -- the gendisk is the direct parent of the queue. The relevant
> > > > line is in ll_rw_blk.c:blk_register_queue():
> > > >
> > > > q->kobj.parent = kobject_get(&disk->dev.kobj);
> > > >
> > > > > > Yes, sounds right. We need to break that deleted-but-wait-for-cleanup at
> > > > > > least at one of the devices involved.
> > > > >
> > > > > But it's broken when the driver is unbound. Diagrammatically it's:
> > > > >
> > > > > scsi_disk -> scsi_device -> queue
> > > > > -> gendisk ->
> > > > >
> > > > > It's not circular, it's released when scsi_disk is released. It can
> > > > > become circular if there's some hidden dependency between any of the
> > > > > components ... but I don't think there is.
> > > >
> > > > Forget about the scsi_disk. It isn't part of the problem. Just
> > > > concentrate on the scsi_device, the gendisk, and the queue. We have:
> > > >
> > > > scsi_device <- gendisk <- queue <- scsi_device,
> > >
> > > OK, so where does the gendisk get a reference to the scsi device?
> >
> > In the unified sysfs layout where the silly and conceptual broken idea
> > of "class devices" gets removed.
> > Everything that has a "device" link today will just live below the
> > device the "device" link points to. The whole current kernel is already
> > converted to do this, besides the "raw kobject" gendisk's, and the SCSI
> > subsystem. The gendisk patch is queued in Greg's tree (see subject of
> > this mail), and the conversion from "struct class_device" to "struct
> > device" for the whole SCSI directory is coming soon.
> >
> > With the gendisk pointing to "driverfs_dev" ("device" link) it will
> > become a child of the scsi_device.
>
> OK, light beginning to go on now.
>
> The problem is that you've fallen into the conceptual trap we tried very
> hard to avoid in the initial go around of joining SCSI upper layer
> drivers to gendisks. That's why no gendisk references are held by the
> mid-layer, only by the entities that represent the objects created by
> upper layer drivers.
That will not change, only the disk will reference the device which it
points to. It's not a problem, we can "orphan" the disk on delete, or we
do the "orphaning" for all devices in the core, which is probably the
right fix anyway.
> Doesn't this circularity now exist for everything? Every device that
> creates a queue has a reference to the queue, every queue has a
> reference to its attached gendisk and now every gendisk has a reference
> to the device creating the queue? This doesn't look to be a SCSI
> specific problem.
It's only SCSI so far, everything else seems fine.
But, the real problem is that the core seems to deadlock if two devices
reference each other (or build a larger circle), even when they are
deleted, that's the problem we are running in.
Thanks,
Kay
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 16:31 ` James Bottomley
2007-10-31 16:42 ` Kay Sievers
@ 2007-10-31 16:44 ` Alan Stern
2007-10-31 17:07 ` James Bottomley
1 sibling, 1 reply; 28+ messages in thread
From: Alan Stern @ 2007-10-31 16:44 UTC (permalink / raw)
To: James Bottomley
Cc: Kay Sievers, Greg KH, Hannes Reinecke, SCSI development list
On Wed, 31 Oct 2007, James Bottomley wrote:
> OK, light beginning to go on now.
>
> The problem is that you've fallen into the conceptual trap we tried very
> hard to avoid in the initial go around of joining SCSI upper layer
> drivers to gendisks. That's why no gendisk references are held by the
> mid-layer, only by the entities that represent the objects created by
> upper layer drivers.
>
> Doesn't this circularity now exist for everything? Every device that
> creates a queue has a reference to the queue, every queue has a
> reference to its attached gendisk and now every gendisk has a reference
> to the device creating the queue? This doesn't look to be a SCSI
> specific problem.
It probably isn't. I haven't looked at other subsystems but the
scenario you described could well be the common case.
Dropping a kobject's reference to its parent when the kobject is
removed rather than when it is deleted will solve all these problems.
The queue's reference to the gendisk and the gendisk's reference to the
device will both be dropped when the device is unregistered, allowing
the device and gendisk to be released. When the device is released it
can then drop its reference to the queue, allowing the queue to be
released.
In other words, reverting that 4-year-old patch would fix everything.
I'd still like to get your acknowledgement for my patch, posted here:
http://marc.info/?l=linux-scsi&m=119368368904151&w=2
Alan Stern
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 16:42 ` Kay Sievers
@ 2007-10-31 16:46 ` James Bottomley
2007-10-31 17:32 ` Kay Sievers
2007-10-31 18:36 ` Alan Stern
0 siblings, 2 replies; 28+ messages in thread
From: James Bottomley @ 2007-10-31 16:46 UTC (permalink / raw)
To: Kay Sievers; +Cc: Alan Stern, Greg KH, Hannes Reinecke, SCSI development list
On Wed, 2007-10-31 at 17:42 +0100, Kay Sievers wrote:
> On Wed, 2007-10-31 at 11:31 -0500, James Bottomley wrote:
> > On Wed, 2007-10-31 at 17:24 +0100, Kay Sievers wrote:
> > > On Wed, 2007-10-31 at 11:13 -0500, James Bottomley wrote:
> > > > On Wed, 2007-10-31 at 12:04 -0400, Alan Stern wrote:
> > > > > On Wed, 31 Oct 2007, James Bottomley wrote:
> > > > >
> > > > > > > Yes, the queue is a child of the disk.
> > > > > >
> > > > > > Right, so this goes gendisk->queue (-> meaning parent of, or takes
> > > > > > reference to)
> > > > >
> > > > > No, no! The _child_ takes an implicit reference to the _parent_, not
> > > > > the other way around.
> > > > >
> > > > > > > > The scsi_device has a ref to the queue
> > > > > > >
> > > > > > > Yeah, while the queue is a grandchild of the scsi_device with the
> > > > > > > unified sysfs layout.
> > > > > >
> > > > > > No, the scsi_device is a direct parent of the queue, so we have
> > > > > >
> > > > > > scsi_device->queue
> > > > >
> > > > > Wrong -- the gendisk is the direct parent of the queue. The relevant
> > > > > line is in ll_rw_blk.c:blk_register_queue():
> > > > >
> > > > > q->kobj.parent = kobject_get(&disk->dev.kobj);
> > > > >
> > > > > > > Yes, sounds right. We need to break that deleted-but-wait-for-cleanup at
> > > > > > > least at one of the devices involved.
> > > > > >
> > > > > > But it's broken when the driver is unbound. Diagrammatically it's:
> > > > > >
> > > > > > scsi_disk -> scsi_device -> queue
> > > > > > -> gendisk ->
> > > > > >
> > > > > > It's not circular, it's released when scsi_disk is released. It can
> > > > > > become circular if there's some hidden dependency between any of the
> > > > > > components ... but I don't think there is.
> > > > >
> > > > > Forget about the scsi_disk. It isn't part of the problem. Just
> > > > > concentrate on the scsi_device, the gendisk, and the queue. We have:
> > > > >
> > > > > scsi_device <- gendisk <- queue <- scsi_device,
> > > >
> > > > OK, so where does the gendisk get a reference to the scsi device?
> > >
> > > In the unified sysfs layout where the silly and conceptual broken idea
> > > of "class devices" gets removed.
> > > Everything that has a "device" link today will just live below the
> > > device the "device" link points to. The whole current kernel is already
> > > converted to do this, besides the "raw kobject" gendisk's, and the SCSI
> > > subsystem. The gendisk patch is queued in Greg's tree (see subject of
> > > this mail), and the conversion from "struct class_device" to "struct
> > > device" for the whole SCSI directory is coming soon.
> > >
> > > With the gendisk pointing to "driverfs_dev" ("device" link) it will
> > > become a child of the scsi_device.
> >
> > OK, light beginning to go on now.
> >
> > The problem is that you've fallen into the conceptual trap we tried very
> > hard to avoid in the initial go around of joining SCSI upper layer
> > drivers to gendisks. That's why no gendisk references are held by the
> > mid-layer, only by the entities that represent the objects created by
> > upper layer drivers.
>
> That will not change, only the disk will reference the device which it
> points to. It's not a problem, we can "orphan" the disk on delete, or we
> do the "orphaning" for all devices in the core, which is probably the
> right fix anyway.
But you're assuming the relationship gendisk->queue exists when we
delete the scsi device. What should happen is that the gendisk should
be released first in the ULD detach. Queues are allowed to exist
without gendisks (when a SCSI queue is first created, that's what it
should look like), so I think we're perhaps simply missing an API that
would detach the queue and the gendisk (and get the gendisk to delete
itself and give up its device link) ... doesn't del_gendisk() do this?
> > Doesn't this circularity now exist for everything? Every device that
> > creates a queue has a reference to the queue, every queue has a
> > reference to its attached gendisk and now every gendisk has a reference
> > to the device creating the queue? This doesn't look to be a SCSI
> > specific problem.
>
> It's only SCSI so far, everything else seems fine.
But you didn't respond to the stated problem: there are very few other
non-scsi block devices ... have you tried looking at cciss?
> But, the real problem is that the core seems to deadlock if two devices
> reference each other (or build a larger circle), even when they are
> deleted, that's the problem we are running in.
Yes ... we sort of evaded that one by having class devices hang off the
sides of real devices ... making everything a peer with crosslinks
brings it back with a vengeance.
James
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 16:44 ` Alan Stern
@ 2007-10-31 17:07 ` James Bottomley
2007-10-31 18:38 ` Alan Stern
0 siblings, 1 reply; 28+ messages in thread
From: James Bottomley @ 2007-10-31 17:07 UTC (permalink / raw)
To: Alan Stern; +Cc: Kay Sievers, Greg KH, Hannes Reinecke, SCSI development list
On Wed, 2007-10-31 at 12:44 -0400, Alan Stern wrote:
> On Wed, 31 Oct 2007, James Bottomley wrote:
>
> > OK, light beginning to go on now.
> >
> > The problem is that you've fallen into the conceptual trap we tried very
> > hard to avoid in the initial go around of joining SCSI upper layer
> > drivers to gendisks. That's why no gendisk references are held by the
> > mid-layer, only by the entities that represent the objects created by
> > upper layer drivers.
> >
> > Doesn't this circularity now exist for everything? Every device that
> > creates a queue has a reference to the queue, every queue has a
> > reference to its attached gendisk and now every gendisk has a reference
> > to the device creating the queue? This doesn't look to be a SCSI
> > specific problem.
>
> It probably isn't. I haven't looked at other subsystems but the
> scenario you described could well be the common case.
>
> Dropping a kobject's reference to its parent when the kobject is
> removed rather than when it is deleted will solve all these problems.
> The queue's reference to the gendisk and the gendisk's reference to the
> device will both be dropped when the device is unregistered, allowing
> the device and gendisk to be released. When the device is released it
> can then drop its reference to the queue, allowing the queue to be
> released.
>
> In other words, reverting that 4-year-old patch would fix everything.
> I'd still like to get your acknowledgement for my patch, posted here:
>
> http://marc.info/?l=linux-scsi&m=119368368904151&w=2
It sounds plausible, but I'm not really up to speed on all the changes
going on in sysfs at the moment, so it would be dangerous to rely on
what I think sounds reasonable.
James
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 16:46 ` James Bottomley
@ 2007-10-31 17:32 ` Kay Sievers
2007-10-31 18:36 ` Alan Stern
1 sibling, 0 replies; 28+ messages in thread
From: Kay Sievers @ 2007-10-31 17:32 UTC (permalink / raw)
To: James Bottomley
Cc: Alan Stern, Greg KH, Hannes Reinecke, SCSI development list
On Wed, 2007-10-31 at 11:46 -0500, James Bottomley wrote:
> On Wed, 2007-10-31 at 17:42 +0100, Kay Sievers wrote:
> > On Wed, 2007-10-31 at 11:31 -0500, James Bottomley wrote:
> > > Doesn't this circularity now exist for everything? Every device that
> > > creates a queue has a reference to the queue, every queue has a
> > > reference to its attached gendisk and now every gendisk has a reference
> > > to the device creating the queue? This doesn't look to be a SCSI
> > > specific problem.
> >
> > It's only SCSI so far, everything else seems fine.
>
> But you didn't respond to the stated problem: there are very few other
> non-scsi block devices ... have you tried looking at cciss?
I tried ub instead of usb-storage and sd card driver which is a raw
block driver, i didn't try cciss, or other more advanced blockdev
drivers.
> > But, the real problem is that the core seems to deadlock if two devices
> > reference each other (or build a larger circle), even when they are
> > deleted, that's the problem we are running in.
>
> Yes ... we sort of evaded that one by having class devices hang off the
> sides of real devices ... making everything a peer with crosslinks
> brings it back with a vengeance.
Yeah, which made sysfs a horrible mess as soon as people started putting
hierarchy in class devices. The whole idea of "physical", "virtual",
"class", "bus" causes nothing but problems in userspace, and exposes
kernel implementation details, which change all the time. There are even
subsystem which moved from class to bus because they need driver
binding. And there is really no point in exposing that to userspace.
It's a design mistake, we try to fix now. We really need a single
classification and a single hierarchy and not the 3 completely different
access rules for 3 types of devices, which don't have any interesting
difference, besides the way the kernel handles them.
Kay
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 16:46 ` James Bottomley
2007-10-31 17:32 ` Kay Sievers
@ 2007-10-31 18:36 ` Alan Stern
1 sibling, 0 replies; 28+ messages in thread
From: Alan Stern @ 2007-10-31 18:36 UTC (permalink / raw)
To: James Bottomley
Cc: Kay Sievers, Greg KH, Hannes Reinecke, SCSI development list
On Wed, 31 Oct 2007, James Bottomley wrote:
> But you're assuming the relationship gendisk->queue exists when we
> delete the scsi device. What should happen is that the gendisk should
> be released first in the ULD detach. Queues are allowed to exist
> without gendisks (when a SCSI queue is first created, that's what it
> should look like), so I think we're perhaps simply missing an API that
> would detach the queue and the gendisk (and get the gendisk to delete
> itself and give up its device link) ... doesn't del_gendisk() do this?
It tries to, but it fails. What happens is del_gendisk() calls
unlink_gendisk(), which calls blk_unregister_queue(). If the kobject
core worked rationally, this would cause the queue to drop its
reference to the gendisk and allow the gendisk to be released.
But it doesn't. The kobject core doesn't drop the reference from a
child to a parent when the child is unregistered; it waits until the
child is released. This means that the gendisk can't be released until
the queue is released. Similarly, the scsi_device can't be released
until the gendisk is released. And obviously this constitutes a deadly
cycle.
The proposed patch will make the kobject core behave as it should,
dropping references during unregister instead of waiting until release.
Alan Stern
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
2007-10-31 17:07 ` James Bottomley
@ 2007-10-31 18:38 ` Alan Stern
0 siblings, 0 replies; 28+ messages in thread
From: Alan Stern @ 2007-10-31 18:38 UTC (permalink / raw)
To: James Bottomley
Cc: Kay Sievers, Greg KH, Hannes Reinecke, SCSI development list
On Wed, 31 Oct 2007, James Bottomley wrote:
> > Dropping a kobject's reference to its parent when the kobject is
> > removed rather than when it is deleted will solve all these problems.
> > The queue's reference to the gendisk and the gendisk's reference to the
> > device will both be dropped when the device is unregistered, allowing
> > the device and gendisk to be released. When the device is released it
> > can then drop its reference to the queue, allowing the queue to be
> > released.
> >
> > In other words, reverting that 4-year-old patch would fix everything.
> > I'd still like to get your acknowledgement for my patch, posted here:
> >
> > http://marc.info/?l=linux-scsi&m=119368368904151&w=2
>
> It sounds plausible, but I'm not really up to speed on all the changes
> going on in sysfs at the moment, so it would be dangerous to rely on
> what I think sounds reasonable.
Okay. It sounds like there's not much SCSI-specific remaining in this
thread, so anything further should move back to LKML.
Alan Stern
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2007-10-31 18:38 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-29 15:18 BUG in: Driver core: convert block from raw kobjects to core devices (fwd) Alan Stern
2007-10-29 15:35 ` James Bottomley
2007-10-29 16:38 ` Alan Stern
2007-10-29 16:45 ` James Bottomley
2007-10-29 17:04 ` Alan Stern
2007-10-29 18:47 ` Alan Stern
2007-10-29 19:13 ` Kay Sievers
2007-10-31 4:25 ` Greg KH
2007-10-31 10:46 ` Kay Sievers
2007-10-31 14:32 ` Greg KH
2007-10-31 15:15 ` James Bottomley
2007-10-31 15:40 ` Kay Sievers
2007-10-31 15:47 ` James Bottomley
2007-10-31 16:04 ` Alan Stern
2007-10-31 16:13 ` James Bottomley
2007-10-31 16:24 ` Kay Sievers
2007-10-31 16:31 ` James Bottomley
2007-10-31 16:42 ` Kay Sievers
2007-10-31 16:46 ` James Bottomley
2007-10-31 17:32 ` Kay Sievers
2007-10-31 18:36 ` Alan Stern
2007-10-31 16:44 ` Alan Stern
2007-10-31 17:07 ` James Bottomley
2007-10-31 18:38 ` Alan Stern
2007-10-31 15:58 ` Alan Stern
2007-10-31 16:11 ` James Bottomley
2007-10-31 4:24 ` Greg KH
2007-10-31 15:51 ` Alan Stern
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox