* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd) [not found] <1193848942.6621.18.camel@lov.site> @ 2007-11-05 21:49 ` Alan Stern 2007-11-05 21:59 ` Greg KH 0 siblings, 1 reply; 7+ messages in thread From: Alan Stern @ 2007-11-05 21:49 UTC (permalink / raw) To: Greg KH; +Cc: Kay Sievers, Kernel development list Greg: So what's our status? Do you think it's worthwhile adding the "drop reference to parent kobject at remove time instead of release time" patch? Also, what's the story on the updates to the USB uevent routines? Do you want separate patches from Kay and me or should we combine them into a single patch? Alan Stern ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd) 2007-11-05 21:49 ` BUG in: Driver core: convert block from raw kobjects to core devices (fwd) Alan Stern @ 2007-11-05 21:59 ` Greg KH 2007-11-06 19:49 ` Alan Stern 0 siblings, 1 reply; 7+ messages in thread From: Greg KH @ 2007-11-05 21:59 UTC (permalink / raw) To: Alan Stern; +Cc: Kay Sievers, Kernel development list On Mon, Nov 05, 2007 at 04:49:21PM -0500, Alan Stern wrote: > Greg: > > So what's our status? Do you think it's worthwhile adding the > "drop reference to parent kobject at remove time instead of release > time" patch? No. I still need to take the time and read this thread and find the real problem here. The fact that the issue does not show up for other, non-scsi block devices, makes me feel this is a scsi-specific problem with how it deals with the driver model, but I need to take the time to sit down and figure it out for sure. > Also, what's the story on the updates to the USB uevent routines? Do > you want separate patches from Kay and me or should we combine them > into a single patch? I'll take a combined patch, as I don't think I got a signed-off for your version, I was waiting for a "final" version. Back to the kobject/kset cleanup/debug mess :) thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd) 2007-11-05 21:59 ` Greg KH @ 2007-11-06 19:49 ` Alan Stern 2007-11-07 12:21 ` Hannes Reinecke 0 siblings, 1 reply; 7+ messages in thread From: Alan Stern @ 2007-11-06 19:49 UTC (permalink / raw) To: Greg KH; +Cc: Kay Sievers, Kernel development list On Mon, 5 Nov 2007, Greg KH wrote: > On Mon, Nov 05, 2007 at 04:49:21PM -0500, Alan Stern wrote: > > Greg: > > > > So what's our status? Do you think it's worthwhile adding the > > "drop reference to parent kobject at remove time instead of release > > time" patch? > > No. > > I still need to take the time and read this thread and find the real > problem here. The fact that the issue does not show up for other, > non-scsi block devices, makes me feel this is a scsi-specific problem > with how it deals with the driver model, but I need to take the time to > sit down and figure it out for sure. Here's the story as far as the SCSI stack goes. To what extent other subsystems have analogous problems, I don't know. 1. In drivers/scsi/scsi_scan.c, scsi_alloc_sdev() creates a scsi_device structure and calls scsi_alloc_queue(), which ends up calling blk_init_queue(). As the creator of the request_queue, the SCSI core owns the initial reference to q->kobj. 2. This reference is released as part of the scsi_device's release routine. In scsi_sysfs.c, scsi_device_dev_release_usercontext() calls scsi_free_queue(), which does nothing but call blk_cleanup_queue(), which calls blk_put_queue(), which does the final kobject_put() on q->kobj. As a result of 1 and 2, the request_queue isn't released until the scsi_device is released. 3. In sd.c, sd_probe() does "gd = alloc_disk()" and it sets gd->driverfs_dev to point to the scsi_device's embedded struct device (named sdev_gendev). It then calls add_disk() in block/genhd.c, which calls register_disk() in fs/partitions/check.c. register_disk() sets disk->dev.parent to disk->driverfs_dev and then calls device_add(&disk->dev). Setting disk->dev.parent and calling device_add() in this way is new to Kay's reworking of the driver core. Previously disk->dev.kobj had been registered directly, as the gendisk was some sort of class device rather than a regular device. Anyway, the upshot of 3 is that sdev->sdev_gendev.kobj is the parent of disk->dev.kobj, and consequently the scsi_device can't be released until the gendisk is released. 4. add_disk() goes on to call blk_register_queue(disk), which sets q->kobj.parent to disk->dev.kobj and then calls kobject_add(&q->kobj). As a result of 4, the gendisk can't be released until the request_queue is released. Thus we have a cycle: 1&2: request_queue isn't released before scsi_device; 3: scsi_device isn't released before gendisk; 4: gendisk isn't released before request_queue. The dependency in 1&2 is hard-coded into the SCSI core. If I understand correctly, the core really does need the request_queue to hang around as long as the scsi_device is still present. According to James Bottomley, any block device driver should be expected to have a similar requirement. But the dependencies in 3 and 4 are unnecessary. They are artifacts, caused by the fact that a kobject doesn't drop its reference to its parent until it is released. If instead the reference to the parent were dropped when the kobject was removed then 3 and 4 wouldn't apply. Alan Stern ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd) 2007-11-06 19:49 ` Alan Stern @ 2007-11-07 12:21 ` Hannes Reinecke 2007-11-07 15:54 ` Alan Stern 0 siblings, 1 reply; 7+ messages in thread From: Hannes Reinecke @ 2007-11-07 12:21 UTC (permalink / raw) To: Alan Stern; +Cc: Greg KH, Kay Sievers, Kernel development list Alan Stern wrote: > On Mon, 5 Nov 2007, Greg KH wrote: > >> On Mon, Nov 05, 2007 at 04:49:21PM -0500, Alan Stern wrote: >>> Greg: >>> >>> So what's our status? Do you think it's worthwhile adding the >>> "drop reference to parent kobject at remove time instead of release >>> time" patch? >> No. >> >> I still need to take the time and read this thread and find the real >> problem here. The fact that the issue does not show up for other, >> non-scsi block devices, makes me feel this is a scsi-specific problem >> with how it deals with the driver model, but I need to take the time to >> sit down and figure it out for sure. > [ .. ] > > Thus we have a cycle: > > 1&2: request_queue isn't released before scsi_device; > > 3: scsi_device isn't released before gendisk; > > 4: gendisk isn't released before request_queue. > > The dependency in 1&2 is hard-coded into the SCSI core. If I > understand correctly, the core really does need the request_queue to > hang around as long as the scsi_device is still present. According to > James Bottomley, any block device driver should be expected to have a > similar requirement. > This is actually true, but as other block device drivers create the LUN (or the equivalent thereof), the request queue, and the block device at the same time or under control of the driver itself they don't have this problem. It's only due to the decoupling of the block driver from the underlying device (ie sd driver and scsi_device) when this problem arises. > But the dependencies in 3 and 4 are unnecessary. They are artifacts, > caused by the fact that a kobject doesn't drop its reference to its > parent until it is released. If instead the reference to the parent > were dropped when the kobject was removed then 3 and 4 wouldn't apply. > And should be okay as the device isn't accessible from userland anyway after doing a device_del(). And the implication is that it's going to be remove soon entirely. So we're just moving the timing of the eventual call to the ->release() function; the events will be triggered by device_del() and won't be changed. And if some device actually requires a reference to the parent during ->release() it can as well acquire it manually and shouldn't rely on the core logic to do that automatically. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd) 2007-11-07 12:21 ` Hannes Reinecke @ 2007-11-07 15:54 ` Alan Stern 2007-11-07 19:36 ` Kay Sievers 0 siblings, 1 reply; 7+ messages in thread From: Alan Stern @ 2007-11-07 15:54 UTC (permalink / raw) To: Hannes Reinecke; +Cc: Greg KH, Kay Sievers, Kernel development list On Wed, 7 Nov 2007, Hannes Reinecke wrote: > Alan Stern wrote: > > > > Thus we have a cycle: > > > > 1&2: request_queue isn't released before scsi_device; > > > > 3: scsi_device isn't released before gendisk; > > > > 4: gendisk isn't released before request_queue. > > > > The dependency in 1&2 is hard-coded into the SCSI core. If I > > understand correctly, the core really does need the request_queue to > > hang around as long as the scsi_device is still present. According to > > James Bottomley, any block device driver should be expected to have a > > similar requirement. > > > This is actually true, but as other block device drivers create the > LUN (or the equivalent thereof), the request queue, and the block device > at the same time or under control of the driver itself they don't have > this problem. > It's only due to the decoupling of the block driver from the underlying > device (ie sd driver and scsi_device) when this problem arises. I don't understand your reasoning. If the same parent-child relationships exist then it doesn't matter who creates the data stuctures. All that matters is that the block device's reference to the request_queue isn't dropped until the device is released. > > But the dependencies in 3 and 4 are unnecessary. They are artifacts, > > caused by the fact that a kobject doesn't drop its reference to its > > parent until it is released. If instead the reference to the parent > > were dropped when the kobject was removed then 3 and 4 wouldn't apply. > > > And should be okay as the device isn't accessible from userland > anyway after doing a device_del(). And the implication is that it's > going to be remove soon entirely. So we're just moving the timing > of the eventual call to the ->release() function; the events will > be triggered by device_del() and won't be changed. > And if some device actually requires a reference to the parent > during ->release() it can as well acquire it manually and shouldn't > rely on the core logic to do that automatically. My thinking exactly. Alan Stern ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd) 2007-11-07 15:54 ` Alan Stern @ 2007-11-07 19:36 ` Kay Sievers 2007-11-07 20:00 ` Alan Stern 0 siblings, 1 reply; 7+ messages in thread From: Kay Sievers @ 2007-11-07 19:36 UTC (permalink / raw) To: Alan Stern; +Cc: Hannes Reinecke, Greg KH, Kernel development list On Wed, 2007-11-07 at 10:54 -0500, Alan Stern wrote: > On Wed, 7 Nov 2007, Hannes Reinecke wrote: > > > Alan Stern wrote: > > > > > > Thus we have a cycle: > > > > > > 1&2: request_queue isn't released before scsi_device; > > > > > > 3: scsi_device isn't released before gendisk; > > > > > > 4: gendisk isn't released before request_queue. > > > > > > The dependency in 1&2 is hard-coded into the SCSI core. If I > > > understand correctly, the core really does need the request_queue to > > > hang around as long as the scsi_device is still present. According to > > > James Bottomley, any block device driver should be expected to have a > > > similar requirement. > > > > > This is actually true, but as other block device drivers create the > > LUN (or the equivalent thereof), the request queue, and the block device > > at the same time or under control of the driver itself they don't have > > this problem. > > It's only due to the decoupling of the block driver from the underlying > > device (ie sd driver and scsi_device) when this problem arises. > > I don't understand your reasoning. If the same parent-child > relationships exist then it doesn't matter who creates the data > stuctures. All that matters is that the block device's reference to > the request_queue isn't dropped until the device is released. > > > > But the dependencies in 3 and 4 are unnecessary. They are artifacts, > > > caused by the fact that a kobject doesn't drop its reference to its > > > parent until it is released. If instead the reference to the parent > > > were dropped when the kobject was removed then 3 and 4 wouldn't apply. > > > > > And should be okay as the device isn't accessible from userland > > anyway after doing a device_del(). And the implication is that it's > > going to be remove soon entirely. So we're just moving the timing > > of the eventual call to the ->release() function; the events will > > be triggered by device_del() and won't be changed. > > And if some device actually requires a reference to the parent > > during ->release() it can as well acquire it manually and shouldn't > > rely on the core logic to do that automatically. > > My thinking exactly. It would remove another implicit "magic" from the core, which is good. Otherwise we will need to introduce a kobject_orphan(), to disassociate an object from its parent, which would be kind of weird, just to break out of the default core logic. I would expect this patch to have an effect only at the pretty complex refcounting users of the driver core, which are SCSI and USB, and I expect the people involved are good prepared now, to fix such possible bugs, should they show up. :) Kay ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd) 2007-11-07 19:36 ` Kay Sievers @ 2007-11-07 20:00 ` Alan Stern 0 siblings, 0 replies; 7+ messages in thread From: Alan Stern @ 2007-11-07 20:00 UTC (permalink / raw) To: Kay Sievers; +Cc: Hannes Reinecke, Greg KH, Kernel development list On Wed, 7 Nov 2007, Kay Sievers wrote: > It would remove another implicit "magic" from the core, which is good. Yes. > Otherwise we will need to introduce a kobject_orphan(), to disassociate > an object from its parent, which would be kind of weird, just to break > out of the default core logic. > > I would expect this patch to have an effect only at the pretty complex > refcounting users of the driver core, which are SCSI and USB, and I > expect the people involved are good prepared now, to fix such possible > bugs, should they show up. :) The person who has to be convinced is Greg. :-) Alan Stern ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-11-07 20:00 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1193848942.6621.18.camel@lov.site>
2007-11-05 21:49 ` BUG in: Driver core: convert block from raw kobjects to core devices (fwd) Alan Stern
2007-11-05 21:59 ` Greg KH
2007-11-06 19:49 ` Alan Stern
2007-11-07 12:21 ` Hannes Reinecke
2007-11-07 15:54 ` Alan Stern
2007-11-07 19:36 ` Kay Sievers
2007-11-07 20:00 ` Alan Stern
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox