public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
       [not found] <1193848942.6621.18.camel@lov.site>
@ 2007-11-05 21:49 ` Alan Stern
  2007-11-05 21:59   ` Greg KH
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Stern @ 2007-11-05 21:49 UTC (permalink / raw)
  To: Greg KH; +Cc: Kay Sievers, Kernel development list

Greg:

So what's our status?  Do you think it's worthwhile adding the 
"drop reference to parent kobject at remove time instead of release 
time" patch?

Also, what's the story on the updates to the USB uevent routines?  Do 
you want separate patches from Kay and me or should we combine them 
into a single patch?

Alan Stern


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
  2007-11-05 21:49 ` BUG in: Driver core: convert block from raw kobjects to core devices (fwd) Alan Stern
@ 2007-11-05 21:59   ` Greg KH
  2007-11-06 19:49     ` Alan Stern
  0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2007-11-05 21:59 UTC (permalink / raw)
  To: Alan Stern; +Cc: Kay Sievers, Kernel development list

On Mon, Nov 05, 2007 at 04:49:21PM -0500, Alan Stern wrote:
> Greg:
> 
> So what's our status?  Do you think it's worthwhile adding the 
> "drop reference to parent kobject at remove time instead of release 
> time" patch?

No.

I still need to take the time and read this thread and find the real
problem here.  The fact that the issue does not show up for other,
non-scsi block devices, makes me feel this is a scsi-specific problem
with how it deals with the driver model, but I need to take the time to
sit down and figure it out for sure.

> Also, what's the story on the updates to the USB uevent routines?  Do 
> you want separate patches from Kay and me or should we combine them 
> into a single patch?

I'll take a combined patch, as I don't think I got a signed-off for your
version, I was waiting for a "final" version.

Back to the kobject/kset cleanup/debug mess :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
  2007-11-05 21:59   ` Greg KH
@ 2007-11-06 19:49     ` Alan Stern
  2007-11-07 12:21       ` Hannes Reinecke
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Stern @ 2007-11-06 19:49 UTC (permalink / raw)
  To: Greg KH; +Cc: Kay Sievers, Kernel development list

On Mon, 5 Nov 2007, Greg KH wrote:

> On Mon, Nov 05, 2007 at 04:49:21PM -0500, Alan Stern wrote:
> > Greg:
> > 
> > So what's our status?  Do you think it's worthwhile adding the 
> > "drop reference to parent kobject at remove time instead of release 
> > time" patch?
> 
> No.
> 
> I still need to take the time and read this thread and find the real
> problem here.  The fact that the issue does not show up for other,
> non-scsi block devices, makes me feel this is a scsi-specific problem
> with how it deals with the driver model, but I need to take the time to
> sit down and figure it out for sure.

Here's the story as far as the SCSI stack goes.  To what extent other 
subsystems have analogous problems, I don't know.

     1. In drivers/scsi/scsi_scan.c, scsi_alloc_sdev() creates a
	scsi_device structure and calls scsi_alloc_queue(), which
	ends up calling blk_init_queue().  As the creator of the
	request_queue, the SCSI core owns the initial reference
	to q->kobj.

     2. This reference is released as part of the scsi_device's
	release routine.  In scsi_sysfs.c,
	scsi_device_dev_release_usercontext() calls scsi_free_queue(),
	which does nothing but call blk_cleanup_queue(), which
	calls blk_put_queue(), which does the final kobject_put()
	on q->kobj.

As a result of 1 and 2, the request_queue isn't released until the
scsi_device is released.

     3. In sd.c, sd_probe() does "gd = alloc_disk()" and it sets
	gd->driverfs_dev to point to the scsi_device's embedded
	struct device (named sdev_gendev).  It then calls add_disk()
	in block/genhd.c, which calls register_disk() in
	fs/partitions/check.c.  register_disk() sets disk->dev.parent
	to disk->driverfs_dev and then calls device_add(&disk->dev).

Setting disk->dev.parent and calling device_add() in this way is new to
Kay's reworking of the driver core.  Previously disk->dev.kobj had been
registered directly, as the gendisk was some sort of class device
rather than a regular device.

Anyway, the upshot of 3 is that sdev->sdev_gendev.kobj is the parent of
disk->dev.kobj, and consequently the scsi_device can't be released
until the gendisk is released.

     4. add_disk() goes on to call blk_register_queue(disk), which sets
	q->kobj.parent to disk->dev.kobj and then calls
	kobject_add(&q->kobj).

As a result of 4, the gendisk can't be released until the request_queue 
is released.

Thus we have a cycle:

	1&2: request_queue isn't released before scsi_device;

	3: scsi_device isn't released before gendisk;

	4: gendisk isn't released before request_queue.

The dependency in 1&2 is hard-coded into the SCSI core.  If I 
understand correctly, the core really does need the request_queue to 
hang around as long as the scsi_device is still present.  According to 
James Bottomley, any block device driver should be expected to have a 
similar requirement.

But the dependencies in 3 and 4 are unnecessary.  They are artifacts,
caused by the fact that a kobject doesn't drop its reference to its
parent until it is released.  If instead the reference to the parent
were dropped when the kobject was removed then 3 and 4 wouldn't apply.

Alan Stern


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
  2007-11-06 19:49     ` Alan Stern
@ 2007-11-07 12:21       ` Hannes Reinecke
  2007-11-07 15:54         ` Alan Stern
  0 siblings, 1 reply; 7+ messages in thread
From: Hannes Reinecke @ 2007-11-07 12:21 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, Kay Sievers, Kernel development list

Alan Stern wrote:
> On Mon, 5 Nov 2007, Greg KH wrote:
> 
>> On Mon, Nov 05, 2007 at 04:49:21PM -0500, Alan Stern wrote:
>>> Greg:
>>>
>>> So what's our status?  Do you think it's worthwhile adding the 
>>> "drop reference to parent kobject at remove time instead of release 
>>> time" patch?
>> No.
>>
>> I still need to take the time and read this thread and find the real
>> problem here.  The fact that the issue does not show up for other,
>> non-scsi block devices, makes me feel this is a scsi-specific problem
>> with how it deals with the driver model, but I need to take the time to
>> sit down and figure it out for sure.
> 
[ .. ]
> 
> Thus we have a cycle:
> 
> 	1&2: request_queue isn't released before scsi_device;
> 
> 	3: scsi_device isn't released before gendisk;
> 
> 	4: gendisk isn't released before request_queue.
> 
> The dependency in 1&2 is hard-coded into the SCSI core.  If I 
> understand correctly, the core really does need the request_queue to 
> hang around as long as the scsi_device is still present.  According to 
> James Bottomley, any block device driver should be expected to have a 
> similar requirement.
> 
This is actually true, but as other block device drivers create the
LUN (or the equivalent thereof), the request queue, and the block device
at the same time or under control of the driver itself they don't have
this problem.
It's only due to the decoupling of the block driver from the underlying
device (ie sd driver and scsi_device) when this problem arises.

> But the dependencies in 3 and 4 are unnecessary.  They are artifacts,
> caused by the fact that a kobject doesn't drop its reference to its
> parent until it is released.  If instead the reference to the parent
> were dropped when the kobject was removed then 3 and 4 wouldn't apply.
> 
And should be okay as the device isn't accessible from userland
anyway after doing a device_del(). And the implication is that it's
going to be remove soon entirely. So we're just moving the timing
of the eventual call to the ->release() function; the events will
be triggered by device_del() and won't be changed.
And if some device actually requires a reference to the parent
during ->release() it can as well acquire it manually and shouldn't
rely on the core logic to do that automatically.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
  2007-11-07 12:21       ` Hannes Reinecke
@ 2007-11-07 15:54         ` Alan Stern
  2007-11-07 19:36           ` Kay Sievers
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Stern @ 2007-11-07 15:54 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Greg KH, Kay Sievers, Kernel development list

On Wed, 7 Nov 2007, Hannes Reinecke wrote:

> Alan Stern wrote:
> > 
> > Thus we have a cycle:
> > 
> > 	1&2: request_queue isn't released before scsi_device;
> > 
> > 	3: scsi_device isn't released before gendisk;
> > 
> > 	4: gendisk isn't released before request_queue.
> > 
> > The dependency in 1&2 is hard-coded into the SCSI core.  If I 
> > understand correctly, the core really does need the request_queue to 
> > hang around as long as the scsi_device is still present.  According to 
> > James Bottomley, any block device driver should be expected to have a 
> > similar requirement.
> > 
> This is actually true, but as other block device drivers create the
> LUN (or the equivalent thereof), the request queue, and the block device
> at the same time or under control of the driver itself they don't have
> this problem.
> It's only due to the decoupling of the block driver from the underlying
> device (ie sd driver and scsi_device) when this problem arises.

I don't understand your reasoning.  If the same parent-child
relationships exist then it doesn't matter who creates the data
stuctures.  All that matters is that the block device's reference to
the request_queue isn't dropped until the device is released.

> > But the dependencies in 3 and 4 are unnecessary.  They are artifacts,
> > caused by the fact that a kobject doesn't drop its reference to its
> > parent until it is released.  If instead the reference to the parent
> > were dropped when the kobject was removed then 3 and 4 wouldn't apply.
> > 
> And should be okay as the device isn't accessible from userland
> anyway after doing a device_del(). And the implication is that it's
> going to be remove soon entirely. So we're just moving the timing
> of the eventual call to the ->release() function; the events will
> be triggered by device_del() and won't be changed.
> And if some device actually requires a reference to the parent
> during ->release() it can as well acquire it manually and shouldn't
> rely on the core logic to do that automatically.

My thinking exactly.

Alan Stern


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
  2007-11-07 15:54         ` Alan Stern
@ 2007-11-07 19:36           ` Kay Sievers
  2007-11-07 20:00             ` Alan Stern
  0 siblings, 1 reply; 7+ messages in thread
From: Kay Sievers @ 2007-11-07 19:36 UTC (permalink / raw)
  To: Alan Stern; +Cc: Hannes Reinecke, Greg KH, Kernel development list

On Wed, 2007-11-07 at 10:54 -0500, Alan Stern wrote:
> On Wed, 7 Nov 2007, Hannes Reinecke wrote:
> 
> > Alan Stern wrote:
> > > 
> > > Thus we have a cycle:
> > > 
> > > 	1&2: request_queue isn't released before scsi_device;
> > > 
> > > 	3: scsi_device isn't released before gendisk;
> > > 
> > > 	4: gendisk isn't released before request_queue.
> > > 
> > > The dependency in 1&2 is hard-coded into the SCSI core.  If I 
> > > understand correctly, the core really does need the request_queue to 
> > > hang around as long as the scsi_device is still present.  According to 
> > > James Bottomley, any block device driver should be expected to have a 
> > > similar requirement.
> > > 
> > This is actually true, but as other block device drivers create the
> > LUN (or the equivalent thereof), the request queue, and the block device
> > at the same time or under control of the driver itself they don't have
> > this problem.
> > It's only due to the decoupling of the block driver from the underlying
> > device (ie sd driver and scsi_device) when this problem arises.
> 
> I don't understand your reasoning.  If the same parent-child
> relationships exist then it doesn't matter who creates the data
> stuctures.  All that matters is that the block device's reference to
> the request_queue isn't dropped until the device is released.
> 
> > > But the dependencies in 3 and 4 are unnecessary.  They are artifacts,
> > > caused by the fact that a kobject doesn't drop its reference to its
> > > parent until it is released.  If instead the reference to the parent
> > > were dropped when the kobject was removed then 3 and 4 wouldn't apply.
> > > 
> > And should be okay as the device isn't accessible from userland
> > anyway after doing a device_del(). And the implication is that it's
> > going to be remove soon entirely. So we're just moving the timing
> > of the eventual call to the ->release() function; the events will
> > be triggered by device_del() and won't be changed.
> > And if some device actually requires a reference to the parent
> > during ->release() it can as well acquire it manually and shouldn't
> > rely on the core logic to do that automatically.
> 
> My thinking exactly.

It would remove another implicit "magic" from the core, which is good.

Otherwise we will need to introduce a kobject_orphan(), to disassociate
an object from its parent, which would be kind of weird, just to break
out of the default core logic.

I would expect this patch to have an effect only at the pretty complex
refcounting users of the driver core, which are SCSI and USB, and I
expect the people involved are good prepared now, to fix such possible
bugs, should they show up. :)

Kay


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: BUG in: Driver core: convert block from raw kobjects to core devices (fwd)
  2007-11-07 19:36           ` Kay Sievers
@ 2007-11-07 20:00             ` Alan Stern
  0 siblings, 0 replies; 7+ messages in thread
From: Alan Stern @ 2007-11-07 20:00 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Hannes Reinecke, Greg KH, Kernel development list

On Wed, 7 Nov 2007, Kay Sievers wrote:

> It would remove another implicit "magic" from the core, which is good.

Yes.

> Otherwise we will need to introduce a kobject_orphan(), to disassociate
> an object from its parent, which would be kind of weird, just to break
> out of the default core logic.
> 
> I would expect this patch to have an effect only at the pretty complex
> refcounting users of the driver core, which are SCSI and USB, and I
> expect the people involved are good prepared now, to fix such possible
> bugs, should they show up. :)

The person who has to be convinced is Greg.  :-)

Alan Stern


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-11-07 20:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1193848942.6621.18.camel@lov.site>
2007-11-05 21:49 ` BUG in: Driver core: convert block from raw kobjects to core devices (fwd) Alan Stern
2007-11-05 21:59   ` Greg KH
2007-11-06 19:49     ` Alan Stern
2007-11-07 12:21       ` Hannes Reinecke
2007-11-07 15:54         ` Alan Stern
2007-11-07 19:36           ` Kay Sievers
2007-11-07 20:00             ` Alan Stern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox