* [PATCH] Fix reference counting for failed SCSI devices
@ 2005-05-24 9:38 Hannes Reinecke
2005-05-24 22:57 ` James Bottomley
0 siblings, 1 reply; 9+ messages in thread
From: Hannes Reinecke @ 2005-05-24 9:38 UTC (permalink / raw)
To: James Bottomley; +Cc: SCSI Mailing List, Linux Kernel
[-- Attachment #1: Type: text/plain, Size: 486 bytes --]
Hi James,
whenever the scsi-ml tries to scan non-existent devices the reference
count in scsi_alloc_sdev() and scsi_probe_and_add_lun() is not adjusted
properly. Every call to XXX_initialize in the driver core sets the
reference count to 1, so for a proper deallocation an explicit XXX_put()
has to be done.
Please apply.
Cheers,
Hannes
--
Dr. Hannes Reinecke hare@suse.de
SuSE Linux AG S390 & zSeries
Maxfeldstraße 5 +49 911 74053 688
90409 Nürnberg http://www.suse.de
[-- Attachment #2: linux-2.6.12-rc4-scsi-scan-fix-refcount-on-fail --]
[-- Type: text/plain, Size: 1279 bytes --]
From: Hannes Reinecke <hare@suse.de>
Subject: Fix refcount for failed devices
When a non-present device is scanned it is not properly deregistered
from the driver core. Calling XXX_initialize() functions from the driver
core sets the reference count to 1, so for proper deallocation a
XXX_put() has to be issued.
--- linux-2.6.12-rc4/drivers/scsi/scsi_scan.c.orig 2005-05-24 10:26:46.000000000 +0200
+++ linux-2.6.12-rc4/drivers/scsi/scsi_scan.c 2005-05-24 10:55:52.000000000 +0200
@@ -273,6 +273,7 @@ static struct scsi_device *scsi_alloc_sd
*/
if (ret == -ENXIO)
display_failure_msg = 0;
+ put_device(&starget->dev);
goto out_device_destroy;
}
}
@@ -282,6 +283,7 @@ static struct scsi_device *scsi_alloc_sd
out_device_destroy:
transport_destroy_device(&sdev->sdev_gendev);
scsi_free_queue(sdev->request_queue);
+ class_device_put(&sdev->sdev_classdev);
put_device(&sdev->sdev_gendev);
out:
if (display_failure_msg)
@@ -855,6 +857,8 @@ static int scsi_probe_and_add_lun(struct
if (sdev->host->hostt->slave_destroy)
sdev->host->hostt->slave_destroy(sdev);
transport_destroy_device(&sdev->sdev_gendev);
+ class_device_put(&sdev->sdev_classdev);
+ put_device(sdev->sdev_gendev.parent);
put_device(&sdev->sdev_gendev);
}
out:
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix reference counting for failed SCSI devices
2005-05-24 9:38 [PATCH] Fix reference counting for failed SCSI devices Hannes Reinecke
@ 2005-05-24 22:57 ` James Bottomley
2005-05-25 6:50 ` Hannes Reinecke
0 siblings, 1 reply; 9+ messages in thread
From: James Bottomley @ 2005-05-24 22:57 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: SCSI Mailing List, Linux Kernel
On Tue, 2005-05-24 at 11:38 +0200, Hannes Reinecke wrote:
> whenever the scsi-ml tries to scan non-existent devices the reference
> count in scsi_alloc_sdev() and scsi_probe_and_add_lun() is not adjusted
> properly. Every call to XXX_initialize in the driver core sets the
> reference count to 1, so for a proper deallocation an explicit XXX_put()
> has to be done.
That's true, but I don't see what the problem is if the device has never
been made visible.
> + put_device(&starget->dev);
this would amount to a double put, since the parent put method is called
in the device release.
> + class_device_put(&sdev->sdev_classdev);
This is unnecessary since the class device is simply occupying a private
area in the scsi_device. As long as its never made visible to the
system, its refcount is irrelevant
> put_device(&sdev->sdev_gendev);
> out:
> if (display_failure_msg)
> @@ -855,6 +857,8 @@ static int scsi_probe_and_add_lun(struct
> if (sdev->host->hostt->slave_destroy)
> sdev->host->hostt->slave_destroy(sdev);
> transport_destroy_device(&sdev->sdev_gendev);
> + class_device_put(&sdev->sdev_classdev);
> + put_device(sdev->sdev_gendev.parent);
same should apply here. As long as this cascade occurs before
scsi_add_lun() (which calls scsi_sysfs_add_sdev()), which is what makes
the whole set of devices and classes visible.
James
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix reference counting for failed SCSI devices
2005-05-24 22:57 ` James Bottomley
@ 2005-05-25 6:50 ` Hannes Reinecke
2005-05-25 12:27 ` James Bottomley
0 siblings, 1 reply; 9+ messages in thread
From: Hannes Reinecke @ 2005-05-25 6:50 UTC (permalink / raw)
To: James Bottomley; +Cc: SCSI Mailing List, Linux Kernel
[-- Attachment #1: Type: text/plain, Size: 2168 bytes --]
James Bottomley wrote:
> On Tue, 2005-05-24 at 11:38 +0200, Hannes Reinecke wrote:
>>whenever the scsi-ml tries to scan non-existent devices the reference
>>count in scsi_alloc_sdev() and scsi_probe_and_add_lun() is not adjusted
>>properly. Every call to XXX_initialize in the driver core sets the
>>reference count to 1, so for a proper deallocation an explicit XXX_put()
>>has to be done.
>
> That's true, but I don't see what the problem is if the device has never
> been made visible.
>
It's not visible but it's still allocated and referenced. So on doing a
rmmod these class_devices are being deallocated which crashes as the
class device is not connected properly.
>>+ put_device(&starget->dev);
>
> this would amount to a double put, since the parent put method is called
> in the device release.
>
Oops. Correct.
>>+ class_device_put(&sdev->sdev_classdev);
>
> This is unnecessary since the class device is simply occupying a private
> area in the scsi_device. As long as its never made visible to the
> system, its refcount is irrelevant
>
It's not. Whenever you try to rmmod the adapter it becomes highly
relevant. If it doesn't crash you've at least generated a memleak as the
class device is never freed.
(And these are quite a few for Wide-SCSI Double-channel adapters ...)
>> put_device(&sdev->sdev_gendev);
>> out:
>> if (display_failure_msg)
>>@@ -855,6 +857,8 @@ static int scsi_probe_and_add_lun(struct
>> if (sdev->host->hostt->slave_destroy)
>> sdev->host->hostt->slave_destroy(sdev);
>> transport_destroy_device(&sdev->sdev_gendev);
>>+ class_device_put(&sdev->sdev_classdev);
>>+ put_device(sdev->sdev_gendev.parent);
>
> same should apply here. As long as this cascade occurs before
> scsi_add_lun() (which calls scsi_sysfs_add_sdev()), which is what makes
> the whole set of devices and classes visible.
>
Correct for the parent device. The class device has to be deallocated
properly if a rmmod should work properly.
New patch attached. Please apply.
Cheers,
Hannes
--
Dr. Hannes Reinecke hare@suse.de
SuSE Linux AG S390 & zSeries
Maxfeldstraße 5 +49 911 74053 688
90409 Nürnberg http://www.suse.de
[-- Attachment #2: linux-2.6.12-rc4-scsi-scan-fix-refcount-on-fail --]
[-- Type: text/plain, Size: 1047 bytes --]
From: Hannes Reinecke <hare@suse.de>
Subject: Fix refcount for failed devices
When a non-present device is scanned it is not properly deregistered
from the driver core. Calling XXX_initialize() functions from the driver
core sets the reference count to 1, so for proper deallocation a
XXX_put() has to be issued.
--- linux-2.6.12-rc4/drivers/scsi/scsi_scan.c.orig 2005-05-24 10:26:46.000000000 +0200
+++ linux-2.6.12-rc4/drivers/scsi/scsi_scan.c 2005-05-24 10:55:52.000000000 +0200
@@ -282,6 +282,7 @@ static struct scsi_device *scsi_alloc_sd
out_device_destroy:
transport_destroy_device(&sdev->sdev_gendev);
scsi_free_queue(sdev->request_queue);
+ class_device_put(&sdev->sdev_classdev);
put_device(&sdev->sdev_gendev);
out:
if (display_failure_msg)
@@ -855,6 +856,7 @@ static int scsi_probe_and_add_lun(struct
if (sdev->host->hostt->slave_destroy)
sdev->host->hostt->slave_destroy(sdev);
transport_destroy_device(&sdev->sdev_gendev);
+ class_device_put(&sdev->sdev_classdev);
put_device(&sdev->sdev_gendev);
}
out:
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix reference counting for failed SCSI devices
2005-05-25 6:50 ` Hannes Reinecke
@ 2005-05-25 12:27 ` James Bottomley
2005-05-25 12:46 ` Hannes Reinecke
0 siblings, 1 reply; 9+ messages in thread
From: James Bottomley @ 2005-05-25 12:27 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: SCSI Mailing List, Linux Kernel
On Wed, 2005-05-25 at 08:50 +0200, Hannes Reinecke wrote:
> >>+ class_device_put(&sdev->sdev_classdev);
> >
> > This is unnecessary since the class device is simply occupying a private
> > area in the scsi_device. As long as its never made visible to the
> > system, its refcount is irrelevant
> >
> It's not. Whenever you try to rmmod the adapter it becomes highly
> relevant. If it doesn't crash you've at least generated a memleak as the
> class device is never freed.
> (And these are quite a few for Wide-SCSI Double-channel adapters ...)
? Look at the code; you're not doing a put on a pointer to the
sdev_classdev, you're doing a put on a reference to it.
It's defined in scsi_device.h:
struct scsi_device {
...
struct class_device sdev_classdev;
...
};
so it's contained within the scsi_device. Freeing the scsi_device frees
the classdev (and the gendev).
James
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix reference counting for failed SCSI devices
2005-05-25 12:27 ` James Bottomley
@ 2005-05-25 12:46 ` Hannes Reinecke
2005-05-25 14:58 ` James Bottomley
0 siblings, 1 reply; 9+ messages in thread
From: Hannes Reinecke @ 2005-05-25 12:46 UTC (permalink / raw)
To: James Bottomley; +Cc: SCSI Mailing List, Linux Kernel
James Bottomley wrote:
> On Wed, 2005-05-25 at 08:50 +0200, Hannes Reinecke wrote:
>>>>+ class_device_put(&sdev->sdev_classdev);
>>>This is unnecessary since the class device is simply occupying a private
>>>area in the scsi_device. As long as its never made visible to the
>>>system, its refcount is irrelevant
>>>
>>It's not. Whenever you try to rmmod the adapter it becomes highly
>>relevant. If it doesn't crash you've at least generated a memleak as the
>>class device is never freed.
>>(And these are quite a few for Wide-SCSI Double-channel adapters ...)
>
> ? Look at the code; you're not doing a put on a pointer to the
> sdev_classdev, you're doing a put on a reference to it.
>
> It's defined in scsi_device.h:
>
> struct scsi_device {
> ...
> struct class_device sdev_classdev;
> ...
> };
>
> so it's contained within the scsi_device. Freeing the scsi_device frees
> the classdev (and the gendev).
>
But does not call the ->release function.
Put it the other way round: does 'rmmod aic7xxx' work for you?
It certainly did _not_ work for aic79xx, hence the fix.
Cheers,
Hannes
--
Dr. Hannes Reinecke hare@suse.de
SuSE Linux AG S390 & zSeries
Maxfeldstraße 5 +49 911 74053 688
90409 Nürnberg http://www.suse.de
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix reference counting for failed SCSI devices
2005-05-25 12:46 ` Hannes Reinecke
@ 2005-05-25 14:58 ` James Bottomley
2005-05-25 15:16 ` Hannes Reinecke
0 siblings, 1 reply; 9+ messages in thread
From: James Bottomley @ 2005-05-25 14:58 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: SCSI Mailing List, Linux Kernel
On Wed, 2005-05-25 at 14:46 +0200, Hannes Reinecke wrote:
> > so it's contained within the scsi_device. Freeing the scsi_device frees
> > the classdev (and the gendev).
> >
> But does not call the ->release function.
Please just read the code like I asked. If you do, you'll find that the
sdev_classdev release method is NULL until scsi_sysfs_add_sdev()
precisely for the reason that the class references don't matter until
that point. We're free to kill the whole thing without bothering about
the class devices until scsi_add_lun detects something and calls
scsi_sysfs_add_sdev() to make the whole thing visible. Then all
classdevs get a ref on the parent gendev which their release method
relinquishes.
> Put it the other way round: does 'rmmod aic7xxx' work for you?
> It certainly did _not_ work for aic79xx, hence the fix.
Well, I know aic7xxx works perfectly on a dual channel card, because I
actually test the failure paths and insmod/rmmod is one of my tests. I
can't comment on aic79xx because I don't have the hardware.
James
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix reference counting for failed SCSI devices
2005-05-25 14:58 ` James Bottomley
@ 2005-05-25 15:16 ` Hannes Reinecke
2005-05-25 17:51 ` James Bottomley
0 siblings, 1 reply; 9+ messages in thread
From: Hannes Reinecke @ 2005-05-25 15:16 UTC (permalink / raw)
To: James Bottomley; +Cc: SCSI Mailing List, Linux Kernel
James Bottomley wrote:
> On Wed, 2005-05-25 at 14:46 +0200, Hannes Reinecke wrote:
>>>so it's contained within the scsi_device. Freeing the scsi_device frees
>>>the classdev (and the gendev).
>>>
>>But does not call the ->release function.
>
> Please just read the code like I asked. If you do, you'll find that the
> sdev_classdev release method is NULL until scsi_sysfs_add_sdev()
> precisely for the reason that the class references don't matter until
> that point. We're free to kill the whole thing without bothering about
> the class devices until scsi_add_lun detects something and calls
> scsi_sysfs_add_sdev() to make the whole thing visible. Then all
> classdevs get a ref on the parent gendev which their release method
> relinquishes.
>
Oh, right. Indeed, you are totally correct. Sorry for the noise.
>>Put it the other way round: does 'rmmod aic7xxx' work for you?
>>It certainly did _not_ work for aic79xx, hence the fix.
>
> Well, I know aic7xxx works perfectly on a dual channel card, because I
> actually test the failure paths and insmod/rmmod is one of my tests. I
> can't comment on aic79xx because I don't have the hardware.
>
I guess it's time to convert aic79xx to the spi_transport class.
Unfortunately my attempt segfaults when removing a device in
attribute_container_device_trigger(); someone is overwriting the ->match
function. Oh well, further debugging needed.
Cheers,
Hannes
--
Dr. Hannes Reinecke hare@suse.de
SuSE Linux AG S390 & zSeries
Maxfeldstraße 5 +49 911 74053 688
90409 Nürnberg http://www.suse.de
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix reference counting for failed SCSI devices
2005-05-25 15:16 ` Hannes Reinecke
@ 2005-05-25 17:51 ` James Bottomley
2005-05-25 18:01 ` Matthew Wilcox
0 siblings, 1 reply; 9+ messages in thread
From: James Bottomley @ 2005-05-25 17:51 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: SCSI Mailing List, Linux Kernel
On Wed, 2005-05-25 at 17:16 +0200, Hannes Reinecke wrote:
> I guess it's time to convert aic79xx to the spi_transport class.
> Unfortunately my attempt segfaults when removing a device in
> attribute_container_device_trigger(); someone is overwriting the ->match
> function. Oh well, further debugging needed.
Well ... best of luck. Given the problems incurred doing this for
aic7xxx (mainly in initial inquiry and domain validation, one of which
is outstanding still) it's not going to be easy.
Large segments of the aic79xx driver are identical to the aic7xxx driver
except that all the functions being ahd_ instead of ahc_, so you should
just be able to mirror quite a lot of the aic7xxx updates. However, you
need to include a large number of rather nasty u320 parameters in the
SPI transport Class (and make sure they're coupled correctly) to get
domain validation to work ... the transport class has also only been
tested on speeds up to u160, so going to u320 will be a first for it
too...
James
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix reference counting for failed SCSI devices
2005-05-25 17:51 ` James Bottomley
@ 2005-05-25 18:01 ` Matthew Wilcox
0 siblings, 0 replies; 9+ messages in thread
From: Matthew Wilcox @ 2005-05-25 18:01 UTC (permalink / raw)
To: James Bottomley
Cc: Hannes Reinecke, SCSI Mailing List, Linux Kernel,
Moore, Eric Dean
On Wed, May 25, 2005 at 01:51:29PM -0400, James Bottomley wrote:
> Large segments of the aic79xx driver are identical to the aic7xxx driver
> except that all the functions being ahd_ instead of ahc_, so you should
> just be able to mirror quite a lot of the aic7xxx updates. However, you
> need to include a large number of rather nasty u320 parameters in the
> SPI transport Class (and make sure they're coupled correctly) to get
> domain validation to work ... the transport class has also only been
> tested on speeds up to u160, so going to u320 will be a first for it
> too...
I wonder if LSI have any preliminary work in that area for the Fusion
driver. Eric?
--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2005-05-25 18:00 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-24 9:38 [PATCH] Fix reference counting for failed SCSI devices Hannes Reinecke
2005-05-24 22:57 ` James Bottomley
2005-05-25 6:50 ` Hannes Reinecke
2005-05-25 12:27 ` James Bottomley
2005-05-25 12:46 ` Hannes Reinecke
2005-05-25 14:58 ` James Bottomley
2005-05-25 15:16 ` Hannes Reinecke
2005-05-25 17:51 ` James Bottomley
2005-05-25 18:01 ` Matthew Wilcox
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox