* linux-2.6.0-test11 [BUG] -- scsi_add/remove_device - out of memory
@ 2003-12-10 9:12 Heiko Carstens
2003-12-10 14:02 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Heiko Carstens @ 2003-12-10 9:12 UTC (permalink / raw)
To: linux-scsi
Hello all,
it looks to me as if the SCSI stack has a memory leak. If I have the following
endless loop in our ZFCP LLD we end up within a very short time in an out
of memory situation (tested on an S390 virtual machine with 128MB):
while(1) {
scsi_add_device(...)
scsi_remove_device(...);
}
It takes only about 40 iterations to reach the out of memory situation.
The device that gets added and removed does respond to Inquiry commands.
Since I'm not sure what are the pending fixes for the SCSI stack I'm wondering
if this is a known bug.
Heiko
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: linux-2.6.0-test11 [BUG] -- scsi_add/remove_device - out of memory
2003-12-10 9:12 linux-2.6.0-test11 [BUG] -- scsi_add/remove_device - out of memory Heiko Carstens
@ 2003-12-10 14:02 ` Christoph Hellwig
0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2003-12-10 14:02 UTC (permalink / raw)
To: Heiko Carstens; +Cc: linux-scsi
On Wed, Dec 10, 2003 at 10:12:17AM +0100, Heiko Carstens wrote:
> Hello all,
>
> it looks to me as if the SCSI stack has a memory leak. If I have the following
> endless loop in our ZFCP LLD we end up within a very short time in an out
> of memory situation (tested on an S390 virtual machine with 128MB):
>
> while(1) {
> scsi_add_device(...)
> scsi_remove_device(...);
> }
>
> It takes only about 40 iterations to reach the out of memory situation.
> The device that gets added and removed does respond to Inquiry commands.
> Since I'm not sure what are the pending fixes for the SCSI stack I'm wondering
> if this is a known bug.
Sounds there's a scsi_device_put missing somewherein your driver..
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: linux-2.6.0-test11 [BUG] -- scsi_add/remove_device - out of memory
@ 2003-12-10 14:18 Heiko Carstens
0 siblings, 0 replies; 8+ messages in thread
From: Heiko Carstens @ 2003-12-10 14:18 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-scsi
Hi,
>> while(1) {
>> scsi_add_device(...)
>> scsi_remove_device(...);
>> }
>>
>> It takes only about 40 iterations to reach the out of memory situation.
>> The device that gets added and removed does respond to Inquiry commands.
>> Since I'm not sure what are the pending fixes for the SCSI stack I'm
>> wondering if this is a known bug.
>
>Sounds there's a scsi_device_put missing somewherein your driver..
Why should I be supposed to issue a scsi_device_put or scsi_device_get in my
device driver?
Actually the scsi_add_device call will call my provided slave_alloc function
and hands over a struct scsi_device pointer which should be valid until
scsi_destroy will be called. Which is done when I call scsi_remove_device.
Currently I don't use any scsi_device_get or scsi_device_put functions.
^ permalink raw reply [flat|nested] 8+ messages in thread[parent not found: <OFE5F05878.12C1E995-ONC1256DF8.004D7413-C1256DF8.004E538F@de.ibm.com>]
* Re: linux-2.6.0-test11 [BUG] -- scsi_add/remove_device - out of memory
[not found] <OFE5F05878.12C1E995-ONC1256DF8.004D7413-C1256DF8.004E538F@de.ibm.com>
@ 2003-12-10 15:09 ` Christoph Hellwig
0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2003-12-10 15:09 UTC (permalink / raw)
To: Heiko Carstens; +Cc: linux-scsi
On Wed, Dec 10, 2003 at 03:14:19PM +0100, Heiko Carstens wrote:
> >Sounds there's a scsi_device_put missing somewherein your driver..
>
> Why should I be supposed to issue a scsi_device_put or scsi_device_get in
> my
> device driver?
scsi_remove_device just marks the device deleted, the memory is only freed
after the final scsi_device_put. That might be the one your driver issues
after it's completly done with the device or for example the one when the last
sysfs file referring to the device is closed. Lifetime of structscsi_device
is not under control of the driver.
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: linux-2.6.0-test11 [BUG] -- scsi_add/remove_device - out of memory
@ 2004-01-13 9:44 Heiko Carstens
2004-01-13 16:26 ` Mike Anderson
0 siblings, 1 reply; 8+ messages in thread
From: Heiko Carstens @ 2004-01-13 9:44 UTC (permalink / raw)
To: Adam Radford; +Cc: linux-scsi
Hi,
yes, now that I had some more time to look into it I found a leak.
Actually the scsi mid layer creates a lot of sysfs attributes but
doesn't remove a single one if a scsi device or a host gets removed.
Since e.g. device_del (as it is called by scsi_remove_device) does
not automatically remove all previously registered sysfs
attributes, we end up with objects that have a reference count > 0
and thus will never be released and eat up memory.
Heiko
To: Heiko Carstens/Germany/IBM@IBMDE
cc:
Subject: RE: linux-2.6.0-test11 [BUG] -- scsi_add/remove_device - out of m emory
Heiko,
Did you figure out where the bug is that is causing the memory leak in the
scsi layer? is it in scsi_proc.c?
-Adam
-----Original Message-----
From: Heiko Carstens [mailto:heiko.carstens@de.ibm.com]
Sent: Wednesday, December 10, 2003 1:12 AM
To: linux-scsi@vger.kernel.org
Subject: linux-2.6.0-test11 [BUG] -- scsi_add/remove_device - out of
memory
Hello all,
it looks to me as if the SCSI stack has a memory leak. If I have the
following
endless loop in our ZFCP LLD we end up within a very short time in an out
of memory situation (tested on an S390 virtual machine with 128MB):
while(1) {
scsi_add_device(...)
scsi_remove_device(...);
}
It takes only about 40 iterations to reach the out of memory situation.
The device that gets added and removed does respond to Inquiry commands.
Since I'm not sure what are the pending fixes for the SCSI stack I'm
wondering
if this is a known bug.
Heiko
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
DISCLAIMER: The information contained in this electronic mail transmission
is intended by 3ware for the use of the named individual or entity to
which
it is directed and may contain information that is confidential or
privileged and should not be disseminated without prior approval from
3ware
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: linux-2.6.0-test11 [BUG] -- scsi_add/remove_device - out of memory
2004-01-13 9:44 Heiko Carstens
@ 2004-01-13 16:26 ` Mike Anderson
2004-01-13 18:11 ` Mike Anderson
0 siblings, 1 reply; 8+ messages in thread
From: Mike Anderson @ 2004-01-13 16:26 UTC (permalink / raw)
To: Heiko Carstens; +Cc: Adam Radford, linux-scsi
Heiko Carstens [Heiko.Carstens@de.ibm.com] wrote:
> Hi,
>
> yes, now that I had some more time to look into it I found a leak.
> Actually the scsi mid layer creates a lot of sysfs attributes but
> doesn't remove a single one if a scsi device or a host gets removed.
> Since e.g. device_del (as it is called by scsi_remove_device) does
> not automatically remove all previously registered sysfs
> attributes, we end up with objects that have a reference count > 0
> and thus will never be released and eat up memory.
>
> Heiko
Maybe something has changed, but the device_del calls kobject_del which
calls sysfs_remove_dir. sysfs_remove_dir removes all children enteries
created by sysfs_create_file. This should clean up the attributes.
-andmike
--
Michael Anderson
andmike@us.ibm.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: linux-2.6.0-test11 [BUG] -- scsi_add/remove_device - out of memory
@ 2004-01-14 10:39 Heiko Carstens
0 siblings, 0 replies; 8+ messages in thread
From: Heiko Carstens @ 2004-01-14 10:39 UTC (permalink / raw)
To: Mike Anderson; +Cc: Adam Radford, linux-scsi
I guess you are right and I am wrong :)
What actually happened: when adding and removing scsi devices in an
endless loop the system ran out of memory and the oom killer started
killing processes. So I thought that there must be a memory leak.
But, if you only add and remove a certain amount of disks,then wait
a moment and then do it with the next bunch this will not happen.
The reason for the memory shortage was actually that the kernel
generated lots of hotplug events and each of these events started
/sbin/hotplug eating up some memory. This explains also why memory
was available again after waiting for a short moment.
So, the solution to this problem seems to be: don't do it.
Heiko
To: Heiko Carstens/Germany/IBM@IBMDE
cc: Adam Radford <aradford@3WARE.com>, linux-scsi@vger.kernel.org
Subject: Re: linux-2.6.0-test11 [BUG] -- scsi_add/remove_device - out of memory
Heiko Carstens [Heiko.Carstens@de.ibm.com] wrote:
> Hi,
>
> yes, now that I had some more time to look into it I found a leak.
> Actually the scsi mid layer creates a lot of sysfs attributes but
> doesn't remove a single one if a scsi device or a host gets removed.
> Since e.g. device_del (as it is called by scsi_remove_device) does
> not automatically remove all previously registered sysfs
> attributes, we end up with objects that have a reference count > 0
> and thus will never be released and eat up memory.
>
> Heiko
Maybe something has changed, but the device_del calls kobject_del which
calls sysfs_remove_dir. sysfs_remove_dir removes all children enteries
created by sysfs_create_file. This should clean up the attributes.
-andmike
--
Michael Anderson
andmike@us.ibm.com
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2004-01-14 10:39 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-12-10 9:12 linux-2.6.0-test11 [BUG] -- scsi_add/remove_device - out of memory Heiko Carstens
2003-12-10 14:02 ` Christoph Hellwig
-- strict thread matches above, loose matches on Subject: below --
2003-12-10 14:18 Heiko Carstens
[not found] <OFE5F05878.12C1E995-ONC1256DF8.004D7413-C1256DF8.004E538F@de.ibm.com>
2003-12-10 15:09 ` Christoph Hellwig
2004-01-13 9:44 Heiko Carstens
2004-01-13 16:26 ` Mike Anderson
2004-01-13 18:11 ` Mike Anderson
2004-01-14 10:39 Heiko Carstens
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox