* [PATCH 1/1] block: CFQ refcounting fix
@ 2005-08-30 22:41 brking
2005-08-31 7:28 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: brking @ 2005-08-30 22:41 UTC (permalink / raw)
To: axboe; +Cc: linux-kernel, brking
I ran across a memory leak related to the cfq scheduler. The cfq
init function increments the refcnt of the associated request_queue.
This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
only calls the elevator exit function when its refcnt goes to zero, the
request_q never gets cleaned up. It didn't look like other io schedulers were
incrementing this refcnt, so I removed the refcnt increment and it fixed the
memory leak for me.
To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
attribute to scan "- - -" repeatedly on a scsi host and watch the memory
vanish.
Signed-off-by: Brian King <brking@us.ibm.com>
---
linux-2.6-bjking1/drivers/block/cfq-iosched.c | 1 -
1 files changed, 1 deletion(-)
diff -puN drivers/block/cfq-iosched.c~cfq_refcnt_fix drivers/block/cfq-iosched.c
--- linux-2.6/drivers/block/cfq-iosched.c~cfq_refcnt_fix 2005-08-30 17:26:55.000000000 -0500
+++ linux-2.6-bjking1/drivers/block/cfq-iosched.c 2005-08-30 17:26:55.000000000 -0500
@@ -2318,7 +2318,6 @@ static int cfq_init_queue(request_queue_
e->elevator_data = cfqd;
cfqd->queue = q;
- atomic_inc(&q->refcnt);
cfqd->max_queued = q->nr_requests / 4;
q->nr_batching = cfq_queued;
_
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] block: CFQ refcounting fix
2005-08-30 22:41 [PATCH 1/1] block: CFQ refcounting fix brking
@ 2005-08-31 7:28 ` Jens Axboe
2005-08-31 13:40 ` Brian King
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2005-08-31 7:28 UTC (permalink / raw)
To: brking; +Cc: linux-kernel
On Tue, Aug 30 2005, brking@us.ibm.com wrote:
>
> I ran across a memory leak related to the cfq scheduler. The cfq
> init function increments the refcnt of the associated request_queue.
> This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
> only calls the elevator exit function when its refcnt goes to zero, the
> request_q never gets cleaned up. It didn't look like other io schedulers were
> incrementing this refcnt, so I removed the refcnt increment and it fixed the
> memory leak for me.
>
> To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
> attribute to scan "- - -" repeatedly on a scsi host and watch the memory
> vanish.
Yeah, that actually looks like a dangling reference. I assume you tested
this properly?
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] block: CFQ refcounting fix
2005-08-31 7:28 ` Jens Axboe
@ 2005-08-31 13:40 ` Brian King
2005-08-31 13:43 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Brian King @ 2005-08-31 13:40 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-kernel
Jens Axboe wrote:
> On Tue, Aug 30 2005, brking@us.ibm.com wrote:
>
>>I ran across a memory leak related to the cfq scheduler. The cfq
>>init function increments the refcnt of the associated request_queue.
>>This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
>>only calls the elevator exit function when its refcnt goes to zero, the
>>request_q never gets cleaned up. It didn't look like other io schedulers were
>>incrementing this refcnt, so I removed the refcnt increment and it fixed the
>>memory leak for me.
>>
>>To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
>>attribute to scan "- - -" repeatedly on a scsi host and watch the memory
>>vanish.
>
>
> Yeah, that actually looks like a dangling reference. I assume you tested
> this properly?
Yes. I applied the patch, booted my system (which was crashing on bootup before
due to out of memory errors due to the leak) ran the scan a few times and verified
/proc/meminfo didn't continually decrease like without it, and rebooted again.
If there is anything else you would like me to do, I would be happy to do so.
Thanks
Brian
--
Brian King
eServer Storage I/O
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] block: CFQ refcounting fix
2005-08-31 13:40 ` Brian King
@ 2005-08-31 13:43 ` Jens Axboe
2005-08-31 13:57 ` Brian King
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2005-08-31 13:43 UTC (permalink / raw)
To: Brian King; +Cc: linux-kernel
On Wed, Aug 31 2005, Brian King wrote:
> Jens Axboe wrote:
> > On Tue, Aug 30 2005, brking@us.ibm.com wrote:
> >
> >>I ran across a memory leak related to the cfq scheduler. The cfq
> >>init function increments the refcnt of the associated request_queue.
> >>This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
> >>only calls the elevator exit function when its refcnt goes to zero, the
> >>request_q never gets cleaned up. It didn't look like other io schedulers were
> >>incrementing this refcnt, so I removed the refcnt increment and it fixed the
> >>memory leak for me.
> >>
> >>To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
> >>attribute to scan "- - -" repeatedly on a scsi host and watch the memory
> >>vanish.
> >
> >
> > Yeah, that actually looks like a dangling reference. I assume you tested
> > this properly?
>
> Yes. I applied the patch, booted my system (which was crashing on
> bootup before due to out of memory errors due to the leak) ran the
> scan a few times and verified /proc/meminfo didn't continually
> decrease like without it, and rebooted again. If there is anything
> else you would like me to do, I would be happy to do so.
I think you need to remove the blk_put_queue() in cfq_put_cfqd() as
well, otherwise I don't see how this can work without looking at freed
memory. I'll audit the other paths as well.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] block: CFQ refcounting fix
2005-08-31 13:43 ` Jens Axboe
@ 2005-08-31 13:57 ` Brian King
2005-08-31 15:50 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Brian King @ 2005-08-31 13:57 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1538 bytes --]
Jens Axboe wrote:
> On Wed, Aug 31 2005, Brian King wrote:
>
>>Jens Axboe wrote:
>>
>>>On Tue, Aug 30 2005, brking@us.ibm.com wrote:
>>>
>>>
>>>>I ran across a memory leak related to the cfq scheduler. The cfq
>>>>init function increments the refcnt of the associated request_queue.
>>>>This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
>>>>only calls the elevator exit function when its refcnt goes to zero, the
>>>>request_q never gets cleaned up. It didn't look like other io schedulers were
>>>>incrementing this refcnt, so I removed the refcnt increment and it fixed the
>>>>memory leak for me.
>>>>
>>>>To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
>>>>attribute to scan "- - -" repeatedly on a scsi host and watch the memory
>>>>vanish.
>>>
>>>
>>>Yeah, that actually looks like a dangling reference. I assume you tested
>>>this properly?
>>
>>Yes. I applied the patch, booted my system (which was crashing on
>>bootup before due to out of memory errors due to the leak) ran the
>>scan a few times and verified /proc/meminfo didn't continually
>>decrease like without it, and rebooted again. If there is anything
>>else you would like me to do, I would be happy to do so.
>
>
> I think you need to remove the blk_put_queue() in cfq_put_cfqd() as
> well, otherwise I don't see how this can work without looking at freed
> memory. I'll audit the other paths as well.
Good catch. Here is an updated patch.
--
Brian King
eServer Storage I/O
IBM Linux Technology Center
[-- Attachment #2: cfq_refcnt_fix.patch --]
[-- Type: text/plain, Size: 1442 bytes --]
I ran across a memory leak related to the cfq scheduler. The cfq
init function increments the refcnt of the associated request_queue.
This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
only calls the elevator exit function when its refcnt goes to zero, the
request_q never gets cleaned up. It didn't look like other io schedulers were
incrementing this refcnt, so I removed the refcnt increment and it fixed the
memory leak for me.
To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
attribute to scan "- - -" repeatedly on a scsi host and watch the memory
vanish.
Signed-off-by: Brian King <brking@us.ibm.com>
---
linux-2.6-bjking1/drivers/block/cfq-iosched.c | 3 ---
1 files changed, 3 deletions(-)
diff -puN drivers/block/cfq-iosched.c~cfq_refcnt_fix drivers/block/cfq-iosched.c
--- linux-2.6/drivers/block/cfq-iosched.c~cfq_refcnt_fix 2005-08-30 17:26:55.000000000 -0500
+++ linux-2.6-bjking1/drivers/block/cfq-iosched.c 2005-08-31 08:48:30.000000000 -0500
@@ -2260,8 +2260,6 @@ static void cfq_put_cfqd(struct cfq_data
if (!atomic_dec_and_test(&cfqd->ref))
return;
- blk_put_queue(q);
-
cfq_shutdown_timer_wq(cfqd);
q->elevator->elevator_data = NULL;
@@ -2318,7 +2316,6 @@ static int cfq_init_queue(request_queue_
e->elevator_data = cfqd;
cfqd->queue = q;
- atomic_inc(&q->refcnt);
cfqd->max_queued = q->nr_requests / 4;
q->nr_batching = cfq_queued;
_
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] block: CFQ refcounting fix
2005-08-31 13:57 ` Brian King
@ 2005-08-31 15:50 ` Jens Axboe
0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2005-08-31 15:50 UTC (permalink / raw)
To: Brian King; +Cc: linux-kernel
On Wed, Aug 31 2005, Brian King wrote:
> diff -puN drivers/block/cfq-iosched.c~cfq_refcnt_fix drivers/block/cfq-iosched.c
> --- linux-2.6/drivers/block/cfq-iosched.c~cfq_refcnt_fix 2005-08-30 17:26:55.000000000 -0500
> +++ linux-2.6-bjking1/drivers/block/cfq-iosched.c 2005-08-31 08:48:30.000000000 -0500
> @@ -2260,8 +2260,6 @@ static void cfq_put_cfqd(struct cfq_data
> if (!atomic_dec_and_test(&cfqd->ref))
> return;
>
> - blk_put_queue(q);
> -
> cfq_shutdown_timer_wq(cfqd);
> q->elevator->elevator_data = NULL;
>
> @@ -2318,7 +2316,6 @@ static int cfq_init_queue(request_queue_
> e->elevator_data = cfqd;
>
> cfqd->queue = q;
> - atomic_inc(&q->refcnt);
>
> cfqd->max_queued = q->nr_requests / 4;
> q->nr_batching = cfq_queued;
> _
That looks better. I'll add this to my outgoing queue, thanks!
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-08-31 15:50 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-30 22:41 [PATCH 1/1] block: CFQ refcounting fix brking
2005-08-31 7:28 ` Jens Axboe
2005-08-31 13:40 ` Brian King
2005-08-31 13:43 ` Jens Axboe
2005-08-31 13:57 ` Brian King
2005-08-31 15:50 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox