public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* Memory leak problem with JFFS2
@ 2002-07-03 13:38 Frederic Giasson
  2002-07-10  9:47 ` David Woodhouse
  0 siblings, 1 reply; 7+ messages in thread
From: Frederic Giasson @ 2002-07-03 13:38 UTC (permalink / raw)
  To: linux-mtd

Hi all,

I discovered what looks like a memory leak in JFFS2.
I setup a test in which a script does the following operations in order:
	1- mount JFFS2;
	2- sync;
	3- Erase linux kernel file from JFFS2; 
	4- sync;	
	5- Copy a new file to JFFS2 partition ( gzip-compressed linux kernel
)
	6- sync;
	7- umount JFFS2;
	8- vmstat;
	9- Goto step 1.

After numerous loops of this test script (between 110 and 470, it is never
the same), the system hangs at mount command.  Looking at the log, each time
vmstat is called I observe that free memory has gone down 8KB after each
loop of the script.  This is very likely to mean that there is a memory leak
related to JFFS2's memory usage ( free/dirty/very dirty lists and so forth
maybe ).  I did not dig into the code yet, I though that maybe someone else
encountered the problem and had already investigated it.

So a setup a second test script which is:

	1- mount JFFS2;
	2- sync;
	3- umount JFFS2;
	4- vmstat;
	5- goto step 1;

And I observed that free memory was going down 8KB each loop, same as in
first test.  

So to confirm that the leak comes from the fact of doing repeated
mount/umount, I setup a third test which is exactly like the first test,
except that I removed mount and umount commands. That time free memory usage
remained constant, which make me think that the leak is probably related to
memory incorrectly freed at umount time.  Nevertheless, the third test also
hanged, after 170 loops, at copy command.		

Between, note that the sync command sprinkled in the test loops are there to
speed up the process.  In fact, I observed that doing a sync after a file
erasure triggers the gc and wait for it to complete its task, so when I
write data to JFFS2 after it does not need to garbage collect in the mean
time.  This makes write times to JFFS2 much more deterministic.

I am using linux kernel 2.4.19 pre 7, and JFFS2 code from CVS as per May
14th, 2002.  By the way, does anybody can tell me if there is some sorts of
branches or labels in JFFS2 CVS, other that time stamps?  Is there a label
per linux release, for instance?

Thanks in advance,

Frédéric Giasson

	 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Memory leak problem with JFFS2
  2002-07-03 13:38 Frederic Giasson
@ 2002-07-10  9:47 ` David Woodhouse
  0 siblings, 0 replies; 7+ messages in thread
From: David Woodhouse @ 2002-07-10  9:47 UTC (permalink / raw)
  To: Frederic Giasson; +Cc: linux-mtd

fgiasson@mediatrix.com said:
>  So to confirm that the leak comes from the fact of doing repeated
> mount/umount, I setup a third test which is exactly like the first
> test, except that I removed mount and umount commands. That time free
> memory usage remained constant, which make me think that the leak is
> probably related to memory incorrectly freed at umount time.
> Nevertheless, the third test also hanged, after 170 loops, at copy
> command.	

Strange. Most JFFS2 memory allocations are done through dedicated slabs, 
rather than with kmalloc(). Can you reproduce this with JFFS2 as a module, 
unloading and reloading the module each time the file system is unmounted? 
That should cause a BUG() if any slab objects are still allocated when you 
try to unload the module. 

If it's a kmalloc(). there are few enough of them that it should be 
possible to track it down.

>  That time free memory usage remained constant, which make me think
> that the leak is probably related to memory incorrectly freed at
> umount time.  Nevertheless, the third test also hanged, after 170
> loops, at copy command.		 

That shouldn't ever happen. Where was it spending the time? (SysRq-P is 
your friend). If you can reproduce with debugging and show what it's 
actually _doing_ during this time, that would be useful.

>  Between, note that the sync command sprinkled in the test loops are
> there to speed up the process.  In fact, I observed that doing a sync
> after a file erasure triggers the gc and wait for it to complete its
> task, so when I write data to JFFS2 after it does not need to garbage
> collect in the mean time.  This makes write times to JFFS2 much more
> deterministic. 

Interesting. I'm not entirely convinced I know why that happens. sync() 
shouldn't really do anything at all.

>  I am using linux kernel 2.4.19 pre 7, and JFFS2 code from CVS as per
> May 14th, 2002.  By the way, does anybody can tell me if there is some
> sorts of branches or labels in JFFS2 CVS, other that time stamps?  Is
> there a label per linux release, for instance? 

There is occasionally a branch made for really experimental code, like when 
we started on NAND support. At the moment there aren't any such branches. 
All we have is the development on the trunk, and the jffs2-2_4-branch which 
contains the same code as is in Marcelo's 2.4.19-rc2. 

For anyone using JFFS2 in production, I would recommend using that version, 
rather than the trunk code. 

--
dwmw2

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Memory leak problem with JFFS2
@ 2002-07-31 14:48 Frederic Giasson
  2002-07-31 15:17 ` David Woodhouse
  0 siblings, 1 reply; 7+ messages in thread
From: Frederic Giasson @ 2002-07-31 14:48 UTC (permalink / raw)
  To: 'David Woodhouse'; +Cc: linux-mtd

Hi David,

I took some time to track down the memory leak that happens after umount.
First I compiled JFFS2 as a module and loaded/unloaded it doing a
mount/umount in between.  No BUG occured, meaning that no slab object is
still allocated when I unload the module.  Then I investigated around the
kmalloc/kfree, but I found nothing.  I put traces and counters each time a
kmalloc / kfree was called, and the number of kmalloc and kfree is balanced.
Therefore, I can conclude than the memory leak is not an allocated object
still allocated at umount time and is not a kmalloc without kfree neither.
Do you have any other idea about it?  This is not very likely to be the
mount / umount mechanism that could cause the leak, I tried mounting and
unmounting NFS and the memory usage before and after does not change. 

Between, we still don't know why doing a sync after an "rm" command triggers
the garbage collector.  Do you have any other clue about it?

Frédéric Giasson


|-----Original Message-----
|From: David Woodhouse [mailto:dwmw2@infradead.org]
|Sent: Wednesday, July 10, 2002 5:48 AM
|To: Frederic Giasson
|Cc: linux-mtd@lists.infradead.org
|Subject: Re: Memory leak problem with JFFS2 
|
|
|
|fgiasson@mediatrix.com said:
|>  So to confirm that the leak comes from the fact of doing repeated
|> mount/umount, I setup a third test which is exactly like the first
|> test, except that I removed mount and umount commands. That time free
|> memory usage remained constant, which make me think that the leak is
|> probably related to memory incorrectly freed at umount time.
|> Nevertheless, the third test also hanged, after 170 loops, at copy
|> command.	
|
|Strange. Most JFFS2 memory allocations are done through 
|dedicated slabs, 
|rather than with kmalloc(). Can you reproduce this with JFFS2 
|as a module, 
|unloading and reloading the module each time the file system 
|is unmounted? 
|That should cause a BUG() if any slab objects are still 
|allocated when you 
|try to unload the module. 
|
|If it's a kmalloc(). there are few enough of them that it should be 
|possible to track it down.
|
|>  That time free memory usage remained constant, which make me think
|> that the leak is probably related to memory incorrectly freed at
|> umount time.  Nevertheless, the third test also hanged, after 170
|> loops, at copy command.		 
|
|That shouldn't ever happen. Where was it spending the time? 
|(SysRq-P is 
|your friend). If you can reproduce with debugging and show what it's 
|actually _doing_ during this time, that would be useful.
|
|>  Between, note that the sync command sprinkled in the test loops are
|> there to speed up the process.  In fact, I observed that doing a sync
|> after a file erasure triggers the gc and wait for it to complete its
|> task, so when I write data to JFFS2 after it does not need to garbage
|> collect in the mean time.  This makes write times to JFFS2 much more
|> deterministic. 
|
|Interesting. I'm not entirely convinced I know why that 
|happens. sync() 
|shouldn't really do anything at all.
|
|>  I am using linux kernel 2.4.19 pre 7, and JFFS2 code from CVS as per
|> May 14th, 2002.  By the way, does anybody can tell me if 
|there is some
|> sorts of branches or labels in JFFS2 CVS, other that time stamps?  Is
|> there a label per linux release, for instance? 
|
|There is occasionally a branch made for really experimental 
|code, like when 
|we started on NAND support. At the moment there aren't any 
|such branches. 
|All we have is the development on the trunk, and the 
|jffs2-2_4-branch which 
|contains the same code as is in Marcelo's 2.4.19-rc2. 
|
|For anyone using JFFS2 in production, I would recommend using 
|that version, 
|rather than the trunk code. 
|
|--
|dwmw2
|
|

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Memory leak problem with JFFS2
  2002-07-31 14:48 Frederic Giasson
@ 2002-07-31 15:17 ` David Woodhouse
  0 siblings, 0 replies; 7+ messages in thread
From: David Woodhouse @ 2002-07-31 15:17 UTC (permalink / raw)
  To: Frederic Giasson; +Cc: linux-mtd

fgiasson@mediatrix.com said:
>  I took some time to track down the memory leak that happens after
> umount. First I compiled JFFS2 as a module and loaded/unloaded it
> doing a mount/umount in between.  No BUG occured, meaning that no slab
> object is still allocated when I unload the module.  Then I
> investigated around the kmalloc/kfree, but I found nothing.  I put
> traces and counters each time a kmalloc / kfree was called, and the
> number of kmalloc and kfree is balanced. Therefore, I can conclude
> than the memory leak is not an allocated object still allocated at
> umount time and is not a kmalloc without kfree neither. Do you have
> any other idea about it?  This is not very likely to be the mount /
> umount mechanism that could cause the leak, I tried mounting and
> unmounting NFS and the memory usage before and after does not change. 

Can you remind me what you have to do to reproduce this?

With 2.5.29, which contains the latest JFFS2 code from CVS...

# vmstat ; while true ; do mount -t jffs2 mtd0 /mnt/spare ; sync ; umount /mnt/spare ; vmstat|tail -1 ; done
   procs                      memory    swap          io     system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
 0  0  0      0  81800      0  26244   0   0     2     5 1003     8   2   3  95
 1  0  0      0  81696      0  26244   0   0     2     5 1003     8   2   3  95
 1  0  0      0  81700      0  26244   0   0     2     5 1003     9   2   3  95
 1  0  0      0  81712      0  26244   0   0     2     5 1003     9   2   3  95
 1  0  0      0  81696      0  26244   0   0     2     5 1003     9   2   3  95
 2  0  0      0  81704      0  26244   0   0     2     5 1003     9   2   3  95
 2  0  0      0  81708      0  26248   0   0     2     5 1003     9   2   3  95
 2  0  0      0  81696      0  26248   0   0     2     5 1003    10   2   3  95
 2  0  0      0  81696      0  26248   0   0     2     5 1003    10   2   3  95
 2  0  0      0  81712      0  26248   0   0     2     5 1003    10   2   3  95
 2  0  1      0  81688      0  26248   0   0     2     5 1003    10   2   3  95
 2  0  1      0  81696      0  26248   0   0     2     5 1003    10   2   3  95
 2  0  0      0  81708      0  26248   0   0     2     5 1003    10   2   3  95
 2  0  0      0  81692      0  26248   0   0     2     5 1003    10   2   3  95
 2  0  0      0  81696      0  26248   0   0     2     5 1003    10   2   3  95
 2  0  0      0  81708      0  26248   0   0     2     5 1003    10   2   3  95
 2  0  0      0  81692      0  26248   0   0     2     5 1003    10   2   3  95
 2  0  0      0  81696      0  26248   0   0     2     5 1003    11   2   3  95
 2  0  0      0  81708      0  26248   0   0     2     5 1003    11   2   3  95
 2  0  0      0  81680      0  26248   0   0     2     5 1003    11   2   3  95
 2  0  0      0  81724      0  26248   0   0     2     5 1003    11   2   3  95
 2  0  0      0  81712      0  26248   0   0     2     5 1003    11   2   3  95
 2  0  0      0  81704      0  26248   0   0     2     5 1003    11   2   3  95
 2  0  0      0  81696      0  26248   0   0     2     5 1003    11   2   3  95
 2  0  1      0  81708      0  26248   0   0     2     5 1003    11   2   3  95
 2  0  0      0  81688      0  26248   0   0     2     5 1003    11   2   3  95
 2  0  0      0  81696      0  26248   0   0     2     5 1003    11   2   3  95
 2  0  0      0  81708      0  26248   0   0     2     5 1003    12   2   3  95
 2  0  0      0  81692      0  26248   0   0     2     5 1003    12   2   3  95
 2  0  0      0  81696      0  26248   0   0     2     5 1003    12   2   3  95
 2  0  0      0  81708      0  26248   0   0     2     5 1003    12   2   3  95
 2  0  0      0  81692      0  26248   0   0     2     5 1003    12   2   3  95
 2  0  0      0  81696      0  26248   0   0     2     5 1003    12   2   3  95
 2  0  0      0  81708      0  26248   0   0     2     5 1003    12   2   3  95
Interrupt

> Between, we still don't know why doing a sync after an "rm" command
> triggers the garbage collector.  Do you have any other clue about it? 

Not off-hand. 

--
dwmw2

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Memory leak problem with JFFS2
@ 2002-07-31 16:14 Frederic Giasson
  2002-07-31 18:04 ` David Woodhouse
  0 siblings, 1 reply; 7+ messages in thread
From: Frederic Giasson @ 2002-07-31 16:14 UTC (permalink / raw)
  To: 'David Woodhouse'; +Cc: linux-mtd

You have the right procedure to reproduce the problem ( vmstat ; while true
; do mount -t jffs2 mtd0 /mnt/spare ; ).  I can reproduce the problem using
this line on my system.

I updated my JFFS2 code with the lastest from CVS, and I am using kernel
2.4.19-rc2, and the problem is still there.  Do you think you could do the
same test under 2.4.19-rc2?

Thanks,

Frédéric Giasson 

|-----Original Message-----
|From: David Woodhouse [mailto:dwmw2@infradead.org]
|Sent: Wednesday, July 31, 2002 11:17 AM
|To: Frederic Giasson
|Cc: linux-mtd@lists.infradead.org
|Subject: Re: Memory leak problem with JFFS2 
|
|
|
|fgiasson@mediatrix.com said:
|>  I took some time to track down the memory leak that happens after
|> umount. First I compiled JFFS2 as a module and loaded/unloaded it
|> doing a mount/umount in between.  No BUG occured, meaning 
|that no slab
|> object is still allocated when I unload the module.  Then I
|> investigated around the kmalloc/kfree, but I found nothing.  I put
|> traces and counters each time a kmalloc / kfree was called, and the
|> number of kmalloc and kfree is balanced. Therefore, I can conclude
|> than the memory leak is not an allocated object still allocated at
|> umount time and is not a kmalloc without kfree neither. Do you have
|> any other idea about it?  This is not very likely to be the mount /
|> umount mechanism that could cause the leak, I tried mounting and
|> unmounting NFS and the memory usage before and after does 
|not change. 
|
|Can you remind me what you have to do to reproduce this?
|
|With 2.5.29, which contains the latest JFFS2 code from CVS...
|
|# vmstat ; while true ; do mount -t jffs2 mtd0 /mnt/spare ; 
|sync ; umount /mnt/spare ; vmstat|tail -1 ; done
|   procs                      memory    swap          io     
|system         cpu
| r  b  w   swpd   free   buff  cache  si  so    bi    bo   in  
|  cs  us  sy  id
| 0  0  0      0  81800      0  26244   0   0     2     5 1003  
|   8   2   3  95
| 1  0  0      0  81696      0  26244   0   0     2     5 1003  
|   8   2   3  95
| 1  0  0      0  81700      0  26244   0   0     2     5 1003  
|   9   2   3  95
| 1  0  0      0  81712      0  26244   0   0     2     5 1003  
|   9   2   3  95
| 1  0  0      0  81696      0  26244   0   0     2     5 1003  
|   9   2   3  95
| 2  0  0      0  81704      0  26244   0   0     2     5 1003  
|   9   2   3  95
| 2  0  0      0  81708      0  26248   0   0     2     5 1003  
|   9   2   3  95
| 2  0  0      0  81696      0  26248   0   0     2     5 1003  
|  10   2   3  95
| 2  0  0      0  81696      0  26248   0   0     2     5 1003  
|  10   2   3  95
| 2  0  0      0  81712      0  26248   0   0     2     5 1003  
|  10   2   3  95
| 2  0  1      0  81688      0  26248   0   0     2     5 1003  
|  10   2   3  95
| 2  0  1      0  81696      0  26248   0   0     2     5 1003  
|  10   2   3  95
| 2  0  0      0  81708      0  26248   0   0     2     5 1003  
|  10   2   3  95
| 2  0  0      0  81692      0  26248   0   0     2     5 1003  
|  10   2   3  95
| 2  0  0      0  81696      0  26248   0   0     2     5 1003  
|  10   2   3  95
| 2  0  0      0  81708      0  26248   0   0     2     5 1003  
|  10   2   3  95
| 2  0  0      0  81692      0  26248   0   0     2     5 1003  
|  10   2   3  95
| 2  0  0      0  81696      0  26248   0   0     2     5 1003  
|  11   2   3  95
| 2  0  0      0  81708      0  26248   0   0     2     5 1003  
|  11   2   3  95
| 2  0  0      0  81680      0  26248   0   0     2     5 1003  
|  11   2   3  95
| 2  0  0      0  81724      0  26248   0   0     2     5 1003  
|  11   2   3  95
| 2  0  0      0  81712      0  26248   0   0     2     5 1003  
|  11   2   3  95
| 2  0  0      0  81704      0  26248   0   0     2     5 1003  
|  11   2   3  95
| 2  0  0      0  81696      0  26248   0   0     2     5 1003  
|  11   2   3  95
| 2  0  1      0  81708      0  26248   0   0     2     5 1003  
|  11   2   3  95
| 2  0  0      0  81688      0  26248   0   0     2     5 1003  
|  11   2   3  95
| 2  0  0      0  81696      0  26248   0   0     2     5 1003  
|  11   2   3  95
| 2  0  0      0  81708      0  26248   0   0     2     5 1003  
|  12   2   3  95
| 2  0  0      0  81692      0  26248   0   0     2     5 1003  
|  12   2   3  95
| 2  0  0      0  81696      0  26248   0   0     2     5 1003  
|  12   2   3  95
| 2  0  0      0  81708      0  26248   0   0     2     5 1003  
|  12   2   3  95
| 2  0  0      0  81692      0  26248   0   0     2     5 1003  
|  12   2   3  95
| 2  0  0      0  81696      0  26248   0   0     2     5 1003  
|  12   2   3  95
| 2  0  0      0  81708      0  26248   0   0     2     5 1003  
|  12   2   3  95
|Interrupt
|
|> Between, we still don't know why doing a sync after an "rm" command
|> triggers the garbage collector.  Do you have any other clue 
|about it? 
|
|Not off-hand. 
|
|--
|dwmw2
|
|
|

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Memory leak problem with JFFS2
  2002-07-31 16:14 Memory leak problem with JFFS2 Frederic Giasson
@ 2002-07-31 18:04 ` David Woodhouse
  0 siblings, 0 replies; 7+ messages in thread
From: David Woodhouse @ 2002-07-31 18:04 UTC (permalink / raw)
  To: Frederic Giasson; +Cc: linux-mtd

fgiasson@mediatrix.com said:
> You have the right procedure to reproduce the problem ( vmstat ; while
> true ; do mount -t jffs2 mtd0 /mnt/spare ; ).  I can reproduce the
> problem using this line on my system.

> I updated my JFFS2 code with the lastest from CVS, and I am using
> kernel 2.4.19-rc2, and the problem is still there.  Do you think you
> could do the same test under 2.4.19-rc2?

I'll have a go. This happens in the CVS code, not the jffs2-2_4-branch 
code, right?

--
dwmw2

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Memory leak problem with JFFS2
@ 2002-07-31 18:18 Frederic Giasson
  0 siblings, 0 replies; 7+ messages in thread
From: Frederic Giasson @ 2002-07-31 18:18 UTC (permalink / raw)
  To: 'David Woodhouse'; +Cc: linux-mtd

Yes, this happens in the CVS code.  I did not try with the jffs2-2_4-branch
( why I understand is the one into kernel 2.4.19-rc2 ? ).

|-----Original Message-----
|From: David Woodhouse [mailto:dwmw2@infradead.org]
|Sent: Wednesday, July 31, 2002 2:05 PM
|To: Frederic Giasson
|Cc: linux-mtd@lists.infradead.org
|Subject: Re: Memory leak problem with JFFS2 
|
|
|
|fgiasson@mediatrix.com said:
|> You have the right procedure to reproduce the problem ( 
|vmstat ; while
|> true ; do mount -t jffs2 mtd0 /mnt/spare ; ).  I can reproduce the
|> problem using this line on my system.
|
|> I updated my JFFS2 code with the lastest from CVS, and I am using
|> kernel 2.4.19-rc2, and the problem is still there.  Do you think you
|> could do the same test under 2.4.19-rc2?
|
|I'll have a go. This happens in the CVS code, not the jffs2-2_4-branch 
|code, right?
|
|--
|dwmw2
|
|
|

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-07-31 18:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-07-31 16:14 Memory leak problem with JFFS2 Frederic Giasson
2002-07-31 18:04 ` David Woodhouse
  -- strict thread matches above, loose matches on Subject: below --
2002-07-31 18:18 Frederic Giasson
2002-07-31 14:48 Frederic Giasson
2002-07-31 15:17 ` David Woodhouse
2002-07-03 13:38 Frederic Giasson
2002-07-10  9:47 ` David Woodhouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox