jffs2 fragmentation

public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed

* jffs2 fragmentation
@ 2003-10-18 14:20 J B
  2003-10-30 17:45 ` Jörn Engel
  2003-10-30 18:53 ` David Woodhouse
  0 siblings, 2 replies; 6+ messages in thread
From: J B @ 2003-10-18 14:20 UTC (permalink / raw)
  To: jffs-dev, linux-mtd

Ok, maybe I am just completely missing something here, but I am stumped.  I 
am running some tests (similar to the jitter tests Vipin did a long time 
ago), and I am seeing some very bad performance.   Below is a summary of 
what I am doing and what I am seeing.

I have a jffs2 filesystem that is mounted and starts out at ~87% full.  
There are various small files on there, and one big file, ~15MiB (lgood 
compression :).  Here is what I do:

rm /opt/big_file; cp /home/big_file /opt/big_file; rm /opt/big_file; cp 
/home/big_file2 /opt/big_file

/opt is the jffs2 filesystem, /home is an nfs mounted directory.  big_file 
and big_file2 are the same size, but have different content (i.e. possibly 
different compression ratios).

Normally, a rm/cp pair takes about 2 minutes on my system.  After about 10 
iterations, the copies begin to take longer, about 3-4 minutes.   After 
about 10 iterations they take upwards of 1/2 an hour.

Looking at the dirty, used, and wasted size from a df command, I see this:

<7>STATFS:
<7>flash_size: 00680000
<7>used_size: 0041afe4
<7>dirty_size: 00199d2c
<7>wasted_size: 0000012c
<7>free_size: 000cb1c4
<7>erasing_size: 00000000
<7>bad_size: 00000000
<7>sector_size: 00020000
<7>nextblock: 0x00660000
<7>gcblock: 0x000a0000

So almost 13 eraseblocks worth of dirty space (NOR flash with 128KiB 
eraseblocks).  But, there is only 1 block on the dirty_list, 6 on the 
free_list, and 44 on the clean_list.  Garbage collection is going on during 
the writes obviously, but it doesn't seem to be making any difference.  In 
fact, the dirty_size _increases_.

So here is my question:  because of the size of the file involved and the 
relative lack of free eraseblocks to start with, is it possible that the 
filesystem is so fragmented that the dirty space is spread accross many 
eraseblocks and garbage collection is incapable of actually freeing up any 
space?

That is the only conclusion I could come to.  Any thoughts would be 
appreciated.

Thx,
J

_________________________________________________________________
Want to check if your PC is virus-infected?  Get a FREE computer virus scan 
online from McAfee.    
http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: jffs2 fragmentation
  2003-10-18 14:20 jffs2 fragmentation J B
@ 2003-10-30 17:45 ` Jörn Engel
  2003-10-30 18:53 ` David Woodhouse
  1 sibling, 0 replies; 6+ messages in thread
From: Jörn Engel @ 2003-10-30 17:45 UTC (permalink / raw)
  To: J B; +Cc: linux-mtd, jffs-dev

Noone cared yet, how odd.

On Sat, 18 October 2003 09:20:23 -0500, J B wrote:
> 
> Ok, maybe I am just completely missing something here, but I am stumped.  I 
> am running some tests (similar to the jitter tests Vipin did a long time 
> ago), and I am seeing some very bad performance.   Below is a summary of 
> what I am doing and what I am seeing.
> 
> I have a jffs2 filesystem that is mounted and starts out at ~87% full.  
> There are various small files on there, and one big file, ~15MiB (lgood 
> compression :).  Here is what I do:
> 
> rm /opt/big_file; cp /home/big_file /opt/big_file; rm /opt/big_file; cp 
> /home/big_file2 /opt/big_file
> 
> /opt is the jffs2 filesystem, /home is an nfs mounted directory.  big_file 
> and big_file2 are the same size, but have different content (i.e. possibly 
> different compression ratios).
> 
> Normally, a rm/cp pair takes about 2 minutes on my system.  After about 10 
> iterations, the copies begin to take longer, about 3-4 minutes.   After 
> about 10 iterations they take upwards of 1/2 an hour.

The first slowdown is expected, the second is not.

When flash is full, jffs2 has to wait for garbage collection before
writing.  Flash erase costs roughly as long as flash write, so double
time is expected.  Try to add a sleep after each rm and this should go
away.

1/2 hour is a surprise.  Does this still appear, when adding a sleep?

> Looking at the dirty, used, and wasted size from a df command, I see this:
> 
> <7>STATFS:
> <7>flash_size:	00680000
> <7>used_size:		0041afe4
> <7>dirty_size:	00199d2c
> <7>wasted_size:	0000012c
> <7>free_size:		000cb1c4
> <7>erasing_size:	00000000
> <7>bad_size:		00000000
> <7>sector_size:	00020000
> <7>nextblock:	      0x00660000
> <7>gcblock:	      0x000a0000
> 
> So almost 13 eraseblocks worth of dirty space (NOR flash with 128KiB 
> eraseblocks).  But, there is only 1 block on the dirty_list, 6 on the 
> free_list, and 44 on the clean_list.  Garbage collection is going on during 
> the writes obviously, but it doesn't seem to be making any difference.  In 
> fact, the dirty_size _increases_.
> 
> So here is my question:  because of the size of the file involved and the 
> relative lack of free eraseblocks to start with, is it possible that the 
> filesystem is so fragmented that the dirty space is spread accross many 
> eraseblocks and garbage collection is incapable of actually freeing up any 
> space?

Fragmentation is hard to achieve for log structured file systems. ;)

Honestly, I have no idea.  Your test is not the usual workload for
jffs2, so you may have uncovered some hidden problem. (and one that
most people don't care about)

> That is the only conclusion I could come to.  Any thoughts would be 
> appreciated.

Jörn

-- 
Homo Sapiens is a goal, not a description.
-- unknown

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: jffs2 fragmentation
  2003-10-18 14:20 jffs2 fragmentation J B
  2003-10-30 17:45 ` Jörn Engel
@ 2003-10-30 18:53 ` David Woodhouse
  2003-10-31 11:24   ` Jörn Engel
  1 sibling, 1 reply; 6+ messages in thread
From: David Woodhouse @ 2003-10-30 18:53 UTC (permalink / raw)
  To: J B; +Cc: linux-mtd, jffs-dev

On Sat, 2003-10-18 at 09:20 -0500, J B wrote:
> Normally, a rm/cp pair takes about 2 minutes on my system.  After about 10 
> iterations, the copies begin to take longer, about 3-4 minutes.   After 
> about 10 iterations they take upwards of 1/2 an hour.

I suspect you've triggered the worst case of a performance bug which
I've known about for a while.

We should write new data out to one empty block, while writing out
garbage-collected data out to another. We don't do that at the moment;
we interleave old and new data and then you erase your new file, leaving
us with a very suboptimal mix of valid and obsolete nodes in each
eraseblock we've been writing to.

I'm still a bit surprised it takes half an hour though. 

-- 
dwmw2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: jffs2 fragmentation
  2003-10-30 18:53 ` David Woodhouse
@ 2003-10-31 11:24   ` Jörn Engel
  2003-10-31 11:53     ` David Woodhouse
  0 siblings, 1 reply; 6+ messages in thread
From: Jörn Engel @ 2003-10-31 11:24 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-mtd, jffs-dev, J B

On Thu, 30 October 2003 18:53:08 +0000, David Woodhouse wrote:
> On Sat, 2003-10-18 at 09:20 -0500, J B wrote:
> > Normally, a rm/cp pair takes about 2 minutes on my system.  After about 10 
> > iterations, the copies begin to take longer, about 3-4 minutes.   After 
> > about 10 iterations they take upwards of 1/2 an hour.
> 
> I suspect you've triggered the worst case of a performance bug which
> I've known about for a while.
> 
> We should write new data out to one empty block, while writing out
> garbage-collected data out to another. We don't do that at the moment;
> we interleave old and new data and then you erase your new file, leaving
> us with a very suboptimal mix of valid and obsolete nodes in each
> eraseblock we've been writing to.
> 
> I'm still a bit surprised it takes half an hour though. 

If your explanation is correct, a shift from 4 to 28 minutes would
correspond to 6 clean nodes reused for every 1 dirty node deleted and
new node written.

Doesn't make a lot of sense with a filesystem that should be >80% free
or dirty, does it?

Jörn

-- 
Fools ignore complexity.  Pragmatists suffer it.
Some can avoid it.  Geniuses remove it.
-- Perlis's Programming Proverb #58, SIGPLAN Notices, Sept.  1982

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: jffs2 fragmentation
  2003-10-31 11:24   ` Jörn Engel
@ 2003-10-31 11:53     ` David Woodhouse
  2003-10-31 12:50       ` Jörn Engel
  0 siblings, 1 reply; 6+ messages in thread
From: David Woodhouse @ 2003-10-31 11:53 UTC (permalink / raw)
  To: Jörn Engel; +Cc: linux-mtd, jffs-dev, J B

On Fri, 2003-10-31 at 12:24 +0100, Jörn Engel wrote:
> If your explanation is correct, a shift from 4 to 28 minutes would
> correspond to 6 clean nodes reused for every 1 dirty node deleted and
> new node written.
> 
> Doesn't make a lot of sense with a filesystem that should be >80% free
> or dirty, does it?

Hmmm. The figure of 87% was _with_ the large file, wasn't it? How full
is it when the large file is deleted? 

When it's 80% full it does make sense. It's 80% full. 20% "free or
dirty". Your 20% free space is mixed in with the clean data; you have to
move 6 nodes out of the way for every node's worth of space you recover.

Consider the case where every eraseblock has 80% clean data and 20% of
each contains part of the large file you've just deleted, and is hence
now dirty. Then you write the same large file again. Garbage collection
happens -- each time we GC a full eraseblock we recover and rewrite 80%
of an eraseblock of clean data, and we manage to write 20% of an
eraseblock of the new file. The 80/20 ratio hence remains stable.

-- 
dwmw2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: jffs2 fragmentation
  2003-10-31 11:53     ` David Woodhouse
@ 2003-10-31 12:50       ` Jörn Engel
  0 siblings, 0 replies; 6+ messages in thread
From: Jörn Engel @ 2003-10-31 12:50 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-mtd, jffs-dev, J B

On Fri, 31 October 2003 11:53:20 +0000, David Woodhouse wrote:
> On Fri, 2003-10-31 at 12:24 +0100, Jörn Engel wrote:
> > If your explanation is correct, a shift from 4 to 28 minutes would
> > correspond to 6 clean nodes reused for every 1 dirty node deleted and
> > new node written.
> > 
> > Doesn't make a lot of sense with a filesystem that should be >80% free
> > or dirty, does it?
> 
> Hmmm. The figure of 87% was _with_ the large file, wasn't it? How full
> is it when the large file is deleted? 
> 
> When it's 80% full it does make sense. It's 80% full. 20% "free or
> dirty". Your 20% free space is mixed in with the clean data; you have to
> move 6 nodes out of the way for every node's worth of space you recover.
> 
> Consider the case where every eraseblock has 80% clean data and 20% of
> each contains part of the large file you've just deleted, and is hence
> now dirty. Then you write the same large file again. Garbage collection
> happens -- each time we GC a full eraseblock we recover and rewrite 80%
> of an eraseblock of clean data, and we manage to write 20% of an
> eraseblock of the new file. The 80/20 ratio hence remains stable.

Sorry, I should have reread the original message.  The fs under
pressure is permanently 87% full, not 87% empty.  Makes perfect sense
now.

Jörn

-- 
When in doubt, punt.  When somebody actually complains, go back and fix it...
The 90% solution is a good thing.
-- Rob Landley

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-10-31 12:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-18 14:20 jffs2 fragmentation J B
2003-10-30 17:45 ` Jörn Engel
2003-10-30 18:53 ` David Woodhouse
2003-10-31 11:24   ` Jörn Engel
2003-10-31 11:53     ` David Woodhouse
2003-10-31 12:50       ` Jörn Engel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox