public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* jffs2_get_inode_nodes() very very slow
@ 2005-02-01 15:32 Rudi Engelbertink
  2005-02-01 16:03 ` Artem B. Bityuckiy
  2005-02-01 16:10 ` Thomas Gleixner
  0 siblings, 2 replies; 9+ messages in thread
From: Rudi Engelbertink @ 2005-02-01 15:32 UTC (permalink / raw)
  To: linux-mtd

Hello,

On a 64 MiB NAND Flash I created a jffs2 file system. After several tests
especially power fail tests resulting in a lot off CRC and Data CRC errors.
This appears not to be a problem except it takes a very long time to check
the file system.
The initial check (scanning for erased blocks) is done in aproximatly 10
seconds, but after that the jffs2_get_inode_nodes check is running. 
This process takes up to 8 minutes.
During this time the file-system is inaccessible which in oure case a
watchdog decides to reboot the system, making it even worse.
It appears that the check is done with a stepsize of 16 bytes. On an 64MiB
NAND flash it does this check +/- 4 million times.

Is there a way to reduce the time to check the file-system and/or how 
can I recover from these errors.

Kind Regards,
Rudi.
-- 
They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety.
   Benjamin Franklin (1706-1790), Letter to Josiah Quincy, Sept. 11, 1773.
GnuPG Key fingerprint = 706C E2AC 7AE2 BCEE 04EB  A962 0A75 7F9B 07A1 83E8

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: jffs2_get_inode_nodes() very very slow
  2005-02-01 15:32 jffs2_get_inode_nodes() very very slow Rudi Engelbertink
@ 2005-02-01 16:03 ` Artem B. Bityuckiy
  2005-02-02  9:05   ` Rudi Engelbertink
  2005-02-01 16:10 ` Thomas Gleixner
  1 sibling, 1 reply; 9+ messages in thread
From: Artem B. Bityuckiy @ 2005-02-01 16:03 UTC (permalink / raw)
  To: Rudi Engelbertink; +Cc: linux-mtd

On Tue, 1 Feb 2005, Rudi Engelbertink wrote:

> Hello,
> 
> On a 64 MiB NAND Flash I created a jffs2 file system. After several tests
> especially power fail tests resulting in a lot off CRC and Data CRC errors.
How do you do your power-fail tests?

> This appears not to be a problem except it takes a very long time to check
> the file system.
> The initial check (scanning for erased blocks) is done in aproximatly 10
> seconds, but after that the jffs2_get_inode_nodes check is running. 
> This process takes up to 8 minutes.
Do you mean this happens every time or in case of "powerfails"?

> During this time the file-system is inaccessible which in oure case a
> watchdog decides to reboot the system, making it even worse.
> It appears that the check is done with a stepsize of 16 bytes. On an 64MiB
> NAND flash it does this check +/- 4 million times.
That's strange. JFFS2 should share processor's cycles.

> 
> Is there a way to reduce the time to check the file-system and/or how 
> can I recover from these errors.
> 
> Kind Regards,
> Rudi.
> -- 
> They that can give up essential liberty to obtain a little temporary
> safety deserve neither liberty nor safety.
>    Benjamin Franklin (1706-1790), Letter to Josiah Quincy, Sept. 11, 1773.
> GnuPG Key fingerprint = 706C E2AC 7AE2 BCEE 04EB  A962 0A75 7F9B 07A1 83E8
> 
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
> 

--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: jffs2_get_inode_nodes() very very slow
  2005-02-01 15:32 jffs2_get_inode_nodes() very very slow Rudi Engelbertink
  2005-02-01 16:03 ` Artem B. Bityuckiy
@ 2005-02-01 16:10 ` Thomas Gleixner
  1 sibling, 0 replies; 9+ messages in thread
From: Thomas Gleixner @ 2005-02-01 16:10 UTC (permalink / raw)
  To: Rudi Engelbertink; +Cc: linux-mtd

On Tue, 2005-02-01 at 16:32 +0100, Rudi Engelbertink wrote:
> Hello,
> 
> On a 64 MiB NAND Flash I created a jffs2 file system. After several tests
> especially power fail tests resulting in a lot off CRC and Data CRC errors.
> This appears not to be a problem except it takes a very long time to check
> the file system.
> The initial check (scanning for erased blocks) is done in aproximatly 10
> seconds, but after that the jffs2_get_inode_nodes check is running. 
> This process takes up to 8 minutes.
> During this time the file-system is inaccessible which in oure case a
> watchdog decides to reboot the system, making it even worse.
> It appears that the check is done with a stepsize of 16 bytes. On an 64MiB
> NAND flash it does this check +/- 4 million times.

Hmm, which kernel version ? 

Can you switch on JFFS2 debugging (debuglevel 1), log the output on a
serial line and send me the log, please ? 

tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: jffs2_get_inode_nodes() very very slow
  2005-02-01 16:03 ` Artem B. Bityuckiy
@ 2005-02-02  9:05   ` Rudi Engelbertink
  2005-02-02 10:26     ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Rudi Engelbertink @ 2005-02-02  9:05 UTC (permalink / raw)
  To: Artem B. Bityuckiy; +Cc: linux-mtd

On Tue, 1 Feb 2005 16:03:31 +0000 (GMT), Artem B. Bityuckiy wrote
> On Tue, 1 Feb 2005, Rudi Engelbertink wrote:
> 
> > Hello,
> > 
> > On a 64 MiB NAND Flash I created a jffs2 file system. After several tests
> > especially power fail tests resulting in a lot off CRC and Data CRC errors.
> How do you do your power-fail tests?
The powerfail tests are done by:
A. a clock. Just turn off and on the power every 15 minutes and start the
an application which logs 2 40-60 byte events every second.
B. an internal (hardware) watchdog which reboots the system when the
'application' appears to be dead for 10 minutes.

> 
> > This appears not to be a problem except it takes a very long time to check
> > the file system.
> > The initial check (scanning for erased blocks) is done in aproximatly 10
> > seconds, but after that the jffs2_get_inode_nodes check is running. 
> > This process takes up to 8 minutes.
> Do you mean this happens every time or in case of "powerfails"?
Yes it happens every time the system reboots or when the (nand) filesystem
is mounted.
> 
> > During this time the file-system is inaccessible which in oure case a
> > watchdog decides to reboot the system, making it even worse.
> > It appears that the check is done with a stepsize of 16 bytes. On an 64MiB
> > NAND flash it does this check +/- 4 million times.
> That's strange. JFFS2 should share processor's cycles.
Yes, the root is accessable but the directory where the logging is stored,
is unavailable for several minutes.
> 
> > 
> > Is there a way to reduce the time to check the file-system and/or how 
> > can I recover from these errors.
> > 
> > Kind Regards,
> > Rudi.
> > -- 
> > They that can give up essential liberty to obtain a little temporary
> > safety deserve neither liberty nor safety.
> >    Benjamin Franklin (1706-1790), Letter to Josiah Quincy, Sept. 11, 1773.
> > GnuPG Key fingerprint = 706C E2AC 7AE2 BCEE 04EB  A962 0A75 7F9B 07A1 83E8
> > 
> > ______________________________________________________
> > Linux MTD discussion mailing list
> > http://lists.infradead.org/mailman/listinfo/linux-mtd/
> >
> 
> --
> Best Regards,
> Artem B. Bityuckiy,
> St.-Petersburg, Russia.


RGDS Rudi.

-- 
They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety.
   Benjamin Franklin (1706-1790), Letter to Josiah Quincy, Sept. 11, 1773.
GnuPG Key fingerprint = 706C E2AC 7AE2 BCEE 04EB  A962 0A75 7F9B 07A1 83E8

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: jffs2_get_inode_nodes() very very slow
  2005-02-02  9:05   ` Rudi Engelbertink
@ 2005-02-02 10:26     ` Thomas Gleixner
  2005-02-02 10:35       ` David Woodhouse
  2005-02-02 12:26       ` Rudi Engelbertink
  0 siblings, 2 replies; 9+ messages in thread
From: Thomas Gleixner @ 2005-02-02 10:26 UTC (permalink / raw)
  To: Rudi Engelbertink; +Cc: linux-mtd

On Wed, 2005-02-02 at 10:05 +0100, Rudi Engelbertink wrote:
> The powerfail tests are done by:
> A. a clock. Just turn off and on the power every 15 minutes and start the
> an application which logs 2 40-60 byte events every second.
> ...
> Yes, the root is accessable but the directory where the logging is stored,
> is unavailable for several minutes.

You hit the worst case for JFFS2.

Your event logging creates tons of small nodes for your logfiles.
Actually are about 96000 very small nodes on the chip, so the mount time
is not surprising. This also will use a quite big amount of memory.

We have no real cure for this at the moment. We have this scenario in
our design list for JFFS3. I remember that somebody else came up with
this issue sometime ago. IIRC changing the log method did help a bit.

while true
do
	log_event
	let cnt=cnt+1
	if [ $cnt -ge $LIMIT ]
	then
		closelog
		cat log.small >>log.big
		rm log.small
	fi
done

This converts the small nodes to bigger nodes when the data are appended
to log.big. I guess garbage collection should kick in quite fast and
clean up the small nodes. It might not totally go away, but it should be
much better than now. This will also give you more capacity on your
partition as the small nodes consist mostly of node overhead. 

You may also try YAFFS for the logging partition. It should deal with
this situation a bit better.

tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: jffs2_get_inode_nodes() very very slow
  2005-02-02 10:26     ` Thomas Gleixner
@ 2005-02-02 10:35       ` David Woodhouse
  2005-02-02 11:23         ` Artem B. Bityuckiy
  2005-02-02 12:26       ` Rudi Engelbertink
  1 sibling, 1 reply; 9+ messages in thread
From: David Woodhouse @ 2005-02-02 10:35 UTC (permalink / raw)
  To: tglx; +Cc: linux-mtd

On Wed, 2005-02-02 at 11:26 +0100, Thomas Gleixner wrote:
> This converts the small nodes to bigger nodes when the data are
> appended to log.big. I guess garbage collection should kick in quite
> fast and clean up the small nodes. It might not totally go away, but
> it should be much better than now. This will also give you more
> capacity on your partition as the small nodes consist mostly of node
> overhead. 

We could possibly do this for ourselves quite easily. When writing to a
page which already has more than N fragments, obsolete them all and
write the whole page out again.

Or maybe, when writing the _last_ byte in a page which already has N
fragments, rewrite the whole page?

-- 
dwmw2

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: jffs2_get_inode_nodes() very very slow
  2005-02-02 10:35       ` David Woodhouse
@ 2005-02-02 11:23         ` Artem B. Bityuckiy
  0 siblings, 0 replies; 9+ messages in thread
From: Artem B. Bityuckiy @ 2005-02-02 11:23 UTC (permalink / raw)
  To: David Woodhouse; +Cc: tglx, linux-mtd

On Wed, 2 Feb 2005, David Woodhouse wrote:

> On Wed, 2005-02-02 at 11:26 +0100, Thomas Gleixner wrote:
> > This converts the small nodes to bigger nodes when the data are
> > appended to log.big. I guess garbage collection should kick in quite
> > fast and clean up the small nodes. It might not totally go away, but
> > it should be much better than now. This will also give you more
> > capacity on your partition as the small nodes consist mostly of node
> > overhead. 
> 
> We could possibly do this for ourselves quite easily. When writing to a
> page which already has more than N fragments, obsolete them all and
> write the whole page out again.
IMO, perfectly good idea! I think the reasonable thrashhold is about 3-6 
fragments.

> 
> Or maybe, when writing the _last_ byte in a page which already has N
> fragments, rewrite the whole page?
> 
> -- 
> dwmw2
> 
> 
> 
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
> 

--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: jffs2_get_inode_nodes() very very slow
  2005-02-02 10:26     ` Thomas Gleixner
  2005-02-02 10:35       ` David Woodhouse
@ 2005-02-02 12:26       ` Rudi Engelbertink
  2005-02-02 12:41         ` Thomas Gleixner
  1 sibling, 1 reply; 9+ messages in thread
From: Rudi Engelbertink @ 2005-02-02 12:26 UTC (permalink / raw)
  To: tglx; +Cc: linux-mtd

On Wed, 02 Feb 2005 11:26:51 +0100, Thomas Gleixner wrote
> On Wed, 2005-02-02 at 10:05 +0100, Rudi Engelbertink wrote:
> > The powerfail tests are done by:
> > A. a clock. Just turn off and on the power every 15 minutes and start the
> > an application which logs 2 40-60 byte events every second.
> > ...
> > Yes, the root is accessable but the directory where the logging is stored,
> > is unavailable for several minutes.
> 
> You hit the worst case for JFFS2.
> 
> Your event logging creates tons of small nodes for your logfiles.
> Actually are about 96000 very small nodes on the chip, so the mount time
> is not surprising. This also will use a quite big amount of memory.
> 
> We have no real cure for this at the moment. We have this scenario in
> our design list for JFFS3. I remember that somebody else came up with
> this issue sometime ago. IIRC changing the log method did help a bit.
> 
> while true
> do
> 	log_event
> 	let cnt=cnt+1
> 	if [ $cnt -ge $LIMIT ]
> 	then
> 		closelog
> 		cat log.small >>log.big
> 		rm log.small
> 	fi
> done
> 
> This converts the small nodes to bigger nodes when the data are appended
> to log.big. I guess garbage collection should kick in quite fast and
> clean up the small nodes. It might not totally go away, but it 
> should be much better than now. This will also give you more 
> capacity on your partition as the small nodes consist mostly of node 
> overhead.

Digging through the logfile I noticed that several inodes had a lot of
versions. Especialy the inode #737 which had 60000+ versions.
This is a bit strange because this is a 16 byte pointer file which will
be opened once and never closed. This should not result in 60000+ versions.
A copy move action (cp -p file file.cp && mv file.cp file) indeed makes a big
performance improvement. 
Will a regular close open of this file make some improvement regarding the
stored version information of this inode. During normal operations this file,
and some other files, never gets closed. So there will be never a 'commit'.
So after a powerfail, all the inode version information must be checked.

> 
> You may also try YAFFS for the logging partition. It should deal with
> this situation a bit better.
> 
> tglx

RGDS Rudi.
-- 
They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety.
   Benjamin Franklin (1706-1790), Letter to Josiah Quincy, Sept. 11, 1773.
GnuPG Key fingerprint = 706C E2AC 7AE2 BCEE 04EB  A962 0A75 7F9B 07A1 83E8

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: jffs2_get_inode_nodes() very very slow
  2005-02-02 12:26       ` Rudi Engelbertink
@ 2005-02-02 12:41         ` Thomas Gleixner
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Gleixner @ 2005-02-02 12:41 UTC (permalink / raw)
  To: Rudi Engelbertink; +Cc: linux-mtd

On Wed, 2005-02-02 at 13:26 +0100, Rudi Engelbertink wrote:
> > This converts the small nodes to bigger nodes when the data are appended
> > to log.big. I guess garbage collection should kick in quite fast and
> > clean up the small nodes. It might not totally go away, but it 
> > should be much better than now. This will also give you more 
> > capacity on your partition as the small nodes consist mostly of node 
> > overhead.
> 
> Digging through the logfile I noticed that several inodes had a lot of
> versions. Especialy the inode #737 which had 60000+ versions.
> This is a bit strange because this is a 16 byte pointer file which will
> be opened once and never closed. This should not result in 60000+ versions.

It's because JFFS2 does synchronous writes. That means, every write()
you do, is written to the FLASH immidiately. So every small log write
goes into FLASH and builds a node.

The version count is just incremented and the old nodes are declared
obsolete. I'm not sure, why the blocks are not garbage collected.

> A copy move action (cp -p file file.cp && mv file.cp file) indeed makes a big
> performance improvement. 
> Will a regular close open of this file make some improvement regarding the
> stored version information of this inode. During normal operations this file,
> and some other files, never gets closed. So there will be never a 'commit'.

close / open actually does not help. Without changes to JFFS2 code only
the copy to a different file will help you.

tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2005-02-02 12:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-01 15:32 jffs2_get_inode_nodes() very very slow Rudi Engelbertink
2005-02-01 16:03 ` Artem B. Bityuckiy
2005-02-02  9:05   ` Rudi Engelbertink
2005-02-02 10:26     ` Thomas Gleixner
2005-02-02 10:35       ` David Woodhouse
2005-02-02 11:23         ` Artem B. Bityuckiy
2005-02-02 12:26       ` Rudi Engelbertink
2005-02-02 12:41         ` Thomas Gleixner
2005-02-01 16:10 ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox