jffs2 robustness against powerfailure

public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed

* jffs2 robustness against powerfailure
@ 2005-10-14  9:35 David Jander
  2005-10-17 12:37 ` David Woodhouse
  0 siblings, 1 reply; 5+ messages in thread
From: David Jander @ 2005-10-14  9:35 UTC (permalink / raw)
  To: linux-mtd; +Cc: linuxppc-embedded

Hi,

We have a custom embedded linux board, based on a MPC852T processor, running 
2.4.25 kernel from denx. Jffs2 has certain backported patches after cvs from 
03/2005.
I wanted to try some stress-testing the flash using jffs2 and the "checkfs" 
tool which comes as part of the jffs2 sources. I setup a "power-cycle-box" as 
described in the README and started logging everything the system produced.
Since jffs2 claims to be robust against power-failures I set the threshold for 
maximum number of corrupt files allowed to 0. The test procedure rewrites all 
testfiles using a single write() call for each file, so that should be ok.
After 279 power-cycles, it stopped with a CRC error in "file13". Of course 
"file13" was the one being written to when power was cut off the last time.

Question: Is this a known shorcoming of jffs2, or must I assume that my 
hardware is broken?

The latter is relatively unlikely, once I try to explain the contents of the 
file:

diskles9:/flash # hexdump file13
0000000 0000 0300 0000 036d 0000 0942 0000 20b0
0000010 0000 08dd 0000 0715 0000 1da1 0000 043c
0000020 0000 05c2 0000 228d 0000 10ad 0000 1c35
...
00002e0 0000 14f1 0000 0d94 0000 1911 0000 12dd
00002f0 0000 09e9 0000 0686 0000 2380 0000 2294
0000300 0000 18f1 0000 01be 0000 25bb 0000 1af9
0000310 0000 1b94 0000 02b0 0000 2511 0000 1f79
0000320 0000 1f97 0000 0b53 0000 1eb7 0000 10bb
0000330 0000 2529 0000 2130 0000 0361 0000 0ff8
0000340 0000 1428 0000 10ab 0000 0364 0000 1b89
0000350 b110

As one can easily see, the first int (0x00000300) indicates the file-length, 
after which the 16-bit CRC should be placed. At offset 0000300 in the file 
there seems to be just more random data (a CRC of 0x0000 is unlikely and 
known wrong in this case).
At the end of the file (offset 0x0000350) there is something that looks more 
like a checksum.
Apparently the previous file was 0x0352 bytes long and the new file was going 
to be 0x0302 bytes long, but was never written completely. 
How comes I get a to see a valid file containing a mix of old and new data if 
it was written with a single write() call?????
Shouldn't jffs2 throw away the new incomplete node and keep the old version of 
the file?

Can anyone explain what happened here??

Greetings,

-- 
David Jander

^ permalink raw reply	[flat|nested] 5+ messages in thread

* jffs2 robustness against powerfailure
@ 2005-10-17  6:51 David Jander
  0 siblings, 0 replies; 5+ messages in thread
From: David Jander @ 2005-10-17  6:51 UTC (permalink / raw)
  To: linux-mtd

Hi,

We have a custom embedded linux board, based on a MPC852T processor, running 
2.4.25 kernel from denx. Jffs2 has certain backported patches after cvs from 
03/2005.
I wanted to try some stress-testing the flash using jffs2 and the "checkfs" 
tool which comes as part of the jffs2 sources. I setup a "power-cycle-box" as 
described in the README and started logging everything the system produced.
Since jffs2 claims to be robust against power-failures I set the threshold for 
maximum number of corrupt files allowed to 0. The test procedure rewrites all 
testfiles using a single write() call for each file, so that should be ok.
After 279 power-cycles, it stopped with a CRC error in "file13". Of course 
"file13" was the one being written to when power was cut off the last time.

Question: Is this a known shorcoming of jffs2, or must I assume that my 
hardware is broken?

The latter is relatively unlikely, once I try to explain the contents of the 
file:

diskles9:/flash # hexdump file13
0000000 0000 0300 0000 036d 0000 0942 0000 20b0
0000010 0000 08dd 0000 0715 0000 1da1 0000 043c
0000020 0000 05c2 0000 228d 0000 10ad 0000 1c35
...
00002e0 0000 14f1 0000 0d94 0000 1911 0000 12dd
00002f0 0000 09e9 0000 0686 0000 2380 0000 2294
0000300 0000 18f1 0000 01be 0000 25bb 0000 1af9
0000310 0000 1b94 0000 02b0 0000 2511 0000 1f79
0000320 0000 1f97 0000 0b53 0000 1eb7 0000 10bb
0000330 0000 2529 0000 2130 0000 0361 0000 0ff8
0000340 0000 1428 0000 10ab 0000 0364 0000 1b89
0000350 b110

As one can easily see, the first int (0x00000300) indicates the file-length, 
of 0x0300 bytes, after which the 16-bit CRC should be placed. At offset 
0000300 in the file there seems to be just more random data (a CRC of 0x0000 
is unlikely and known wrong in this case).
At the end of the file (offset 0x0000350) there is something that looks more 
like a checksum.
Apparently the previous file was 0x0352 bytes long and the new file was going 
to be 0x0302 bytes long, but was never written completely. 
How comes I get a to see a valid file containing a mix of old and new data if 
it was written with a single write() call?????
Shouldn't jffs2 throw away the new incomplete node and keep the old version of 
the file?

Can anyone please explain what happened here??

Greetings,

-- 
David Jander

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: jffs2 robustness against powerfailure
  2005-10-14  9:35 jffs2 robustness against powerfailure David Jander
@ 2005-10-17 12:37 ` David Woodhouse
  2005-10-19  8:10   ` David Jander
  0 siblings, 1 reply; 5+ messages in thread
From: David Woodhouse @ 2005-10-17 12:37 UTC (permalink / raw)
  To: David Jander; +Cc: linux-mtd, linuxppc-embedded

On Fri, 2005-10-14 at 11:35 +0200, David Jander wrote:
> We have a custom embedded linux board, based on a MPC852T processor, running 
> 2.4.25 kernel from denx. Jffs2 has certain backported patches after cvs from 
> 03/2005.

That sounds like a recipe for pain. March 2005 wasn't a good time to
take a snapshot from CVS; that just happens to be the time that we
stopped bothering to make it build in obsolete kernels.

If you want _stable_ JFFS2 code, you should use the code which is in the
2.4.31 kernel, or use the code which is in the 2.6 kernel (perhaps
updated from current CVS). 

> How comes I get a to see a valid file containing a mix of old and new
> data if it was written with a single write() call?????

Linux doesn't guarantee atomicity of writes larger than a single page,
but since your case is smaller than a page, it should have been atomic.

> Shouldn't jffs2 throw away the new incomplete node and keep the old
> version of the file?

Yes, it should. It's acceptable that there are extra data in the file
after 0x300 bytes, because the test program first does a write() call
and then a subsequent truncate() call. But it's not expected that the
0x300-byte write was not atomic; except in certain circumstances (like
reaching the end of an eraseblock and writing a smaller node there) you
should have seen all of it, or none. 

Please could you reproduce on a sane kernel and show the output of the
checkfs program during your test just before the power down, and also if
possible take an image of the contents of the flash _before_ mounting it
again after the power cycle. I'd like to see precisely the log nodes
which were present on the flash. If it's difficult to take a snapshot
before remounting, then running with CONFIG_JFFS2_FS_DEBUG=1 and
capturing all the KERN_DEBUG output via a serial console would suffice.

-- 
dwmw2

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: jffs2 robustness against powerfailure
  2005-10-17 12:37 ` David Woodhouse
@ 2005-10-19  8:10   ` David Jander
  2005-10-19  9:50     ` David Woodhouse
  0 siblings, 1 reply; 5+ messages in thread
From: David Jander @ 2005-10-19  8:10 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-mtd, linuxppc-embedded

On Monday 17 October 2005 14:37, David Woodhouse wrote:
> On Fri, 2005-10-14 at 11:35 +0200, David Jander wrote:
> > We have a custom embedded linux board, based on a MPC852T processor,
> > running 2.4.25 kernel from denx. Jffs2 has certain backported patches
> > after cvs from 03/2005.
>
> That sounds like a recipe for pain. March 2005 wasn't a good time to
> take a snapshot from CVS; that just happens to be the time that we
> stopped bothering to make it build in obsolete kernels.

That's why I posted to the linuxppc-embedded list, because I know there are 
quite some people using the same version (denx CVS kernel), and might have 
had issues of this kind also, although I mostly hear that it seems pretty 
stable and doesn't give problems.

> If you want _stable_ JFFS2 code, you should use the code which is in the
> 2.4.31 kernel, or use the code which is in the 2.6 kernel (perhaps
> updated from current CVS).

2.6 is not an option yet for mpc8xx architecture, so I'll have to stick with 
either what I have now or 2.4.31, but I fear the tradeoff of using vanilla 
2.4.31 jffs2 will be much slower fs, prohibitively long mount-times, etc... 
am I right?

>[...]
> Please could you reproduce on a sane kernel and show the output of the
> checkfs program during your test just before the power down, and also if
> possible take an image of the contents of the flash _before_ mounting it
> again after the power cycle. I'd like to see precisely the log nodes
> which were present on the flash. If it's difficult to take a snapshot
> before remounting, then running with CONFIG_JFFS2_FS_DEBUG=1 and
> capturing all the KERN_DEBUG output via a serial console would suffice.

I am still busy doing experiments, please have a little patience.
Until now I have turned on debug info in the same kernel as before, and get 
literally tons of log info. My monitor script had a bug, so the board was 
reset a little to soon in several occasions (shouldn't harm, should it), so 
now I have an image of jffs2 which on boot of the system produces a BUG() in 
gc.c line 139. This is not what I am looking for right now, and I still have 
to discard any possibilities that this could have happened due to other 
problems (RAM issues, etc). Once I finish sorting this out, I'd be glad to 
send you a few megabytes of debug output along with a "broken" jffs2 image if 
you like. Actually I'd be very grateful if you could take some time to look 
at it and give me your opinion, because I am still slightly clueless about 
jffs2.

Greetings,

-- 
David Jander

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: jffs2 robustness against powerfailure
  2005-10-19  8:10   ` David Jander
@ 2005-10-19  9:50     ` David Woodhouse
  0 siblings, 0 replies; 5+ messages in thread
From: David Woodhouse @ 2005-10-19  9:50 UTC (permalink / raw)
  To: David Jander; +Cc: linux-mtd, linuxppc-embedded

On Wed, 2005-10-19 at 10:10 +0200, David Jander wrote:
> 2.6 is not an option yet for mpc8xx architecture, so I'll have to stick with 
> either what I have now or 2.4.31, but I fear the tradeoff of using vanilla 
> 2.4.31 jffs2 will be much slower fs, prohibitively long mount-times, etc... 
> am I right?

If it's all running perfectly for you and you have no work to do, then
yes, perhaps you're right. But since that's evidently _not_ the case,
then no, I would disagree.

If I were you, the first thing I'd do would be to get a current kernel
working. It should only take a week or so -- porting from 2.4 to 2.6
really isn't that difficult.

> I am still busy doing experiments, please have a little patience.
> Until now I have turned on debug info in the same kernel as before, and get 
> literally tons of log info. My monitor script had a bug, so the board was 
> reset a little to soon in several occasions (shouldn't harm, should it), so 
> now I have an image of jffs2 which on boot of the system produces a BUG() in 
> gc.c line 139.

That should never happen, regardless of when the board is reset.
Assuming it still happens with JFFS2 code I care about (either 2.4 or
2.6), please could I have a copy of this image?

The problem you first reported doesn't seem too worrying to me. Writes
aren't always atomic -- in fact the Linux VFS¹ _guarantees_ that writes
larger than a page are _not_ atomic, because it splits pages up to call
prepare_write() and commit_write() on each one.

JFFS2 will mostly write each page out in a single node, but when there
is only a small amount of space at the end of an eraseblock it will
split writes still further, filling the eraseblock with as much data as
possible before writing the remainder of the page into a new eraseblock.
I suspect that's what happened in the case you showed. 

-- 
dwmw2

¹ Assuming you use generic_file_write()

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-10-19  9:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-14  9:35 jffs2 robustness against powerfailure David Jander
2005-10-17 12:37 ` David Woodhouse
2005-10-19  8:10   ` David Jander
2005-10-19  9:50     ` David Woodhouse
  -- strict thread matches above, loose matches on Subject: below --
2005-10-17  6:51 David Jander

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox