public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* File system testing
@ 2007-06-27 17:02 James Graves
  2007-06-28 19:07 ` David Woodhouse
  0 siblings, 1 reply; 4+ messages in thread
From: James Graves @ 2007-06-27 17:02 UTC (permalink / raw)
  To: linux-mtd


Hello all,

We're running into some possible data corruption issues with some log 
files stored on a JFFS2 partition.

Our current configuration is admittedly a bit unusual.  We've got a 
2.6.11 kernel on a PowerPC processor.  However, we ported the JFFS2 and 
MTD code from a 2.6.21 kernel.  As you might guess, there were some 
other issues that prevented us from just moving to a recent kernel, 
though we're still considering that.

To start with, however, I've got some general questions on testing the 
system.

We've got a script that will write a series of files to the filesystem, 
and then read them back, checking the contents.

The script will then unmount the filesystem, and re-mount it. 
Sometimes, when reading the files again, we'll see a problem.  For 
instance, seeing nulls (0x00) instead of the expected characters.

We'll also see at least one message indicating that a node has invalid 
size (would go over the end of the erase block) and after that some 
"magic bitmask 0x1985 not found" messages.

We then modified the script to sleep after writing each file, that 
seemed to help, and the test ran longer without seeing a data corruption 
problem.  Though we did eventually see one after a long time.

So, questions:

Is this a good way to test the filesystem in general?

Is it possible to push the filesystem too hard, by reading a writing a 
lot of data all at once?  Will the reads block the writes from being 
completed?  Or block the erasing dirty eraseblocks?

Thanks,

James Graves

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: File system testing
  2007-06-27 17:02 File system testing James Graves
@ 2007-06-28 19:07 ` David Woodhouse
  2007-06-28 19:34   ` James Graves
  0 siblings, 1 reply; 4+ messages in thread
From: David Woodhouse @ 2007-06-28 19:07 UTC (permalink / raw)
  To: James Graves; +Cc: linux-mtd

On Wed, 2007-06-27 at 12:02 -0500, James Graves wrote:
> We've got a script that will write a series of files to the filesystem, 
> and then read them back, checking the contents.
> 
> The script will then unmount the filesystem, and re-mount it. 
> Sometimes, when reading the files again, we'll see a problem.  For 
> instance, seeing nulls (0x00) instead of the expected characters.
> 
> We'll also see at least one message indicating that a node has invalid 
> size (would go over the end of the erase block) and after that some 
> "magic bitmask 0x1985 not found" messages.

Hm. That's very bad. Can you reproduce with CONFIG_JFFS2_FS_DEBUG=1 and
a serial console?

> We then modified the script to sleep after writing each file, that 
> seemed to help, and the test ran longer without seeing a data corruption 
> problem.  Though we did eventually see one after a long time.
> 
> So, questions:
> 
> Is this a good way to test the filesystem in general?

Yes. We have a test in mtd-utils.git which does something very similar.

> Is it possible to push the filesystem too hard, by reading a writing a 
> lot of data all at once?  Will the reads block the writes from being 
> completed?  Or block the erasing dirty eraseblocks?

No, none of that should happen.

-- 
dwmw2

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: File system testing
  2007-06-28 19:07 ` David Woodhouse
@ 2007-06-28 19:34   ` James Graves
  2007-06-28 19:41     ` David Woodhouse
  0 siblings, 1 reply; 4+ messages in thread
From: James Graves @ 2007-06-28 19:34 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-mtd


Thanks for the reply David.

Turns out to apparently be a hardware problem.

Data written to the last 512 of flash memory (of a 32Mbyte flash part) 
sometimes doesn't work.  Sometimes, even though the programming 
apparently worked, you'll just get back 0xff anyway.  The rest of the 
flash memory works fine, and we were able to run our filesystem 
exerciser for over 12 hours solid, with no errors, once we excluded that 
part of the flash memory.

So, obviously, any block at the end of memory was getting corrupted.

I'm expecting there's some kind of memory mapping conflict, but that'll 
take further investigation.

----------------------------------------------------------------

What's funny is that we'd previously done some informal testing of the 
raw partition.  Reading and writing images to the /dev/mtdblock3 device 
directly.

And it seemed to be working fine.  But, as it turns out, we were only 
testing approximately 32Mbytes... and not _exactly_ 32Mbytes.

Let that be a lesson to you all!

Thanks again,

James Graves

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: File system testing
  2007-06-28 19:34   ` James Graves
@ 2007-06-28 19:41     ` David Woodhouse
  0 siblings, 0 replies; 4+ messages in thread
From: David Woodhouse @ 2007-06-28 19:41 UTC (permalink / raw)
  To: James Graves; +Cc: linux-mtd

On Thu, 2007-06-28 at 14:34 -0500, James Graves wrote:
> Thanks for the reply David.
> 
> Turns out to apparently be a hardware problem.

Good. I was hoping it might be :)

-- 
dwmw2

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-06-28 19:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-27 17:02 File system testing James Graves
2007-06-28 19:07 ` David Woodhouse
2007-06-28 19:34   ` James Graves
2007-06-28 19:41     ` David Woodhouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox