JFFS2 on NAND flash/DiskOnChip.

public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed

* JFFS2 on NAND flash/DiskOnChip.
@ 2001-12-01  9:51 David Woodhouse
  2002-01-31 10:04 ` JFFS2 on NAND flash Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: David Woodhouse @ 2001-12-01  9:51 UTC (permalink / raw)
  To: linux-mtd; +Cc: jffs-dev

Quite a few people have asked about JFFS2 on NAND flash recently. 

At least one has been serious enough to have approached a local contractor
and offered money for it - but that company didn't want to put up _all_ the
money to get it done.

That contractor didn't want to abuse the mailing lists by trolling for 
business, but I _want_ him to get this contract and get it working - 
largely because I'm tired of say 'no, sorry we can't do that yet'.

So I'd like to know if anyone else would be interested in contributing to
the same contract. Otherwise, it's likely to stay on my TODO list until Red
Hat gets a contract to do it, or unless the contractor in question manages
to do it for next-to-no money.

Please contact me in private if you're like to pursue this.

(Sorry for the list abuse.)

--
dwmw2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* JFFS2 on NAND flash
  2001-12-01  9:51 JFFS2 on NAND flash/DiskOnChip David Woodhouse
@ 2002-01-31 10:04 ` Thomas Gleixner
  2002-01-31 10:22   ` David Woodhouse
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2002-01-31 10:04 UTC (permalink / raw)
  To: David Woodhouse, linux-mtd

Smile !

I tried jffs2 on a NAND flash. Tracking down one ugly bug in fs/jffs2/write.c 
i got it basicly working !
Thanks to all of you, who did his job on that jffs2/nand stuff obviously 
without having access to a NAND device.

Bug description
If you write a file a empty dnode is created first.
jffs2_do_create calls 
	jfss2_write_dnode(c, f, ri, NULL, 0, phys_ofs, &writtenlen);

jfss2_write_dnode puts NULL & 0 , which is data and datalen into
	vecs[1].iov_base = (unsigned char *)data;
	vecs[1].iov_len = datalen;
and then calls 
	ret = jffs2_flash_writev(c, vecs, 2, flash_ofs, &retlen); 
2 is the number of vecs assigned to this job.

jffs2_flash_writev calls mtd->writev, if available else it calls
mtd_fake_writev. The nand driver supports writev and does not check if the 
vecs[1] entry is empty. mtd_fake_writev takes care of this.
I suggest to do the check in jffs_write_dnode (where the wrong count comes 
from), although i will include a check in nand.c.
This problem did not show up due to the fact, that obviously none of the 
FLASH drivers supports writev. (I could not find one in mtd)

Patch is committed to CVS

Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de  

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: JFFS2 on NAND flash
  2002-01-31 10:04 ` JFFS2 on NAND flash Thomas Gleixner
@ 2002-01-31 10:22   ` David Woodhouse
  2002-01-31 12:26     ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: David Woodhouse @ 2002-01-31 10:22 UTC (permalink / raw)
  To: gleixner; +Cc: linux-mtd

gleixner@autronix.de said:
>  I tried jffs2 on a NAND flash. Tracking down one ugly bug in fs/jffs2/
> write.c  i got it basicly working ! Thanks to all of you, who did his
> job on that jffs2/nand stuff obviously  without having access to a
> NAND device. 

Be careful. You have nothing there to make sure that it doesn't violate the 
constraints on the number of write cycles per page. You have no ECC, you 
have no real chance of it working in the wild.

Also note that the locking in jffs2_garbage_collect_deletion_dirent() is 
broken. We need to lock the erase_completion_lock while we go through the 
list, and drop the lock when we read the nodes. 

> jffs2_flash_writev calls mtd->writev, if available else it calls
> mtd_fake_writev. The nand driver supports writev and does not check if
> the  vecs[1] entry is empty. mtd_fake_writev takes care of this. I
> suggest to do the check in jffs_write_dnode (where the wrong count
> comes  from), although i will include a check in nand.c. This problem
> did not show up due to the fact, that obviously none of the  FLASH
> drivers supports writev. (I could not find one in mtd)

All the writev stuff was put there for the benefit of NAND flash - so yes,
nobody's used it yet. This problem had come up recently in the eCos port,
but the fix hadn't yet propagated to the main tree.

--
dwmw2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: JFFS2 on NAND flash
  2002-01-31 10:22   ` David Woodhouse
@ 2002-01-31 12:26     ` Thomas Gleixner
  2002-01-31 14:29       ` Thomas Gleixner
  2002-01-31 14:30       ` David Woodhouse
  0 siblings, 2 replies; 6+ messages in thread
From: Thomas Gleixner @ 2002-01-31 12:26 UTC (permalink / raw)
  To: David Woodhouse, gleixner; +Cc: linux-mtd

On Thursday, 31. January 2002 11:22, David Woodhouse wrote:
> Be careful. You have nothing there to make sure that it doesn't violate the
> constraints on the number of write cycles per page. You have no ECC, you
> have no real chance of it working in the wild.
I know that and i was trying to put a workaround for the write cycle problem 
into the nand driver. I think thats the correct location for this. Are there 
other chips dealing with the same problem or is it related to NAND only ?
My current solutiun would be:
In nand.c the write functions checks the write attempts to a page. If there 
were three writes already to this page, the function reads back the block 
data, erases the block and writes the block data back to the chip. 

Is this also a problem for jffs1 ? I run jffs1 for a couple of weeks on my 
board and had not one problem at all. 

> Also note that the locking in jffs2_garbage_collect_deletion_dirent() is
> broken. We need to lock the erase_completion_lock while we go through the
> list, and drop the lock when we read the nodes.
I'm not deep enough inside this to see the neccecary change. Could you please 
explain more detailed ?

> All the writev stuff was put there for the benefit of NAND flash - so yes,
> nobody's used it yet. This problem had come up recently in the eCos port,
> but the fix hadn't yet propagated to the main tree.
No problem it took only some time to understand what happens there.

Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de  

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: JFFS2 on NAND flash
  2002-01-31 12:26     ` Thomas Gleixner
@ 2002-01-31 14:29       ` Thomas Gleixner
  2002-01-31 14:30       ` David Woodhouse
  1 sibling, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2002-01-31 14:29 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-mtd

On Thursday, 31. January 2002 13:26, Thomas Gleixner wrote:
> Is this also a problem for jffs1 ? I run jffs1 for a couple of weeks on my
> board and had not one problem at all.

Stupid question ! It happens on jffs1 too. I built in a check for consecutive
page writes and it happens sometimes. The maximum number of consecutive
writes to a page was 4. But there was never a loss of data or something like
this. Strange !

> I know that and i was trying to put a workaround for the write cycle
> problem into the nand driver. I think thats the correct location for this.
> Are there other chips dealing with the same problem or is it related to
> NAND only ?
> My current solutiun would be:
> In nand.c the write functions checks the write attempts to a page. If there
> were three writes already to this page, the function reads back the block
> data, erases the block and writes the block data back to the chip.

Works, but in case of powerdown between erase and writeback the filesystem is
left corrupted.
There would be a easy solution for this inside the nand-driver. The nand
driver reserves some blocks at the end of the device and uses them to store
the block data. If the data are succesfully stored the concerned block can be
erased safely and data put back into this. In case of reset between erase and
writeback the data is safe. When bringing up the NAND-driver we could check,
if there is a safed block in the buffer and bring it back to the original
location before mounting. This would not bring all this NAND problems into
jffs and jffs2.

Thomas
_________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de  

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: JFFS2 on NAND flash
  2002-01-31 12:26     ` Thomas Gleixner
  2002-01-31 14:29       ` Thomas Gleixner
@ 2002-01-31 14:30       ` David Woodhouse
  1 sibling, 0 replies; 6+ messages in thread
From: David Woodhouse @ 2002-01-31 14:30 UTC (permalink / raw)
  To: gleixner; +Cc: jffs-dev

( moved to jffs-dev list )

gleixner@autronix.de said:
> I know that and i was trying to put a workaround for the write cycle
> problem  into the nand driver. I think thats the correct location for
> this. Are there  other chips dealing with the same problem or is it
> related to NAND only ? 

Some new ST chips have similar problems, I think. You can only write once 
to any given 8-byte region before erasing it.

> My current solutiun would be: In nand.c the
> write functions checks the write attempts to a page. If there  were
> three writes already to this page, the function reads back the block
> data, erases the block and writes the block data back to the chip. 

You can't do it like that. What if you lose power while the block is erased,
but before you have written the data back again? You lose all the
information from the remainder of that erase block.

See http://mhonarc.axis.se/jffs-dev/msg01140.html

>  Is this also a problem for jffs1 ? I run jffs1 for a couple of weeks
> on my  board and had not one problem at all.  

I think you get away with it on JFFS1. You don't have nodes small enough 
that you'll fit ten of them in a page, and you can go for a couple of weeks 
without the lack of ECC biting you.

gleixner@autronix.de said:
> > Also note that the locking in jffs2_garbage_collect_deletion_dirent()
> > is broken. We need to lock the erase_completion_lock while we go through
> > the list, and drop the lock when we read the nodes. 

> I'm not deep enough inside this to see the neccecary change. Could you 
> please  explain more detailed ?

On completion of a block erase, we walk through all the node lists, and we 
remove and free all nodes in the block which has just been erased. Anything 
which walks the node lists (as this function does) must hold the
erase_completion_lock, so that the erase can't happen simultaneously.

So we must hold the erase_completion_lock while walking the lists, and 
because it's a spinlock we must unlock it before calling the flash read 
function. Then when we get it again the node we were looking at and the 
next node in the list may both have gone and been freed - we have to start 
again from the beginning.

--
dwmw2

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2002-01-31 14:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-12-01  9:51 JFFS2 on NAND flash/DiskOnChip David Woodhouse
2002-01-31 10:04 ` JFFS2 on NAND flash Thomas Gleixner
2002-01-31 10:22   ` David Woodhouse
2002-01-31 12:26     ` Thomas Gleixner
2002-01-31 14:29       ` Thomas Gleixner
2002-01-31 14:30       ` David Woodhouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox