* NAND flash and JFFS(2)
@ 2002-02-05 14:41 Veli-Pekka Ylönen
2002-02-05 15:30 ` David Woodhouse
0 siblings, 1 reply; 20+ messages in thread
From: Veli-Pekka Ylönen @ 2002-02-05 14:41 UTC (permalink / raw)
To: linux-mtd
I read TODO file in JFFS2 sources and found this:
- NAND flash support:
- Fix locking in jffs2_garbage_collect_deletion_dirent().
- Move CLEANMARKER into the 'spare' area.
- Write batching - build up a NAND-page worth of data and write out all in
one go, using the hardware ECC or block-based software ECC. This gives us
some interesting problems, but it's not that bad:
- When we go to erase a block from which we've been garbage-collecting,
we have to make sure that the nodes in it _really_ are obsolete, and
the new node which finally obsoletes the block we want to erase isn't
still waiting in the write-buffer. We can do this by sticking such
blocks not on the erase_pending_list, but on a new erase_pending_wbuf
list, and then moving them to the erase_pending_list when the buffer is
flushed.
- fsync() becomes a non-NOP.
- Deal with write errors. Data don't get lost - we just have to write
the affected node(s) out again somewhere else.
Is somebody already working on this?
I read also in the archive that JFFS should support NAND. Is this true?
I have made the raw NAND interface work but JFFS and JFFS2 doesn't
do the batching and tries to write multiple times to same
page.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-05 14:41 NAND flash and JFFS(2) Veli-Pekka Ylönen
@ 2002-02-05 15:30 ` David Woodhouse
2002-02-05 17:28 ` Thomas Gleixner
2002-02-05 17:35 ` Veli-Pekka Ylönen
0 siblings, 2 replies; 20+ messages in thread
From: David Woodhouse @ 2002-02-05 15:30 UTC (permalink / raw)
To: Veli-Pekka Ylönen; +Cc: linux-mtd
On Tue, 5 Feb 2002, Veli-Pekka Ylönen wrote:
> I read TODO file in JFFS2 sources and found this:
>
> - NAND flash support:
> - Fix locking in jffs2_garbage_collect_deletion_dirent().
> - Move CLEANMARKER into the 'spare' area.
> - Write batching - build up a NAND-page worth of data and write out all in
> one go, using the hardware ECC or block-based software ECC. This gives us
> some interesting problems, but it's not that bad:
> - When we go to erase a block from which we've been garbage-collecting,
> we have to make sure that the nodes in it _really_ are obsolete, and
> the new node which finally obsoletes the block we want to erase isn't
> still waiting in the write-buffer. We can do this by sticking such
> blocks not on the erase_pending_list, but on a new erase_pending_wbuf
> list, and then moving them to the erase_pending_list when the buffer is
> flushed.
> - fsync() becomes a non-NOP.
> - Deal with write errors. Data don't get lost - we just have to write
> the affected node(s) out again somewhere else.
>
> Is somebody already working on this?
Vaguely.
I've had a go at some of these - I got terminally bored on the plane to
linux.conf.au so 've just checked in code to the experimental
jffs2-nand-branch in CVS which does the CLEANMARKER bit, although we
probably want to check where we should put the cleanmarker in NAND flash
to avoid all the hardware ECC arrangments. I know this one is OK with the
DiskOnChip but don't remember offhand where SmartMedia puts its ECC data.
I've done some of the write batching too - we have code to set up a write
buffer, flush it when it's full, etc., do the erase_pending_wbuf_list
thing, etc.
It doesn't yet deal nicely with write errors, although we know how, and I
haven't implemented fsync(). Neither have I tested that any of it does
anything more than actually compile - it's just there as more verbose
outline of what I think needs doing than what I put in the TODO list.
The whole thing wants a locking audit too. It's not just the
gc_deletion_dirent code, although that is known broken and a little hard
to fix - we need to drop the lock for reading the block in question, but
then when we lock again we have to start again from the beginning of the
list - and how do we know where we got to, given that the block we just
looked at, the block before it and the block after it could all have gone
from the list?
Perhaps we want to make the erase_completion_lock into a semaphore. I'm
sure we could persuade the MTD maintainer to retrospectively declare that
all erase callbacks must happen in process context. In fact, the freeing
of the node refs in erased blocks already happens from process context -
we could possibly just invent a new semaphore which protects us from that.
> I read also in the archive that JFFS should support NAND. Is this true?
>
> I have made the raw NAND interface work but JFFS and JFFS2 doesn't
> do the batching and tries to write multiple times to same
> page.
JFFS ought to be OK on chips that can take ten write cycles per 512-byte
page, because it uses writev to ensure that nodes are written in one go,
and no node will be less than 52 bytes.
On NAND flash chips which can do fewer than ten writes per page, JFFS
probably won't work either.
--
dwmw2
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-05 15:30 ` David Woodhouse
@ 2002-02-05 17:28 ` Thomas Gleixner
2002-02-05 21:18 ` David Woodhouse
2002-02-05 17:35 ` Veli-Pekka Ylönen
1 sibling, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2002-02-05 17:28 UTC (permalink / raw)
To: David Woodhouse, Veli-Pekka Ylönen; +Cc: linux-mtd, jffs-dev
On Tuesday, 5. February 2002 16:30, David Woodhouse wrote:
> I've had a go at some of these - I got terminally bored on the plane to
> linux.conf.au so 've just checked in code to the experimental
> jffs2-nand-branch in CVS which does the CLEANMARKER bit, although we
> probably want to check where we should put the cleanmarker in NAND flash
> to avoid all the hardware ECC arrangments. I know this one is OK with the
> DiskOnChip but don't remember offhand where SmartMedia puts its ECC data.
The SmartMedia is nothing else than a raw NAND chip assembled into a thin
plastic card. The "Smart" thing is the adaptor they sell for the PC. This
adaptor simulates a DOS-FAT filesystem. The filesystem on the SMartCard
itself is simliar to DOS-FAT with some extensions for bad block management. I
doubt, if it's possible to put jffs2 on top of this adapters. If you use a
SmartMediaCard direct in your hardware as a removable NAND-chip with software
ECC you can put the ECC data into the spare area where you like.
I think we can keep the cleanmarker where it is. The code actualy writes the
next inode after the cleanmarker, and there is no problem. All NAND devices i
reviewed allow min 2 consecutive writes to a page. If we use different
writemodes for cleanmarker and full page (what we actually do), we can skip
the ECC for cleanmarker writing and write the ECC, when we write the page
data. I spent a additional byte in the ECC area, which determines, if ECC is
available or not.
This byte is 0xff, when we have written the cleanmarker, after writing the
page data with ECC i write it to 0. On read i check, if the ECC available
flag is != 0xff.
> I've done some of the write batching too - we have code to set up a write
> buffer, flush it when it's full, etc., do the erase_pending_wbuf_list
> thing, etc.
I got this out of CVS yesterday and made it running.
> It doesn't yet deal nicely with write errors, although we know how,
That's a bit of headache i don't know exactly how to handle. The only tricky
thing is a write error when we flush the buffer and have data of the previous
write in the buffer. A write error, which occures on non buffered data can be
handled by the existing JFFS2 code already. When we crash in the flush buffer
write, then we must return retlen = 0. The upper layer in write.c can then
check, if c->wbuf_len is > 0 and start data rescuing.
> and I haven't implemented fsync().
I implemented it basicly and it works.
> > I read also in the archive that JFFS should support NAND. Is this true?
> JFFS ought to be OK on chips that can take ten write cycles per 512-byte
> page, because it uses writev to ensure that nodes are written in one go,
> and no node will be less than 52 bytes.
> On NAND flash chips which can do fewer than ten writes per page, JFFS
> probably won't work either.
I tried it already and it does work, as long you have enough free space on
the chip and you don't run into garbage collection. If you do a copy, remove,
copy loop, your filesystem is totaly corrupted after the first block
wraparound.
Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-05 15:30 ` David Woodhouse
2002-02-05 17:28 ` Thomas Gleixner
@ 2002-02-05 17:35 ` Veli-Pekka Ylönen
1 sibling, 0 replies; 20+ messages in thread
From: Veli-Pekka Ylönen @ 2002-02-05 17:35 UTC (permalink / raw)
To: linux-mtd
I am using 64 Mbyte chip and only 3 partial page writes is allowed. I
guess that Samsung models 16 MB to 64 MB allow 5 partial page writes and
Toshiba 64 M chip - the one I am using - allows 3.
Any possibility to increase the node size in JFFS to make it work also
with my chip?
On Wed, 6 Feb 2002, David Woodhouse wrote:
>
> JFFS ought to be OK on chips that can take ten write cycles per 512-byte
> page, because it uses writev to ensure that nodes are written in one go,
> and no node will be less than 52 bytes.
>
> On NAND flash chips which can do fewer than ten writes per page, JFFS
> probably won't work either.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-05 17:28 ` Thomas Gleixner
@ 2002-02-05 21:18 ` David Woodhouse
0 siblings, 0 replies; 20+ messages in thread
From: David Woodhouse @ 2002-02-05 21:18 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Veli-Pekka Ylönen, linux-mtd, jffs-dev
On Tue, 5 Feb 2002, Thomas Gleixner wrote:
> If you use a SmartMediaCard direct in your hardware as a removable
> NAND-chip with software ECC you can put the ECC data into the spare area
> where you like.
I was assuming the existence of SmartMedia adaptors which do only the ECC,
allowing you do to the rest of the SmartMedia format in software. In that
case, it would be best to allow the ECC to go where SmartMedia wants it to
go, so we don't _have_ to do software ECC.
> I think we can keep the cleanmarker where it is. The code actualy writes the
> next inode after the cleanmarker, and there is no problem. All NAND devices i
> reviewed allow min 2 consecutive writes to a page. If we use different
> writemodes for cleanmarker and full page (what we actually do), we can skip
> the ECC for cleanmarker writing and write the ECC, when we write the page
> data. I spent a additional byte in the ECC area, which determines, if ECC is
> available or not.
> This byte is 0xff, when we have written the cleanmarker, after writing the
> page data with ECC i write it to 0. On read i check, if the ECC available
> flag is != 0xff.
This would be feasible. I thought I'd heard of NAND flash devices
which only allowed one write per page though - and I've already
written the code to move the cleanmarker.
> > I've done some of the write batching too - we have code to set up a write
> > buffer, flush it when it's full, etc., do the erase_pending_wbuf_list
> > thing, etc.
> I got this out of CVS yesterday and made it running.
Cool.
> > It doesn't yet deal nicely with write errors, although we know how,
> That's a bit of headache i don't know exactly how to handle. The only tricky
> thing is a write error when we flush the buffer and have data of the previous
> write in the buffer. A write error, which occures on non buffered data can be
> handled by the existing JFFS2 code already. When we crash in the flush buffer
> write, then we must return retlen = 0. The upper layer in write.c can then
> check, if c->wbuf_len is > 0 and start data rescuing.
The comments say what you need to do in that case - still a PITA to
actually write though.
> > and I haven't implemented fsync().
> I implemented it basicly and it works.
How did you implement this?
> > > I read also in the archive that JFFS should support NAND. Is this true?
> > JFFS ought to be OK on chips that can take ten write cycles per 512-byte
> > page, because it uses writev to ensure that nodes are written in one go,
> > and no node will be less than 52 bytes.
> > On NAND flash chips which can do fewer than ten writes per page, JFFS
> > probably won't work either.
> I tried it already and it does work, as long you have enough free space on
> the chip and you don't run into garbage collection. If you do a copy, remove,
> copy loop, your filesystem is totaly corrupted after the first block
> wraparound.
This is JFFS1, yes?
--
dwmw2
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
[not found] <02020523405103.11497@thomas>
@ 2002-02-06 4:54 ` David Woodhouse
2002-02-06 22:47 ` Thomas Gleixner
0 siblings, 1 reply; 20+ messages in thread
From: David Woodhouse @ 2002-02-06 4:54 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Veli-Pekka Ylönen, linux-mtd, jffs-dev
On Tue, 5 Feb 2002, Thomas Gleixner wrote:
> There are different types of adaptors available:
> - Very smart with controller (Useless they force this DOS/FAT style fs)
> - Less smart with ECC hardware (OK, but we should have a close look on it,
> regarding ECC stuff, especially write to the ecc area and consecutive writes
> to a page). I have schematics and CPLD code. I try to figure it out.
We should be able to use their ECC hardware. As with the DiskOnChip, this
effectively limits us to one write per page, even if the NAND chip can do
more than that.
> - simple without anything, like on the Cirrus evaluation boards is exactly
> the same as a soldered NAND chip. (No restrictions)
We already have software ECC code which we can use on these.
> Maybe we can use the SmartMedia scheme for ECC placement in general for all
> NAND chips.
Except DiskOnChip, which has its own layout.
> > This would be feasible. I thought I'd heard of NAND flash devices
> > which only allowed one write per page though - and I've already
> > written the code to move the cleanmarker.
> I have a look on the data sheets again. I know only about 2 writes/page limit.
> But maybe we need this, when we have a less smart adaptor with ECC hardware.
Yes, the ECC makes supporting multiple writes quite hard. We could invent
new node types and do ECC per-node, but I think it would be better to do
block-based ECC and use the hardware capabilities, as we have to do write
batching into blocks anyway.
> I had already a look on your changes.
> Is there a possibility to get a mail, when CVS updates are done ?
http://lists.infradead.org/mailman/listinfo/linux-mtd-cvs
>
> > > > I've done some of the write batching too - we have code to set up a
> > Cool.
> tricky hack, but now oops is gone.
OK. Could you show me the changes you had to make?
> > > That's a bit of headache i don't know exactly how to handle. The only
> > > tricky thing is a write error when we flush the buffer and have data of
> > > the previous write in the buffer.
> >
> > The comments say what you need to do in that case - still a PITA to
> > actually write though.
> I know, but I'm not deep enough inside the jffs2 stuff to do this. If you
> provide a basic implementation, i could do the same as to the wbuf stuff.
OK, I'll have a go at it. I was leaving it till last, because it's not
particularly likely to happen in real life. Once the rest works, I'll be
all happy and motivated to do the evil bits of the error handling :)
> > > > and I haven't implemented fsync().
> > > I implemented it basicly and it works.
> > How did you implement this?
> I look, if the inode of the file, which has to be synced, is partially or
> full in the writebuffer. Then i pad and flush the buffer and adjust the
> c->nextsize->free_size and dirty_size. Is simple but works.
OK. It would be nice if we could trigger GC to fill the wbuf, rather than
padding - and only pad it if we really can't use up the space with GC
writes.
--
dwmw2
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-06 4:54 ` David Woodhouse
@ 2002-02-06 22:47 ` Thomas Gleixner
2002-02-06 22:55 ` David Woodhouse
0 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2002-02-06 22:47 UTC (permalink / raw)
To: David Woodhouse; +Cc: Veli-Pekka Ylönen, linux-mtd, jffs-dev
On Wednesday, 6. February 2002 05:54, David Woodhouse wrote:
>
> We should be able to use their ECC hardware. As with the DiskOnChip, this
> effectively limits us to one write per page, even if the NAND chip can do
> more than that.
The ECC hardware does not limit us to one write. The ECC hardware just
calculates the ECC if requested and you must read it back out of the CPLD.
Then you program it into the spare area. On read you must also enable the ECC
hardware, read the page, read the ECC .... You can do also read/write without
ECC.There are no restrictions which part of the spare area to use. You can
also do everything in software. Anyway we need a different driver for this
adaptors, because they implement the access to the select lines in their
CPLD.
> > Maybe we can use the SmartMedia scheme for ECC placement in general for
> > all NAND chips.
> Except DiskOnChip, which has its own layout.
OK
> > I have a look on the data sheets again. I know only about 2 writes/page
> > limit. But maybe we need this, when we have a less smart adaptor with ECC
> > hardware.
> Yes, the ECC makes supporting multiple writes quite hard. We could invent
> new node types and do ECC per-node, but I think it would be better to do
> block-based ECC and use the hardware capabilities, as we have to do write
> batching into blocks anyway.
See above
> OK. Could you show me the changes you had to make?
Should i put them into CVS to jffs-nand-branch or send them by mail ?
> OK, I'll have a go at it. I was leaving it till last, because it's not
> particularly likely to happen in real life. Once the rest works, I'll be
> all happy and motivated to do the evil bits of the error handling :)
sounds good to me
> OK. It would be nice if we could trigger GC to fill the wbuf, rather than
> padding - and only pad it if we really can't use up the space with GC
> writes.
Sure, but we must have something, that makes sure that data is written
immidiatly to the flash, if we have sensitive data loggers or configuration
data, where a loss is not acceptable.
Another thought about cleanmarker nodes. If we keep them in the page, we can
easy burn a jffs2-image into a raw flash. This makes life easier for
bootloaders....
Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-06 22:47 ` Thomas Gleixner
@ 2002-02-06 22:55 ` David Woodhouse
2002-02-07 6:51 ` Thomas Gleixner
0 siblings, 1 reply; 20+ messages in thread
From: David Woodhouse @ 2002-02-06 22:55 UTC (permalink / raw)
To: gleixner; +Cc: Veli-Pekka Ylönen, linux-mtd, jffs-dev
On Wed, 6 Feb 2002, Thomas Gleixner wrote:
> The ECC hardware does not limit us to one write. The ECC hardware just
> calculates the ECC if requested and you must read it back out of the CPLD.
> Then you program it into the spare area. On read you must also enable the ECC
> hardware, read the page, read the ECC .... You can do also read/write without
> ECC.There are no restrictions which part of the spare area to use.
True - but in order to implement ECC we have to either write partial
blocks _without_ ECC, until we fill the block and write the ECC data with
the 512nd byte, or batch the writes into pages.
> Should i put them into CVS to jffs-nand-branch or send them by mail ?
Mail's probably best at this stage - thanks.
> > OK. It would be nice if we could trigger GC to fill the wbuf, rather than
> > padding - and only pad it if we really can't use up the space with GC
> > writes.
> Sure, but we must have something, that makes sure that data is written
> immidiatly to the flash, if we have sensitive data loggers or configuration
> data, where a loss is not acceptable.
We can do that. Triggering a GC pass to try to fill the wbuf won't take
long - it can be done in the context of a fsync() call, and if we can't
find anything else to write out, _then_ we pad it and still return
(almost) immediately.
We probably need to rethink our (ab)use of the s_dirt member of the
superblock, and actually make write_super() do the sync too.
> Another thought about cleanmarker nodes. If we keep them in the page, we can
> easy burn a jffs2-image into a raw flash. This makes life easier for
> bootloaders....
Bootloaders need to know about JFFS2 anyway - currently they erase
blocks without putting a cleanmarker in, then Linux will re-erase all
those blocks again on first mount.
--
dwmw2
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-06 22:55 ` David Woodhouse
@ 2002-02-07 6:51 ` Thomas Gleixner
2002-02-07 7:04 ` David Woodhouse
0 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2002-02-07 6:51 UTC (permalink / raw)
To: David Woodhouse, gleixner; +Cc: Veli-Pekka Ylönen, linux-mtd, jffs-dev
On Wednesday, 6. February 2002 23:55, David Woodhouse wrote:
> On Wed, 6 Feb 2002, Thomas Gleixner wrote:
> True - but in order to implement ECC we have to either write partial
> blocks _without_ ECC, until we fill the block and write the ECC data with
> the 512nd byte, or batch the writes into pages.
I mean we should keep the batch write. Thats correct. But we can leave the
cleanmarker where it is now and write it without ECC and put the rest of the
page with ECC. On small NAND devices we have only 8byte spare area per block.
We need 3 byte for ECC and at least 1 byte for flags. Then the cleanmarker
area is reduced to 4 bytes. Is that enough ?
> > Should i put them into CVS to jffs-nand-branch or send them by mail ?
> Mail's probably best at this stage - thanks.
I will send it in the afternoon
Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-07 6:51 ` Thomas Gleixner
@ 2002-02-07 7:04 ` David Woodhouse
2002-02-11 13:42 ` Thomas Gleixner
0 siblings, 1 reply; 20+ messages in thread
From: David Woodhouse @ 2002-02-07 7:04 UTC (permalink / raw)
To: gleixner; +Cc: Veli-Pekka Ylönen, linux-mtd, jffs-dev
On Thu, 7 Feb 2002, Thomas Gleixner wrote:
> I mean we should keep the batch write. Thats correct. But we can leave the
> cleanmarker where it is now and write it without ECC and put the rest of the
> page with ECC. On small NAND devices we have only 8byte spare area per block.
> We need 3 byte for ECC and at least 1 byte for flags. Then the cleanmarker
> area is reduced to 4 bytes. Is that enough ?
Yeah, that's fine. We can do either - I'm not sure it matters too much.
--
dwmw2
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-07 7:04 ` David Woodhouse
@ 2002-02-11 13:42 ` Thomas Gleixner
2002-02-11 13:53 ` David Woodhouse
0 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2002-02-11 13:42 UTC (permalink / raw)
To: David Woodhouse; +Cc: linux-mtd, jffs-dev
On Thursday, 7. February 2002 08:04, David Woodhouse wrote:
> On Thu, 7 Feb 2002, Thomas Gleixner wrote:
> > I mean we should keep the batch write. Thats correct. But we can leave
> > the cleanmarker where it is now and write it without ECC and put the rest
> > of the page with ECC. On small NAND devices we have only 8byte spare area
> > per block. We need 3 byte for ECC and at least 1 byte for flags. Then the
> > cleanmarker area is reduced to 4 bytes. Is that enough ?
>
> Yeah, that's fine. We can do either - I'm not sure it matters too much.
I'm going to merge your current CVS code with my wbuf changes.
But we should decide what to do with the cleanmarker stuff and the usage of
the spare area.
On devices with 512 byte pagesize we have a spare area of 16 byte. On smaller
devices with 256 byte pagesize we have only 8 byte.
We have already some layouts for the spare area:
DOC
Byte 0-5 ECC
Byte 6-7 block used status (0x55,0x55)
Byte 8-15 not assigned
SmartMediaCard 256 Byte pagesize
Byte 0-5 not assigned
Byte 6 bad block status
Byte 7 not assigned
SmartMediaCard 512 Byte pagesize
Byte 0-5 not assigned
Byte 6 bad block status
Byte 7 not assigned
Byte 8-15 not assigned
Raw Nand Chips 256 Byte pagesize
Byte 0-7 not assigned
Raw Nand Chips 512 Byte pagesize
Byte 0-15 not assigned
The bad block status on the SmartMediaCards is programmed to a value
different to 0xff by the manufacturer. We should use this too.
Bad blocks on raw NAND chips are marked by values different to 0xff in the
first two pages of a block.
I don't know, if there's a similar technique on DOC.
I think the SmartMedia bad block indication is much more convenient than the
raw NAND solution. May be we can implement a utility to move the bad block
info into the spare area on raw NAND chips, so the bad block indication would
be the same. This could also be done at production time or be part of a
bootloader, because it has to done only once.
The nand driver should exclude these blocks from erase/write to keep this
information intact. But then we should read the bad block information at
mount time too. This can easily be done with the read_cleanmarker_oob
function. If we detect a bad block during operation, we could mark it the
same way, so the block is excluded for ever.
Layout with cleanmarker in spare area:
DOC
Byte 0-5 ECC
Byte 6-7 block used status (0x55,0x55) or bad block (value != 0x55 && != 0xff)
Byte 8-15 cleanmarker
SmartMedia and raw NAND 512 Byte pagesize
Byte 0-5 ECC
Byte 6 bad block status
Byte 7 page data valid flag
Byte 8-15 cleanmarker
SmartMedia and raw NAND 256 Byte pagesize
Byte 0-2 ECC for even page
Byte 3-5 cleanmarker
Byte 6 bad block status
Byte 7 page data valid flag
Some information about the current state. I ran a 24h test with copying,
moving, deleting files with random power down. The filesystem is clean and we
have no data loss except some broken files on powerdown. But nothing in the
world can help, if we write a new file and loose power before we finished it.
Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-11 13:42 ` Thomas Gleixner
@ 2002-02-11 13:53 ` David Woodhouse
[not found] ` <0202121847440K.00764@thomas>
0 siblings, 1 reply; 20+ messages in thread
From: David Woodhouse @ 2002-02-11 13:53 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: linux-mtd, jffs-dev
On Mon, 11 Feb 2002, Thomas Gleixner wrote:
> On Thursday, 7. February 2002 08:04, David Woodhouse wrote:
> I'm going to merge your current CVS code with my wbuf changes.
> But we should decide what to do with the cleanmarker stuff and the usage of
> the spare area.
> On devices with 512 byte pagesize we have a spare area of 16 byte. On smaller
> devices with 256 byte pagesize we have only 8 byte.
> We have already some layouts for the spare area:
> DOC
> Byte 0-5 ECC
> Byte 6-7 block used status (0x55,0x55)
> Byte 8-15 not assigned
Yes it is, at least for the first three pages in each eraseblock. It holds
the 'Unit Control Information' structures.
> SmartMediaCard 256 Byte pagesize
> Byte 0-5 not assigned
> Byte 6 bad block status
> Byte 7 not assigned
Where's the ECC when we're using the SmartMedia format on them?
> The bad block status on the SmartMediaCards is programmed to a value
> different to 0xff by the manufacturer. We should use this too.
> Bad blocks on raw NAND chips are marked by values different to 0xff in the
> first two pages of a block.
> I don't know, if there's a similar technique on DOC.
Same as raw NAND I believe. The DiskOnChip driver lists the NAND chips
which are used - all the data sheets are available.
> I think the SmartMedia bad block indication is much more convenient than the
> raw NAND solution.
Either way, you mustn't ever erase a block which is marked bad, right?
> May be we can implement a utility to move the bad block
> info into the spare area on raw NAND chips, so the bad block indication would
> be the same. This could also be done at production time or be part of a
> bootloader, because it has to done only once.
> The nand driver should exclude these blocks from erase/write to keep this
> information intact. But then we should read the bad block information at
> mount time too. This can easily be done with the read_cleanmarker_oob
> function. If we detect a bad block during operation, we could mark it the
> same way, so the block is excluded for ever.
Seems sane.
> Layout with cleanmarker in spare area:
> DOC
> Byte 0-5 ECC
> Byte 6-7 block used status (0x55,0x55) or bad block (value != 0x55 && != 0xff)
> Byte 8-15 cleanmarker
>
> SmartMedia and raw NAND 512 Byte pagesize
> Byte 0-5 ECC
> Byte 6 bad block status
> Byte 7 page data valid flag
> Byte 8-15 cleanmarker
>
> SmartMedia and raw NAND 256 Byte pagesize
> Byte 0-2 ECC for even page
> Byte 3-5 cleanmarker
> Byte 6 bad block status
> Byte 7 page data valid flag
Those also look sane.
> Some information about the current state. I ran a 24h test with copying,
> moving, deleting files with random power down. The filesystem is clean and we
> have no data loss except some broken files on powerdown. But nothing in the
> world can help, if we write a new file and loose power before we finished it.
Cool. Does this leave some files on the fs and make lots of GC happen, and
check the contents of all files on restart? Vipin's tests did this quite
well, IIRC.
Now I suppose we need to look at using write_super and sb->s_dirt the way
that God intended, for flushing pending writes in the wbuf rather than
just triggering erases.
--
dwmw2
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
@ 2002-02-11 15:32 Thomas Gleixner
2002-02-11 15:48 ` Thomas Gleixner
0 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2002-02-11 15:32 UTC (permalink / raw)
To: linux-mtd; +Cc: jffs-dev
On Monday, 11. February 2002 14:53, David Woodhouse wrote:
> Where's the ECC when we're using the SmartMedia format on them?
The layout for SmartMedia DOS for 256 byte pagesize:
Even page:
Byte 0-3 reserved area
Byte 4 data status flag
Byte 5 block status flag (bad block marker)
Byte 6-7 block adress 1
Odd page:
Byte 0-2 ECC Area-2 (odd page)
Byte 3-4 Block Address 2
Byte 5-7 ECC Area-1 (even page)
The layout for SmartMedia DOS for 512 byte pagesize:
Byte 0-3 Reserved Area
Byte 4 Data Status Flag
Byte 5 block status flag (bad block marker)
Byte 6-7 Block Address-1
Byte 8-10 ECC Area-2 (byte 256-511)
Byte 11-12 Block Address 2
Byte 13-15 ECC Area-1 (byte 0-255)
They build a virtual blocksize of 512 byte on the small devices. I thought
about doing the same, but IMHO it's not a good idea.
> Either way, you mustn't ever erase a block which is marked bad, right?
The datasheet says it's not allowed to do this, due to the fact, that you
loose bad block information and maybe you can't write the bad block info
again. So you would run always into this bad block after restart.
Sorry, as you see above, i had some trouble to count. The bad block status on
SmartMedia is not at Byte 6 it's at Byte 5. We want to keep it there, so we
have to split the ECC data on 512 byte devices.
SmartMedia and raw NAND 512 byte pagesize
Byte 0-3 ECC part 1
Byte 4 page data valid flag
Byte 5 bad block status
Byte 6-7 ECC part 2
Byte 8-15 cleanmarker
SmartMedia and raw NAND 256 Byte pagesize
Byte 0-2 ECC
Byte 3-4 cleanmarker
Byte 5 bad block status
Byte 6 page data valid flag
Byte 7 spare
Please let me know, if we can fix this layout. Then i would implement and
test it.
> Cool. Does this leave some files on the fs and make lots of GC happen, and
> check the contents of all files on restart? Vipin's tests did this quite
> well, IIRC.
I have some files untouched on the fs. I check them with diff after reboot.
There was no failure.
> Now I suppose we need to look at using write_super and sb->s_dirt the way
> that God intended, for flushing pending writes in the wbuf rather than
> just triggering erases.
What do you mean exactly and what about the cleanup, when we fail during
flush_wbuf ?
Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-11 15:32 Thomas Gleixner
@ 2002-02-11 15:48 ` Thomas Gleixner
2002-02-11 19:28 ` Thomas Gleixner
0 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2002-02-11 15:48 UTC (permalink / raw)
To: David Woodhouse; +Cc: jffs-dev, linux-mtd
On Monday, 11. February 2002 16:32, Thomas Gleixner wrote:
> On Monday, 11. February 2002 14:53, David Woodhouse wrote:
> > Where's the ECC when we're using the SmartMedia format on them?
>
> The layout for SmartMedia DOS for 256 byte pagesize:
>
> Even page:
> Byte 0-3 reserved area
> Byte 4 data status flag
> Byte 5 block status flag (bad block marker)
> Byte 6-7 block adress 1
>
> Odd page:
> Byte 0-2 ECC Area-2 (odd page)
> Byte 3-4 Block Address 2
> Byte 5-7 ECC Area-1 (even page)
>
> The layout for SmartMedia DOS for 512 byte pagesize:
> Byte 0-3 Reserved Area
> Byte 4 Data Status Flag
> Byte 5 block status flag (bad block marker)
> Byte 6-7 Block Address-1
> Byte 8-10 ECC Area-2 (byte 256-511)
> Byte 11-12 Block Address 2
> Byte 13-15 ECC Area-1 (byte 0-255)
>
> They build a virtual blocksize of 512 byte on the small devices. I thought
> about doing the same, but IMHO it's not a good idea.
>
I reflected this question again and we should do this very careful before
implemetation. If we choose the same layout, somebody would be able to
implement the SmartMedia DOS fs on top of SmartMedia and raw NAND flash.
I personally have not interest in this fs, it's ok for MP3 players and
digicams but not for industrial use.
Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-11 15:48 ` Thomas Gleixner
@ 2002-02-11 19:28 ` Thomas Gleixner
0 siblings, 0 replies; 20+ messages in thread
From: Thomas Gleixner @ 2002-02-11 19:28 UTC (permalink / raw)
To: David Woodhouse; +Cc: jffs-dev, linux-mtd
On Monday, 11. February 2002 16:48, Thomas Gleixner wrote:
> I reflected this question again and we should do this very careful before
> implemetation. If we choose the same layout, somebody would be able to
> implement the SmartMedia DOS fs on top of SmartMedia and raw NAND flash.
> I personally have not interest in this fs, it's ok for MP3 players and
> digicams but not for industrial use.
I had a look at the nand driver, where we have to do some changes anyway to
support a different ECC address. To get the flexibility to use either a JFFS2
specific ECC scheme or a different scheme like the SmartMedia DOS-FAT i will
add a array to the nand structure, which holds the positions of the ECC bytes
inside the spare area and a flag, which can be used to support the virtual
pagesize of 512 Byte for small NAND devices (if anybody is going to implement
this).
The filesystem driver can set the ECC byte positions, so they match it's
requirements. We could add config options to select a default ECC scheme, so
we have access to the chip as char device too. I'm not sure if we need this
really.
So we have the flexibility to do what we want and we don't prevent anybody
from implementing a different system.
I will try it with the following scheme tonight:
SmartMedia and raw NAND 512 byte pagesize
Byte 0-3 ECC part 1
Byte 4 page data valid flag
Byte 5 bad block status
Byte 6-7 ECC part 2
Byte 8-15 cleanmarker
SmartMedia and raw NAND 256 Byte pagesize
Byte 0-2 ECC
Byte 3 spare
Byte 4 page data valid flag
Byte 5 bad block status
Byte 6-7 cleanmarker
I have both card types so i can verify that it works.
Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
@ 2002-02-11 23:42 Steve_Chen
2002-02-12 0:10 ` Thomas Gleixner
0 siblings, 1 reply; 20+ messages in thread
From: Steve_Chen @ 2002-02-11 23:42 UTC (permalink / raw)
To: gleixner; +Cc: David Woodhouse, jffs-dev, linux-mtd
Hi Thomas,
I have two questions regarding the layout scheme in oob.
1. What does "page data valid flag" mean ? How do you use it ?
2. If jffs2 detects a bad block, does it set a flag in one of the bytes in
oob ? I suppose we don't want to use byte 05 (bad block status) which was
set by the chip manufacturer.
Thanks.
Regards,
Steve
Thomas Gleixner <gleixner@autronix.de>@axis.com on 02/11/2002 11:28:09 AM
Please respond to gleixner@autronix.de
Sent by: owner-jffs-dev@axis.com
To: David Woodhouse <dwmw2@infradead.org>
cc: jffs-dev@axis.com, linux-mtd@lists.infradead.org
Subject: Re: NAND flash and JFFS(2)
On Monday, 11. February 2002 16:48, Thomas Gleixner wrote:
> I reflected this question again and we should do this very careful before
> implemetation. If we choose the same layout, somebody would be able to
> implement the SmartMedia DOS fs on top of SmartMedia and raw NAND flash.
> I personally have not interest in this fs, it's ok for MP3 players and
> digicams but not for industrial use.
I had a look at the nand driver, where we have to do some changes anyway to
support a different ECC address. To get the flexibility to use either a
JFFS2
specific ECC scheme or a different scheme like the SmartMedia DOS-FAT i
will
add a array to the nand structure, which holds the positions of the ECC
bytes
inside the spare area and a flag, which can be used to support the virtual
pagesize of 512 Byte for small NAND devices (if anybody is going to
implement
this).
The filesystem driver can set the ECC byte positions, so they match it's
requirements. We could add config options to select a default ECC scheme,
so
we have access to the chip as char device too. I'm not sure if we need this
really.
So we have the flexibility to do what we want and we don't prevent anybody
from implementing a different system.
I will try it with the following scheme tonight:
SmartMedia and raw NAND 512 byte pagesize
Byte 0-3 ECC part 1
Byte 4 page data valid flag
Byte 5 bad block status
Byte 6-7 ECC part 2
Byte 8-15 cleanmarker
SmartMedia and raw NAND 256 Byte pagesize
Byte 0-2 ECC
Byte 3 spare
Byte 4 page data valid flag
Byte 5 bad block status
Byte 6-7 cleanmarker
I have both card types so i can verify that it works.
Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de
To unsubscribe from this list: send the line "unsubscribe jffs-dev" in
the body of a message to majordomo@axis.com
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-11 23:42 Steve_Chen
@ 2002-02-12 0:10 ` Thomas Gleixner
0 siblings, 0 replies; 20+ messages in thread
From: Thomas Gleixner @ 2002-02-12 0:10 UTC (permalink / raw)
To: Steve_Chen, gleixner; +Cc: David Woodhouse, jffs-dev, linux-mtd
On Tuesday, 12. February 2002 00:42, Steve_Chen@kingston.com wrote:
> 1. What does "page data valid flag" mean ? How do you use it ?
This flag is programmed to 0, if a complete page is written and tells the
ECC, that the data can be read with ECC.
> 2. If jffs2 detects a bad block, does it set a flag in one of the bytes in
> oob ? I suppose we don't want to use byte 05 (bad block status) which was
> set by the chip manufacturer.
Not yet, but we are going to implement this. We definitly will use byte 5. We
detect the bad blocks marked by the manufacturer, which we never touch again,
and mark detected bad blocks at the same place.
Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
[not found] ` <0202121847440K.00764@thomas>
@ 2002-02-13 11:40 ` Thomas Gleixner
0 siblings, 0 replies; 20+ messages in thread
From: Thomas Gleixner @ 2002-02-13 11:40 UTC (permalink / raw)
To: David Woodhouse; +Cc: linux-mtd, jffs-dev
On Tuesday, 12. February 2002 18:47, Thomas Gleixner wrote:
>
> Actual state of JFFS2 on NAND
>
> I will run some more tests tonight.
Excellent test result.
Test is as follows:
SmartMediaCard: 16MB two partitions a 8MB
1st partition is root
2nd partition is data
data contains 20 files, which are not touched.
3 tasks running in loops
task 1:
copy 40 files from root to data, remove the files on root
diff against original files on nfs
copy 40 files from data to root, remove the files on data
diff against original files on nfs
task 2:
open file A on data for write every 1 sec.
append 1000 bytes and close the file
loop until filesize is >1MB
open file B on data for write every 1 sec.
append 1000 bytes and close the file
loop until filesize is >1MB
task3:
wait until A is >1MB
copy file A to nfs
check file integrity of A
remove file A on data
wait until B is >1MB
copy file B to nfs
check file integrity of B
remove file B on data
A independent control removes power after ~ 1-2 hours without warning.
After reboot:
The untouched 20 files on data are diffed against the original files on nfs
The 40 files, which are shuffeld between root and data are diffed against
original files on nfs.
task1 starts
task2 takes the file, which was filled at last and starts
task3 starts
The test ran for 8 hours without any problem.
Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: NAND flash and JFFS(2)
@ 2002-02-13 20:03 Lance Nakamura
2002-02-13 20:19 ` Thomas Gleixner
0 siblings, 1 reply; 20+ messages in thread
From: Lance Nakamura @ 2002-02-13 20:03 UTC (permalink / raw)
To: gleixner, David Woodhouse; +Cc: linux-mtd, jffs-dev
2 things:
1) can the test be set to remove power more frequently,
as in every minute rather than every hour? our experience with
power fail safety is that you have to run at least 100K cycles, preferably
over a number of different units with parts from different manufactures,
in order to have reasonable confidence. that's probably something left
to product implementors and not the current developers. however, there is
the order of magnitude rule that says everytime you get an order of magnitude
more experience, you find another layer of problems. a thousand cycles
would be pretty easy to come by on your setup if you can get random power
fails
every minute or so, and that would yield a very high level of confidence
at this stage of development.
2) an important test is to remove power while the file
system is trying to recover. in the real world, power failures
are rarely clean, isolated instances. you may get multiple failures in
as little as 1 second (failures a few hundred miliseconds apart).
-----Original Message-----
From: Thomas Gleixner [mailto:gleixner@autronix.de]
Sent: Wednesday, February 13, 2002 1:41 AM
To: David Woodhouse
Cc: linux-mtd@lists.infradead.org; jffs-dev@axis.com
Subject: Re: NAND flash and JFFS(2)
On Tuesday, 12. February 2002 18:47, Thomas Gleixner wrote:
>
> Actual state of JFFS2 on NAND
>
> I will run some more tests tonight.
Excellent test result.
Test is as follows:
SmartMediaCard: 16MB two partitions a 8MB
1st partition is root
2nd partition is data
data contains 20 files, which are not touched.
3 tasks running in loops
task 1:
copy 40 files from root to data, remove the files on root
diff against original files on nfs
copy 40 files from data to root, remove the files on data
diff against original files on nfs
task 2:
open file A on data for write every 1 sec.
append 1000 bytes and close the file
loop until filesize is >1MB
open file B on data for write every 1 sec.
append 1000 bytes and close the file
loop until filesize is >1MB
task3:
wait until A is >1MB
copy file A to nfs
check file integrity of A
remove file A on data
wait until B is >1MB
copy file B to nfs
check file integrity of B
remove file B on data
A independent control removes power after ~ 1-2 hours without warning.
After reboot:
The untouched 20 files on data are diffed against the original files on nfs
The 40 files, which are shuffeld between root and data are diffed against
original files on nfs.
task1 starts
task2 takes the file, which was filled at last and starts
task3 starts
The test ran for 8 hours without any problem.
Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de
To unsubscribe from this list: send the line "unsubscribe jffs-dev" in
the body of a message to majordomo@axis.com
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: NAND flash and JFFS(2)
2002-02-13 20:03 Lance Nakamura
@ 2002-02-13 20:19 ` Thomas Gleixner
0 siblings, 0 replies; 20+ messages in thread
From: Thomas Gleixner @ 2002-02-13 20:19 UTC (permalink / raw)
To: Lance Nakamura, David Woodhouse; +Cc: linux-mtd, jffs-dev
On Wednesday, 13. February 2002 21:03, Lance Nakamura wrote:
> 2 things:
> 1) can the test be set to remove power more frequently,
> as in every minute rather than every hour? our experience with
> power fail safety is that you have to run at least 100K cycles,
>
> 2) an important test is to remove power while the file
> system is trying to recover. in the real world, power failures
> are rarely clean, isolated instances. you may get multiple failures in
> as little as 1 second (failures a few hundred miliseconds apart).
You're right, but this was a test for a big hack inside the fs.
The power on/off was just a little addon. If we have finished some left
issues i will setup a real power fail safety test.
Thomas
__________________________________________________
Thomas Gleixner, autronix automation GmbH
auf dem berg 3, d-88690 uhldingen-muehlhofen
fon: +49 7556 919891 , fax: +49 7556 919886
mail: gleixner@autronix.de, http://www.autronix.de
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2002-02-13 20:07 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-02-05 14:41 NAND flash and JFFS(2) Veli-Pekka Ylönen
2002-02-05 15:30 ` David Woodhouse
2002-02-05 17:28 ` Thomas Gleixner
2002-02-05 21:18 ` David Woodhouse
2002-02-05 17:35 ` Veli-Pekka Ylönen
[not found] <02020523405103.11497@thomas>
2002-02-06 4:54 ` David Woodhouse
2002-02-06 22:47 ` Thomas Gleixner
2002-02-06 22:55 ` David Woodhouse
2002-02-07 6:51 ` Thomas Gleixner
2002-02-07 7:04 ` David Woodhouse
2002-02-11 13:42 ` Thomas Gleixner
2002-02-11 13:53 ` David Woodhouse
[not found] ` <0202121847440K.00764@thomas>
2002-02-13 11:40 ` Thomas Gleixner
-- strict thread matches above, loose matches on Subject: below --
2002-02-11 15:32 Thomas Gleixner
2002-02-11 15:48 ` Thomas Gleixner
2002-02-11 19:28 ` Thomas Gleixner
2002-02-11 23:42 Steve_Chen
2002-02-12 0:10 ` Thomas Gleixner
2002-02-13 20:03 Lance Nakamura
2002-02-13 20:19 ` Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox