public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* mtd-utils/nandwrite: what if write fails?
@ 2006-10-17 14:17 Ricard Wanderlof
  2006-10-17 14:36 ` Artem Bityutskiy
  2006-10-17 15:33 ` Josh Boyer
  0 siblings, 2 replies; 7+ messages in thread
From: Ricard Wanderlof @ 2006-10-17 14:17 UTC (permalink / raw)
  To: Linux mtd

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1201 bytes --]


I have a question regarding writes to NAND flash; for instance, in 
mtd-utils/nandwrite.c, prior to writing to a block, it is checked so that 
it isn't bad (using the MEMGETBADBLOCK ioctl). However, what happens if 
the block goes bad during write? If the pwrite() call which writes out the 
page data fails, the application says perror() and exits. Shouldn't it 
mark the block as bad, and re-write the data so far written to the block 
to the next good block? As I understand it, mtd doesn't mark a block bad, 
it is up to the application or overlying file system (e.g. JFFS2). So it 
won't even help to run nandwrite again as the block has not been marked 
bad.

Or have I missed something here?

(Or is it simply that normally nandwrite is only used during testing, or 
writing an initial filesystem, and the likelyhood of a block failing at 
precisely this time is rather small, compared to the rest of the lifetime 
of the memory (i.e. repeated JFFS2 accesses)?)

/Ricard
--
Ricard Wolf Wanderlöf                           ricardw(at)axis.com
Axis Communications AB, Lund, Sweden            www.axis.com
Phone +46 46 272 2016                           Fax +46 46 13 61 30

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mtd-utils/nandwrite: what if write fails?
  2006-10-17 14:17 mtd-utils/nandwrite: what if write fails? Ricard Wanderlof
@ 2006-10-17 14:36 ` Artem Bityutskiy
  2006-10-27 13:07   ` Ricard Wanderlof
  2006-10-17 15:33 ` Josh Boyer
  1 sibling, 1 reply; 7+ messages in thread
From: Artem Bityutskiy @ 2006-10-17 14:36 UTC (permalink / raw)
  To: Ricard Wanderlof; +Cc: Linux mtd

Hi Richard,

On Tue, 2006-10-17 at 16:17 +0200, Ricard Wanderlof wrote:
> I have a question regarding writes to NAND flash; for instance, in 
> mtd-utils/nandwrite.c, prior to writing to a block, it is checked so that 
> it isn't bad (using the MEMGETBADBLOCK ioctl). However, what happens if 
> the block goes bad during write? If the pwrite() call which writes out the 
> page data fails, the application says perror() and exits. Shouldn't it 
> mark the block as bad, and re-write the data so far written to the block 
> to the next good block? 

If we are talking about nandwrite - I would prefer it not to do this. Or
do this if an this was specified via an explicit option ...

> As I understand it, mtd doesn't mark a block bad,

Yes, MTD provides you mechanisms to do this, but it is up to you (MTD
user) to do this or not.

> it is up to the application or overlying file system (e.g. JFFS2).

Yes, not only JFFS2, just any software which works on top of MTD and
knows what it does. For example, UBI can do this. Or some user-space
tools.

>  So it 
> won't even help to run nandwrite again as the block has not been marked 
> bad.

Not sure, put probably yes. You may explore fix this adding a
corresponding option to nandwrite. Or you may write an utility which
does flash torturing and identifies bad eraseblocks, e.g., flash_check.

> (Or is it simply that normally nandwrite is only used during testing, or 
> writing an initial filesystem, and the likelyhood of a block failing at 
> precisely this time is rather small, compared to the rest of the lifetime 
> of the memory (i.e. repeated JFFS2 accesses)?)
Not sure, but I personally used it only for debugging purposes.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mtd-utils/nandwrite: what if write fails?
  2006-10-17 14:17 mtd-utils/nandwrite: what if write fails? Ricard Wanderlof
  2006-10-17 14:36 ` Artem Bityutskiy
@ 2006-10-17 15:33 ` Josh Boyer
  2006-10-18  8:18   ` Ricard Wanderlof
  2006-10-27 13:09   ` Ricard Wanderlof
  1 sibling, 2 replies; 7+ messages in thread
From: Josh Boyer @ 2006-10-17 15:33 UTC (permalink / raw)
  To: Ricard Wanderlof; +Cc: Linux mtd

On Tue, 2006-10-17 at 16:17 +0200, Ricard Wanderlof wrote:
> I have a question regarding writes to NAND flash; for instance, in 
> mtd-utils/nandwrite.c, prior to writing to a block, it is checked so that 
> it isn't bad (using the MEMGETBADBLOCK ioctl). However, what happens if 
> the block goes bad during write? If the pwrite() call which writes out the 
> page data fails, the application says perror() and exits. Shouldn't it 
> mark the block as bad, and re-write the data so far written to the block 
> to the next good block? As I understand it, mtd doesn't mark a block bad, 
> it is up to the application or overlying file system (e.g. JFFS2). So it 
> won't even help to run nandwrite again as the block has not been marked 
> bad.

pwrite can fail because of a transient error.  NAND can get soft bit
flips that can be cleared by erasing the block.  So to mark a block bad
without at least trying to write to it a couple of times seems a bit
drastic.

josh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mtd-utils/nandwrite: what if write fails?
  2006-10-17 15:33 ` Josh Boyer
@ 2006-10-18  8:18   ` Ricard Wanderlof
  2006-10-27 13:09   ` Ricard Wanderlof
  1 sibling, 0 replies; 7+ messages in thread
From: Ricard Wanderlof @ 2006-10-18  8:18 UTC (permalink / raw)
  To: Linux mtd

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1849 bytes --]


On Tue, 17 Oct 2006, Josh Boyer wrote:

> On Tue, 2006-10-17 at 16:17 +0200, Ricard Wanderlof wrote:
>> ...
>> it isn't bad (using the MEMGETBADBLOCK ioctl). However, what happens if
>> the block goes bad during write? If the pwrite() call which writes out the
>> page data fails, the application says perror() and exits. Shouldn't it
>> mark the block as bad, and re-write the data so far written to the block
>> to the next good block?
>> ..
>
> pwrite can fail because of a transient error.  NAND can get soft bit
> flips that can be cleared by erasing the block.  So to mark a block bad
> without at least trying to write to it a couple of times seems a bit
> drastic.

I see. Well, then in that case, a course of action would be to erase the 
block, try to rewrite, and then mark it bad should pwrite still fail 
after the second attempt?

Would another option be to erase each block immediately prior to writing 
However, this wouldn't make it possible to write only part of a block, so 
again, should not be the default behavior.

I'm curious about the 'soft bit flipping' though. I suppose both the case 
of a bit going from 1->0 as well as 0->1 are possible? That would mean 
that a previously erased block could contain random 0's, resulting in the 
potential need for re-erase before writing.

Does anyone have a reference to more info on this subject? I've scanned 
the 'net, but most of the hard-core information I've found to be in 
application notes from various chip manufacturers. While informative, they 
tend to be a bit restrictive on practical aspects of NAND flash 
management.

/Ricard
--
Ricard Wolf Wanderlöf                           ricardw(at)axis.com
Axis Communications AB, Lund, Sweden            www.axis.com
Phone +46 46 272 2016                           Fax +46 46 13 61 30

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mtd-utils/nandwrite: what if write fails?
  2006-10-17 14:36 ` Artem Bityutskiy
@ 2006-10-27 13:07   ` Ricard Wanderlof
  2006-10-27 14:13     ` Artem Bityutskiy
  0 siblings, 1 reply; 7+ messages in thread
From: Ricard Wanderlof @ 2006-10-27 13:07 UTC (permalink / raw)
  To: Artem Bityutskiy; +Cc: Linux mtd


On Tue, 17 Oct 2006, Artem Bityutskiy wrote:

>On Tue, 2006-10-17 at 16:17 +0200, Ricard Wanderlof wrote:
>> I have a question regarding writes to NAND flash; for instance, in 
>> mtd-utils/nandwrite.c, prior to writing to a block, it is checked so that 
>> it isn't bad (using the MEMGETBADBLOCK ioctl). However, what happens if 
>> the block goes bad during write? If the pwrite() call which writes out the 
>> page data fails, the application says perror() and exits. Shouldn't it 
>> mark the block as bad, and re-write the data so far written to the block 
>> to the next good block? 
>
> If we are talking about nandwrite - I would prefer it not to do this. Or 
> do this if an this was specified via an explicit option ...

I could live with an option.

>> As I understand it, mtd doesn't mark a block bad,
>> it is up to the application or overlying file system (e.g. JFFS2).
>
> Yes, not only JFFS2, just any software which works on top of MTD and 
> knows what it does. For example, UBI can do this. Or some user-space 
> tools.

I assume that the mtd block devices don't provide any bad block management 
either? (Hm, one could imagine a device which when written to simply 
skipped bad blocks ... ?)

>> So it won't even help to run nandwrite again as the block has not been 
>> marked bad.
>
> Not sure, put probably yes. You may explore fix this adding a 
> corresponding option to nandwrite. Or you may write an utility which 
> does flash torturing and identifies bad eraseblocks, e.g., flash_check. 
> [...] but I personally used it only for debugging purposes.

I would like to use it for writing filesystem images to nand flash, i.e. a 
more production-oriented environment. Either during initial production, or 
during upgrades. So some form of (perhaps option-controlled) bad block 
management would be nice.

A torture test would be nice too, but it's not really the same thing. 
Blocks can go bad with time, and when one actually does go bad, it has to 
be handled at that time.

Another option would be to integrate erasure into nandwrite, so that it 
could erase blocks prior to writing them, to give a completely integrated 
utility. Writing to a non-erased (nand) flash is rather pointless anyway 
isn't it? Naturally, one would want to set limits for the erasure so that 
not the whole flash would have to be erased just to write a small image.

One would still have to handle the case of erase or write failing though.

/Ricard
--
Ricard Wolf Wanderlöf                           ricardw(at)axis.com
Axis Communications AB, Lund, Sweden            www.axis.com
Phone +46 46 272 2016                           Fax +46 46 13 61 30

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mtd-utils/nandwrite: what if write fails?
  2006-10-17 15:33 ` Josh Boyer
  2006-10-18  8:18   ` Ricard Wanderlof
@ 2006-10-27 13:09   ` Ricard Wanderlof
  1 sibling, 0 replies; 7+ messages in thread
From: Ricard Wanderlof @ 2006-10-27 13:09 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Linux mtd


On Tue, 17 Oct 2006, Josh Boyer wrote:

> pwrite can fail because of a transient error.  NAND can get soft bit
> flips that can be cleared by erasing the block.  So to mark a block bad
> without at least trying to write to it a couple of times seems a bit
> drastic.

What if erase were integrated into nandwrite (controlled by an option, not 
by default), so that blocks were erased immediately prior to writing? In 
that case a failed erase or write would really indicate a bad block.

Or have there been cases where a block fails to erase or write, but 
recovers after a couple of erase operations? Surely such a block would be 
deemed to uncertain to contain data, and should be discarded (= marked 
bad) ?

/Ricard
--
Ricard Wolf Wanderlöf                           ricardw(at)axis.com
Axis Communications AB, Lund, Sweden            www.axis.com
Phone +46 46 272 2016                           Fax +46 46 13 61 30

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mtd-utils/nandwrite: what if write fails?
  2006-10-27 13:07   ` Ricard Wanderlof
@ 2006-10-27 14:13     ` Artem Bityutskiy
  0 siblings, 0 replies; 7+ messages in thread
From: Artem Bityutskiy @ 2006-10-27 14:13 UTC (permalink / raw)
  To: Ricard Wanderlof; +Cc: Linux mtd

On Fri, 2006-10-27 at 15:07 +0200, Ricard Wanderlof wrote:
> I assume that the mtd block devices don't provide any bad block management 
> either? (Hm, one could imagine a device which when written to simply 
> skipped bad blocks ... ?)
Yes. I personally have never used it because it is obviously very poor
FTL and is barely usable except for testing/debugging or such.

> A torture test would be nice too, but it's not really the same thing. 
> Blocks can go bad with time, and when one actually does go bad, it has to 
> be handled at that time.

I offer you to test it. Select an eraseblock, and erase it many times in
cycle and see what happens. It may be interesting. I can send you a test
module which does this.

> Another option would be to integrate erasure into nandwrite, so that it 
> could erase blocks prior to writing them, to give a completely integrated 
> utility. Writing to a non-erased (nand) flash is rather pointless anyway 
> isn't it? Naturally, one would want to set limits for the erasure so that 
> not the whole flash would have to be erased just to write a small image.

IMO be it makes sense to write a nice image flashing utility for this
instead from scratch.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-10-27 14:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-17 14:17 mtd-utils/nandwrite: what if write fails? Ricard Wanderlof
2006-10-17 14:36 ` Artem Bityutskiy
2006-10-27 13:07   ` Ricard Wanderlof
2006-10-27 14:13     ` Artem Bityutskiy
2006-10-17 15:33 ` Josh Boyer
2006-10-18  8:18   ` Ricard Wanderlof
2006-10-27 13:09   ` Ricard Wanderlof

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox