* Is it an atomic operation for writing a page in NAND flash
@ 2010-01-20 9:58 Liu Hui
2010-01-20 10:13 ` Ricard Wanderlof
2010-01-20 13:41 ` Jamie Lokier
0 siblings, 2 replies; 19+ messages in thread
From: Liu Hui @ 2010-01-20 9:58 UTC (permalink / raw)
To: linux-mtd
Hi guys,
This is a question confused me for a long time. As I know, writing a
sector for a hard disk is atomic. That is to say, when we are writing
a sector to hard disk and power failure happen, the sector will be
written completely or not at all.
For NAND flash, I didn't see the atomic guarantee in any material.
Could you please tell me if writing a page for NAND flash is atomic?
This is very important for a transaction based file system.
--
Thanks & Best Regards
Liu Hui
--
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 9:58 Is it an atomic operation for writing a page in NAND flash Liu Hui
@ 2010-01-20 10:13 ` Ricard Wanderlof
2010-01-20 13:11 ` Liu Hui
2010-01-20 13:41 ` Jamie Lokier
1 sibling, 1 reply; 19+ messages in thread
From: Ricard Wanderlof @ 2010-01-20 10:13 UTC (permalink / raw)
To: Liu Hui; +Cc: linux-mtd@lists.infradead.org
On Wed, 20 Jan 2010, Liu Hui wrote:
> Hi guys,
>
> This is a question confused me for a long time. As I know, writing a
> sector for a hard disk is atomic. That is to say, when we are writing
> a sector to hard disk and power failure happen, the sector will be
> written completely or not at all.
>
> For NAND flash, I didn't see the atomic guarantee in any material.
> Could you please tell me if writing a page for NAND flash is atomic?
> This is very important for a transaction based file system.
It is my understanding that if you get a power failure while the nand
flash chip is writing the page the page could get partly written.
The only way around something like this would be to monitor the power line
prior to the supply regulator, and not start a write if it can be detected
that a power failure has occurred and there is not enough power to
complete the write.
/Ricard
--
Ricard Wolf Wanderlöf ricardw(at)axis.com
Axis Communications AB, Lund, Sweden www.axis.com
Phone +46 46 272 2016 Fax +46 46 13 61 30
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 10:13 ` Ricard Wanderlof
@ 2010-01-20 13:11 ` Liu Hui
2010-01-20 13:33 ` Jamie Lokier
0 siblings, 1 reply; 19+ messages in thread
From: Liu Hui @ 2010-01-20 13:11 UTC (permalink / raw)
To: Ricard Wanderlof; +Cc: linux-mtd@lists.infradead.org
Richard,
Thank you for your confirmation and good idea.
I also think about your idea before, that is, when power failure
happens, generate an interrupt and blocks any other write requests in
interrupt handler. But this is a little complex.
Now, I think I can use ECC to check the partial write, if a write was
not finished, the ECC should be wrong, so we can detect this partial
write and discard this write. Do you think this is a good idea?
Thanks,
Hui
2010/1/20 Ricard Wanderlof <ricard.wanderlof@axis.com>:
>
> On Wed, 20 Jan 2010, Liu Hui wrote:
>
>> Hi guys,
>>
>> This is a question confused me for a long time. As I know, writing a
>> sector for a hard disk is atomic. That is to say, when we are writing
>> a sector to hard disk and power failure happen, the sector will be
>> written completely or not at all.
>>
>> For NAND flash, I didn't see the atomic guarantee in any material.
>> Could you please tell me if writing a page for NAND flash is atomic?
>> This is very important for a transaction based file system.
>
> It is my understanding that if you get a power failure while the nand flash
> chip is writing the page the page could get partly written.
>
> The only way around something like this would be to monitor the power line
> prior to the supply regulator, and not start a write if it can be detected
> that a power failure has occurred and there is not enough power to complete
> the write.
>
> /Ricard
> --
> Ricard Wolf Wanderlöf ricardw(at)axis.com
> Axis Communications AB, Lund, Sweden www.axis.com
> Phone +46 46 272 2016 Fax +46 46 13 61 30
>
--
Thanks & Best Regards
Liu Hui
--
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 13:11 ` Liu Hui
@ 2010-01-20 13:33 ` Jamie Lokier
2010-01-20 14:25 ` Liu Hui
0 siblings, 1 reply; 19+ messages in thread
From: Jamie Lokier @ 2010-01-20 13:33 UTC (permalink / raw)
To: Liu Hui; +Cc: linux-mtd@lists.infradead.org, Ricard Wanderlof
Liu Hui wrote:
> Richard,
>
> Thank you for your confirmation and good idea.
>
> I also think about your idea before, that is, when power failure
> happens, generate an interrupt and blocks any other write requests in
> interrupt handler. But this is a little complex.
Ideally, you would design the hardware so that power failure can be
detected early near the power input, but with enough on-board power
retention (i.e. capacitor) that there is guaranteed enough continuous
power for the CPU to react and the NAND chip to have enough stable
power to complete the write reliably.
There is no need for an interrupt, if you have a fast GPIO that you
can read before each write command that tells if the input power has
not dropped.
> Now, I think I can use ECC to check the partial write, if a write was
> not finished, the ECC should be wrong, so we can detect this partial
> write and discard this write. Do you think this is a good idea?
It's good, but not perfect: In principle a power-failed write could
successfully store the correct bits including ECC so they read back
correctly, but with the cell charges not completely stable. But I
guess that's rare enough that it is just included in the normal NAND
bad block possibilities.
-- Jamie
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 9:58 Is it an atomic operation for writing a page in NAND flash Liu Hui
2010-01-20 10:13 ` Ricard Wanderlof
@ 2010-01-20 13:41 ` Jamie Lokier
2010-01-20 13:58 ` Artem Bityutskiy
2010-01-20 14:01 ` Liu Hui
1 sibling, 2 replies; 19+ messages in thread
From: Jamie Lokier @ 2010-01-20 13:41 UTC (permalink / raw)
To: Liu Hui; +Cc: linux-mtd
Liu Hui wrote:
> This is a question confused me for a long time. As I know, writing a
> sector for a hard disk is atomic. That is to say, when we are writing
> a sector to hard disk and power failure happen, the sector will be
> written completely or not at all.
Are you sure about that?
I have never seen a reliable confirmation of it.
-- Jamie
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 13:41 ` Jamie Lokier
@ 2010-01-20 13:58 ` Artem Bityutskiy
2010-01-20 14:06 ` Liu Hui
2010-01-20 14:01 ` Liu Hui
1 sibling, 1 reply; 19+ messages in thread
From: Artem Bityutskiy @ 2010-01-20 13:58 UTC (permalink / raw)
To: Jamie Lokier; +Cc: linux-mtd, Liu Hui
On Wed, 2010-01-20 at 13:41 +0000, Jamie Lokier wrote:
> Liu Hui wrote:
> > This is a question confused me for a long time. As I know, writing a
> > sector for a hard disk is atomic. That is to say, when we are writing
> > a sector to hard disk and power failure happen, the sector will be
> > written completely or not at all.
>
> Are you sure about that?
> I have never seen a reliable confirmation of it.
Theo sent a longish e-mail some time ago to fs-devel or lkml, explaining
why this almost never true in practice.
--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 13:41 ` Jamie Lokier
2010-01-20 13:58 ` Artem Bityutskiy
@ 2010-01-20 14:01 ` Liu Hui
1 sibling, 0 replies; 19+ messages in thread
From: Liu Hui @ 2010-01-20 14:01 UTC (permalink / raw)
To: Jamie Lokier; +Cc: linux-mtd
http://kerneltrap.org/node/6741
In this article, it said:"Disks assure atomicity at the sector level.
This means that a write to a sector either goes through completely or
not at all."
I also know, some transaction based file system depend on the atomic
sector write feature of hard disk.
But I am not very confirmed about this...
2010/1/20 Jamie Lokier <jamie@shareable.org>:
> Liu Hui wrote:
>> This is a question confused me for a long time. As I know, writing a
>> sector for a hard disk is atomic. That is to say, when we are writing
>> a sector to hard disk and power failure happen, the sector will be
>> written completely or not at all.
>
> Are you sure about that?
> I have never seen a reliable confirmation of it.
>
> -- Jamie
>
--
Thanks & Best Regards
Liu Hui
--
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 13:58 ` Artem Bityutskiy
@ 2010-01-20 14:06 ` Liu Hui
2010-01-20 14:38 ` Artem Bityutskiy
0 siblings, 1 reply; 19+ messages in thread
From: Liu Hui @ 2010-01-20 14:06 UTC (permalink / raw)
To: dedekind1; +Cc: linux-mtd, Jamie Lokier
I didn't see this e-mail of Theo and I also can't find it in my
fs-devel mail list, could you please share it with us?
much appreciated!
2010/1/20 Artem Bityutskiy <dedekind1@gmail.com>:
> On Wed, 2010-01-20 at 13:41 +0000, Jamie Lokier wrote:
>> Liu Hui wrote:
>> > This is a question confused me for a long time. As I know, writing a
>> > sector for a hard disk is atomic. That is to say, when we are writing
>> > a sector to hard disk and power failure happen, the sector will be
>> > written completely or not at all.
>>
>> Are you sure about that?
>> I have never seen a reliable confirmation of it.
>
> Theo sent a longish e-mail some time ago to fs-devel or lkml, explaining
> why this almost never true in practice.
>
> --
> Best Regards,
> Artem Bityutskiy (Артём Битюцкий)
>
>
--
Thanks & Best Regards
Liu Hui
--
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 13:33 ` Jamie Lokier
@ 2010-01-20 14:25 ` Liu Hui
2010-01-20 14:54 ` Ricard Wanderlof
0 siblings, 1 reply; 19+ messages in thread
From: Liu Hui @ 2010-01-20 14:25 UTC (permalink / raw)
To: Jamie Lokier; +Cc: linux-mtd@lists.infradead.org, Ricard Wanderlof
Ok, In fact, I am a soft developer, I want to find a way to design a
power-cut safe FTL. I can't control the design of hardware, so I have
to find a common way to ensure atomic operation.
> There is no need for an interrupt, if you have a fast GPIO that you
> can read before each write command that tells if the input power has
> not dropped.
This is no good for performance.
> It's good, but not perfect: In principle a power-failed write could
> successfully store the correct bits including ECC so they read back
> correctly, but with the cell charges not completely stable. But I
> guess that's rare enough that it is just included in the normal NAND
> bad block possibilities.
Ok, ECC can detect partial write but can't detect unstable cell
charges, I think this is enough since NAND flash is unstable media.
Thanks for your information!
Hui
2010/1/20 Jamie Lokier <jamie@shareable.org>:
> Liu Hui wrote:
>> Richard,
>>
>> Thank you for your confirmation and good idea.
>>
>> I also think about your idea before, that is, when power failure
>> happens, generate an interrupt and blocks any other write requests in
>> interrupt handler. But this is a little complex.
>
> Ideally, you would design the hardware so that power failure can be
> detected early near the power input, but with enough on-board power
> retention (i.e. capacitor) that there is guaranteed enough continuous
> power for the CPU to react and the NAND chip to have enough stable
> power to complete the write reliably.
>
> There is no need for an interrupt, if you have a fast GPIO that you
> can read before each write command that tells if the input power has
> not dropped.
>
>> Now, I think I can use ECC to check the partial write, if a write was
>> not finished, the ECC should be wrong, so we can detect this partial
>> write and discard this write. Do you think this is a good idea?
>
> It's good, but not perfect: In principle a power-failed write could
> successfully store the correct bits including ECC so they read back
> correctly, but with the cell charges not completely stable. But I
> guess that's rare enough that it is just included in the normal NAND
> bad block possibilities.
>
> -- Jamie
>
--
Thanks & Best Regards
Liu Hui
--
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 14:06 ` Liu Hui
@ 2010-01-20 14:38 ` Artem Bityutskiy
2010-01-20 14:46 ` Artem Bityutskiy
0 siblings, 1 reply; 19+ messages in thread
From: Artem Bityutskiy @ 2010-01-20 14:38 UTC (permalink / raw)
To: Liu Hui; +Cc: linux-mtd, Jamie Lokier
On Wed, 2010-01-20 at 22:06 +0800, Liu Hui wrote:
> I didn't see this e-mail of Theo and I also can't find it in my
> fs-devel mail list, could you please share it with us?
>
> much appreciated!
Found it on lkml. Enjoy. IMO, very good writing.
--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 14:38 ` Artem Bityutskiy
@ 2010-01-20 14:46 ` Artem Bityutskiy
2010-01-20 14:52 ` Liu Hui
0 siblings, 1 reply; 19+ messages in thread
From: Artem Bityutskiy @ 2010-01-20 14:46 UTC (permalink / raw)
To: Liu Hui; +Cc: Jamie Lokier, linux-mtd
On Wed, 2010-01-20 at 16:38 +0200, Artem Bityutskiy wrote:
> On Wed, 2010-01-20 at 22:06 +0800, Liu Hui wrote:
> > I didn't see this e-mail of Theo and I also can't find it in my
> > fs-devel mail list, could you please share it with us?
> >
> > much appreciated!
>
> Found it on lkml. Enjoy. IMO, very good writing.
Sorry:
http://lkml.org/lkml/2009/8/24/156
(forgot to paste the link)
--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 14:46 ` Artem Bityutskiy
@ 2010-01-20 14:52 ` Liu Hui
0 siblings, 0 replies; 19+ messages in thread
From: Liu Hui @ 2010-01-20 14:52 UTC (permalink / raw)
To: dedekind1; +Cc: Jamie Lokier, linux-mtd
Great, so kind of you.
Thanks,
Hui
2010/1/20 Artem Bityutskiy <dedekind1@gmail.com>:
> On Wed, 2010-01-20 at 16:38 +0200, Artem Bityutskiy wrote:
>> On Wed, 2010-01-20 at 22:06 +0800, Liu Hui wrote:
>> > I didn't see this e-mail of Theo and I also can't find it in my
>> > fs-devel mail list, could you please share it with us?
>> >
>> > much appreciated!
>>
>> Found it on lkml. Enjoy. IMO, very good writing.
>
> Sorry:
> http://lkml.org/lkml/2009/8/24/156
>
> (forgot to paste the link)
>
> --
> Best Regards,
> Artem Bityutskiy (Артём Битюцкий)
>
>
--
Thanks & Best Regards
Liu Hui
--
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 14:25 ` Liu Hui
@ 2010-01-20 14:54 ` Ricard Wanderlof
2010-01-20 15:11 ` Liu Hui
2010-01-20 16:17 ` David Parkinson
0 siblings, 2 replies; 19+ messages in thread
From: Ricard Wanderlof @ 2010-01-20 14:54 UTC (permalink / raw)
To: Liu Hui; +Cc: linux-mtd@lists.infradead.org, Jamie Lokier
On Wed, 20 Jan 2010, Liu Hui wrote:
>> It's good, but not perfect: In principle a power-failed write could
>> successfully store the correct bits including ECC so they read back
>> correctly, but with the cell charges not completely stable. But I
>> guess that's rare enough that it is just included in the normal NAND
>> bad block possibilities.
> Ok, ECC can detect partial write but can't detect unstable cell
> charges, I think this is enough since NAND flash is unstable media.
ECC is designed to correct a small number of bits (1 bit for the software
ECC algorithm used by mtd) and detect failure if a couple of more bits are
bad (2 bits for the mtd algorithm). Beyond that, the results cannot be
trusted. That means, that if there are, say, 16 incorrect bits in the
data, the ECC algorithm will not necessarily indicate that there is a
failure. It might very well indicate that there is a single bit that needs
correction, or that all bits are correct. It is not a CRC.
The end result is that you can't say "if the ECC says it's ok, the data
hasn't been corrupted" (which you could with a CRC). The only thing you
can say (in the case of the mtd ECC algorithm) is "if there is a one-bit
error in the data, the ECC will correct it" and "if there is a two-bit
error in the data, the ECC will detect it".
If you really need that kind of check for data integrity, I suppose you
could add a CRC algorithm to the ECC calculations already being performed
by mtd, so that any change in the data would be flagged due to a
mismatching CRC.
/Ricard
--
Ricard Wolf Wanderlöf ricardw(at)axis.com
Axis Communications AB, Lund, Sweden www.axis.com
Phone +46 46 272 2016 Fax +46 46 13 61 30
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 14:54 ` Ricard Wanderlof
@ 2010-01-20 15:11 ` Liu Hui
2010-01-20 16:09 ` Ricard Wanderlof
2010-01-20 16:17 ` David Parkinson
1 sibling, 1 reply; 19+ messages in thread
From: Liu Hui @ 2010-01-20 15:11 UTC (permalink / raw)
To: Ricard Wanderlof; +Cc: linux-mtd@lists.infradead.org, Jamie Lokier
Thanks you very much, CRC is the real solution.
But I don't understand, if a partial write happens, we use ECC to
correct the data, we will find the data can't be corrected, then
-EBADMSG will be returned(see nand_correct_data()), then we can know
this page are corrupted. IMHO, this works.
Anyway, I will think about CRC seriously.
Thanks,
Hui
2010/1/20 Ricard Wanderlof <ricard.wanderlof@axis.com>:
>
> On Wed, 20 Jan 2010, Liu Hui wrote:
>
>>> It's good, but not perfect: In principle a power-failed write could
>>> successfully store the correct bits including ECC so they read back
>>> correctly, but with the cell charges not completely stable. But I
>>> guess that's rare enough that it is just included in the normal NAND
>>> bad block possibilities.
>>
>> Ok, ECC can detect partial write but can't detect unstable cell
>> charges, I think this is enough since NAND flash is unstable media.
>
> ECC is designed to correct a small number of bits (1 bit for the software
> ECC algorithm used by mtd) and detect failure if a couple of more bits are
> bad (2 bits for the mtd algorithm). Beyond that, the results cannot be
> trusted. That means, that if there are, say, 16 incorrect bits in the data,
> the ECC algorithm will not necessarily indicate that there is a failure. It
> might very well indicate that there is a single bit that needs correction,
> or that all bits are correct. It is not a CRC.
>
> The end result is that you can't say "if the ECC says it's ok, the data
> hasn't been corrupted" (which you could with a CRC). The only thing you can
> say (in the case of the mtd ECC algorithm) is "if there is a one-bit error
> in the data, the ECC will correct it" and "if there is a two-bit error in
> the data, the ECC will detect it".
>
> If you really need that kind of check for data integrity, I suppose you
> could add a CRC algorithm to the ECC calculations already being performed by
> mtd, so that any change in the data would be flagged due to a mismatching
> CRC.
>
> /Ricard
> --
> Ricard Wolf Wanderlöf ricardw(at)axis.com
> Axis Communications AB, Lund, Sweden www.axis.com
> Phone +46 46 272 2016 Fax +46 46 13 61 30
>
--
Thanks & Best Regards
Liu Hui
--
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 15:11 ` Liu Hui
@ 2010-01-20 16:09 ` Ricard Wanderlof
2010-01-21 1:37 ` Liu Hui
0 siblings, 1 reply; 19+ messages in thread
From: Ricard Wanderlof @ 2010-01-20 16:09 UTC (permalink / raw)
To: Liu Hui; +Cc: Jamie Lokier, Ricard Wanderlöf,
linux-mtd@lists.infradead.org
On Wed, 20 Jan 2010, Liu Hui wrote:
> Thanks you very much, CRC is the real solution.
>
> But I don't understand, if a partial write happens, we use ECC to
> correct the data, we will find the data can't be corrected, then
> -EBADMSG will be returned(see nand_correct_data()), then we can know
> this page are corrupted. IMHO, this works.
Assuming the ECC algorithm used by mtd, it only produces correct results
in the case of 0, 1 or 2 bit errors in the data. For more bit errors than
that, the result is undefined.
Let's assume a partial write occurs, which leads to 57 bit errors compared
to what was originally supposed to be there. Since there are more than 2
bit errors, the algorthm output is undefined; it may say that the data
can't be corrected, or it may say that the data is ok, or it may say that
the data can be corrected; it's impossible to tell. As far as I
understand, it is not uncommon for ECC to say the data is correct when it
fact it isn't.
A slightly trivial case:
Again assuming the ECC algorithm used by mtd, the ECC bytes for a chunk of
data where all the bytes have the same value is 0xFFFFFF, regardless of
the actual value. So, say you have a page full of 0xA3; the ECC is then
0xFFFFFFF. Now, assuming a partial write causes bit 2 of all bit cells to
not change from 1 to 0 when programming. The result is a page full of
0xA7, in effect, 256 bit errors (assuming a page size of 256 bits, or at
least, assuming an ECC calculation encompassing that many bytes). But the
ECC will still be 0xFFFFFF, and the corresponding ECC calculation will say
that the data is correct. That is, as I mentioned before, because the
result of an ECC calculation on data with >2 bit errors is undefined.
Note that there are other ECC algorithms which can correct more error
bits. For MLC flash it is recommended to use an algorithm which corrects 4
bit errors rather than a single bit error in a block of data. Such
algorithms require more ECC bits though.
One has a tendency to think of ECC as a checking algorithm. It is not. It
corrects and detects bit errors under certain circumstances. Outside those
circumstances it is worthless. For the case of the software algorithm used
in mtd, it is worthless if there are more than 2 bit errors. A failed
write could cause any number of bit errors, so it is worthless to check
the result using the ECC algorithm. The normal failure mode of a nand
flash chip is random single random bit errors with low probability. ECC
handles this elegantly.
Elaborating on this slightly, a devil's advocate would considere ECC
worthless as a correction algorithm for a flash chip. Assume one bit error
occurs, which is corrected by ECC. Then another bit error occurs in the
same page. ECC then detects a failure. Then another bit error occurrs. The
ECC algorithm is now worthless, it may detect the error, it may say that
the data is correct, or it may even try to correct it (erroneously).
The reason that all this works in practice is that the probability of a
bit error occurring is so low that the probability of two bit errors
occurring in the same page is very low, in some respect lower than other
failure modes in the system, so that we don't have to worry about it. (For
example: What about bit errors occurring in RAM chips from cosmic
radiation? It is a real risk, but so small that most systems don't have to
worry about it.)
It can be a real concern though, and that is why things like UBI provide
so-called bit scrubbing: whenever it detects that ECC has done a bit
correction in a block, it erases that block and rewrites the data [lots of
details omitted here] so that the chance of two bit errors ever occurring
in the same page will be very small indeed.
Especially in these days of larger and larger flash chips resulting from
shrinking chip geometries this is problem that is getting worse and worse.
It also tends to vary hugely among manufacturers.
/Ricard
--
Ricard Wolf Wanderlöf ricardw(at)axis.com
Axis Communications AB, Lund, Sweden www.axis.com
Phone +46 46 272 2016 Fax +46 46 13 61 30
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 14:54 ` Ricard Wanderlof
2010-01-20 15:11 ` Liu Hui
@ 2010-01-20 16:17 ` David Parkinson
2010-01-20 16:35 ` Ricard Wanderlof
1 sibling, 1 reply; 19+ messages in thread
From: David Parkinson @ 2010-01-20 16:17 UTC (permalink / raw)
To: linux-mtd
At 14:54 20/01/2010, Ricard Wanderlof wrote:
>...
>The end result is that you can't say "if the ECC says it's ok, the data
>hasn't been corrupted" (which you could with a CRC).
>...
Apologies for nit-picking (and small digression), but a CRC is no
guarantee either. Whilst error correcting codes have additional
information so that small errors can be corrected both CRCs and ECCs
work in the same way in detecting likely errors in the communications
channel. (It's all maths and statistics....).
A side question here is have the check algorithms been matched to the
characteristics of the MTDs? For example a weakish radio signal is
likely to have errors randomly distributed across the message. With
a magnetic disk drive the errors are likely be caused by a blemish on
the surface and therefore will come in bursts. Some algorithms will
be better than others in the respective cases.
Regards
David
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 16:17 ` David Parkinson
@ 2010-01-20 16:35 ` Ricard Wanderlof
2010-01-20 23:08 ` Charles Manning
0 siblings, 1 reply; 19+ messages in thread
From: Ricard Wanderlof @ 2010-01-20 16:35 UTC (permalink / raw)
To: David Parkinson; +Cc: linux-mtd@lists.infradead.org
On Wed, 20 Jan 2010, David Parkinson wrote:
> At 14:54 20/01/2010, Ricard Wanderlof wrote:
> >...
> >The end result is that you can't say "if the ECC says it's ok, the data
> >hasn't been corrupted" (which you could with a CRC).
> >...
>
> Apologies for nit-picking (and small digression), but a CRC is no
> guarantee either. Whilst error correcting codes have additional
> information so that small errors can be corrected both CRCs and ECCs
> work in the same way in detecting likely errors in the communications
> channel. (It's all maths and statistics....).
You are right of course. Indeed, any mapping of N bits to n bits (where N
> n) must result in a number of bit patterns for N which map to identical
bit patterns for n. Still, CRC's used for data checking are designed so
that the different bit patterns for N that map to the same n n are
reasonably different from each other, so that a CRC is unlikely to show a
correct result if there has been a 'typical' failure on the channel. At
least the ECC algorithm used for mtd has is not intended for that level of
error detection; it is optimized for correcting single-bit errors.
> A side question here is have the check algorithms been matched to the
> characteristics of the MTDs? For example a weakish radio signal is
> likely to have errors randomly distributed across the message. With
> a magnetic disk drive the errors are likely be caused by a blemish on
> the surface and therefore will come in bursts. Some algorithms will
> be better than others in the respective cases.
The algorithm used in mtd comes from Toshiba I think and was
originally designed for an old 256 page flash of theirs. But I would think
all 1-bit-error-correction ECC's are basically the same.
I don't know, but I think the basic premise is that bit errors are rare,
and when they do occur, they will be single bit errors occurring in random
places. Indeed, the algorithm used seems to be ideally suited to this
case.
/Ricard
--
Ricard Wolf Wanderlöf ricardw(at)axis.com
Axis Communications AB, Lund, Sweden www.axis.com
Phone +46 46 272 2016 Fax +46 46 13 61 30
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 16:35 ` Ricard Wanderlof
@ 2010-01-20 23:08 ` Charles Manning
0 siblings, 0 replies; 19+ messages in thread
From: Charles Manning @ 2010-01-20 23:08 UTC (permalink / raw)
To: linux-mtd; +Cc: Ricard Wanderlof, David Parkinson
On Thursday 21 January 2010 05:35:09 Ricard Wanderlof wrote:
> On Wed, 20 Jan 2010, David Parkinson wrote:
> > At 14:54 20/01/2010, Ricard Wanderlof wrote:
> > >...
> > >The end result is that you can't say "if the ECC says it's ok, the data
> > >hasn't been corrupted" (which you could with a CRC).
> > >...
> >
> > Apologies for nit-picking (and small digression), but a CRC is no
> > guarantee either. Whilst error correcting codes have additional
> > information so that small errors can be corrected both CRCs and ECCs
> > work in the same way in detecting likely errors in the communications
> > channel. (It's all maths and statistics....).
While you might be technically and theoretically correct, in practical terms
CRC is a copper-bottomed guarantee when compared with ECC.
A 32-bit CRC is very difficult to randomly spoof. An ECC is extremely easy to
spoof with random errors. The difference is in the order of millions.
The best protection for getting good NAND writes/erases is to make sure you
don't launch a programming (write/erase op) unless you know power is good. If
your system has a "power OK" flag then check it before doing the write.
Don't wire the WP pin to hardware power fail flags either since a falling WP
will abort the current write.
>
> You are right of course. Indeed, any mapping of N bits to n bits (where N
>
> > n) must result in a number of bit patterns for N which map to identical
>
> bit patterns for n. Still, CRC's used for data checking are designed so
> that the different bit patterns for N that map to the same n n are
> reasonably different from each other, so that a CRC is unlikely to show a
> correct result if there has been a 'typical' failure on the channel. At
> least the ECC algorithm used for mtd has is not intended for that level of
> error detection; it is optimized for correcting single-bit errors.
>
> > A side question here is have the check algorithms been matched to the
> > characteristics of the MTDs? For example a weakish radio signal is
> > likely to have errors randomly distributed across the message. With
> > a magnetic disk drive the errors are likely be caused by a blemish on
> > the surface and therefore will come in bursts. Some algorithms will
> > be better than others in the respective cases.
Actually radio errors are often bursty due to interference. The electric fence
clicks I get here knocks out a few adjacent bits.
>
> The algorithm used in mtd comes from Toshiba I think and was
> originally designed for an old 256 page flash of theirs. But I would think
> all 1-bit-error-correction ECC's are basically the same.
This still treats the pages as blocks of 256 bytes so if you have a 512-byte
page it will be treated as two 256-byte ECC regions.
>
> I don't know, but I think the basic premise is that bit errors are rare,
> and when they do occur, they will be single bit errors occurring in random
> places. Indeed, the algorithm used seems to be ideally suited to this
> case.
It depends...
For MLC flash you can expect quite a few errors which is why there has been a
shift to multi-bit ECC for these. If you have, say, 4-bit ECC then you might
choose to treat 2 or less errors as "no error".
There are basically two types of multi-bit ECC: RS and BCH. RS is more suited
to "burst" errors like you'll see on CD or radio. BCH is more suited to
random errors.
>
> /Ricard
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is it an atomic operation for writing a page in NAND flash
2010-01-20 16:09 ` Ricard Wanderlof
@ 2010-01-21 1:37 ` Liu Hui
0 siblings, 0 replies; 19+ messages in thread
From: Liu Hui @ 2010-01-21 1:37 UTC (permalink / raw)
To: Ricard Wanderlof; +Cc: linux-mtd@lists.infradead.org, Jamie Lokier
Very informative, it expand my view. Thanks!
2010/1/21 Ricard Wanderlof <ricard.wanderlof@axis.com>:
>
> On Wed, 20 Jan 2010, Liu Hui wrote:
>
>> Thanks you very much, CRC is the real solution.
>>
>> But I don't understand, if a partial write happens, we use ECC to
>> correct the data, we will find the data can't be corrected, then
>> -EBADMSG will be returned(see nand_correct_data()), then we can know
>> this page are corrupted. IMHO, this works.
>
> Assuming the ECC algorithm used by mtd, it only produces correct results in
> the case of 0, 1 or 2 bit errors in the data. For more bit errors than that,
> the result is undefined.
>
> Let's assume a partial write occurs, which leads to 57 bit errors compared
> to what was originally supposed to be there. Since there are more than 2 bit
> errors, the algorthm output is undefined; it may say that the data can't be
> corrected, or it may say that the data is ok, or it may say that the data
> can be corrected; it's impossible to tell. As far as I understand, it is not
> uncommon for ECC to say the data is correct when it fact it isn't.
>
> A slightly trivial case:
>
> Again assuming the ECC algorithm used by mtd, the ECC bytes for a chunk of
> data where all the bytes have the same value is 0xFFFFFF, regardless of the
> actual value. So, say you have a page full of 0xA3; the ECC is then
> 0xFFFFFFF. Now, assuming a partial write causes bit 2 of all bit cells to
> not change from 1 to 0 when programming. The result is a page full of 0xA7,
> in effect, 256 bit errors (assuming a page size of 256 bits, or at least,
> assuming an ECC calculation encompassing that many bytes). But the ECC will
> still be 0xFFFFFF, and the corresponding ECC calculation will say that the
> data is correct. That is, as I mentioned before, because the result of an
> ECC calculation on data with >2 bit errors is undefined.
>
> Note that there are other ECC algorithms which can correct more error bits.
> For MLC flash it is recommended to use an algorithm which corrects 4 bit
> errors rather than a single bit error in a block of data. Such algorithms
> require more ECC bits though.
>
> One has a tendency to think of ECC as a checking algorithm. It is not. It
> corrects and detects bit errors under certain circumstances. Outside those
> circumstances it is worthless. For the case of the software algorithm used
> in mtd, it is worthless if there are more than 2 bit errors. A failed write
> could cause any number of bit errors, so it is worthless to check the result
> using the ECC algorithm. The normal failure mode of a nand flash chip is
> random single random bit errors with low probability. ECC handles this
> elegantly.
>
>
> Elaborating on this slightly, a devil's advocate would considere ECC
> worthless as a correction algorithm for a flash chip. Assume one bit error
> occurs, which is corrected by ECC. Then another bit error occurs in the same
> page. ECC then detects a failure. Then another bit error occurrs. The ECC
> algorithm is now worthless, it may detect the error, it may say that the
> data is correct, or it may even try to correct it (erroneously).
>
> The reason that all this works in practice is that the probability of a bit
> error occurring is so low that the probability of two bit errors occurring
> in the same page is very low, in some respect lower than other failure modes
> in the system, so that we don't have to worry about it. (For example: What
> about bit errors occurring in RAM chips from cosmic radiation? It is a real
> risk, but so small that most systems don't have to worry about it.)
>
> It can be a real concern though, and that is why things like UBI provide
> so-called bit scrubbing: whenever it detects that ECC has done a bit
> correction in a block, it erases that block and rewrites the data [lots of
> details omitted here] so that the chance of two bit errors ever occurring in
> the same page will be very small indeed.
>
> Especially in these days of larger and larger flash chips resulting from
> shrinking chip geometries this is problem that is getting worse and worse.
> It also tends to vary hugely among manufacturers.
>
> /Ricard
> --
> Ricard Wolf Wanderlöf ricardw(at)axis.com
> Axis Communications AB, Lund, Sweden www.axis.com
> Phone +46 46 272 2016 Fax +46 46 13 61 30
>
--
Thanks & Best Regards
Liu Hui
--
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2010-01-21 1:37 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-20 9:58 Is it an atomic operation for writing a page in NAND flash Liu Hui
2010-01-20 10:13 ` Ricard Wanderlof
2010-01-20 13:11 ` Liu Hui
2010-01-20 13:33 ` Jamie Lokier
2010-01-20 14:25 ` Liu Hui
2010-01-20 14:54 ` Ricard Wanderlof
2010-01-20 15:11 ` Liu Hui
2010-01-20 16:09 ` Ricard Wanderlof
2010-01-21 1:37 ` Liu Hui
2010-01-20 16:17 ` David Parkinson
2010-01-20 16:35 ` Ricard Wanderlof
2010-01-20 23:08 ` Charles Manning
2010-01-20 13:41 ` Jamie Lokier
2010-01-20 13:58 ` Artem Bityutskiy
2010-01-20 14:06 ` Liu Hui
2010-01-20 14:38 ` Artem Bityutskiy
2010-01-20 14:46 ` Artem Bityutskiy
2010-01-20 14:52 ` Liu Hui
2010-01-20 14:01 ` Liu Hui
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox