* NAND ECC capabilities
@ 2015-01-08 3:10 Steve deRosier
2015-01-08 4:17 ` Ezequiel Garcia
2015-01-08 8:32 ` Ricard Wanderlof
0 siblings, 2 replies; 28+ messages in thread
From: Steve deRosier @ 2015-01-08 3:10 UTC (permalink / raw)
To: linux-mtd
So, doing further experiments and I wondered if someone could confirm
this finding.
With atmel_nand, we're setup for 4-bit ECC on 512 sectors with a 2k
page. I was thinking about this a bit and realized that there's 4 of
these sectors per page, and this implies then that we can detect and
correct 4 bad bits _per_ each sector. Assuming that they're evenly
spread, that's up to 16 bad bits per page. Obviously in practice,
that assumption wouldn't hold...
So, is my understanding correct?
I took it further and decided to play with this experimentally. On my
UBIFS rootfs, I flipped 3 bits in the first sector of a page and then
3 more in the second sector. From my kernel log I got this:
[ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
byte_pos: 98, bit_pos: 3, 0x31 -> 0x39
[ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
byte_pos: 98, bit_pos: 2, 0x39 -> 0x3d
[ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
byte_pos: 98, bit_pos: 1, 0x3d -> 0x3f
[ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
byte_pos: 530, bit_pos: 6, 0x8e -> 0xce
[ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
byte_pos: 530, bit_pos: 5, 0xce -> 0xee
[ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
byte_pos: 530, bit_pos: 4, 0xee -> 0xfe
[ 78.304687] UBI: fixable bit-flip detected at PEB 20
[ 78.304687] UBI: schedule PEB 20 for scrubbing
[ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
byte_pos: 98, bit_pos: 3, 0x31 -> 0x39
[ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
byte_pos: 98, bit_pos: 2, 0x39 -> 0x3d
[ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
byte_pos: 98, bit_pos: 1, 0x3d -> 0x3f
[ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
byte_pos: 530, bit_pos: 6, 0x8e -> 0xce
[ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
byte_pos: 530, bit_pos: 5, 0xce -> 0xee
[ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
byte_pos: 530, bit_pos: 4, 0xee -> 0xfe
[ 78.343750] UBI: fixable bit-flip detected at PEB 20
[ 78.382812] UBI: scrubbed PEB 20 (LEB 0:18), data moved to PEB 250
So, my takeaway from this is a couple of things:
1. Yes, it can correct more than 4 bits per page as long as those are
on different sectors of the page.
2. My test of 6 bits hit the 4 bit threshold setting and at that point
UBI decided that maybe something is wrong with that PEB.
3. When it did, UBI corrected the data and copied it elsewhere
4. Then UBI scrubbed. I assume it then did the torture test. Since I
manually made a flip, it found it was fine once it erased it, so it
didn't mark it as bad. I checked my BBT and it's not marked. So I
assume it's erased and ready for use again.
Is my general understanding correct?
Thanks,
- Steve
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: NAND ECC capabilities
2015-01-08 3:10 NAND ECC capabilities Steve deRosier
@ 2015-01-08 4:17 ` Ezequiel Garcia
2015-01-08 6:22 ` Steve deRosier
2015-01-08 8:32 ` Ricard Wanderlof
1 sibling, 1 reply; 28+ messages in thread
From: Ezequiel Garcia @ 2015-01-08 4:17 UTC (permalink / raw)
To: Steve deRosier, linux-mtd
On 01/08/2015 12:10 AM, Steve deRosier wrote:
> So, doing further experiments and I wondered if someone could confirm
> this finding.
>
> With atmel_nand, we're setup for 4-bit ECC on 512 sectors with a 2k
> page. I was thinking about this a bit and realized that there's 4 of
> these sectors per page, and this implies then that we can detect and
> correct 4 bad bits _per_ each sector. Assuming that they're evenly
> spread, that's up to 16 bad bits per page. Obviously in practice,
> that assumption wouldn't hold...
>
Not sure why you say that woulnd't hold.
> So, is my understanding correct?
>
I'm not familiar with atmel-nand, but as far as I know, you are right. A
4-bit ECC strength over 512 byte sectors, means exactly that.
Most likely your ECC hardware stores four ECC values (one for each
512-byte sector in your 2048-byte page) in the OOB of the page. Each ECC
value is used to correct up to 4-bit on each sector, so that's why you
can correct as much as that.
> I took it further and decided to play with this experimentally. On my
> UBIFS rootfs, I flipped 3 bits in the first sector of a page and then
> 3 more in the second sector. From my kernel log I got this:
>
> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 98, bit_pos: 3, 0x31 -> 0x39
> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 98, bit_pos: 2, 0x39 -> 0x3d
> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 98, bit_pos: 1, 0x3d -> 0x3f
> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 530, bit_pos: 6, 0x8e -> 0xce
> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 530, bit_pos: 5, 0xce -> 0xee
> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 530, bit_pos: 4, 0xee -> 0xfe
> [ 78.304687] UBI: fixable bit-flip detected at PEB 20
> [ 78.304687] UBI: schedule PEB 20 for scrubbing
> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 98, bit_pos: 3, 0x31 -> 0x39
> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 98, bit_pos: 2, 0x39 -> 0x3d
> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 98, bit_pos: 1, 0x3d -> 0x3f
> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 530, bit_pos: 6, 0x8e -> 0xce
> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 530, bit_pos: 5, 0xce -> 0xee
> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 530, bit_pos: 4, 0xee -> 0xfe
> [ 78.343750] UBI: fixable bit-flip detected at PEB 20
> [ 78.382812] UBI: scrubbed PEB 20 (LEB 0:18), data moved to PEB 250
>
> So, my takeaway from this is a couple of things:
>
> 1. Yes, it can correct more than 4 bits per page as long as those are
> on different sectors of the page.
Correct. It can correct as much as advertised: 4-bits per 512-byte sector.
> 2. My test of 6 bits hit the 4 bit threshold setting and at that point
> UBI decided that maybe something is wrong with that PEB.
Correct. Read/program disturb accumulates and that produces bitflips.
Given these bitflip can be eliminated by erasing the block, UBI will do
that before the block get worse.
> 3. When it did, UBI corrected the data and copied it elsewhere
Actually, your NAND controller (or MTD software ECC) corrected the data
and reported the number of bitflips to UBI.
> 4. Then UBI scrubbed. I assume it then did the torture test. Since I
> manually made a flip, it found it was fine once it erased it, so it
> didn't mark it as bad. I checked my BBT and it's not marked. So I
> assume it's erased and ready for use again.
>
Yes, UBI tortures the PEB on occassions. However, this does happen only
under certain circumstances (you'll have to dig the code for details). I
don't think it was tortured in your case (the block just had a few
artifitial bitflips, but other than that it was healthy).
Torture comes with a noisy message "run torture test for PEB %d", so you
would notice.
> Is my general understanding correct?
>
I think so, yes.
--
Ezequiel Garcia, VanguardiaSur
www.vanguardiasur.com.ar
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: NAND ECC capabilities
2015-01-08 4:17 ` Ezequiel Garcia
@ 2015-01-08 6:22 ` Steve deRosier
[not found] ` <0D23F1ECC880A74392D56535BCADD73526C0EA9A@NTXBOIMBX03.micron.com>
0 siblings, 1 reply; 28+ messages in thread
From: Steve deRosier @ 2015-01-08 6:22 UTC (permalink / raw)
To: Ezequiel Garcia; +Cc: linux-mtd
Thanks Ezequiel!
On Wed, Jan 7, 2015 at 8:17 PM, Ezequiel Garcia
<ezequiel@vanguardiasur.com.ar> wrote:
> On 01/08/2015 12:10 AM, Steve deRosier wrote:
>> So, doing further experiments and I wondered if someone could confirm
>> this finding.
>>
>> With atmel_nand, we're setup for 4-bit ECC on 512 sectors with a 2k
>> page. I was thinking about this a bit and realized that there's 4 of
>> these sectors per page, and this implies then that we can detect and
>> correct 4 bad bits _per_ each sector. Assuming that they're evenly
>> spread, that's up to 16 bad bits per page. Obviously in practice,
>> that assumption wouldn't hold...
>>
>
> Not sure why you say that woulnd't hold.
>
Ah, simple. We have a 2048 byte page (2112 if you count the OOB). 4
bits per 512. Which means 16 per page _assuming_ they're evenly
spread. That means 4 in the first 512, 4 in the next etc... But if
all 16 bit flips were in the first 512, boom, we're done (actually,
only need 5 in that first one). There's no reason to believe that the
bitflips would be evenly over the page (in fact, the opposite is often
true, they seem to cluster in my experience).
This NAND doesn't have subpages, it's a 2K page. The ECC hardware can
do the calculations in 1024 or 512 byte chunks. But those are all for
the one page which acts as a single entity.
It's fair to say we can only do 4 bit-flips per page in this case
because the worst case scenario is all the bits in question would be
in the same 512 sector.
Thanks for your reply!
- Steve
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: NAND ECC capabilities
2015-01-08 3:10 NAND ECC capabilities Steve deRosier
2015-01-08 4:17 ` Ezequiel Garcia
@ 2015-01-08 8:32 ` Ricard Wanderlof
2015-01-08 16:42 ` Ezequiel Garcia
2015-01-08 17:14 ` NAND ECC capabilities Steve deRosier
1 sibling, 2 replies; 28+ messages in thread
From: Ricard Wanderlof @ 2015-01-08 8:32 UTC (permalink / raw)
To: Steve deRosier; +Cc: linux-mtd@lists.infradead.org
On Thu, 8 Jan 2015, Steve deRosier wrote:
> So, doing further experiments and I wondered if someone could confirm
> this finding.
>
> With atmel_nand, we're setup for 4-bit ECC on 512 sectors with a 2k
> page. I was thinking about this a bit and realized that there's 4 of
> these sectors per page, and this implies then that we can detect and
> correct 4 bad bits _per_ each sector. Assuming that they're evenly
> spread, that's up to 16 bad bits per page. Obviously in practice,
> that assumption wouldn't hold...
>
> So, is my understanding correct?
>
> I took it further and decided to play with this experimentally. On my
> UBIFS rootfs, I flipped 3 bits in the first sector of a page and then
> 3 more in the second sector. From my kernel log I got this:
>
> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 98, bit_pos: 3, 0x31 -> 0x39
> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 98, bit_pos: 2, 0x39 -> 0x3d
> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 98, bit_pos: 1, 0x3d -> 0x3f
> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 530, bit_pos: 6, 0x8e -> 0xce
> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 530, bit_pos: 5, 0xce -> 0xee
> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 530, bit_pos: 4, 0xee -> 0xfe
> [ 78.304687] UBI: fixable bit-flip detected at PEB 20
> [ 78.304687] UBI: schedule PEB 20 for scrubbing
> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 98, bit_pos: 3, 0x31 -> 0x39
> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 98, bit_pos: 2, 0x39 -> 0x3d
> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 98, bit_pos: 1, 0x3d -> 0x3f
> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 530, bit_pos: 6, 0x8e -> 0xce
> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 530, bit_pos: 5, 0xce -> 0xee
> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
> byte_pos: 530, bit_pos: 4, 0xee -> 0xfe
> [ 78.343750] UBI: fixable bit-flip detected at PEB 20
> [ 78.382812] UBI: scrubbed PEB 20 (LEB 0:18), data moved to PEB 250
>
> So, my takeaway from this is a couple of things:
>
> 1. Yes, it can correct more than 4 bits per page as long as those are
> on different sectors of the page.
> 2. My test of 6 bits hit the 4 bit threshold setting and at that point
> UBI decided that maybe something is wrong with that PEB.
> 3. When it did, UBI corrected the data and copied it elsewhere
> 4. Then UBI scrubbed. I assume it then did the torture test. Since I
> manually made a flip, it found it was fine once it erased it, so it
> didn't mark it as bad. I checked my BBT and it's not marked. So I
> assume it's erased and ready for use again.
>
> Is my general understanding correct?
I'd say yes, but the ECC threshold should be per 512 byte ECC block (which
seems to be the correct term rather than 'sector'), rather than per page.
Are you sure that the threshold is set to 4 (see
/sys/devices/virtual/mtd/mtd<n>/bitflip_threshold )?
Normally the threshold is set below the ECC correction capability, so that
bit scrubbing has a chance to occur before the bits rot too far. Say you
have the threshold set at 4 bits, and you have 3 bits that have flipped.
If another bit flips, the block would be scrubbed, but say that two bits
flipped before you read the data the next time. You would have lost your
chance of recovery, so it makes sense to have the threshold lower than the
ECC capability. I would say 3/4 of the ECC capability would be a
reasonable value.
/Ricard
--
Ricard Wolf Wanderlöf ricardw(at)axis.com
Axis Communications AB, Lund, Sweden www.axis.com
Phone +46 46 272 2016 Fax +46 46 13 61 30
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: NAND ECC capabilities
2015-01-08 8:32 ` Ricard Wanderlof
@ 2015-01-08 16:42 ` Ezequiel Garcia
2015-01-08 17:26 ` Steve deRosier
2015-01-08 19:09 ` Brian Norris
2015-01-08 17:14 ` NAND ECC capabilities Steve deRosier
1 sibling, 2 replies; 28+ messages in thread
From: Ezequiel Garcia @ 2015-01-08 16:42 UTC (permalink / raw)
To: Ricard Wanderlof, Steve deRosier
Cc: Huang Shijie, Brian Norris, linux-mtd@lists.infradead.org,
Richard Weinberger
(Adding a few MTD folks)
On 01/08/2015 05:32 AM, Ricard Wanderlof wrote:
>
> On Thu, 8 Jan 2015, Steve deRosier wrote:
>
>> So, doing further experiments and I wondered if someone could confirm
>> this finding.
>>
>> With atmel_nand, we're setup for 4-bit ECC on 512 sectors with a 2k
>> page. I was thinking about this a bit and realized that there's 4 of
>> these sectors per page, and this implies then that we can detect and
>> correct 4 bad bits _per_ each sector. Assuming that they're evenly
>> spread, that's up to 16 bad bits per page. Obviously in practice,
>> that assumption wouldn't hold...
>>
>> So, is my understanding correct?
>>
>> I took it further and decided to play with this experimentally. On my
>> UBIFS rootfs, I flipped 3 bits in the first sector of a page and then
>> 3 more in the second sector. From my kernel log I got this:
>>
>> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
>> byte_pos: 98, bit_pos: 3, 0x31 -> 0x39
>> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
>> byte_pos: 98, bit_pos: 2, 0x39 -> 0x3d
>> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
>> byte_pos: 98, bit_pos: 1, 0x3d -> 0x3f
>> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
>> byte_pos: 530, bit_pos: 6, 0x8e -> 0xce
>> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
>> byte_pos: 530, bit_pos: 5, 0xce -> 0xee
>> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
>> byte_pos: 530, bit_pos: 4, 0xee -> 0xfe
>> [ 78.304687] UBI: fixable bit-flip detected at PEB 20
>> [ 78.304687] UBI: schedule PEB 20 for scrubbing
>> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
>> byte_pos: 98, bit_pos: 3, 0x31 -> 0x39
>> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
>> byte_pos: 98, bit_pos: 2, 0x39 -> 0x3d
>> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
>> byte_pos: 98, bit_pos: 1, 0x3d -> 0x3f
>> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
>> byte_pos: 530, bit_pos: 6, 0x8e -> 0xce
>> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
>> byte_pos: 530, bit_pos: 5, 0xce -> 0xee
>> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
>> byte_pos: 530, bit_pos: 4, 0xee -> 0xfe
>> [ 78.343750] UBI: fixable bit-flip detected at PEB 20
>> [ 78.382812] UBI: scrubbed PEB 20 (LEB 0:18), data moved to PEB 250
>>
>> So, my takeaway from this is a couple of things:
>>
>> 1. Yes, it can correct more than 4 bits per page as long as those are
>> on different sectors of the page.
>> 2. My test of 6 bits hit the 4 bit threshold setting and at that point
>> UBI decided that maybe something is wrong with that PEB.
>> 3. When it did, UBI corrected the data and copied it elsewhere
>> 4. Then UBI scrubbed. I assume it then did the torture test. Since I
>> manually made a flip, it found it was fine once it erased it, so it
>> didn't mark it as bad. I checked my BBT and it's not marked. So I
>> assume it's erased and ready for use again.
>>
>> Is my general understanding correct?
>
> I'd say yes, but the ECC threshold should be per 512 byte ECC block (which
> seems to be the correct term rather than 'sector'), rather than per page.
> Are you sure that the threshold is set to 4 (see
> /sys/devices/virtual/mtd/mtd<n>/bitflip_threshold )?
>
> Normally the threshold is set below the ECC correction capability, so that
> bit scrubbing has a chance to occur before the bits rot too far. Say you
> have the threshold set at 4 bits, and you have 3 bits that have flipped.
> If another bit flips, the block would be scrubbed, but say that two bits
> flipped before you read the data the next time. You would have lost your
> chance of recovery, so it makes sense to have the threshold lower than the
> ECC capability. I would say 3/4 of the ECC capability would be a
> reasonable value.
>
This makes a lot of sense. However, do we have any way of telling if the
bitflips where produced on the same ECC sector?
>From a cursory look to the code, I'd say there's no such feature with
the current MTD/NAND design. So, if an mtd_read reports 3 bitflips you
have no way of telling they happened on the same sector or not, so you
can't implement your idea.
I'm curious about what ECC threshold is typically used in production.
Obviously a too big or too small values have deadly consequences, so
this doesn't seem like a minor issue.
--
Ezequiel Garcia, VanguardiaSur
www.vanguardiasur.com.ar
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: NAND ECC capabilities
[not found] ` <0D23F1ECC880A74392D56535BCADD73526C0EA9A@NTXBOIMBX03.micron.com>
@ 2015-01-08 17:09 ` Steve deRosier
2015-01-08 18:57 ` Brian Norris
0 siblings, 1 reply; 28+ messages in thread
From: Steve deRosier @ 2015-01-08 17:09 UTC (permalink / raw)
To: Jeff Lauruhn (jlauruhn); +Cc: linux-mtd, Ezequiel Garcia
On Thu, Jan 8, 2015 at 8:38 AM, Jeff Lauruhn (jlauruhn)
<jlauruhn@micron.com> wrote:
> You might find this interesting http://www.cyclicdesign.com/index.php/ecc-trends-in-nand-flash.
>
>
Jeff,
Thanks for that. The diagram on page 7 actually illustrates exactly
what I was getting at. In my case, the ECC can handle 4-bits on each
512 sector, yet since the increment we handle in the higher levels is
per-page, we really need to set the threshold at a max of 4 (probably
3 as Richard points out) for the page.
- Steve
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: NAND ECC capabilities
2015-01-08 8:32 ` Ricard Wanderlof
2015-01-08 16:42 ` Ezequiel Garcia
@ 2015-01-08 17:14 ` Steve deRosier
1 sibling, 0 replies; 28+ messages in thread
From: Steve deRosier @ 2015-01-08 17:14 UTC (permalink / raw)
To: Ricard Wanderlof; +Cc: linux-mtd@lists.infradead.org
Hi Ricard,
On Thu, Jan 8, 2015 at 12:32 AM, Ricard Wanderlof
<ricard.wanderlof@axis.com> wrote:
>
> I'd say yes, but the ECC threshold should be per 512 byte ECC block (which
> seems to be the correct term rather than 'sector'), rather than per page.
> Are you sure that the threshold is set to 4 (see
> /sys/devices/virtual/mtd/mtd<n>/bitflip_threshold )?
>
Yep, bitflip_threashold is set to 4. I'm thinking of changing it to 3
anyway as you suggested.
Actually I'm beginning of thinking about changing it to 1024 size
sectors, with 12-bit /sector correction and setting the threshold at
10. The problem there is upgrading devices already out in the wild,
that doesn't sound like an easy process in our particular case. Maybe
later when I have my head wrapped around the problem a bit I'll ask a
question on that.
Thanks,
- Steve
PS sorry about the typo of your name in my response to Jeff. My
fingers flew over the "h" automatically and I hit send before I
noticed.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: NAND ECC capabilities
2015-01-08 16:42 ` Ezequiel Garcia
@ 2015-01-08 17:26 ` Steve deRosier
2015-01-08 19:09 ` Brian Norris
1 sibling, 0 replies; 28+ messages in thread
From: Steve deRosier @ 2015-01-08 17:26 UTC (permalink / raw)
To: Ezequiel Garcia
Cc: linux-mtd@lists.infradead.org, Brian Norris, Ricard Wanderlof,
Huang Shijie, Richard Weinberger
On Thu, Jan 8, 2015 at 8:42 AM, Ezequiel Garcia
<ezequiel@vanguardiasur.com.ar> wrote:
> (Adding a few MTD folks)
>
> On 01/08/2015 05:32 AM, Ricard Wanderlof wrote:
>>
>> On Thu, 8 Jan 2015, Steve deRosier wrote:
>>
>>> So, doing further experiments and I wondered if someone could confirm
>>> this finding.
>>>
>>> With atmel_nand, we're setup for 4-bit ECC on 512 sectors with a 2k
>>> page. I was thinking about this a bit and realized that there's 4 of
>>> these sectors per page, and this implies then that we can detect and
>>> correct 4 bad bits _per_ each sector. Assuming that they're evenly
>>> spread, that's up to 16 bad bits per page. Obviously in practice,
>>> that assumption wouldn't hold...
>>>
>>> So, is my understanding correct?
>>>
>>> I took it further and decided to play with this experimentally. On my
>>> UBIFS rootfs, I flipped 3 bits in the first sector of a page and then
>>> 3 more in the second sector. From my kernel log I got this:
>>>
>>> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
>>> byte_pos: 98, bit_pos: 3, 0x31 -> 0x39
>>> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
>>> byte_pos: 98, bit_pos: 2, 0x39 -> 0x3d
>>> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
>>> byte_pos: 98, bit_pos: 1, 0x3d -> 0x3f
>>> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
>>> byte_pos: 530, bit_pos: 6, 0x8e -> 0xce
>>> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
>>> byte_pos: 530, bit_pos: 5, 0xce -> 0xee
>>> [ 78.304687] atmel_nand 40000000.nand: Bit flip in data area,
>>> byte_pos: 530, bit_pos: 4, 0xee -> 0xfe
>>> [ 78.304687] UBI: fixable bit-flip detected at PEB 20
>>> [ 78.304687] UBI: schedule PEB 20 for scrubbing
>>> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
>>> byte_pos: 98, bit_pos: 3, 0x31 -> 0x39
>>> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
>>> byte_pos: 98, bit_pos: 2, 0x39 -> 0x3d
>>> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
>>> byte_pos: 98, bit_pos: 1, 0x3d -> 0x3f
>>> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
>>> byte_pos: 530, bit_pos: 6, 0x8e -> 0xce
>>> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
>>> byte_pos: 530, bit_pos: 5, 0xce -> 0xee
>>> [ 78.328125] atmel_nand 40000000.nand: Bit flip in data area,
>>> byte_pos: 530, bit_pos: 4, 0xee -> 0xfe
>>> [ 78.343750] UBI: fixable bit-flip detected at PEB 20
>>> [ 78.382812] UBI: scrubbed PEB 20 (LEB 0:18), data moved to PEB 250
>>>
>>> So, my takeaway from this is a couple of things:
>>>
>>> 1. Yes, it can correct more than 4 bits per page as long as those are
>>> on different sectors of the page.
>>> 2. My test of 6 bits hit the 4 bit threshold setting and at that point
>>> UBI decided that maybe something is wrong with that PEB.
>>> 3. When it did, UBI corrected the data and copied it elsewhere
>>> 4. Then UBI scrubbed. I assume it then did the torture test. Since I
>>> manually made a flip, it found it was fine once it erased it, so it
>>> didn't mark it as bad. I checked my BBT and it's not marked. So I
>>> assume it's erased and ready for use again.
>>>
>>> Is my general understanding correct?
>>
>> I'd say yes, but the ECC threshold should be per 512 byte ECC block (which
>> seems to be the correct term rather than 'sector'), rather than per page.
>> Are you sure that the threshold is set to 4 (see
>> /sys/devices/virtual/mtd/mtd<n>/bitflip_threshold )?
>>
>> Normally the threshold is set below the ECC correction capability, so that
>> bit scrubbing has a chance to occur before the bits rot too far. Say you
>> have the threshold set at 4 bits, and you have 3 bits that have flipped.
>> If another bit flips, the block would be scrubbed, but say that two bits
>> flipped before you read the data the next time. You would have lost your
>> chance of recovery, so it makes sense to have the threshold lower than the
>> ECC capability. I would say 3/4 of the ECC capability would be a
>> reasonable value.
>>
>
> This makes a lot of sense. However, do we have any way of telling if the
> bitflips where produced on the same ECC sector?
>
> From a cursory look to the code, I'd say there's no such feature with
> the current MTD/NAND design. So, if an mtd_read reports 3 bitflips you
> have no way of telling they happened on the same sector or not, so you
> can't implement your idea.
I'm not sure anyone proposed an idea, but to further an interesting
discussion, I wonder if we should implement a per-ECC-sector
threashold? On devices where the sector size == the page size the two
parameters would be identical, but I'd be interested in my case of a
bitflip-sector-threshold == 3 and bitflip-page-threshold == 12. In my
basic logic, we'd trigger the same threashold event we currently do if
either total across the page is > 12 or any single sector is > 3.
I know it seems possible to detect in the atmel-nand code specifically
which sector the flip is in (otherwise it wouldn't be possible to
correct it), so it should be possible to pull that higher up and have
the dual threasholds work. The problem I see is the ECC sector is a
lower level MTD device detail, and we'd want the higher level to deal
with an abstraction. I don't have time in the next two weeks due to
travel, but I'd be willing to give it a try after that if anyone is
actually interested in such a feature.
>
> I'm curious about what ECC threshold is typically used in production.
> Obviously a too big or too small values have deadly consequences, so
> this doesn't seem like a minor issue.
In my experience with embedded Linux in general, most people usually
go into production with the defaults that are in place already unless
they've explicitly encountered a problem and have developed the domain
knowledge. The default on my platform is 4. Which is the max setting
also for the ECC, so by the time bit 5 flips in the same sector and we
notice and run the scrub, it's already too late for the data. Ouch!
I'm thinking like Ricard, that I should set to 3 in my case. I'm
wondering if all defaults across the board should be set to either
max-1 or 3/4 as Ricard suggests.
- Steve
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: NAND ECC capabilities
2015-01-08 17:09 ` Steve deRosier
@ 2015-01-08 18:57 ` Brian Norris
0 siblings, 0 replies; 28+ messages in thread
From: Brian Norris @ 2015-01-08 18:57 UTC (permalink / raw)
To: Steve deRosier; +Cc: Jeff Lauruhn (jlauruhn), linux-mtd, Ezequiel Garcia
On Thu, Jan 08, 2015 at 09:09:20AM -0800, Steve deRosier wrote:
> On Thu, Jan 8, 2015 at 8:38 AM, Jeff Lauruhn (jlauruhn)
> <jlauruhn@micron.com> wrote:
> > You might find this interesting http://www.cyclicdesign.com/index.php/ecc-trends-in-nand-flash.
Either Jeff sent this privately, or his mail is getting bounced by the
mailing list.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: NAND ECC capabilities
2015-01-08 16:42 ` Ezequiel Garcia
2015-01-08 17:26 ` Steve deRosier
@ 2015-01-08 19:09 ` Brian Norris
2015-01-08 19:27 ` Ezequiel Garcia
2015-01-12 8:35 ` Josh Wu
1 sibling, 2 replies; 28+ messages in thread
From: Brian Norris @ 2015-01-08 19:09 UTC (permalink / raw)
To: Ezequiel Garcia
Cc: Steve deRosier, linux-mtd@lists.infradead.org, Ricard Wanderlof,
Huang Shijie, Richard Weinberger
On Thu, Jan 08, 2015 at 01:42:53PM -0300, Ezequiel Garcia wrote:
> On 01/08/2015 05:32 AM, Ricard Wanderlof wrote:
> > On Thu, 8 Jan 2015, Steve deRosier wrote:
> >> Is my general understanding correct?
> >
> > I'd say yes, but the ECC threshold should be per 512 byte ECC block (which
> > seems to be the correct term rather than 'sector'), rather than per page.
> > Are you sure that the threshold is set to 4 (see
> > /sys/devices/virtual/mtd/mtd<n>/bitflip_threshold )?
> >
> > Normally the threshold is set below the ECC correction capability, so that
> > bit scrubbing has a chance to occur before the bits rot too far. Say you
> > have the threshold set at 4 bits, and you have 3 bits that have flipped.
> > If another bit flips, the block would be scrubbed, but say that two bits
> > flipped before you read the data the next time. You would have lost your
> > chance of recovery, so it makes sense to have the threshold lower than the
> > ECC capability. I would say 3/4 of the ECC capability would be a
> > reasonable value.
> >
>
> This makes a lot of sense. However, do we have any way of telling if the
> bitflips where produced on the same ECC sector?
>
> From a cursory look to the code, I'd say there's no such feature with
> the current MTD/NAND design. So, if an mtd_read reports 3 bitflips you
> have no way of telling they happened on the same sector or not, so you
> can't implement your idea.
I'm not sure if I'm misunderstanding you or if you are misunderstanding
the code. Please review Documentation/ABI/testing/sysfs-class-mtd for
the 'bitflip_threshold' description. We only deal with the max # of
bitflips per sector (or block, or whatever you want to call it). No
ECC-related concept is handled on a per-page basis.
So I believe Ricard is accurately describing the current reality, not
"his idea."
Side note: I wonder if we want to change the nand_base defaults so
bitflip_threshold == 3/4 * ecc_strength
rather than
bitflip_threshold == ecc_strength
See in nand_scan_tail():
if (!mtd->bitflip_threshold)
mtd->bitflip_threshold = mtd->ecc_strength;
Brian
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: NAND ECC capabilities
2015-01-08 19:09 ` Brian Norris
@ 2015-01-08 19:27 ` Ezequiel Garcia
2015-01-12 8:35 ` Josh Wu
1 sibling, 0 replies; 28+ messages in thread
From: Ezequiel Garcia @ 2015-01-08 19:27 UTC (permalink / raw)
To: Brian Norris
Cc: Steve deRosier, linux-mtd@lists.infradead.org, Ricard Wanderlof,
Huang Shijie, Richard Weinberger
On 01/08/2015 04:09 PM, Brian Norris wrote:
> On Thu, Jan 08, 2015 at 01:42:53PM -0300, Ezequiel Garcia wrote:
>> On 01/08/2015 05:32 AM, Ricard Wanderlof wrote:
>>> On Thu, 8 Jan 2015, Steve deRosier wrote:
>>>> Is my general understanding correct?
>>>
>>> I'd say yes, but the ECC threshold should be per 512 byte ECC block (which
>>> seems to be the correct term rather than 'sector'), rather than per page.
>>> Are you sure that the threshold is set to 4 (see
>>> /sys/devices/virtual/mtd/mtd<n>/bitflip_threshold )?
>>>
>>> Normally the threshold is set below the ECC correction capability, so that
>>> bit scrubbing has a chance to occur before the bits rot too far. Say you
>>> have the threshold set at 4 bits, and you have 3 bits that have flipped.
>>> If another bit flips, the block would be scrubbed, but say that two bits
>>> flipped before you read the data the next time. You would have lost your
>>> chance of recovery, so it makes sense to have the threshold lower than the
>>> ECC capability. I would say 3/4 of the ECC capability would be a
>>> reasonable value.
>>>
>>
>> This makes a lot of sense. However, do we have any way of telling if the
>> bitflips where produced on the same ECC sector?
>>
>> From a cursory look to the code, I'd say there's no such feature with
>> the current MTD/NAND design. So, if an mtd_read reports 3 bitflips you
>> have no way of telling they happened on the same sector or not, so you
>> can't implement your idea.
>
> I'm not sure if I'm misunderstanding you or if you are misunderstanding
> the code. Please review Documentation/ABI/testing/sysfs-class-mtd for
> the 'bitflip_threshold' description. We only deal with the max # of
> bitflips per sector (or block, or whatever you want to call it). No
> ECC-related concept is handled on a per-page basis.
>
> So I believe Ricard is accurately describing the current reality, not
> "his idea."
>
My bad. I was completely sure that the "max_bitflips" returned by the
NAND read operations was a per-page value, but I see now it wouldn't
make any sense.
Thanks for the correction,
--
Ezequiel Garcia, VanguardiaSur
www.vanguardiasur.com.ar
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: NAND ECC capabilities
2015-01-08 19:09 ` Brian Norris
2015-01-08 19:27 ` Ezequiel Garcia
@ 2015-01-12 8:35 ` Josh Wu
2015-01-12 20:51 ` [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength Brian Norris
1 sibling, 1 reply; 28+ messages in thread
From: Josh Wu @ 2015-01-12 8:35 UTC (permalink / raw)
To: Brian Norris, Ezequiel Garcia
Cc: Steve deRosier, Ricard Wanderlof, linux-mtd@lists.infradead.org,
Huang Shijie, Richard Weinberger
Hi, Brian
On 1/9/2015 3:09 AM, Brian Norris wrote:
> On Thu, Jan 08, 2015 at 01:42:53PM -0300, Ezequiel Garcia wrote:
>> On 01/08/2015 05:32 AM, Ricard Wanderlof wrote:
>>> On Thu, 8 Jan 2015, Steve deRosier wrote:
>>>> Is my general understanding correct?
>>> I'd say yes, but the ECC threshold should be per 512 byte ECC block (which
>>> seems to be the correct term rather than 'sector'), rather than per page.
>>> Are you sure that the threshold is set to 4 (see
>>> /sys/devices/virtual/mtd/mtd<n>/bitflip_threshold )?
>>>
>>> Normally the threshold is set below the ECC correction capability, so that
>>> bit scrubbing has a chance to occur before the bits rot too far. Say you
>>> have the threshold set at 4 bits, and you have 3 bits that have flipped.
>>> If another bit flips, the block would be scrubbed, but say that two bits
>>> flipped before you read the data the next time. You would have lost your
>>> chance of recovery, so it makes sense to have the threshold lower than the
>>> ECC capability. I would say 3/4 of the ECC capability would be a
>>> reasonable value.
>>>
>> This makes a lot of sense. However, do we have any way of telling if the
>> bitflips where produced on the same ECC sector?
>>
>> From a cursory look to the code, I'd say there's no such feature with
>> the current MTD/NAND design. So, if an mtd_read reports 3 bitflips you
>> have no way of telling they happened on the same sector or not, so you
>> can't implement your idea.
> I'm not sure if I'm misunderstanding you or if you are misunderstanding
> the code. Please review Documentation/ABI/testing/sysfs-class-mtd for
> the 'bitflip_threshold' description. We only deal with the max # of
> bitflips per sector (or block, or whatever you want to call it). No
> ECC-related concept is handled on a per-page basis.
>
> So I believe Ricard is accurately describing the current reality, not
> "his idea."
>
> Side note: I wonder if we want to change the nand_base defaults so
>
> bitflip_threshold == 3/4 * ecc_strength
>
> rather than
>
> bitflip_threshold == ecc_strength
>
> See in nand_scan_tail():
>
> if (!mtd->bitflip_threshold)
> mtd->bitflip_threshold = mtd->ecc_strength;
According to the discussion, this change really make sense.
Will you generate a patch for this? Thanks.
Best Regards,
Josh Wu
>
> Brian
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-12 8:35 ` Josh Wu
@ 2015-01-12 20:51 ` Brian Norris
2015-01-13 2:01 ` Huang Shijie
` (3 more replies)
0 siblings, 4 replies; 28+ messages in thread
From: Brian Norris @ 2015-01-12 20:51 UTC (permalink / raw)
To: Brian Norris
Cc: Ricard Wanderlof, Richard Weinberger, Steve deRosier, Josh Wu,
linux-mtd@lists.infradead.org, Ezequiel Garcia, Huang Shijie
The MTD API reports -EUCLEAN only if the maximum number of bitflips
found in any ECC block exceeds a certain threshold. This is done to
avoid excessive -EUCLEAN reports to MTD users, which may induce
additional scrubbing of data, even when the ECC algorithm in use is
perfectly capable of handling the bitflips.
This threshold can be controlled by user-space (via sysfs), to allow
users to determine what they are willing to tolerate in their
application. But it still helps to have sane defaults.
In recent discussion [1], it was pointed out that our default threshold
is equal to the correction strength. That means that we won't actually
report any -EUCLEAN (i.e., "bitflips were corrected") errors until there
are almost too many to handle. It was determined that 3/4 of the
correction strength is probably a better default.
[1] http://lists.infradead.org/pipermail/linux-mtd/2015-January/057259.html
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
---
drivers/mtd/nand/nand_base.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 816b5c1fd416..3f24b587304f 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -4171,7 +4171,7 @@ int nand_scan_tail(struct mtd_info *mtd)
* properly set.
*/
if (!mtd->bitflip_threshold)
- mtd->bitflip_threshold = mtd->ecc_strength;
+ mtd->bitflip_threshold = DIV_ROUND_UP(mtd->ecc_strength * 3, 4);
/* Check, if we should skip the bad block table scan */
if (chip->options & NAND_SKIP_BBTSCAN)
--
1.9.1
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-12 20:51 ` [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength Brian Norris
@ 2015-01-13 2:01 ` Huang Shijie
2015-01-13 2:38 ` Brian Norris
2015-01-13 13:25 ` Richard Weinberger
` (2 subsequent siblings)
3 siblings, 1 reply; 28+ messages in thread
From: Huang Shijie @ 2015-01-13 2:01 UTC (permalink / raw)
To: Brian Norris
Cc: Ricard Wanderlof, Richard Weinberger, Steve deRosier, Josh Wu,
linux-mtd@lists.infradead.org, Ezequiel Garcia, Huang Shijie
On Mon, Jan 12, 2015 at 12:51:29PM -0800, Brian Norris wrote:
> The MTD API reports -EUCLEAN only if the maximum number of bitflips
> found in any ECC block exceeds a certain threshold. This is done to
> avoid excessive -EUCLEAN reports to MTD users, which may induce
> additional scrubbing of data, even when the ECC algorithm in use is
> perfectly capable of handling the bitflips.
>
> This threshold can be controlled by user-space (via sysfs), to allow
> users to determine what they are willing to tolerate in their
> application. But it still helps to have sane defaults.
>
> In recent discussion [1], it was pointed out that our default threshold
> is equal to the correction strength. That means that we won't actually
> report any -EUCLEAN (i.e., "bitflips were corrected") errors until there
> are almost too many to handle. It was determined that 3/4 of the
> correction strength is probably a better default.
>
> [1] http://lists.infradead.org/pipermail/linux-mtd/2015-January/057259.html
>
> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
> ---
> drivers/mtd/nand/nand_base.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index 816b5c1fd416..3f24b587304f 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -4171,7 +4171,7 @@ int nand_scan_tail(struct mtd_info *mtd)
> * properly set.
> */
> if (!mtd->bitflip_threshold)
> - mtd->bitflip_threshold = mtd->ecc_strength;
> + mtd->bitflip_threshold = DIV_ROUND_UP(mtd->ecc_strength * 3, 4);
After this patch, we have to change the bitflip_threshold to
ecc_strength manually when we do the mtd_biterrors.ko test.
Anyway, I think this patch makes sense.
>
> /* Check, if we should skip the bad block table scan */
> if (chip->options & NAND_SKIP_BBTSCAN)
> --
Acked-by: Huang Shijie <shijie.huang@intel.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-13 2:01 ` Huang Shijie
@ 2015-01-13 2:38 ` Brian Norris
2015-01-13 2:56 ` Huang Shijie
0 siblings, 1 reply; 28+ messages in thread
From: Brian Norris @ 2015-01-13 2:38 UTC (permalink / raw)
To: Huang Shijie
Cc: Ricard Wanderlof, Richard Weinberger, Steve deRosier, Josh Wu,
linux-mtd@lists.infradead.org, Ezequiel Garcia, Huang Shijie
On Mon, Jan 12, 2015 at 6:01 PM, Huang Shijie <shijie.huang@intel.com> wrote:
> On Mon, Jan 12, 2015 at 12:51:29PM -0800, Brian Norris wrote:
>> --- a/drivers/mtd/nand/nand_base.c
>> +++ b/drivers/mtd/nand/nand_base.c
>> @@ -4171,7 +4171,7 @@ int nand_scan_tail(struct mtd_info *mtd)
>> * properly set.
>> */
>> if (!mtd->bitflip_threshold)
>> - mtd->bitflip_threshold = mtd->ecc_strength;
>> + mtd->bitflip_threshold = DIV_ROUND_UP(mtd->ecc_strength * 3, 4);
> After this patch, we have to change the bitflip_threshold to
> ecc_strength manually when we do the mtd_biterrors.ko test.
Why?
Brian
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-13 2:38 ` Brian Norris
@ 2015-01-13 2:56 ` Huang Shijie
0 siblings, 0 replies; 28+ messages in thread
From: Huang Shijie @ 2015-01-13 2:56 UTC (permalink / raw)
To: Brian Norris
Cc: Ricard Wanderlof, Richard Weinberger, Steve deRosier, Josh Wu,
linux-mtd@lists.infradead.org, Ezequiel Garcia, Huang Shijie
On Mon, Jan 12, 2015 at 06:38:49PM -0800, Brian Norris wrote:
> On Mon, Jan 12, 2015 at 6:01 PM, Huang Shijie <shijie.huang@intel.com> wrote:
> > On Mon, Jan 12, 2015 at 12:51:29PM -0800, Brian Norris wrote:
> >> --- a/drivers/mtd/nand/nand_base.c
> >> +++ b/drivers/mtd/nand/nand_base.c
> >> @@ -4171,7 +4171,7 @@ int nand_scan_tail(struct mtd_info *mtd)
> >> * properly set.
> >> */
> >> if (!mtd->bitflip_threshold)
> >> - mtd->bitflip_threshold = mtd->ecc_strength;
> >> + mtd->bitflip_threshold = DIV_ROUND_UP(mtd->ecc_strength * 3, 4);
> > After this patch, we have to change the bitflip_threshold to
> > ecc_strength manually when we do the mtd_biterrors.ko test.
>
> Why?
sorry, I misunderstood the mtd_biterrors code.
It is fine after this patch.
thanks
Huang Shijie
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-12 20:51 ` [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength Brian Norris
2015-01-13 2:01 ` Huang Shijie
@ 2015-01-13 13:25 ` Richard Weinberger
2015-01-13 18:48 ` Brian Norris
2015-01-17 19:01 ` Boris Brezillon
2015-01-21 7:45 ` Brian Norris
3 siblings, 1 reply; 28+ messages in thread
From: Richard Weinberger @ 2015-01-13 13:25 UTC (permalink / raw)
To: Brian Norris
Cc: Ricard Wanderlof, Steve deRosier, Josh Wu,
linux-mtd@lists.infradead.org, Ezequiel Garcia, Huang Shijie
Am 12.01.2015 um 21:51 schrieb Brian Norris:
> The MTD API reports -EUCLEAN only if the maximum number of bitflips
> found in any ECC block exceeds a certain threshold. This is done to
> avoid excessive -EUCLEAN reports to MTD users, which may induce
> additional scrubbing of data, even when the ECC algorithm in use is
> perfectly capable of handling the bitflips.
>
> This threshold can be controlled by user-space (via sysfs), to allow
> users to determine what they are willing to tolerate in their
> application. But it still helps to have sane defaults.
>
> In recent discussion [1], it was pointed out that our default threshold
> is equal to the correction strength. That means that we won't actually
> report any -EUCLEAN (i.e., "bitflips were corrected") errors until there
> are almost too many to handle. It was determined that 3/4 of the
> correction strength is probably a better default.
>
> [1] http://lists.infradead.org/pipermail/linux-mtd/2015-January/057259.html
I like this change but I have one question.
UBI will treat a block as bad if it shows bitflips (EUCLEAN) right
after erasure. For SLC NAND this works very well.
Does this also hold for MLC NAND? If one or two bit flips are okay
even for a freshly erased MLC NAND this change could cause UBI to
mark good blocks as bad depending on the ECC strength.
Thanks,
//richard
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-13 13:25 ` Richard Weinberger
@ 2015-01-13 18:48 ` Brian Norris
2015-01-13 18:51 ` Richard Weinberger
0 siblings, 1 reply; 28+ messages in thread
From: Brian Norris @ 2015-01-13 18:48 UTC (permalink / raw)
To: Richard Weinberger
Cc: Ricard Wanderlof, Steve deRosier, Josh Wu,
linux-mtd@lists.infradead.org, Ezequiel Garcia, Huang Shijie
Hi Richard,
On Tue, Jan 13, 2015 at 02:25:30PM +0100, Richard Weinberger wrote:
> Am 12.01.2015 um 21:51 schrieb Brian Norris:
> > The MTD API reports -EUCLEAN only if the maximum number of bitflips
> > found in any ECC block exceeds a certain threshold. This is done to
> > avoid excessive -EUCLEAN reports to MTD users, which may induce
> > additional scrubbing of data, even when the ECC algorithm in use is
> > perfectly capable of handling the bitflips.
> >
> > This threshold can be controlled by user-space (via sysfs), to allow
> > users to determine what they are willing to tolerate in their
> > application. But it still helps to have sane defaults.
> >
> > In recent discussion [1], it was pointed out that our default threshold
> > is equal to the correction strength. That means that we won't actually
> > report any -EUCLEAN (i.e., "bitflips were corrected") errors until there
> > are almost too many to handle. It was determined that 3/4 of the
> > correction strength is probably a better default.
> >
> > [1] http://lists.infradead.org/pipermail/linux-mtd/2015-January/057259.html
>
> I like this change but I have one question.
>
> UBI will treat a block as bad if it shows bitflips (EUCLEAN) right
> after erasure.
Can you elaborate? When "after erasure"? The closest I see is that UBI
will mark a block bad if it sees an -EIO failure from sync_erase() in
erase_worker(). If you have extra debug checks on, then
ubi_self_check_all_ff() could potentially give you bitflip problems
after the erase, but that's an odd corner case anyway, which many
drivers have been handling in hacked together ad-hoc ways anyway (search
for "bitflips in erase pages").
So I can't pinpoint what you're talking about, exactly.
> For SLC NAND this works very well.
> Does this also hold for MLC NAND? If one or two bit flips are okay
> even for a freshly erased MLC NAND this change could cause UBI to
> mark good blocks as bad depending on the ECC strength.
I would typically assume that MLC NAND users must be using significantly
stronger ECC (e.g., 12-bit / 512-byte, at least), so "one or two
bitflips" would still fall well under 75% of 12 bits.
Brian
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-13 18:48 ` Brian Norris
@ 2015-01-13 18:51 ` Richard Weinberger
2015-01-13 19:51 ` Brian Norris
0 siblings, 1 reply; 28+ messages in thread
From: Richard Weinberger @ 2015-01-13 18:51 UTC (permalink / raw)
To: Brian Norris
Cc: Ricard Wanderlof, Steve deRosier, Josh Wu,
linux-mtd@lists.infradead.org, Ezequiel Garcia, Huang Shijie
Brian,
Am 13.01.2015 um 19:48 schrieb Brian Norris:
> Hi Richard,
>
> On Tue, Jan 13, 2015 at 02:25:30PM +0100, Richard Weinberger wrote:
>> Am 12.01.2015 um 21:51 schrieb Brian Norris:
>>> The MTD API reports -EUCLEAN only if the maximum number of bitflips
>>> found in any ECC block exceeds a certain threshold. This is done to
>>> avoid excessive -EUCLEAN reports to MTD users, which may induce
>>> additional scrubbing of data, even when the ECC algorithm in use is
>>> perfectly capable of handling the bitflips.
>>>
>>> This threshold can be controlled by user-space (via sysfs), to allow
>>> users to determine what they are willing to tolerate in their
>>> application. But it still helps to have sane defaults.
>>>
>>> In recent discussion [1], it was pointed out that our default threshold
>>> is equal to the correction strength. That means that we won't actually
>>> report any -EUCLEAN (i.e., "bitflips were corrected") errors until there
>>> are almost too many to handle. It was determined that 3/4 of the
>>> correction strength is probably a better default.
>>>
>>> [1] http://lists.infradead.org/pipermail/linux-mtd/2015-January/057259.html
>>
>> I like this change but I have one question.
>>
>> UBI will treat a block as bad if it shows bitflips (EUCLEAN) right
>> after erasure.
>
> Can you elaborate? When "after erasure"? The closest I see is that UBI
> will mark a block bad if it sees an -EIO failure from sync_erase() in
> erase_worker(). If you have extra debug checks on, then
> ubi_self_check_all_ff() could potentially give you bitflip problems
> after the erase, but that's an odd corner case anyway, which many
> drivers have been handling in hacked together ad-hoc ways anyway (search
> for "bitflips in erase pages").
>
> So I can't pinpoint what you're talking about, exactly.
See torture_peb()
out:
mutex_unlock(&ubi->buf_mutex);
if (err == UBI_IO_BITFLIPS || mtd_is_eccerr(err)) {
/*
* If a bit-flip or data integrity error was detected, the test
* has not passed because it happened on a freshly erased
* physical eraseblock which means something is wrong with it.
*/
ubi_err(ubi, "read problems on freshly erased PEB %d, must be bad",
pnum);
err = -EIO;
}
>> For SLC NAND this works very well.
>> Does this also hold for MLC NAND? If one or two bit flips are okay
>> even for a freshly erased MLC NAND this change could cause UBI to
>> mark good blocks as bad depending on the ECC strength.
>
> I would typically assume that MLC NAND users must be using significantly
> stronger ECC (e.g., 12-bit / 512-byte, at least), so "one or two
> bitflips" would still fall well under 75% of 12 bits.
Same here. I just want to make sure that UBI does not assume a perfect NAND world. :)
Thanks,
//richard
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-13 18:51 ` Richard Weinberger
@ 2015-01-13 19:51 ` Brian Norris
0 siblings, 0 replies; 28+ messages in thread
From: Brian Norris @ 2015-01-13 19:51 UTC (permalink / raw)
To: Richard Weinberger
Cc: Ricard Wanderlof, Kamal Dasu, Artem Bityutskiy, Steve deRosier,
Josh Wu, linux-mtd@lists.infradead.org, Ezequiel Garcia,
Huang Shijie
Hi Richard,
On Tue, Jan 13, 2015 at 07:51:40PM +0100, Richard Weinberger wrote:
> Am 13.01.2015 um 19:48 schrieb Brian Norris:
> > On Tue, Jan 13, 2015 at 02:25:30PM +0100, Richard Weinberger wrote:
> >> Am 12.01.2015 um 21:51 schrieb Brian Norris:
> >>> The MTD API reports -EUCLEAN only if the maximum number of bitflips
> >>> found in any ECC block exceeds a certain threshold. This is done to
> >>> avoid excessive -EUCLEAN reports to MTD users, which may induce
> >>> additional scrubbing of data, even when the ECC algorithm in use is
> >>> perfectly capable of handling the bitflips.
> >>>
> >>> This threshold can be controlled by user-space (via sysfs), to allow
> >>> users to determine what they are willing to tolerate in their
> >>> application. But it still helps to have sane defaults.
> >>>
> >>> In recent discussion [1], it was pointed out that our default threshold
> >>> is equal to the correction strength. That means that we won't actually
> >>> report any -EUCLEAN (i.e., "bitflips were corrected") errors until there
> >>> are almost too many to handle. It was determined that 3/4 of the
> >>> correction strength is probably a better default.
> >>>
> >>> [1] http://lists.infradead.org/pipermail/linux-mtd/2015-January/057259.html
> >>
> >> I like this change but I have one question.
> >>
> >> UBI will treat a block as bad if it shows bitflips (EUCLEAN) right
> >> after erasure.
> >
> > Can you elaborate? When "after erasure"? The closest I see is that UBI
> > will mark a block bad if it sees an -EIO failure from sync_erase() in
> > erase_worker(). If you have extra debug checks on, then
> > ubi_self_check_all_ff() could potentially give you bitflip problems
> > after the erase, but that's an odd corner case anyway, which many
> > drivers have been handling in hacked together ad-hoc ways anyway (search
> > for "bitflips in erase pages").
> >
> > So I can't pinpoint what you're talking about, exactly.
>
> See torture_peb()
> out:
> mutex_unlock(&ubi->buf_mutex);
> if (err == UBI_IO_BITFLIPS || mtd_is_eccerr(err)) {
> /*
> * If a bit-flip or data integrity error was detected, the test
> * has not passed because it happened on a freshly erased
> * physical eraseblock which means something is wrong with it.
> */
> ubi_err(ubi, "read problems on freshly erased PEB %d, must be bad",
> pnum);
> err = -EIO;
> }
Thanks for refreshing my memory.
This is actually the same function that people have been tripping over
already, especially this:
/* Make sure the PEB contains only 0xFF bytes */
err = ubi_io_read(ubi, ubi->peb_buf, pnum, 0, ubi->peb_size);
if (err)
goto out;
err = ubi_check_pattern(ubi->peb_buf, 0xFF, ubi->peb_size);
if (err == 0) {
ubi_err(ubi, "erased PEB %d, but a non-0xFF byte found",
pnum);
err = -EIO;
goto out;
}
The problems here are mostly with drivers that don't (or do a bad job
of) correcting bitflips in blank pages. But there's never been an issue
of more than a few bitflips, I don't think.
So I guess we're focusing on this chunk?
/* Write a pattern and check it */
memset(ubi->peb_buf, patterns[i], ubi->peb_size);
err = ubi_io_write(ubi, ubi->peb_buf, pnum, 0, ubi->peb_size);
if (err)
goto out;
memset(ubi->peb_buf, ~patterns[i], ubi->peb_size);
err = ubi_io_read(ubi, ubi->peb_buf, pnum, 0, ubi->peb_size);
if (err)
goto out;
I wouldn't expect that even MLC NAND would see 75% of its spec'd
bitflips immediately after erase/program/read-verify. But I'll admit I
haven't heavily tested this.
Also interesting to consider: should we make use of the knowledge of the
manufacturer-specified minimum correction info, when available? See
nand_chip::ecc_strength_ds and nand_chip::ecc_step_ds. These are pulled
from the parmeter page (ONFI or JEDEC) or the full-ID table. Note that
some systems might overprovision their ECC strength, but it may still be
worth reporting based on the datasheet specifications. Not sure of all
the implications of doing that automatically, though.
> >> For SLC NAND this works very well.
> >> Does this also hold for MLC NAND? If one or two bit flips are okay
> >> even for a freshly erased MLC NAND this change could cause UBI to
> >> mark good blocks as bad depending on the ECC strength.
> >
> > I would typically assume that MLC NAND users must be using significantly
> > stronger ECC (e.g., 12-bit / 512-byte, at least), so "one or two
> > bitflips" would still fall well under 75% of 12 bits.
>
> Same here. I just want to make sure that UBI does not assume a perfect NAND world. :)
At some point the responsibility is on the user. Except for the open
problems on bitflips in erased pages [1], torture_peb() still seems to look
OK. But if there are further issues related to this threshold, we might
need to adjust UBI for a less-perfect world. e.g., notice the difference
between MTD_NANDFLASH and MTD_MLCNANDFLASH.
Recall this threshold parameter is already user-settable, and paranoid
users are likely to already set it to 75% or less anyway. So UBI should
not try to inflict too much unnecessary damage.
Brian
[1] Mostly a MTD/NAND layer problem, although it's arguably a UBI issue
that it cares about something flash manufacturers never guaranteed;
that an erased block contains all 0xff.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-12 20:51 ` [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength Brian Norris
2015-01-13 2:01 ` Huang Shijie
2015-01-13 13:25 ` Richard Weinberger
@ 2015-01-17 19:01 ` Boris Brezillon
2015-01-17 19:26 ` Richard Weinberger
2015-01-21 8:22 ` Brian Norris
2015-01-21 7:45 ` Brian Norris
3 siblings, 2 replies; 28+ messages in thread
From: Boris Brezillon @ 2015-01-17 19:01 UTC (permalink / raw)
To: Brian Norris
Cc: Ricard Wanderlof, Richard Weinberger, Steve deRosier, Josh Wu,
linux-mtd@lists.infradead.org, Ezequiel Garcia, Huang Shijie
Hi Brian,
On Mon, 12 Jan 2015 12:51:29 -0800
Brian Norris <computersforpeace@gmail.com> wrote:
> The MTD API reports -EUCLEAN only if the maximum number of bitflips
> found in any ECC block exceeds a certain threshold. This is done to
> avoid excessive -EUCLEAN reports to MTD users, which may induce
> additional scrubbing of data, even when the ECC algorithm in use is
> perfectly capable of handling the bitflips.
>
> This threshold can be controlled by user-space (via sysfs), to allow
> users to determine what they are willing to tolerate in their
> application. But it still helps to have sane defaults.
>
> In recent discussion [1], it was pointed out that our default threshold
> is equal to the correction strength. That means that we won't actually
> report any -EUCLEAN (i.e., "bitflips were corrected") errors until there
> are almost too many to handle. It was determined that 3/4 of the
> correction strength is probably a better default.
>
> [1] http://lists.infradead.org/pipermail/linux-mtd/2015-January/057259.html
>
> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
> ---
> drivers/mtd/nand/nand_base.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index 816b5c1fd416..3f24b587304f 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -4171,7 +4171,7 @@ int nand_scan_tail(struct mtd_info *mtd)
> * properly set.
> */
> if (!mtd->bitflip_threshold)
> - mtd->bitflip_threshold = mtd->ecc_strength;
> + mtd->bitflip_threshold = DIV_ROUND_UP(mtd->ecc_strength * 3, 4);
Just sharing my experience with MLC NANDs requiring read-retry: the
number of reported bitflips often raise ecc_strength value (at least
with the current read-retry approach).
This patch will definitely make UBI move NAND blocks over and over
again considering the threshold has been raised and the block is not
reliable anymore.
While I like the idea of limiting the threshold to something smaller
than what's recommended on the datasheet (or reported by ONFI) I wonder
if it won't make things worst in some cases.
Regarding the read-retry code, it currently stops retrying reading the
page once the page has been successfully retrieved (or in other terms
all bitflips have been fixed). But it might stop to soon, because by
changing the bit level threshold (in other term retrying one more time)
it might successfully read the page with less bitflips than the
previous attempt (these are just supposition, I haven't tested it yet).
If we can achieve that we could retry until we reach something below
the bitflips threshold value, and if we fail to find any, just consider
the lower number of bitflips found during those read-retry operations.
Best Regards,
Boris
--
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-17 19:01 ` Boris Brezillon
@ 2015-01-17 19:26 ` Richard Weinberger
2015-01-17 19:42 ` Boris Brezillon
2015-01-21 8:22 ` Brian Norris
1 sibling, 1 reply; 28+ messages in thread
From: Richard Weinberger @ 2015-01-17 19:26 UTC (permalink / raw)
To: Boris Brezillon, Brian Norris
Cc: Ricard Wanderlof, Steve deRosier, Josh Wu,
linux-mtd@lists.infradead.org, Ezequiel Garcia, Huang Shijie
Am 17.01.2015 um 20:01 schrieb Boris Brezillon:
> Just sharing my experience with MLC NANDs requiring read-retry: the
> number of reported bitflips often raise ecc_strength value (at least
> with the current read-retry approach).
> This patch will definitely make UBI move NAND blocks over and over
> again considering the threshold has been raised and the block is not
> reliable anymore.
Within the last 6 months I had to face a lot of strange UBI/MTD issues.
All showed one "flaw" in UBI, namely that it was designed with good SLC
NANDs in mind.
Even some modern SLC NANDs show bad behavior like read disturb after
less than 100000 reads.
I think it is time to bite the bullet and improve UBI wrt. MLC NAND.
This is not an easy task as it needs some hardware to play with and
time/budget. But I think it is worth the effort.
Thanks,
//richard
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-17 19:26 ` Richard Weinberger
@ 2015-01-17 19:42 ` Boris Brezillon
2015-01-17 19:54 ` Richard Weinberger
0 siblings, 1 reply; 28+ messages in thread
From: Boris Brezillon @ 2015-01-17 19:42 UTC (permalink / raw)
To: Richard Weinberger
Cc: Ricard Wanderlof, Steve deRosier, Josh Wu,
linux-mtd@lists.infradead.org, Ezequiel Garcia, Brian Norris,
Huang Shijie
On Sat, 17 Jan 2015 20:26:44 +0100
Richard Weinberger <richard@nod.at> wrote:
> Am 17.01.2015 um 20:01 schrieb Boris Brezillon:
> > Just sharing my experience with MLC NANDs requiring read-retry: the
> > number of reported bitflips often raise ecc_strength value (at least
> > with the current read-retry approach).
> > This patch will definitely make UBI move NAND blocks over and over
> > again considering the threshold has been raised and the block is not
> > reliable anymore.
>
> Within the last 6 months I had to face a lot of strange UBI/MTD issues.
> All showed one "flaw" in UBI, namely that it was designed with good SLC
> NANDs in mind.
> Even some modern SLC NANDs show bad behavior like read disturb after
> less than 100000 reads.
> I think it is time to bite the bullet and improve UBI wrt. MLC NAND.
> This is not an easy task as it needs some hardware to play with and
> time/budget. But I think it is worth the effort.
I do all my MLC tests on a cubietruck (embedding an Allwinner A20 SoC
and a Micron MLC NAND).
I already started to work on randomizer/scrambler support (which are
needed on some MLC chips), and added support for read-retry on a Micron
non-ONFI NAND (you can find my work here [1], but it's not ready to be
mainlined yet).
But these are all things we can handle in the NAND layer.
Then comes trickier parts, like improved bitflips robustness (as
you stated), paired pages handling (you cannot reliably write on one
page without risking to corrupt the page it is paired with, which
implies specific handling for such cases in upper layers: UBI/UBIFS ?),
and surely other things I don't remember :-).
Anyway, I'd be happy to help with any of these tasks.
Best Regards,
Boris
[1]https://github.com/bbrezillon/linux-sunxi/commits/sunxi-nand-next
--
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-17 19:42 ` Boris Brezillon
@ 2015-01-17 19:54 ` Richard Weinberger
0 siblings, 0 replies; 28+ messages in thread
From: Richard Weinberger @ 2015-01-17 19:54 UTC (permalink / raw)
To: Boris Brezillon
Cc: Ricard Wanderlof, Steve deRosier, Josh Wu,
linux-mtd@lists.infradead.org, Ezequiel Garcia, Brian Norris,
Huang Shijie
Am 17.01.2015 um 20:42 schrieb Boris Brezillon:
> On Sat, 17 Jan 2015 20:26:44 +0100
> Richard Weinberger <richard@nod.at> wrote:
>
>> Am 17.01.2015 um 20:01 schrieb Boris Brezillon:
>>> Just sharing my experience with MLC NANDs requiring read-retry: the
>>> number of reported bitflips often raise ecc_strength value (at least
>>> with the current read-retry approach).
>>> This patch will definitely make UBI move NAND blocks over and over
>>> again considering the threshold has been raised and the block is not
>>> reliable anymore.
>>
>> Within the last 6 months I had to face a lot of strange UBI/MTD issues.
>> All showed one "flaw" in UBI, namely that it was designed with good SLC
>> NANDs in mind.
>> Even some modern SLC NANDs show bad behavior like read disturb after
>> less than 100000 reads.
>> I think it is time to bite the bullet and improve UBI wrt. MLC NAND.
>> This is not an easy task as it needs some hardware to play with and
>> time/budget. But I think it is worth the effort.
>
> I do all my MLC tests on a cubietruck (embedding an Allwinner A20 SoC
> and a Micron MLC NAND).
Maybe I should get me one of these boards.
Despite I'm not really a fan of sunxi.
> I already started to work on randomizer/scrambler support (which are
> needed on some MLC chips), and added support for read-retry on a Micron
> non-ONFI NAND (you can find my work here [1], but it's not ready to be
> mainlined yet).
> But these are all things we can handle in the NAND layer.
Yep.
> Then comes trickier parts, like improved bitflips robustness (as
> you stated), paired pages handling (you cannot reliably write on one
> page without risking to corrupt the page it is paired with, which
> implies specific handling for such cases in upper layers: UBI/UBIFS ?),
> and surely other things I don't remember :-).
Unstable bits for example need also handling.
I really would like to get some input from NAND vendors what they want
us to solve in software.
I'm currently working on a solution for UBI to deal better with
read disturb. Within the next few week I hopefully have something sane
to share. :)
> Anyway, I'd be happy to help with any of these tasks.
Good to know!
Thanks,
//richard
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-12 20:51 ` [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength Brian Norris
` (2 preceding siblings ...)
2015-01-17 19:01 ` Boris Brezillon
@ 2015-01-21 7:45 ` Brian Norris
3 siblings, 0 replies; 28+ messages in thread
From: Brian Norris @ 2015-01-21 7:45 UTC (permalink / raw)
To: Josh Wu, Steve deRosier, linux-mtd@lists.infradead.org,
Ricard Wanderlof, Huang Shijie, Richard Weinberger,
Ezequiel Garcia
On Mon, Jan 12, 2015 at 12:51:29PM -0800, Brian Norris wrote:
> The MTD API reports -EUCLEAN only if the maximum number of bitflips
> found in any ECC block exceeds a certain threshold. This is done to
> avoid excessive -EUCLEAN reports to MTD users, which may induce
> additional scrubbing of data, even when the ECC algorithm in use is
> perfectly capable of handling the bitflips.
>
> This threshold can be controlled by user-space (via sysfs), to allow
> users to determine what they are willing to tolerate in their
> application. But it still helps to have sane defaults.
>
> In recent discussion [1], it was pointed out that our default threshold
> is equal to the correction strength. That means that we won't actually
> report any -EUCLEAN (i.e., "bitflips were corrected") errors until there
> are almost too many to handle. It was determined that 3/4 of the
> correction strength is probably a better default.
>
> [1] http://lists.infradead.org/pipermail/linux-mtd/2015-January/057259.html
>
> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Pushed to l2-mtd.git.
Brian
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-17 19:01 ` Boris Brezillon
2015-01-17 19:26 ` Richard Weinberger
@ 2015-01-21 8:22 ` Brian Norris
2015-01-21 8:42 ` Boris Brezillon
1 sibling, 1 reply; 28+ messages in thread
From: Brian Norris @ 2015-01-21 8:22 UTC (permalink / raw)
To: Boris Brezillon
Cc: Ricard Wanderlof, Richard Weinberger, Steve deRosier, Josh Wu,
linux-mtd@lists.infradead.org, Ezequiel Garcia, Huang Shijie
I'm sorry, I just noticed your reply as I was applying the patch. I can
back it out if we find a real objection.
On Sat, Jan 17, 2015 at 08:01:37PM +0100, Boris Brezillon wrote:
> On Mon, 12 Jan 2015 12:51:29 -0800
> Brian Norris <computersforpeace@gmail.com> wrote:
> > The MTD API reports -EUCLEAN only if the maximum number of bitflips
> > found in any ECC block exceeds a certain threshold. This is done to
> > avoid excessive -EUCLEAN reports to MTD users, which may induce
> > additional scrubbing of data, even when the ECC algorithm in use is
> > perfectly capable of handling the bitflips.
> >
> > This threshold can be controlled by user-space (via sysfs), to allow
> > users to determine what they are willing to tolerate in their
> > application. But it still helps to have sane defaults.
> >
> > In recent discussion [1], it was pointed out that our default threshold
> > is equal to the correction strength. That means that we won't actually
> > report any -EUCLEAN (i.e., "bitflips were corrected") errors until there
> > are almost too many to handle. It was determined that 3/4 of the
> > correction strength is probably a better default.
> >
> > [1] http://lists.infradead.org/pipermail/linux-mtd/2015-January/057259.html
> >
> > Signed-off-by: Brian Norris <computersforpeace@gmail.com>
> > ---
> > drivers/mtd/nand/nand_base.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> > index 816b5c1fd416..3f24b587304f 100644
> > --- a/drivers/mtd/nand/nand_base.c
> > +++ b/drivers/mtd/nand/nand_base.c
> > @@ -4171,7 +4171,7 @@ int nand_scan_tail(struct mtd_info *mtd)
> > * properly set.
> > */
> > if (!mtd->bitflip_threshold)
> > - mtd->bitflip_threshold = mtd->ecc_strength;
> > + mtd->bitflip_threshold = DIV_ROUND_UP(mtd->ecc_strength * 3, 4);
>
> Just sharing my experience with MLC NANDs requiring read-retry: the
> number of reported bitflips often raise ecc_strength value (at least
> with the current read-retry approach).
I did not have fun when testing a few MLC NAND which required read
retry. I ended up recommending that most of our customers move off of it
onto another solution where possible. But yes, I can understand your
issue.
> This patch will definitely make UBI move NAND blocks over and over
> again considering the threshold has been raised and the block is not
> reliable anymore.
Personally, we found that we just needed to lower the
MTD_UBI_WL_THRESHOLD and let UBI move away from blocks with high bitflip
counts. I suppose we essentially treat the block as bad.
> While I like the idea of limiting the threshold to something smaller
> than what's recommended on the datasheet (or reported by ONFI) I wonder
> if it won't make things worst in some cases.
I wouldn't exactly say that there is any threshold recommended on the
datasheet or in ONFI. They simply specify a required correction
strength, with no word about any intermediate handling -- what do they
expect software to do when bitflips exceed the ECC strength? We
immediately lose data. So we need to preemptively move such data.
So I don't think it's a good idea at all to say that
threshold == ecc_strength. That renders your ECC nearly useless w.r.t.
the original design of UBI. UBI intends to use -EUCLEAN as a signal that
high # of bitflips. I would suggest that 23 bitflips on a 24-bit
correction algorithm is a "high # of bitflips." So as I see it, this
patch is just restoring that UBI assumption.
> Regarding the read-retry code, it currently stops retrying reading the
> page once the page has been successfully retrieved (or in other terms
> all bitflips have been fixed). But it might stop to soon, because by
> changing the bit level threshold (in other term retrying one more time)
> it might successfully read the page with less bitflips than the
> previous attempt (these are just supposition, I haven't tested it yet).
> If we can achieve that we could retry until we reach something below
> the bitflips threshold value, and if we fail to find any, just consider
> the lower number of bitflips found during those read-retry operations.
I believe I suggested scenarios like this to some flash vendors when
speaking to reps in person, but they didn't seem to consider that
likely. I think they were implying that there would be only one read
retry mode that gives a reasonable result. I'm not sure if they were
really the experts on that particular topic, though, or if they were
just giving me an answer to make me happy.
Honestly, there's a lot of work that goes into MLC NAND for use by
serious applications. For instance, SSD controller vendors, and even
eMMC makers, use plenty of low-level knowledge that we simply do not
have. From whatever I've gleaned about Toshiba MLC NAND (I don't think
they even try to sell much raw NAND outside of eMMC and SSD solutions),
they actually expose fine-grained voltage threshold controls through
their (non-standard, obviously) command interface, and their clients
likely use this to chart a targeted plan on how to adjust thresholds
over the lifetime of a block, rather than just using a dumb sequential
search like we do.
So anyway, that drifted a little away from the topic at hand. Are you
suggesting that we should not change this default in nand_base, because
of potential issues with highly-unreliable NAND that need read-retry? I
can back out the patch while we finish this discussion, although I'm not
very convinced so far.
Brian
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-21 8:22 ` Brian Norris
@ 2015-01-21 8:42 ` Boris Brezillon
2015-02-10 13:50 ` Boris Brezillon
0 siblings, 1 reply; 28+ messages in thread
From: Boris Brezillon @ 2015-01-21 8:42 UTC (permalink / raw)
To: Brian Norris
Cc: Ricard Wanderlof, Richard Weinberger, Steve deRosier, Josh Wu,
linux-mtd@lists.infradead.org, Ezequiel Garcia, Huang Shijie
Hi Brian,
On Wed, 21 Jan 2015 00:22:57 -0800
Brian Norris <computersforpeace@gmail.com> wrote:
> I'm sorry, I just noticed your reply as I was applying the patch. I can
> back it out if we find a real objection.
No, I'm fine with this patch, I was just trying to draw attention on
several problems I noticed while working on MLC NANDs.
We'll need to sort this out at some point, but I won't prevent sane
SLC/MLC NANDs from having this ECC threshold changed for something that
is not well supported yet ;-).
>
> On Sat, Jan 17, 2015 at 08:01:37PM +0100, Boris Brezillon wrote:
> > On Mon, 12 Jan 2015 12:51:29 -0800
> > Brian Norris <computersforpeace@gmail.com> wrote:
> > > The MTD API reports -EUCLEAN only if the maximum number of bitflips
> > > found in any ECC block exceeds a certain threshold. This is done to
> > > avoid excessive -EUCLEAN reports to MTD users, which may induce
> > > additional scrubbing of data, even when the ECC algorithm in use is
> > > perfectly capable of handling the bitflips.
> > >
> > > This threshold can be controlled by user-space (via sysfs), to allow
> > > users to determine what they are willing to tolerate in their
> > > application. But it still helps to have sane defaults.
> > >
> > > In recent discussion [1], it was pointed out that our default threshold
> > > is equal to the correction strength. That means that we won't actually
> > > report any -EUCLEAN (i.e., "bitflips were corrected") errors until there
> > > are almost too many to handle. It was determined that 3/4 of the
> > > correction strength is probably a better default.
> > >
> > > [1] http://lists.infradead.org/pipermail/linux-mtd/2015-January/057259.html
> > >
> > > Signed-off-by: Brian Norris <computersforpeace@gmail.com>
> > > ---
> > > drivers/mtd/nand/nand_base.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> > > index 816b5c1fd416..3f24b587304f 100644
> > > --- a/drivers/mtd/nand/nand_base.c
> > > +++ b/drivers/mtd/nand/nand_base.c
> > > @@ -4171,7 +4171,7 @@ int nand_scan_tail(struct mtd_info *mtd)
> > > * properly set.
> > > */
> > > if (!mtd->bitflip_threshold)
> > > - mtd->bitflip_threshold = mtd->ecc_strength;
> > > + mtd->bitflip_threshold = DIV_ROUND_UP(mtd->ecc_strength * 3, 4);
> >
> > Just sharing my experience with MLC NANDs requiring read-retry: the
> > number of reported bitflips often raise ecc_strength value (at least
> > with the current read-retry approach).
>
> I did not have fun when testing a few MLC NAND which required read
> retry. I ended up recommending that most of our customers move off of it
> onto another solution where possible. But yes, I can understand your
> issue.
Neither did I, but that's the world we live in, and I guess board
manufacturers will keep using crapy MLC chips so this is definitely
something we have to work on.
>
> > This patch will definitely make UBI move NAND blocks over and over
> > again considering the threshold has been raised and the block is not
> > reliable anymore.
>
> Personally, we found that we just needed to lower the
> MTD_UBI_WL_THRESHOLD and let UBI move away from blocks with high bitflip
> counts. I suppose we essentially treat the block as bad.
The problem I have is that a lot of them reach this ECC threshold limit
(at least with the current implementation).
>
> > While I like the idea of limiting the threshold to something smaller
> > than what's recommended on the datasheet (or reported by ONFI) I wonder
> > if it won't make things worst in some cases.
>
> I wouldn't exactly say that there is any threshold recommended on the
> datasheet or in ONFI. They simply specify a required correction
> strength,
Yes, that's what I was talking about.
> with no word about any intermediate handling -- what do they
> expect software to do when bitflips exceed the ECC strength? We
> immediately lose data. So we need to preemptively move such data.
Yep, I agree.
>
> So I don't think it's a good idea at all to say that
> threshold == ecc_strength. That renders your ECC nearly useless w.r.t.
> the original design of UBI. UBI intends to use -EUCLEAN as a signal that
> high # of bitflips. I would suggest that 23 bitflips on a 24-bit
> correction algorithm is a "high # of bitflips." So as I see it, this
> patch is just restoring that UBI assumption.
Yes.
>
> > Regarding the read-retry code, it currently stops retrying reading the
> > page once the page has been successfully retrieved (or in other terms
> > all bitflips have been fixed). But it might stop to soon, because by
> > changing the bit level threshold (in other term retrying one more time)
> > it might successfully read the page with less bitflips than the
> > previous attempt (these are just supposition, I haven't tested it yet).
> > If we can achieve that we could retry until we reach something below
> > the bitflips threshold value, and if we fail to find any, just consider
> > the lower number of bitflips found during those read-retry operations.
>
> I believe I suggested scenarios like this to some flash vendors when
> speaking to reps in person, but they didn't seem to consider that
> likely. I think they were implying that there would be only one read
> retry mode that gives a reasonable result. I'm not sure if they were
> really the experts on that particular topic, though, or if they were
> just giving me an answer to make me happy.
Okay, good to know. I'll try to do some more testing to verify that.
>
> Honestly, there's a lot of work that goes into MLC NAND for use by
> serious applications. For instance, SSD controller vendors, and even
> eMMC makers, use plenty of low-level knowledge that we simply do not
> have. From whatever I've gleaned about Toshiba MLC NAND (I don't think
> they even try to sell much raw NAND outside of eMMC and SSD solutions),
> they actually expose fine-grained voltage threshold controls through
> their (non-standard, obviously) command interface, and their clients
> likely use this to chart a targeted plan on how to adjust thresholds
> over the lifetime of a block, rather than just using a dumb sequential
> search like we do.
Yep, but they do sell these unreliable MLC chips (at least Micron and
Hynix do) to board manufacturers, and a lot of people would really like
to use a mainline kernel on these boards.
>
> So anyway, that drifted a little away from the topic at hand. Are you
> suggesting that we should not change this default in nand_base, because
> of potential issues with highly-unreliable NAND that need read-retry? I
> can back out the patch while we finish this discussion, although I'm not
> very convinced so far.
As I said I don't expect you to back it out, just discussing the MLC
chip problems in an almost unrelated thread ;-).
Best Regards,
Boris
--
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength
2015-01-21 8:42 ` Boris Brezillon
@ 2015-02-10 13:50 ` Boris Brezillon
0 siblings, 0 replies; 28+ messages in thread
From: Boris Brezillon @ 2015-02-10 13:50 UTC (permalink / raw)
To: Brian Norris
Cc: Boris Brezillon, Ricard Wanderlof, Richard Weinberger,
Steve deRosier, Josh Wu, linux-mtd@lists.infradead.org,
Ezequiel Garcia, Huang Shijie
On Wed, 21 Jan 2015 09:42:57 +0100
Boris Brezillon <boris.brezillon@free-electrons.com> wrote:
> Hi Brian,
>
> On Wed, 21 Jan 2015 00:22:57 -0800
> Brian Norris <computersforpeace@gmail.com> wrote:
[...]
>
> >
> > > Regarding the read-retry code, it currently stops retrying reading the
> > > page once the page has been successfully retrieved (or in other terms
> > > all bitflips have been fixed). But it might stop to soon, because by
> > > changing the bit level threshold (in other term retrying one more time)
> > > it might successfully read the page with less bitflips than the
> > > previous attempt (these are just supposition, I haven't tested it yet).
> > > If we can achieve that we could retry until we reach something below
> > > the bitflips threshold value, and if we fail to find any, just consider
> > > the lower number of bitflips found during those read-retry operations.
> >
> > I believe I suggested scenarios like this to some flash vendors when
> > speaking to reps in person, but they didn't seem to consider that
> > likely. I think they were implying that there would be only one read
> > retry mode that gives a reasonable result. I'm not sure if they were
> > really the experts on that particular topic, though, or if they were
> > just giving me an answer to make me happy.
>
> Okay, good to know. I'll try to do some more testing to verify that.
I did some more test on my cubietruck, trying other read-retry if the
threshold limit is reached (here is the code [1]), and it seems that
better read-retry mode are found in most cases (actually in all the
cases I encountered: see those traces [2]).
Note that I configured the bitflips_threshold to 3/4 of the
ecc-strength (exactly what you're doing in this patch).
Given these results I really think we should consider testing other
'read modes' if the succeeding one exceed the threshold value.
Best Regards,
Boris
[1]http://code.bulix.org/lvcs9x-87859
[2]http://code.bulix.org/xii8nw-87860
--
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2015-02-10 13:51 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-08 3:10 NAND ECC capabilities Steve deRosier
2015-01-08 4:17 ` Ezequiel Garcia
2015-01-08 6:22 ` Steve deRosier
[not found] ` <0D23F1ECC880A74392D56535BCADD73526C0EA9A@NTXBOIMBX03.micron.com>
2015-01-08 17:09 ` Steve deRosier
2015-01-08 18:57 ` Brian Norris
2015-01-08 8:32 ` Ricard Wanderlof
2015-01-08 16:42 ` Ezequiel Garcia
2015-01-08 17:26 ` Steve deRosier
2015-01-08 19:09 ` Brian Norris
2015-01-08 19:27 ` Ezequiel Garcia
2015-01-12 8:35 ` Josh Wu
2015-01-12 20:51 ` [PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength Brian Norris
2015-01-13 2:01 ` Huang Shijie
2015-01-13 2:38 ` Brian Norris
2015-01-13 2:56 ` Huang Shijie
2015-01-13 13:25 ` Richard Weinberger
2015-01-13 18:48 ` Brian Norris
2015-01-13 18:51 ` Richard Weinberger
2015-01-13 19:51 ` Brian Norris
2015-01-17 19:01 ` Boris Brezillon
2015-01-17 19:26 ` Richard Weinberger
2015-01-17 19:42 ` Boris Brezillon
2015-01-17 19:54 ` Richard Weinberger
2015-01-21 8:22 ` Brian Norris
2015-01-21 8:42 ` Boris Brezillon
2015-02-10 13:50 ` Boris Brezillon
2015-01-21 7:45 ` Brian Norris
2015-01-08 17:14 ` NAND ECC capabilities Steve deRosier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).