* enhance ONFI table reliability/stable
@ 2015-07-21 14:42 Bean Huo 霍斌斌 (beanhuo)
2015-11-18 2:50 ` Brian Norris
0 siblings, 1 reply; 6+ messages in thread
From: Bean Huo 霍斌斌 (beanhuo) @ 2015-07-21 14:42 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org
Hi,
Recently, I faced some case about ONFI table reliability, now it used CRC.
If there is bit flips in ONFI parameter pages, parameter backup page will be taken.
For latest linux,default read three copys.
chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1);
for (i = 0; i < 3; i++) {
for (j = 0; j < sizeof(*p); j++)
((uint8_t *)p)[j] = chip->read_byte(mtd);
if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) ==
le16_to_cpu(p->crc)) {
break;
}
}
However ,with technoogy improvement,for TLC and new generatin MLC,I think, three copys of
Parameter tables is not powerful enough.my question is that if there is a good method to protect and corrent parameter page. For example,we can use linux software BCH ecc.
Any suggections and input be welcomed,if you having any concerns about this,don't free tell me.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: enhance ONFI table reliability/stable 2015-07-21 14:42 enhance ONFI table reliability/stable Bean Huo 霍斌斌 (beanhuo) @ 2015-11-18 2:50 ` Brian Norris 2015-11-19 4:21 ` Bean Huo 霍斌斌 (beanhuo) 0 siblings, 1 reply; 6+ messages in thread From: Brian Norris @ 2015-11-18 2:50 UTC (permalink / raw) To: Bean Huo 霍斌斌 (beanhuo) Cc: linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org, Boris Brezillon Hi Bean, I was sorting through old email and I found this. On Tue, Jul 21, 2015 at 02:42:34PM +0000, Bean Huo 霍斌斌 (beanhuo) wrote: > Hi, > > Recently, I faced some case about ONFI table reliability, now it used CRC. > If there is bit flips in ONFI parameter pages, parameter backup page will be taken. > For latest linux,default read three copys. > > chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1); > for (i = 0; i < 3; i++) { > for (j = 0; j < sizeof(*p); j++) > ((uint8_t *)p)[j] = chip->read_byte(mtd); > if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) == > le16_to_cpu(p->crc)) { > break; > } > } > > However ,with technoogy improvement,for TLC and new generatin MLC,I > think, three copys of Ha, "improvement" :) > Parameter tables is not powerful enough.my question is that if there > is a good method to protect and corrent parameter page. For example,we > can use linux software BCH ecc. Any suggections and input be > welcomed,if you having any concerns about this,don't free tell me. I recall this being brought up at my old job, and I all I can say is... (please pardon my censored language) ...that is complete and utter bulls***. An ONFI standard that can't guarantee "reliable enough" parameter pages is no standard at all. To step back a bit: How would one expect to store and retrieve ECC parity data? ...on the NAND flash? But to do that, we have to know the geometry parameters of said NAND flash. How do we figure out the geometry? From the ONFI parameter pages! Nice Catch 22 you have there. Please encourage your employer never to produce "ONFI-compliant" flash that are this bad. Regards, Brian ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: enhance ONFI table reliability/stable 2015-11-18 2:50 ` Brian Norris @ 2015-11-19 4:21 ` Bean Huo 霍斌斌 (beanhuo) 2015-11-20 23:59 ` Brian Norris 0 siblings, 1 reply; 6+ messages in thread From: Bean Huo 霍斌斌 (beanhuo) @ 2015-11-19 4:21 UTC (permalink / raw) To: Brian Norris Cc: linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org, Boris Brezillon > > Hi Bean, > > I was sorting through old email and I found this. > > On Tue, Jul 21, 2015 at 02:42:34PM +0000, Bean Huo 霍斌斌 (beanhuo) > wrote: > > Hi, > > > > Recently, I faced some case about ONFI table reliability, now it used CRC. > > If there is bit flips in ONFI parameter pages, parameter backup page will be > taken. > > For latest linux,default read three copys. > > > > chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1); > > for (i = 0; i < 3; i++) { > > for (j = 0; j < sizeof(*p); j++) > > ((uint8_t *)p)[j] = chip->read_byte(mtd); > > if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) == > > le16_to_cpu(p->crc)) { > > break; > > } > > } > > > > However ,with technoogy improvement,for TLC and new generatin MLC,I > > think, three copys of > > Ha, "improvement" :) > > > Parameter tables is not powerful enough.my question is that if there > > is a good method to protect and corrent parameter page. For example,we > > can use linux software BCH ecc. Any suggections and input be > > welcomed,if you having any concerns about this,don't free tell me. > > I recall this being brought up at my old job, and I all I can say is... > (please pardon my censored language) Yes , you ever told about this. I just follow. Sorry for my rude following. I only want to share my one suggestion about using software ECC to protect ONFI table that read from NAND. I want to hear every MTD expert 's valuable Feedback on this. if OK, I can do it. > ...that is complete and utter bulls***. An ONFI standard that can't guarantee > "reliable enough" parameter pages is no standard at all. > > To step back a bit: How would one expect to store and retrieve ECC parity > data? ...on the NAND flash? But to do that, we have to know the geometry > parameters of said NAND flash. How do we figure out the geometry? From the > ONFI parameter pages! Nice Catch 22 you have there. > > Please encourage your employer never to produce "ONFI-compliant" flash that > are this bad. > > Regards, > Brian ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: enhance ONFI table reliability/stable 2015-11-19 4:21 ` Bean Huo 霍斌斌 (beanhuo) @ 2015-11-20 23:59 ` Brian Norris 2015-11-21 7:46 ` Boris Brezillon 0 siblings, 1 reply; 6+ messages in thread From: Brian Norris @ 2015-11-20 23:59 UTC (permalink / raw) To: Bean Huo 霍斌斌 (beanhuo) Cc: linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org, Boris Brezillon On Thu, Nov 19, 2015 at 04:21:01AM +0000, Bean Huo 霍斌斌 (beanhuo) wrote: > > On Tue, Jul 21, 2015 at 02:42:34PM +0000, Bean Huo 霍斌斌 (beanhuo) > > wrote: > > > Hi, > > > > > > Recently, I faced some case about ONFI table reliability, now it used CRC. > > > If there is bit flips in ONFI parameter pages, parameter backup page will be > > taken. > > > For latest linux,default read three copys. > > > > > > chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1); > > > for (i = 0; i < 3; i++) { > > > for (j = 0; j < sizeof(*p); j++) > > > ((uint8_t *)p)[j] = chip->read_byte(mtd); > > > if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) == > > > le16_to_cpu(p->crc)) { > > > break; > > > } > > > } > > > > > > However ,with technoogy improvement,for TLC and new generatin MLC,I > > > think, three copys of > > > > Ha, "improvement" :) > > > > > Parameter tables is not powerful enough.my question is that if there > > > is a good method to protect and corrent parameter page. For example,we > > > can use linux software BCH ecc. Any suggections and input be > > > welcomed,if you having any concerns about this,don't free tell me. > > > > I recall this being brought up at my old job, and I all I can say is... > > (please pardon my censored language) > > > Yes , you ever told about this. I just follow. > Sorry for my rude following. > I only want to share my one suggestion about using software ECC to protect > ONFI table that read from NAND. I want to hear every MTD expert 's valuable > Feedback on this. if OK, I can do it. Perhaps I'm misunderstanding you, I don't understand how you could possibly "do it" if it is a circular dependency. You have nowhere to store ECC/parity data for a parameter page, because you can't actually read/write the NAND flash until after you know its geometry. > > ...that is complete and utter bulls***. An ONFI standard that can't guarantee > > "reliable enough" parameter pages is no standard at all. > > > > To step back a bit: How would one expect to store and retrieve ECC parity > > data? ...on the NAND flash? But to do that, we have to know the geometry > > parameters of said NAND flash. How do we figure out the geometry? From the > > ONFI parameter pages! Nice Catch 22 you have there. I realize a non-native English speaker might not understand the "Catch 22" reference. Wikipedia has a nice summary: https://en.wikipedia.org/wiki/Catch-22_(logic) Essentially, it's a circular argument, or a contradiction. An impossibility. > > Please encourage your employer never to produce "ONFI-compliant" flash that > > are this bad. I still stand by the above statement. But now that I'm in a slightly more charitable mood, there are ways to improve our ability to recover from slightly corrupted parameter pages (ECC is not one of them). For one, you could do some kind of bit majority. e.g.: (1) try pages 1-3 (2) if none pass the CRC check, then compute bit majority of all 3; if the CRC of this combined page passes, then use it (3) ??? Brian ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: enhance ONFI table reliability/stable 2015-11-20 23:59 ` Brian Norris @ 2015-11-21 7:46 ` Boris Brezillon 2015-11-21 8:27 ` Brian Norris 0 siblings, 1 reply; 6+ messages in thread From: Boris Brezillon @ 2015-11-21 7:46 UTC (permalink / raw) To: Brian Norris Cc: Bean Huo 霍斌斌 (beanhuo), linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org On Fri, 20 Nov 2015 15:59:27 -0800 Brian Norris <computersforpeace@gmail.com> wrote: > On Thu, Nov 19, 2015 at 04:21:01AM +0000, Bean Huo 霍斌斌 (beanhuo) wrote: > > > On Tue, Jul 21, 2015 at 02:42:34PM +0000, Bean Huo 霍斌斌 (beanhuo) > > > wrote: > > > > Hi, > > > > > > > > Recently, I faced some case about ONFI table reliability, now it used CRC. > > > > If there is bit flips in ONFI parameter pages, parameter backup page will be > > > taken. > > > > For latest linux,default read three copys. > > > > > > > > chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1); > > > > for (i = 0; i < 3; i++) { > > > > for (j = 0; j < sizeof(*p); j++) > > > > ((uint8_t *)p)[j] = chip->read_byte(mtd); > > > > if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) == > > > > le16_to_cpu(p->crc)) { > > > > break; > > > > } > > > > } > > > > > > > > However ,with technoogy improvement,for TLC and new generatin MLC,I > > > > think, three copys of > > > > > > Ha, "improvement" :) > > > > > > > Parameter tables is not powerful enough.my question is that if there > > > > is a good method to protect and corrent parameter page. For example,we > > > > can use linux software BCH ecc. Any suggections and input be > > > > welcomed,if you having any concerns about this,don't free tell me. > > > > > > I recall this being brought up at my old job, and I all I can say is... > > > (please pardon my censored language) > > > > > > Yes , you ever told about this. I just follow. > > Sorry for my rude following. > > I only want to share my one suggestion about using software ECC to protect > > ONFI table that read from NAND. I want to hear every MTD expert 's valuable > > Feedback on this. if OK, I can do it. > > Perhaps I'm misunderstanding you, I don't understand how you could > possibly "do it" if it is a circular dependency. You have nowhere to > store ECC/parity data for a parameter page, because you can't actually > read/write the NAND flash until after you know its geometry. Well, while I agree with most of your answer (why the hell are NAND vendors storing the ONFI parameter page, and other sensitive information in normal NAND pages, especially when we're talking about TLC/MLC NANDs???), it's perfectly possible to have ECC in this case, as long as the geometry is known in advance (at least this is true for BCH). Say you have only 3 copies of the parameter page and ECC are stored after that. You can define the following layout: |3 x parameter page size|3 x ECC bytes| Of course this implies reserving the space after the 3 parameter pages for the ECC bytes, which according to the current ONFI spec is not true (you should have at least 3 copies, but you can have more). And we would choose the ECC geometry with this logic: ECC chunk size = sizeof(struct nand_onfi_params) ECC strength = iteratively tested with different pre-defined values This being said, I don't know how you would change the ONFI spec and keep it compatible with the previous version. As I said, the current version of the spec does not reserve any area after the mandatory parameter pages... You'll probably have to add a NAND_CMD_ALT_PARAM to support this kind of thing. > > > > ...that is complete and utter bulls***. An ONFI standard that can't guarantee > > > "reliable enough" parameter pages is no standard at all. > > > > > > To step back a bit: How would one expect to store and retrieve ECC parity > > > data? ...on the NAND flash? But to do that, we have to know the geometry > > > parameters of said NAND flash. How do we figure out the geometry? From the > > > ONFI parameter pages! Nice Catch 22 you have there. > > I realize a non-native English speaker might not understand the "Catch > 22" reference. Wikipedia has a nice summary: > > https://en.wikipedia.org/wiki/Catch-22_(logic) > > Essentially, it's a circular argument, or a contradiction. An > impossibility. > > > > Please encourage your employer never to produce "ONFI-compliant" flash that > > > are this bad. > > I still stand by the above statement. > > But now that I'm in a slightly more charitable mood, there are ways to > improve our ability to recover from slightly corrupted parameter pages > (ECC is not one of them). > > For one, you could do some kind of bit majority. e.g.: > > (1) try pages 1-3 > (2) if none pass the CRC check, then compute bit majority of all 3; if > the CRC of this combined page passes, then use it > (3) ??? Should work too, but it's probably less reliable than BCH ECC (we only have 3 copies :-/). Best Regards, Boris -- Boris Brezillon, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: enhance ONFI table reliability/stable 2015-11-21 7:46 ` Boris Brezillon @ 2015-11-21 8:27 ` Brian Norris 0 siblings, 0 replies; 6+ messages in thread From: Brian Norris @ 2015-11-21 8:27 UTC (permalink / raw) To: Boris Brezillon Cc: Bean Huo 霍斌斌 (beanhuo), linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org On Sat, Nov 21, 2015 at 08:46:04AM +0100, Boris Brezillon wrote: > This being said, I don't know how you would change the ONFI spec and > keep it compatible with the previous version. As I said, the current > version of the spec does not reserve any area after the mandatory > parameter pages... > You'll probably have to add a NAND_CMD_ALT_PARAM to support this kind > of thing. I was interpreting Bean's comments to mean only a change in software, not in the actual NAND flash (HW) implementation. If he is suggesting a change in the flash spec, then that's a completely different discussion. Brian ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-11-21 8:27 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-07-21 14:42 enhance ONFI table reliability/stable Bean Huo 霍斌斌 (beanhuo) 2015-11-18 2:50 ` Brian Norris 2015-11-19 4:21 ` Bean Huo 霍斌斌 (beanhuo) 2015-11-20 23:59 ` Brian Norris 2015-11-21 7:46 ` Boris Brezillon 2015-11-21 8:27 ` Brian Norris
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).