[PATCH v3 0/6] NAND BBM + BBT updates

linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 0/6] NAND BBM + BBT updates
@ 2012-01-09 20:23 Brian Norris
  2012-01-09 20:23 ` [PATCH v3 1/6] mtd: nand: add NAND_NO_WRITE_OOB option Brian Norris
                   ` (6 more replies)
  0 siblings, 7 replies; 28+ messages in thread
From: Brian Norris @ 2012-01-09 20:23 UTC (permalink / raw)
  To: linux-mtd
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Peter Wippich, Gabor Juhos,
	Guillaume LECERF, Jonas Gorski, Jamie Iles, Ivan Djelic,
	Robert Jarzmik, David Woodhouse, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Barry Song, Jim Quinlan,
	Andres Salomon, Axel Lin, Anatolij Gustschin, Mike Frysinger,
	Arnd Bergmann, Lei Wen, Sascha Hauer, Artem Bityutskiy,
	Florian Fainelli, Artem Bityutskiy, Adrian Hunter,
	Matthieu CASTET, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Chuanxiao Dong, Joe Perches, Brian Norris, Roman Tereshonkov

This patch series is an update to a previous patch (that has split into
a few patches) with a few additional patches at the end. The important
segments of this series involve the default steps for marking new bad
blocks when using a flash-based BBT. The new default behavior will write
to the BBT as well as attempting to write a BBM to the OOB area of the
bad block. See the patch descriptions for details.

The first patch, regarding NAND_NO_WRITE_OOB, is a first attempt at
satisfying Sebastian's concerns that some systems utilize the entire OOB
area for ECC, and so we need an option to prevent writing markers to
OOB. My attempt to prevent other OOB writes may be misguided,
incomplete, flawed in some other way, or some combination of the three.
Please provide constructive criticism.

v3: writing to flash-based BBT and to BBM is still default, but
    there is a new option NAND_NO_WRITE_OOB that can prevent writing the
    BBM as well as prevent all other OOB writes.

Brian Norris (6):
  mtd: nand: add NAND_NO_WRITE_OOB option
  mtd: nand: write bad block marker by default even with BBT
  mtd: nand: erase block before marking bad
  mtd: nand: fix SCAN2NDPAGE check for BBM
  mtd: nand: differentiate 1- vs. 2-byte writes when marking bad blocks
  mtd: nand: correct comment on nand_chip badblockbits

 drivers/mtd/nand/nand_base.c |   79 ++++++++++++++++++++++++++++-------------
 include/linux/mtd/nand.h     |   11 +++++-
 2 files changed, 63 insertions(+), 27 deletions(-)

-- 
1.7.5.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 1/6] mtd: nand: add NAND_NO_WRITE_OOB option
  2012-01-09 20:23 [PATCH v3 0/6] NAND BBM + BBT updates Brian Norris
@ 2012-01-09 20:23 ` Brian Norris
  2012-01-09 20:23 ` [PATCH v3 2/6] mtd: nand: write bad block marker by default even with BBT Brian Norris
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: Brian Norris @ 2012-01-09 20:23 UTC (permalink / raw)
  To: linux-mtd
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Peter Wippich, Gabor Juhos,
	Guillaume LECERF, Jonas Gorski, Jamie Iles, Ivan Djelic,
	Robert Jarzmik, David Woodhouse, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Barry Song, Jim Quinlan,
	Andres Salomon, Axel Lin, Anatolij Gustschin, Mike Frysinger,
	Arnd Bergmann, Lei Wen, Sascha Hauer, Artem Bityutskiy,
	Florian Fainelli, Artem Bityutskiy, Adrian Hunter,
	Matthieu CASTET, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Chuanxiao Dong, Joe Perches, Brian Norris, Roman Tereshonkov

Some systems cannot use the OOB area for storing any information, not
even bad block markers (for instance, if ECC uses entire spare area).
Thus, we implement an option to prevent ever writing to OOB. This will
be useful for determining whether to record bad blocks by writing to the
flash-based BBT, the bad block's OOB region, or both.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
---
 drivers/mtd/nand/nand_base.c |    8 ++++++--
 include/linux/mtd/nand.h     |    6 ++++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 35b4565..b9dbf0c 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -2187,6 +2187,7 @@ static int nand_do_write_ops(struct mtd_info *mtd, loff_t to,
 	int ret, subpage;
 
 	ops->retlen = 0;
+	ops->oobretlen = 0;
 	if (!writelen)
 		return 0;
 
@@ -2238,7 +2239,7 @@ static int nand_do_write_ops(struct mtd_info *mtd, loff_t to,
 			wbuf = chip->buffers->databuf;
 		}
 
-		if (unlikely(oob)) {
+		if (unlikely(oob) && !(chip->options & NAND_NO_WRITE_OOB)) {
 			size_t len = min(oobwritelen, oobmaxlen);
 			oob = nand_fill_oob(mtd, oob, len, ops);
 			oobwritelen -= len;
@@ -2270,7 +2271,7 @@ static int nand_do_write_ops(struct mtd_info *mtd, loff_t to,
 	}
 
 	ops->retlen = ops->len - writelen;
-	if (unlikely(oob))
+	if (unlikely(oob) && !(chip->options & NAND_NO_WRITE_OOB))
 		ops->oobretlen = ops->ooblen;
 	return ret;
 }
@@ -2372,6 +2373,9 @@ static int nand_do_write_oob(struct mtd_info *mtd, loff_t to,
 	pr_debug("%s: to = 0x%08x, len = %i\n",
 			 __func__, (unsigned int)to, (int)ops->ooblen);
 
+	if (chip->options & NAND_NO_WRITE_OOB)
+		return 0;
+
 	if (ops->mode == MTD_OPS_AUTO_OOB)
 		len = chip->ecc.layout->oobavail;
 	else
diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
index 63b5a8b..3547205 100644
--- a/include/linux/mtd/nand.h
+++ b/include/linux/mtd/nand.h
@@ -228,6 +228,12 @@ typedef enum {
 #define NAND_OWN_BUFFERS	0x00020000
 /* Chip may not exist, so silence any errors in scan */
 #define NAND_SCAN_SILENT_NODEV	0x00040000
+/*
+ * Do not write to OOB. Useful, e.g., when ECC takes up entire OOB. Should be
+ * used with flash-based BBT, since the bad block markers will no longer be
+ * reliable.
+ */
+#define NAND_NO_WRITE_OOB	0x00080000
 
 /* Options set by nand scan */
 /* Nand scan has allocated controller struct */
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 2/6] mtd: nand: write bad block marker by default even with BBT
  2012-01-09 20:23 [PATCH v3 0/6] NAND BBM + BBT updates Brian Norris
  2012-01-09 20:23 ` [PATCH v3 1/6] mtd: nand: add NAND_NO_WRITE_OOB option Brian Norris
@ 2012-01-09 20:23 ` Brian Norris
  2012-01-09 20:23 ` [PATCH v3 3/6] mtd: nand: erase block before marking bad Brian Norris
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: Brian Norris @ 2012-01-09 20:23 UTC (permalink / raw)
  To: linux-mtd
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Peter Wippich, Gabor Juhos,
	Guillaume LECERF, Jonas Gorski, Jamie Iles, Ivan Djelic,
	Robert Jarzmik, David Woodhouse, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Barry Song, Jim Quinlan,
	Andres Salomon, Axel Lin, Anatolij Gustschin, Mike Frysinger,
	Arnd Bergmann, Lei Wen, Sascha Hauer, Artem Bityutskiy,
	Florian Fainelli, Artem Bityutskiy, Adrian Hunter,
	Matthieu CASTET, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Chuanxiao Dong, Joe Perches, Brian Norris, Roman Tereshonkov

Currently, the flash-based BBT implementation writes bad block data only
to its flash-based table and not to the OOB marker area. Then, as new
bad blocks are marked over time, the OOB markers become out of date and
the flash-based table becomes the only source of current bad block
information. This can be a problem when, for example:

 * bootloader cannot read the flash-based BBT format
 * BBT is corrupted and the flash must be rescanned for bad
   blocks; we want to remember bad blocks that were marked from Linux

In an attempt to keep the bad block markers in sync with the flash-based
BBT, this patch changes the default so that we write bad block markers
to the proper OOB area on each block in addition to flash-based BBT.

Theoretically, the bad block table and the OOB markers can still get out
of sync if the system experiences a power cut between writing the BBT to
flash and writing the OOB marker to a newly-marked bad block. However,
this is a relatively unlikely event, as new bad blocks shouldn't appear
frequently.

Note that this is a change from the previous default flash-based BBT
behavior. To restore old behavior (and to generally prevent writing to
OOB area), use the NAND_NO_WRITE_OOB options (in combination with
NAND_BBT_USE_FLASH and NAND_BBT_NO_OOB).

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
---
 drivers/mtd/nand/nand_base.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index b9dbf0c..ead2a12 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -392,7 +392,10 @@ static int nand_default_block_markbad(struct mtd_info *mtd, loff_t ofs)
 {
 	struct nand_chip *chip = mtd->priv;
 	uint8_t buf[2] = { 0, 0 };
-	int block, ret, i = 0;
+	int block, ret = 0, i = 0;
+
+	BUG_ON((chip->options & NAND_NO_WRITE_OOB) &&
+			!(chip->bbt_options & NAND_BBT_USE_FLASH));

 	if (chip->bbt_options & NAND_BBT_SCANLASTPAGE)
 		ofs += mtd->erasesize - mtd->writesize;
@@ -405,7 +408,9 @@ static int nand_default_block_markbad(struct mtd_info *mtd, loff_t ofs)
 	/* Do we have a flash based bad block table? */
 	if (chip->bbt_options & NAND_BBT_USE_FLASH)
 		ret = nand_update_bbt(mtd, ofs);
-	else {
+
+	/* Write bad block marker to OOB */
+	if (!(chip->options & NAND_NO_WRITE_OOB)) {
 		struct mtd_oob_ops ops;

 		nand_get_device(chip, mtd, FL_WRITING);
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 3/6] mtd: nand: erase block before marking bad
  2012-01-09 20:23 [PATCH v3 0/6] NAND BBM + BBT updates Brian Norris
  2012-01-09 20:23 ` [PATCH v3 1/6] mtd: nand: add NAND_NO_WRITE_OOB option Brian Norris
  2012-01-09 20:23 ` [PATCH v3 2/6] mtd: nand: write bad block marker by default even with BBT Brian Norris
@ 2012-01-09 20:23 ` Brian Norris
  2012-01-13 22:42   ` Artem Bityutskiy
  2012-01-09 20:23 ` [PATCH v3 4/6] mtd: nand: fix SCAN2NDPAGE check for BBM Brian Norris
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Brian Norris @ 2012-01-09 20:23 UTC (permalink / raw)
  To: linux-mtd
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Peter Wippich, Gabor Juhos,
	Guillaume LECERF, Jonas Gorski, Jamie Iles, Ivan Djelic,
	Robert Jarzmik, David Woodhouse, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Barry Song, Jim Quinlan,
	Andres Salomon, Axel Lin, Anatolij Gustschin, Mike Frysinger,
	Arnd Bergmann, Lei Wen, Sascha Hauer, Artem Bityutskiy,
	Florian Fainelli, Artem Bityutskiy, Adrian Hunter,
	Matthieu CASTET, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Chuanxiao Dong, Joe Perches, Brian Norris, Roman Tereshonkov

Many NAND flash systems (especially those with MLC NAND) cannot be
reliably written twice in a row. For instance, when marking a bad block,
the block may already have data written to it, and so we should attempt
to erase the block before writing a bad block marker to its OOB region.

We can ignore erase failures, since the block may be bad such that it
cannot be erased properly; we still attempt to write zeros to its spare
area.

Note that the erase must be performed before the BBT is updated, since
otherwise, nand_erase_nand() would not allow us to erase our "bad
block."

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
---
 drivers/mtd/nand/nand_base.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index ead2a12..d5dbe0a 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -397,6 +397,16 @@ static int nand_default_block_markbad(struct mtd_info *mtd, loff_t ofs)
 	BUG_ON((chip->options & NAND_NO_WRITE_OOB) &&
 			!(chip->bbt_options & NAND_BBT_USE_FLASH));
 
+	/* Erase before writing to OOB and before BBT is updated */
+	if (!(chip->options & NAND_NO_WRITE_OOB)) {
+		struct erase_info einfo;
+		memset(&einfo, 0, sizeof(einfo));
+		einfo.mtd = mtd;
+		einfo.addr = ofs;
+		einfo.len = 1 << chip->phys_erase_shift;
+		nand_erase_nand(mtd, &einfo, 0);
+	}
+
 	if (chip->bbt_options & NAND_BBT_SCANLASTPAGE)
 		ofs += mtd->erasesize - mtd->writesize;
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 3/6] mtd: nand: erase block before marking bad
  2012-01-09 20:23 ` [PATCH v3 3/6] mtd: nand: erase block before marking bad Brian Norris
@ 2012-01-13 22:42   ` Artem Bityutskiy
  2012-01-13 23:07     ` Brian Norris
  0 siblings, 1 reply; 28+ messages in thread
From: Artem Bityutskiy @ 2012-01-13 22:42 UTC (permalink / raw)
  To: Brian Norris
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Peter Wippich, Gabor Juhos,
	linux-mtd, Jonas Gorski, Jamie Iles, Ivan Djelic, Robert Jarzmik,
	David Woodhouse, Maxim Levitsky, Dmitry Eremin-Solenikov,
	Kevin Cernekee, Barry Song, Jim Quinlan, Andres Salomon, Axel Lin,
	Anatolij Gustschin, Mike Frysinger, Arnd Bergmann, Lei Wen,
	Sascha Hauer, Artem Bityutskiy, Florian Fainelli, Adrian Hunter,
	Matthieu CASTET, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Chuanxiao Dong, Joe Perches, Guillaume LECERF, Roman Tereshonkov

On Mon, 2012-01-09 at 12:23 -0800, Brian Norris wrote:
> Many NAND flash systems (especially those with MLC NAND) cannot be
> reliably written twice in a row. For instance, when marking a bad block,
> the block may already have data written to it, and so we should attempt
> to erase the block before writing a bad block marker to its OOB region.
> 
> We can ignore erase failures, since the block may be bad such that it
> cannot be erased properly; we still attempt to write zeros to its spare
> area.
> 
> Note that the erase must be performed before the BBT is updated, since
> otherwise, nand_erase_nand() would not allow us to erase our "bad
> block."
> 
> Signed-off-by: Brian Norris <computersforpeace@gmail.com>

This looks like an independent patch to me, is that right? If yes, you
can send it separately.

Artem.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 3/6] mtd: nand: erase block before marking bad
  2012-01-13 22:42   ` Artem Bityutskiy
@ 2012-01-13 23:07     ` Brian Norris
  0 siblings, 0 replies; 28+ messages in thread
From: Brian Norris @ 2012-01-13 23:07 UTC (permalink / raw)
  To: dedekind1
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Peter Wippich, Gabor Juhos,
	linux-mtd, Jonas Gorski, Jamie Iles, Ivan Djelic, Robert Jarzmik,
	David Woodhouse, Maxim Levitsky, Dmitry Eremin-Solenikov,
	Kevin Cernekee, Barry Song, Jim Quinlan, Andres Salomon, Axel Lin,
	Anatolij Gustschin, Mike Frysinger, Arnd Bergmann, Lei Wen,
	Sascha Hauer, Artem Bityutskiy, Florian Fainelli, Adrian Hunter,
	Matthieu CASTET, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Chuanxiao Dong, Joe Perches, Guillaume LECERF, Roman Tereshonkov

On Fri, Jan 13, 2012 at 2:42 PM, Artem Bityutskiy <dedekind1@gmail.com> wrote:
> This looks like an independent patch to me, is that right? If yes, you
> can send it separately.

It's mostly independent, but its exact code placement will be
dependent on the previous two patches. If you'd prefer, I'll resubmit
this separately. Then, assuming it's applied, I'll fix and resend the
controversial stuff (BBT + OOB) as a separate series, on top of it.

Brian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 4/6] mtd: nand: fix SCAN2NDPAGE check for BBM
  2012-01-09 20:23 [PATCH v3 0/6] NAND BBM + BBT updates Brian Norris
                   ` (2 preceding siblings ...)
  2012-01-09 20:23 ` [PATCH v3 3/6] mtd: nand: erase block before marking bad Brian Norris
@ 2012-01-09 20:23 ` Brian Norris
  2012-01-09 20:23 ` [PATCH v3 5/6] mtd: nand: differentiate 1- vs. 2-byte writes when marking bad blocks Brian Norris
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: Brian Norris @ 2012-01-09 20:23 UTC (permalink / raw)
  To: linux-mtd; +Cc: Brian Norris, Artem Bityutskiy

nand_block_bad() doesn't check the correct pages when
NAND_BBT_SCAN2NDPAGE is enabled. It should scan both the OOB region of
both the 1st and 2nd page of each block.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
---
 drivers/mtd/nand/nand_base.c |   40 +++++++++++++++++++++++-----------------
 1 files changed, 23 insertions(+), 17 deletions(-)

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index d5dbe0a..8319242 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -338,7 +338,7 @@ static int nand_verify_buf16(struct mtd_info *mtd, const uint8_t *buf, int len)
  */
 static int nand_block_bad(struct mtd_info *mtd, loff_t ofs, int getchip)
 {
-	int page, chipnr, res = 0;
+	int page, chipnr, res = 0, i = 0;
 	struct nand_chip *chip = mtd->priv;
 	u16 bad;
 
@@ -356,23 +356,29 @@ static int nand_block_bad(struct mtd_info *mtd, loff_t ofs, int getchip)
 		chip->select_chip(mtd, chipnr);
 	}
 
-	if (chip->options & NAND_BUSWIDTH_16) {
-		chip->cmdfunc(mtd, NAND_CMD_READOOB, chip->badblockpos & 0xFE,
-			      page);
-		bad = cpu_to_le16(chip->read_word(mtd));
-		if (chip->badblockpos & 0x1)
-			bad >>= 8;
-		else
-			bad &= 0xFF;
-	} else {
-		chip->cmdfunc(mtd, NAND_CMD_READOOB, chip->badblockpos, page);
-		bad = chip->read_byte(mtd);
-	}
+	do {
+		if (chip->options & NAND_BUSWIDTH_16) {
+			chip->cmdfunc(mtd, NAND_CMD_READOOB,
+					chip->badblockpos & 0xFE, page);
+			bad = cpu_to_le16(chip->read_word(mtd));
+			if (chip->badblockpos & 0x1)
+				bad >>= 8;
+			else
+				bad &= 0xFF;
+		} else {
+			chip->cmdfunc(mtd, NAND_CMD_READOOB, chip->badblockpos,
+					page);
+			bad = chip->read_byte(mtd);
+		}
 
-	if (likely(chip->badblockbits == 8))
-		res = bad != 0xFF;
-	else
-		res = hweight8(bad) < chip->badblockbits;
+		if (likely(chip->badblockbits == 8))
+			res = bad != 0xFF;
+		else
+			res = hweight8(bad) < chip->badblockbits;
+		ofs += mtd->writesize;
+		page = (int)(ofs >> chip->page_shift) & chip->pagemask;
+		i++;
+	} while (!res && i < 2 && (chip->bbt_options & NAND_BBT_SCAN2NDPAGE));
 
 	if (getchip)
 		nand_release_device(mtd);
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 5/6] mtd: nand: differentiate 1- vs. 2-byte writes when marking bad blocks
  2012-01-09 20:23 [PATCH v3 0/6] NAND BBM + BBT updates Brian Norris
                   ` (3 preceding siblings ...)
  2012-01-09 20:23 ` [PATCH v3 4/6] mtd: nand: fix SCAN2NDPAGE check for BBM Brian Norris
@ 2012-01-09 20:23 ` Brian Norris
  2012-01-09 20:23 ` [PATCH v3 6/6] mtd: nand: correct comment on nand_chip badblockbits Brian Norris
  2012-01-10  9:44 ` [PATCH v3 0/6] NAND BBM + BBT updates Sebastian Andrzej Siewior
  6 siblings, 0 replies; 28+ messages in thread
From: Brian Norris @ 2012-01-09 20:23 UTC (permalink / raw)
  To: linux-mtd; +Cc: Brian Norris, Artem Bityutskiy

It seems that we have developed a bad-block-marking "feature" out of
pure laziness:

  "We write two bytes per location, so we dont have to mess with 16 bit
  access."

It's relatively simple to write a 1 byte at a time on x8 devices and 2
bytes at a time on x16 devices, so let's do it.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
---
 drivers/mtd/nand/nand_base.c |   12 ++++++++----
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 8319242..030ffd3 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -434,13 +434,17 @@ static int nand_default_block_markbad(struct mtd_info *mtd, loff_t ofs)
 		/*
 		 * Write to first two pages if necessary. If we write to more
 		 * than one location, the first error encountered quits the
-		 * procedure. We write two bytes per location, so we dont have
-		 * to mess with 16 bit access.
+		 * procedure.
 		 */
-		ops.len = ops.ooblen = 2;
 		ops.datbuf = NULL;
 		ops.oobbuf = buf;
-		ops.ooboffs = chip->badblockpos & ~0x01;
+		ops.ooboffs = chip->badblockpos;
+		if (chip->options & NAND_BUSWIDTH_16) {
+			ops.ooboffs &= ~0x01;
+			ops.len = ops.ooblen = 2;
+		} else {
+			ops.len = ops.ooblen = 1;
+		}
 		ops.mode = MTD_OPS_PLACE_OOB;
 		do {
 			ret = nand_do_write_oob(mtd, ofs, &ops);
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 6/6] mtd: nand: correct comment on nand_chip badblockbits
  2012-01-09 20:23 [PATCH v3 0/6] NAND BBM + BBT updates Brian Norris
                   ` (4 preceding siblings ...)
  2012-01-09 20:23 ` [PATCH v3 5/6] mtd: nand: differentiate 1- vs. 2-byte writes when marking bad blocks Brian Norris
@ 2012-01-09 20:23 ` Brian Norris
  2012-01-10  9:44 ` [PATCH v3 0/6] NAND BBM + BBT updates Sebastian Andrzej Siewior
  6 siblings, 0 replies; 28+ messages in thread
From: Brian Norris @ 2012-01-09 20:23 UTC (permalink / raw)
  To: linux-mtd; +Cc: Brian Norris, Artem Bityutskiy

The description for badblockbits is incorrect. I think someone just made
up a false description on the spot to satisfy some kerneldoc warning.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
---
 include/linux/mtd/nand.h |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
index 3547205..a7d8f2e 100644
--- a/include/linux/mtd/nand.h
+++ b/include/linux/mtd/nand.h
@@ -454,8 +454,9 @@ struct nand_buffers {
  *			will be copied to the appropriate nand_bbt_descr's.
  * @badblockpos:	[INTERN] position of the bad block marker in the oob
  *			area.
- * @badblockbits:	[INTERN] number of bits to left-shift the bad block
- *			number
+ * @badblockbits:	[INTERN] minimum number of set bits in a good block's
+ *			bad block marker position; i.e., BBM == 11110111b is
+ *			not bad when badblockbits == 7
  * @cellinfo:		[INTERN] MLC/multichip data from chip ident
  * @numchips:		[INTERN] number of physical chips
  * @chipsize:		[INTERN] the size of one chip for multichip arrays
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-09 20:23 [PATCH v3 0/6] NAND BBM + BBT updates Brian Norris
                   ` (5 preceding siblings ...)
  2012-01-09 20:23 ` [PATCH v3 6/6] mtd: nand: correct comment on nand_chip badblockbits Brian Norris
@ 2012-01-10  9:44 ` Sebastian Andrzej Siewior
  2012-01-10 18:54   ` Brian Norris
  2012-01-11 22:28   ` Artem Bityutskiy
  6 siblings, 2 replies; 28+ messages in thread
From: Sebastian Andrzej Siewior @ 2012-01-10  9:44 UTC (permalink / raw)
  To: Brian Norris
  Cc: Dan Carpenter, Kulikov Vasiliy, Nicolas Ferre, Dominik Brodowski,
	Peter Wippich, Gabor Juhos, linux-mtd, Jonas Gorski, Jamie Iles,
	Ivan Djelic, Robert Jarzmik, David Woodhouse, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Barry Song, Jim Quinlan,
	Andres Salomon, Axel Lin, Anatolij Gustschin, Mike Frysinger,
	Arnd Bergmann, Lei Wen, Sascha Hauer, Artem Bityutskiy,
	Florian Fainelli, Artem Bityutskiy, Adrian Hunter,
	Matthieu CASTET, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Chuanxiao Dong, Joe Perches, Guillaume LECERF, Roman Tereshonkov

On 01/09/2012 09:23 PM, Brian Norris wrote:
> This patch series is an update to a previous patch (that has split into
> a few patches) with a few additional patches at the end. The important
> segments of this series involve the default steps for marking new bad
> blocks when using a flash-based BBT. The new default behavior will write
> to the BBT as well as attempting to write a BBM to the OOB area of the
> bad block. See the patch descriptions for details.

Why do we update BBT and OOB and have the date in two places? One
Argument was that the boot loader may not have support for BBT and uses
OOB instead. If so, why not update the boot loader and make sure both
users (OS and boot loader) use the same data?
Any other other arguments why updating OOB is a good idea?

> The first patch, regarding NAND_NO_WRITE_OOB, is a first attempt at
So now the old-default behavior requires a flag. If I remember
correctly the OLPC used a different BBT layout and OOB was used for
some other purpose. I remember that we had a controller which wrote ECC
into OOB on its own and the driver could not write into OOB in ECC
mode. But then I don't known if this was simply not implemented in the
driver. So never mind.

> satisfying Sebastian's concerns that some systems utilize the entire OOB
> area for ECC, and so we need an option to prevent writing markers to
> OOB. My attempt to prevent other OOB writes may be misguided,
> incomplete, flawed in some other way, or some combination of the three.
> Please provide constructive criticism.

and I am still not convinced that it is a good idea to provide one
information in two places. It seems to be redundant. If there are other
people supporting this, I am not in your way.

Sebastian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-10  9:44 ` [PATCH v3 0/6] NAND BBM + BBT updates Sebastian Andrzej Siewior
@ 2012-01-10 18:54   ` Brian Norris
  2012-01-11 22:28   ` Artem Bityutskiy
  1 sibling, 0 replies; 28+ messages in thread
From: Brian Norris @ 2012-01-10 18:54 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Dan Carpenter, Kulikov Vasiliy, Nicolas Ferre, Dominik Brodowski,
	Peter Wippich, Gabor Juhos, linux-mtd, Jonas Gorski, Jamie Iles,
	Ivan Djelic, Robert Jarzmik, David Woodhouse, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Barry Song, Jim Quinlan,
	Andres Salomon, Axel Lin, Anatolij Gustschin, Mike Frysinger,
	Arnd Bergmann, Lei Wen, Sascha Hauer, Artem Bityutskiy,
	Florian Fainelli, Artem Bityutskiy, Adrian Hunter,
	Matthieu CASTET, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Chuanxiao Dong, Joe Perches, Guillaume LECERF, Roman Tereshonkov

Hi Sebastian,

On Tue, Jan 10, 2012 at 1:44 AM, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
> On 01/09/2012 09:23 PM, Brian Norris wrote:
>> The important
>> segments of this series involve the default steps for marking new bad
>> blocks when using a flash-based BBT. The new default behavior will write
>> to the BBT as well as attempting to write a BBM to the OOB area of the
>> bad block. See the patch descriptions for details.
>
> Why do we update BBT and OOB and have the date in two places? One
> Argument was that the boot loader may not have support for BBT and uses
> OOB instead. If so, why not update the boot loader and make sure both
> users (OS and boot loader) use the same data?

This is not possible to do very generically. The NAND BBT framework is
not a stable exportable framework and has too many options and
variability for that to make sense, IMO. I feel that, if the
bootloader really needs to read from NAND and detect bad blocks, it
should only have to rely on the "standards" set down in the
datasheets. Having a bootloader learn Linux would unnecessarily
complicate it as well as seemingly defy the purpose of the bootloader.

> Any other other arguments why updating OOB is a good idea?

Yes. There were two stated reasons in patch 2. The second one:
   BBT is corrupted and the flash must be rescanned for bad blocks; we
want to remember bad blocks that were marked from Linux

Essentially, some developers have found that flash-based BBT isn't
100% reliable, and so we find ways to improve it so that when one or
two pages on a device have unexpected problems, the whole chip doesn't
become unusable. For one, perhaps you haven't followed a recent patch
(that was integrated into mainline) that provided a fallback mechanism
for the instance of ECC errors or excessive bitflips in the BBT:
   commit 623978de362a5faeb18d8395fa86089650642626
   mtd: nand: scrub BBT on ECC errors

This patch means that, for reliability reasons, "default" flash-based
BBT systems *already* may rely on the bad block markers in OOB. Now,
this also may not be desirable for your situation, but I didn't hear
complaints about this earlier. And I don't think I was the only one
requesting that feature.

>> The first patch, regarding NAND_NO_WRITE_OOB, is a first attempt at
>
> So now the old-default behavior requires a flag.

Yes. I hoped to make that clear.

>> satisfying Sebastian's concerns that some systems utilize the entire OOB
>> area for ECC, and so we need an option to prevent writing markers to
>> OOB. My attempt to prevent other OOB writes may be misguided,
>> incomplete, flawed in some other way, or some combination of the three.
>> Please provide constructive criticism.
>
> and I am still not convinced that it is a good idea to provide one
> information in two places. It seems to be redundant.

It seems that overall, we have (at least) two different paradigms for
the flash-based BBT concept. For me, I use it primarily as a
performance convenience: I don't have to scan the entire flash at
every bootup, saving time. I don't rely on it 100%, as it has caused
some problems in practice; I wish to be able to fall back to the
"standard" bad block markers when needed. For you, you seem to use it
out of necessity: you cannot use OOB for both ECC and bad block
markers, so you must scan the device once, build a table, then rely on
the table 100%. Please correct me if this characterization is wrong.

Now, the question is: are these paradigms reconcilable? For instance,
I've recently built in the ability to rescan the NAND if/when ECC
problems arise (mentioned above); but this is undesirable in your
paradigm, I think. You just hope to prevent fatal ECC problems?
Similarly, the BBT may be accidentally overwritten somehow; I would
hope that we can (someday) provide a mechanism to erase the table and
rebuild it. There are probably other more significant points of
contention between the two views, but I'm not going further at the
moment.

> If there are other
> people supporting this, I am not in your way.

I believe at least Matthieu Castet was interested in this patch series
before, and I have seen confirmation from Artem that the concept is
reasonable (in fact, he wasn't sure why this wasn't already the
default). I don't intend to ignore your views, and at a minimum, would
like to provide an option that fits correctly into the entire
MTD/NAND/BBT system and fulfills the requirements of your systems.

To that end: is the NAND_NO_WRITE_OOB flag acceptable? Are there
fundamental problems with that approach, where MTD/NAND will never
write to the OOB region? How about smaller technical issues with the
corresponding patches (patch 1 and 2)?

Brian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-10  9:44 ` [PATCH v3 0/6] NAND BBM + BBT updates Sebastian Andrzej Siewior
  2012-01-10 18:54   ` Brian Norris
@ 2012-01-11 22:28   ` Artem Bityutskiy
  2012-01-12  7:58     ` Shmulik Ladkani
                       ` (2 more replies)
  1 sibling, 3 replies; 28+ messages in thread
From: Artem Bityutskiy @ 2012-01-11 22:28 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Brian Norris
  Cc: Dan Carpenter, Kulikov Vasiliy, Nicolas Ferre, Dominik Brodowski,
	Peter Wippich, Gabor Juhos, linux-mtd, Jonas Gorski, Jamie Iles,
	Ivan Djelic, Robert Jarzmik, David Woodhouse, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Barry Song, Jim Quinlan,
	Andres Salomon, Axel Lin, Anatolij Gustschin, Mike Frysinger,
	Arnd Bergmann, Lei Wen, Sascha Hauer, Artem Bityutskiy,
	Florian Fainelli, Adrian Hunter, Matthieu CASTET, Kyungmin Park,
	Shmulik Ladkani, Wolfram Sang, Chuanxiao Dong, Joe Perches,
	Guillaume LECERF, Roman Tereshonkov

On Tue, 2012-01-10 at 10:44 +0100, Sebastian Andrzej Siewior wrote:
> and I am still not convinced that it is a good idea to provide one
> information in two places. It seems to be redundant. If there are other
> people supporting this, I am not in your way.

NANDs become less and less reliable - they suffer from all kinds of read
and write disturb issues, unstable bits, etc. Do you trust MTD's
on-flash BBT which was created for the old reliable flashes? I don't
really trust it. I have a feeling that it is very real to have the BBT
corrupted because of read/write disturb - we read it rarely.

In my view, OOB BB markers is the primary, reliable, and simple
mechanism. And BBT is just an additional optimization to speed up system
startup.

So in general I support Brian's efforts. However, I am not sure that
Brian's decision to first mark block as bad in BBT than in OOB is the
right one. I have a feeling that the opposite way is correct. And it
looks like this will almost automatically solve the possible issue of
getting BBT and OOB out-of-sync due to a power cut while making a block
as bad. At least for the software I know: JFFS2, UBI, user-space tools
like ubiformat - I'll refer it just as "SW".

Indeed, when we mark a block as bad?

1. When we get erase error. Well, if SW erases a block, it does not care
of the contents. This means that if after the reboot SW will re-try
erasing it. And if the block is bad, and previously the erasure failed,
it will fail again, and SW will mark it as bad again.

2. When we get a write error. The SW recovers useful data from the
eraseblock, then tries to mark it bad. Well, UBI will first try to
torture it, but this is a not essential detail. Anyway, if we get a
power cut - the situation is the same - SW will try to erase this block
and write to it, will get errors again and will mark it as bad.

I guess we also need to read oob before writing it when we are marking a
block as bad - just in case it is already marked as bad in OOB.

Comments? If this does not make sense - I have a good excuse - it is
late and I am very sleepy :-)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-11 22:28   ` Artem Bityutskiy
@ 2012-01-12  7:58     ` Shmulik Ladkani
  2012-01-13 22:12       ` Artem Bityutskiy
  2012-01-12  9:09     ` Sebastian Andrzej Siewior
  2012-01-17 10:22     ` Angus CLARK
  2 siblings, 1 reply; 28+ messages in thread
From: Shmulik Ladkani @ 2012-01-12  7:58 UTC (permalink / raw)
  To: dedekind1
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Peter Wippich, Gabor Juhos,
	linux-mtd, Jonas Gorski, Jamie Iles, Ivan Djelic, Robert Jarzmik,
	David Woodhouse, Maxim Levitsky, Dmitry Eremin-Solenikov,
	Kevin Cernekee, Barry Song, Jim Quinlan, Andres Salomon, Axel Lin,
	Anatolij Gustschin, Mike Frysinger, Arnd Bergmann, Lei Wen,
	Sascha Hauer, Artem Bityutskiy, Florian Fainelli, Adrian Hunter,
	Matthieu CASTET, Kyungmin Park, Wolfram Sang, Chuanxiao Dong,
	Joe Perches, Guillaume LECERF, Brian Norris, Roman Tereshonkov

On Thu, 12 Jan 2012 00:28:45 +0200 Artem Bityutskiy <dedekind1@gmail.com> wrote:
> In my view, OOB BB markers is the primary, reliable, and simple
> mechanism. And BBT is just an additional optimization to speed up system
> startup.
> 
> So in general I support Brian's efforts

I'm in favor of this approach as well.
However IMO it should (1) be 'bbt_options' configurable; (2) should
properly address OOB vs BBT out-of-sync issues.

> Indeed, when we mark a block as bad?
> 
> 1. When we get erase error. Well, if SW erases a block, it does not care
> of the contents. This means that if after the reboot SW will re-try
> erasing it. And if the block is bad, and previously the erasure failed,
> it will fail again, and SW will mark it as bad again.
> 
> 2. When we get a write error. The SW recovers useful data from the
> eraseblock, then tries to mark it bad. Well, UBI will first try to
> torture it, but this is a not essential detail. Anyway, if we get a
> power cut - the situation is the same - SW will try to erase this block
> and write to it, will get errors again and will mark it as bad.

So your new scheme for 'nand_default_block_markbad' is as follows:
  (1) mark BBM in OOB
  (2) update on-flash BBT.
Where existing scheme (for NAND_BBT_USE_FLASH devices) is:
  update on-flash BBT.

And hence, if power-cut occurs between (1) and (2) in the new scheme,
it is equivalent to a power-cut that occurred just an instant prior
actually performing the BBT update in the old scheme.

Meaning: the system, being NAND_BBT_USE_FLASH based, will simply won't
be aware of the bad block (although already OOB marked).
Is that right?

> I guess we also need to read oob before writing it when we are marking a
> block as bad - just in case it is already marked as bad in OOB.

I assume you mean using 'chip->block_bad' within the new implementation
of 'nand_default_block_markbad' prior executing (1). Is that right?

> Comments? If this does not make sense - I have a good excuse - it is
> late and I am very sleepy :-)

I guess it's reasonable :)

The only argument I have is that this scheme, although working,
contradicts your view of "OOB BB markers being the primary mechanism".
That's because 'nand_block_checkbad' prefers the info from the BBT
(for NAND_BBT_USE_FLASH devices).

Regards,
Shmulik

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-12  7:58     ` Shmulik Ladkani
@ 2012-01-13 22:12       ` Artem Bityutskiy
  2012-01-16 19:35         ` Shmulik Ladkani
  0 siblings, 1 reply; 28+ messages in thread
From: Artem Bityutskiy @ 2012-01-13 22:12 UTC (permalink / raw)
  To: Shmulik Ladkani
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Peter Wippich, Gabor Juhos,
	linux-mtd, Jonas Gorski, Jamie Iles, Ivan Djelic, Robert Jarzmik,
	David Woodhouse, Maxim Levitsky, Dmitry Eremin-Solenikov,
	Kevin Cernekee, Barry Song, Jim Quinlan, Andres Salomon, Axel Lin,
	Anatolij Gustschin, Mike Frysinger, Arnd Bergmann, Lei Wen,
	Sascha Hauer, Artem Bityutskiy, Florian Fainelli, Adrian Hunter,
	Matthieu CASTET, Kyungmin Park, Wolfram Sang, Chuanxiao Dong,
	Joe Perches, Guillaume LECERF, Brian Norris, Roman Tereshonkov

On Thu, 2012-01-12 at 09:58 +0200, Shmulik Ladkani wrote:
> On Thu, 12 Jan 2012 00:28:45 +0200 Artem Bityutskiy <dedekind1@gmail.com> wrote:
> > In my view, OOB BB markers is the primary, reliable, and simple
> > mechanism. And BBT is just an additional optimization to speed up system
> > startup.
> > 
> > So in general I support Brian's efforts
> 
> I'm in favor of this approach as well.
> However IMO it should (1) be 'bbt_options' configurable;

Why does it have to be configurable? Do you have some example in mind?

>  (2) should
> properly address OOB vs BBT out-of-sync issues.

This is reasonable.

> 
> > Indeed, when we mark a block as bad?
> > 
> > 1. When we get erase error. Well, if SW erases a block, it does not care
> > of the contents. This means that if after the reboot SW will re-try
> > erasing it. And if the block is bad, and previously the erasure failed,
> > it will fail again, and SW will mark it as bad again.
> > 
> > 2. When we get a write error. The SW recovers useful data from the
> > eraseblock, then tries to mark it bad. Well, UBI will first try to
> > torture it, but this is a not essential detail. Anyway, if we get a
> > power cut - the situation is the same - SW will try to erase this block
> > and write to it, will get errors again and will mark it as bad.
> 
> So your new scheme for 'nand_default_block_markbad' is as follows:
>   (1) mark BBM in OOB
>   (2) update on-flash BBT.
> Where existing scheme (for NAND_BBT_USE_FLASH devices) is:
>   update on-flash BBT.
> 
> And hence, if power-cut occurs between (1) and (2) in the new scheme,
> it is equivalent to a power-cut that occurred just an instant prior
> actually performing the BBT update in the old scheme.
> 
> Meaning: the system, being NAND_BBT_USE_FLASH based, will simply won't
> be aware of the bad block (although already OOB marked).
> Is that right?

Yes. And the idea is that it will discover it when starting doing I/O on
this eraseblock. Indeed, if it found out that it is bad before the power
cut (it exhibited I/O errors), it should discover it again by getting
I/O errors.

> > I guess we also need to read oob before writing it when we are marking a
> > block as bad - just in case it is already marked as bad in OOB.
> 
> I assume you mean using 'chip->block_bad' within the new implementation
> of 'nand_default_block_markbad' prior executing (1). Is that right?

Probably yes.

> 
> > Comments? If this does not make sense - I have a good excuse - it is
> > late and I am very sleepy :-)
> 
> I guess it's reasonable :)
> 
> The only argument I have is that this scheme, although working,
> contradicts your view of "OOB BB markers being the primary mechanism".
> That's because 'nand_block_checkbad' prefers the info from the BBT
> (for NAND_BBT_USE_FLASH devices).

My point is that in case of a power cut between (1) and (2) the upper
layers will detect the bad block again and mark it as bad again, both in
OOB and BBT. So OOB and BBT will be in sync.

The other approach would be to have an additional bit per eraseblock in
the in-ram BBT for lazy checking. And actually compare the OOB bad block
marker with the BBT on the first erase or write operation, and bring OOB
and BBM in sync.

Artem.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-13 22:12       ` Artem Bityutskiy
@ 2012-01-16 19:35         ` Shmulik Ladkani
  0 siblings, 0 replies; 28+ messages in thread
From: Shmulik Ladkani @ 2012-01-16 19:35 UTC (permalink / raw)
  To: dedekind1
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Peter Wippich, Gabor Juhos,
	linux-mtd, Jonas Gorski, Jamie Iles, Ivan Djelic, Robert Jarzmik,
	David Woodhouse, Maxim Levitsky, Dmitry Eremin-Solenikov,
	Kevin Cernekee, Barry Song, Jim Quinlan, Andres Salomon, Axel Lin,
	Anatolij Gustschin, Mike Frysinger, Arnd Bergmann, Lei Wen,
	Sascha Hauer, Artem Bityutskiy, Florian Fainelli, Adrian Hunter,
	Matthieu CASTET, Kyungmin Park, Wolfram Sang, Chuanxiao Dong,
	Joe Perches, Guillaume LECERF, Brian Norris, Roman Tereshonkov

On Sat, 14 Jan 2012 00:12:11 +0200 Artem Bityutskiy <dedekind1@gmail.com> wrote:
> On Thu, 2012-01-12 at 09:58 +0200, Shmulik Ladkani wrote:
> > On Thu, 12 Jan 2012 00:28:45 +0200 Artem Bityutskiy <dedekind1@gmail.com> wrote:
> > > In my view, OOB BB markers is the primary, reliable, and simple
> > > mechanism. And BBT is just an additional optimization to speed up system
> > > startup.
> > > 
> > > So in general I support Brian's efforts
> > 
> > I'm in favor of this approach as well.
> > However IMO it should (1) be 'bbt_options' configurable;
> 
> Why does it have to be configurable? Do you have some example in mind?
> 

Can't tell if it's a good enough reason, but it looks like some are
happy with existing behavior (do not expect on-flash BBT reliability
issues, or pleased with current handling of it, and have no
bootloader/kernel OOB-vs-BBT configuration clashes), and as such, they
are simply not interested of this change.
Must it be enforced?

Also, the change introduces some fine nuances (the OOB BBM test needed
within chip->block_markbad, or alternatively, lazy sync code).
Wouldn't it be better to have them scoped using a config?

Regards
Shmulik

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-11 22:28   ` Artem Bityutskiy
  2012-01-12  7:58     ` Shmulik Ladkani
@ 2012-01-12  9:09     ` Sebastian Andrzej Siewior
  2012-01-13 22:36       ` Artem Bityutskiy
  2012-01-17 10:22     ` Angus CLARK
  2 siblings, 1 reply; 28+ messages in thread
From: Sebastian Andrzej Siewior @ 2012-01-12  9:09 UTC (permalink / raw)
  To: dedekind1
  Cc: Dan Carpenter, Kulikov Vasiliy, Nicolas Ferre, Dominik Brodowski,
	Peter Wippich, Gabor Juhos, linux-mtd, Jonas Gorski, Jamie Iles,
	Ivan Djelic, Robert Jarzmik, David Woodhouse, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Barry Song, Jim Quinlan,
	Andres Salomon, Axel Lin, Anatolij Gustschin, Mike Frysinger,
	Arnd Bergmann, Lei Wen, Sascha Hauer, Artem Bityutskiy,
	Florian Fainelli, Adrian Hunter, Matthieu CASTET, Kyungmin Park,
	Shmulik Ladkani, Wolfram Sang, Chuanxiao Dong, Joe Perches,
	Guillaume LECERF, Brian Norris, Roman Tereshonkov

On 01/11/2012 11:28 PM, Artem Bityutskiy wrote:
> On Tue, 2012-01-10 at 10:44 +0100, Sebastian Andrzej Siewior wrote:
>> and I am still not convinced that it is a good idea to provide one
>> information in two places. It seems to be redundant. If there are other
>> people supporting this, I am not in your way.
>
> NANDs become less and less reliable - they suffer from all kinds of read
> and write disturb issues, unstable bits, etc. Do you trust MTD's
> on-flash BBT which was created for the old reliable flashes? I don't
> really trust it. I have a feeling that it is very real to have the BBT
> corrupted because of read/write disturb - we read it rarely.
>
> In my view, OOB BB markers is the primary, reliable, and simple
> mechanism. And BBT is just an additional optimization to speed up system
> startup.

so the OOB array is by design more reliable than the data area? So the
"less reliable" part of NAND does not apply to OOB, right? Because I
was thinking about putting in UBI and deal with it there sice it should
not lose data.

> So in general I support Brian's efforts. However, I am not sure that
> Brian's decision to first mark block as bad in BBT than in OOB is the
> right one. I have a feeling that the opposite way is correct. And it
> looks like this will almost automatically solve the possible issue of
> getting BBT and OOB out-of-sync due to a power cut while making a block
> as bad. At least for the software I know: JFFS2, UBI, user-space tools
> like ubiformat - I'll refer it just as "SW".
>
> Indeed, when we mark a block as bad?
>
> 1. When we get erase error. Well, if SW erases a block, it does not care
> of the contents. This means that if after the reboot SW will re-try
> erasing it. And if the block is bad, and previously the erasure failed,
> it will fail again, and SW will mark it as bad again.
>
> 2. When we get a write error. The SW recovers useful data from the
> eraseblock, then tries to mark it bad. Well, UBI will first try to
> torture it, but this is a not essential detail. Anyway, if we get a
> power cut - the situation is the same - SW will try to erase this block
> and write to it, will get errors again and will mark it as bad.
>
> I guess we also need to read oob before writing it when we are marking a
> block as bad - just in case it is already marked as bad in OOB.

why should it been marked bad and we as the system aka do one that made
the order do not know about it? It would make sense to verify OOB vs
BBT during boot-up. So we read BBT and would then sync the content with
OOB async so we don't block the boot process.

> Comments? If this does not make sense - I have a good excuse - it is
> late and I am very sleepy :-)

Do we lose the BBT table completely or just a few entries? If it is
just a matter of an entry or two what is the worst thing that can
happen? We run into the bad block again and mark it (again).

Sebastian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-12  9:09     ` Sebastian Andrzej Siewior
@ 2012-01-13 22:36       ` Artem Bityutskiy
  2012-01-16 20:59         ` Woodhouse, David
                           ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Artem Bityutskiy @ 2012-01-13 22:36 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Dan Carpenter, Barry Song, Nicolas Ferre, Dominik Brodowski,
	Adrian Hunter, Gabor Juhos, linux-mtd, Jonas Gorski, Jamie Iles,
	Ivan Djelic, Robert Jarzmik, David Woodhouse, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Kulikov Vasiliy,
	Jim Quinlan, Andres Salomon, Axel Lin, Anatolij Gustschin,
	Mike Frysinger, Arnd Bergmann, Lei Wen, Sascha Hauer,
	Artem Bityutskiy, Florian Fainelli, Peter Wippich,
	Matthieu CASTET, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Chuanxiao Dong, Joe Perches, Guillaume LECERF, Brian Norris,
	Roman Tereshonkov

On Thu, 2012-01-12 at 10:09 +0100, Sebastian Andrzej Siewior wrote:
> On 01/11/2012 11:28 PM, Artem Bityutskiy wrote:
> > On Tue, 2012-01-10 at 10:44 +0100, Sebastian Andrzej Siewior wrote:
> >> and I am still not convinced that it is a good idea to provide one
> >> information in two places. It seems to be redundant. If there are other
> >> people supporting this, I am not in your way.
> >
> > NANDs become less and less reliable - they suffer from all kinds of read
> > and write disturb issues, unstable bits, etc. Do you trust MTD's
> > on-flash BBT which was created for the old reliable flashes? I don't
> > really trust it. I have a feeling that it is very real to have the BBT
> > corrupted because of read/write disturb - we read it rarely.
> >
> > In my view, OOB BB markers is the primary, reliable, and simple
> > mechanism. And BBT is just an additional optimization to speed up system
> > startup.
> 
> so the OOB array is by design more reliable than the data area?

I think so, because it is distributed, and it is historically the way
blocks had been marked as bad, and I thing vendors make sure this
mechanism works.

>  So the
> "less reliable" part of NAND does not apply to OOB, right?

My idea is that when all the bad block information is in one place, and
this place becomes corrupted for whatever reasons - we are in a big
trouble.

And then I make an argument that modern NANDs tend to be unreliable and
start bit-flipping when you do I/O on adjacent eraseblocks. And because
the BBT is very static and MTD does not refresh it very often, it may
become corrupted.

But again, I did not make experiments.

Also, I think Brians arguments about bootloaders supporting OOB bad
block markers well and BBM not very well is rather strong.

>  Because I
> was thinking about putting in UBI and deal with it there sice it should
> not lose data.

:-) BTW, with current unresolved unstable bits problem I do not
recommend to use UBI/UBIFS if you need high power cut tolerance.

Anyway, would you recap why you are opposed to Brian's idea?

> > I guess we also need to read oob before writing it when we are marking a
> > block as bad - just in case it is already marked as bad in OOB.
> 
> why should it been marked bad and we as the system aka do one that made
> the order do not know about it?

Sorry, did not understand the question. As I explained, I _think_ the SW
I am aware of will be fine. Let's take the ubiformat tool.

1. ubiformat erases PEB 7
2. ubiformat gets I/O errors.
3. ubiformat decides to mark the PEB 7 as bad
4. We get a power cut after we have put the BB marker to the OOB, but
before we have updated the BBT.
5. We reboot, we run ubiformat again.
6. MTD reports that PEB 7 is good.
7. ubiformat erases PEB 7
8. ubiformat gets I/O error, and marks PEB as bad.

Similar in UBI.

1. UBI writes to PEB 9 and gets an I/O error.
2. UBI recovers data from PEB 9 to PEB 137
3. UBI marks PEB 9 as bad and we have a power cut
4. After the reboot UBI sees PEB 9 as good, but it will recognize it as
old, because there is a newer version in PEB 137.
5. UBI erases PEB 9. This may fail, or may succeed. Assume the latter.
6. Later UBI writes data to PEB 9, gets I/O error, and marks it as bad.

>  It would make sense to verify OOB vs
> BBT during boot-up. So we read BBT and would then sync the content with
> OOB async so we don't block the boot process.

Well, yes, we can have lazy checking, I guess, I am just not sure it is
necessary to complicate things.

> > Comments? If this does not make sense - I have a good excuse - it is
> > late and I am very sleepy :-)
> 
> Do we lose the BBT table completely or just a few entries? If it is
> just a matter of an entry or two what is the worst thing that can
> happen? We run into the bad block again and mark it (again).

I do not remember, but just glanced to the code and I see that BBT is
not protected by CRC at all. So we only rely on ECC protection, which is
not good enough to detect many-bit corruptions. But in most cases it
will detect the corruption, so we loose whole NAND page of bad block
data. But there is the second copy, and to lose the data completely we
need to have the same NAND page corrupted in the second copy. But
current code is not very smart in recovering, it will require the second
copy to be completely ucorrupted to recover the first copy.

Artem.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-13 22:36       ` Artem Bityutskiy
@ 2012-01-16 20:59         ` Woodhouse, David
  2012-01-17  8:23           ` Artem Bityutskiy
  2012-01-17 11:19         ` Angus CLARK
  2012-01-18 22:18         ` Brian Norris
  2 siblings, 1 reply; 28+ messages in thread
From: Woodhouse, David @ 2012-01-16 20:59 UTC (permalink / raw)
  To: dedekind1@gmail.com
  Cc: Dan Carpenter, Barry Song, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Hunter, Adrian, Gabor Juhos,
	linux-mtd@lists.infradead.org, Jonas Gorski, Jamie Iles,
	Ivan Djelic, Robert Jarzmik, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Kulikov Vasiliy,
	Jim Quinlan, Andres Salomon, Axel Lin, Anatolij Gustschin,
	Mike Frysinger, Arnd Bergmann, Lei Wen, Sascha Hauer,
	Bityutskiy, Artem, Florian Fainelli, Peter Wippich,
	Matthieu CASTET, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Dong, Chuanxiao, Joe Perches, Guillaume LECERF, Brian Norris,
	Roman Tereshonkov

[-- Attachment #1: Type: text/plain, Size: 1165 bytes --]

On Sat, 2012-01-14 at 00:36 +0200, Artem Bityutskiy wrote:
> I think so, because it is distributed, and it is historically the way
> blocks had been marked as bad, and I thing vendors make sure this
> mechanism works. 

They make sure it works for *them* at manufacturing time, sure. But what
makes you so sure it'll work for *us*?

They may have special ways to clear or even fuse out the the appropriate
bits during manufacture, that we can't do from software. Or maybe they
just throw away any chip where a bad block is *so* bad that they can't
even clear the bad block marker? That might not affect their yield so
much when it's used only for factory-bad blocks, but if we do it for
*all* blocks that go bad at runtime, it's a different calculation. And a
more 'interesting' failure mode because when we find it out, it's
already in production.

So I wouldn't necessarily assume that what works for them, will work for
us.

-- 
                   Sent with MeeGo's ActiveSync support.

David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation



[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 4370 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-16 20:59         ` Woodhouse, David
@ 2012-01-17  8:23           ` Artem Bityutskiy
  2012-01-17  8:27             ` Artem Bityutskiy
  0 siblings, 1 reply; 28+ messages in thread
From: Artem Bityutskiy @ 2012-01-17  8:23 UTC (permalink / raw)
  To: Woodhouse, David
  Cc: Dan Carpenter, Barry Song, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Hunter, Adrian, Gabor Juhos,
	linux-mtd@lists.infradead.org, Jonas Gorski, Jamie Iles,
	Ivan Djelic, Robert Jarzmik, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Kulikov Vasiliy,
	Jim Quinlan, Andres Salomon, Axel Lin, Anatolij Gustschin,
	Mike Frysinger, Arnd Bergmann, Lei Wen, Sascha Hauer,
	Florian Fainelli, Peter Wippich, Matthieu CASTET, Kyungmin Park,
	Shmulik Ladkani, Wolfram Sang, Dong, Chuanxiao, Joe Perches,
	Guillaume LECERF, Brian Norris, Roman Tereshonkov

[-- Attachment #1: Type: text/plain, Size: 853 bytes --]

On Mon, 2012-01-16 at 20:59 +0000, Woodhouse, David wrote:
> On Sat, 2012-01-14 at 00:36 +0200, Artem Bityutskiy wrote:
> > I think so, because it is distributed, and it is historically the way
> > blocks had been marked as bad, and I thing vendors make sure this
> > mechanism works. 
> 
> They make sure it works for *them* at manufacturing time, sure. But what
> makes you so sure it'll work for *us*?

Well, I am 100% not sure of course. But think that because marking blocks as bad using OOB is
the standard way, and vendors know about this, and they know that flash
bad blocks become bad, they will probably make try to make this
mechanism work for the users. But even if this is not true for a
specific chip, then the users should have BBT, and it will be used. But
I do not believe that 

-- 
Best Regards,
Artem Bityutskiy


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-17  8:23           ` Artem Bityutskiy
@ 2012-01-17  8:27             ` Artem Bityutskiy
  0 siblings, 0 replies; 28+ messages in thread
From: Artem Bityutskiy @ 2012-01-17  8:27 UTC (permalink / raw)
  To: Woodhouse, David
  Cc: Dan Carpenter, Barry Song, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Hunter, Adrian, Gabor Juhos,
	linux-mtd@lists.infradead.org, Jonas Gorski, Jamie Iles,
	Ivan Djelic, Robert Jarzmik, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Kulikov Vasiliy,
	Jim Quinlan, Andres Salomon, Axel Lin, Anatolij Gustschin,
	Mike Frysinger, Arnd Bergmann, Lei Wen, Sascha Hauer,
	Florian Fainelli, Peter Wippich, Matthieu CASTET, Kyungmin Park,
	Shmulik Ladkani, Wolfram Sang, Dong, Chuanxiao, Joe Perches,
	Guillaume LECERF, Brian Norris, Roman Tereshonkov

[-- Attachment #1: Type: text/plain, Size: 683 bytes --]

[Sorry, did not finish the e-mail :-)]

On Tue, 2012-01-17 at 10:23 +0200, Artem Bityutskiy wrote:
> Well, I am 100% not sure of course. But think that because marking blocks as bad using OOB is
> the standard way, and vendors know about this, and they know that flash
> bad blocks become bad, they will probably make try to make this
> mechanism work for the users. But even if this is not true for a
> specific chip, then the users should have BBT, and it will be used. But
> I do not believe that 

that current MTD BBT is reliable enough for modern flashes. For such
a hypothetical chip ti would have to be carefully assessed.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-13 22:36       ` Artem Bityutskiy
  2012-01-16 20:59         ` Woodhouse, David
@ 2012-01-17 11:19         ` Angus CLARK
  2012-01-17 13:06           ` Ivan Djelic
  2012-01-18 22:18         ` Brian Norris
  2 siblings, 1 reply; 28+ messages in thread
From: Angus CLARK @ 2012-01-17 11:19 UTC (permalink / raw)
  To: dedekind1
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Peter Wippich, Gabor Juhos,
	linux-mtd, Jonas Gorski, Jamie Iles, Ivan Djelic, Robert Jarzmik,
	David Woodhouse, Maxim Levitsky, Dmitry Eremin-Solenikov,
	Kevin Cernekee, Barry Song, Jim Quinlan, Andres Salomon, Axel Lin,
	Anatolij Gustschin, Mike Frysinger, Arnd Bergmann, Lei Wen,
	Sascha Hauer, Artem Bityutskiy, Florian Fainelli, Adrian Hunter,
	Matthieu CASTET, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Chuanxiao Dong, Joe Perches, Guillaume LECERF, Brian Norris,
	Roman Tereshonkov

On 01/13/2012 10:36 PM, Artem Bityutskiy wrote:
> On Thu, 2012-01-12 at 10:09 +0100, Sebastian Andrzej Siewior wrote:
>>
>> so the OOB array is by design more reliable than the data area?
> 
> I think so, because it is distributed, and it is historically the way
> blocks had been marked as bad, and I thing vendors make sure this
> mechanism works.
> 

Is this really true?  I was under the impression that the OOB area was the same
as the data area, as far as reliability is concerned, and is subject to the same
ECC requirements.

As far as I am aware, NAND manufacturers only guarantee that the
factory-programmed OOB BB markers are valid.  Nothing is mentioned in the
datasheets about using OOB BB markers to track worn blocks - they all tend to
recommend BBTs.

Cheers,

Angus

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-17 11:19         ` Angus CLARK
@ 2012-01-17 13:06           ` Ivan Djelic
  0 siblings, 0 replies; 28+ messages in thread
From: Ivan Djelic @ 2012-01-17 13:06 UTC (permalink / raw)
  To: Angus CLARK
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Peter Wippich, Gabor Juhos,
	linux-mtd@lists.infradead.org, Jonas Gorski, Jamie Iles,
	Robert Jarzmik, David Woodhouse, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Barry Song, Jim Quinlan,
	Andres Salomon, Axel Lin, Anatolij Gustschin, Mike Frysinger,
	Arnd Bergmann, Lei Wen, Sascha Hauer, Artem Bityutskiy,
	Florian Fainelli, dedekind1@gmail.com, Adrian Hunter,
	Matthieu Castet, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Chuanxiao Dong, Joe Perches, Guillaume LECERF, Brian Norris,
	Roman Tereshonkov

On Tue, Jan 17, 2012 at 11:19:19AM +0000, Angus CLARK wrote:
> On 01/13/2012 10:36 PM, Artem Bityutskiy wrote:
> > On Thu, 2012-01-12 at 10:09 +0100, Sebastian Andrzej Siewior wrote:
> >>
> >> so the OOB array is by design more reliable than the data area?
> > 
> > I think so, because it is distributed, and it is historically the way
> > blocks had been marked as bad, and I thing vendors make sure this
> > mechanism works.
> > 
> 
> Is this really true?  I was under the impression that the OOB area was the same
> as the data area, as far as reliability is concerned, and is subject to the same
> ECC requirements.
> 
> As far as I am aware, NAND manufacturers only guarantee that the
> factory-programmed OOB BB markers are valid.  Nothing is mentioned in the
> datasheets about using OOB BB markers to track worn blocks - they all tend to
> recommend BBTs.
> 

Hello,
FWIW, my experience with NAND manufacturers totally confirms what you are saying;
i.e. OOB is no different technology, and factory bad block markers are not
always even implemented in OOB: hardwired tables -- probably efuses -- are
sometimes used to systematically return 0x00 bytes when a bad block is read;
which has the advantage of preventing SW from accidentally erasing a factory
bad block.
BR,

Ivan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-13 22:36       ` Artem Bityutskiy
  2012-01-16 20:59         ` Woodhouse, David
  2012-01-17 11:19         ` Angus CLARK
@ 2012-01-18 22:18         ` Brian Norris
  2 siblings, 0 replies; 28+ messages in thread
From: Brian Norris @ 2012-01-18 22:18 UTC (permalink / raw)
  To: dedekind1
  Cc: Dan Carpenter, Barry Song, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Adrian Hunter, Gabor Juhos,
	linux-mtd, Jonas Gorski, Jamie Iles, Ivan Djelic, Robert Jarzmik,
	David Woodhouse, Maxim Levitsky, Dmitry Eremin-Solenikov,
	Kevin Cernekee, Kulikov Vasiliy, Jim Quinlan, Andres Salomon,
	Axel Lin, Anatolij Gustschin, Mike Frysinger, Arnd Bergmann,
	Lei Wen, Sascha Hauer, Artem Bityutskiy, Florian Fainelli,
	Peter Wippich, Matthieu CASTET, Kyungmin Park, Shmulik Ladkani,
	Wolfram Sang, Chuanxiao Dong, Joe Perches, Guillaume LECERF,
	Roman Tereshonkov

On Fri, Jan 13, 2012 at 2:36 PM, Artem Bityutskiy <dedekind1@gmail.com> wrote:
> On Thu, 2012-01-12 at 10:09 +0100, Sebastian Andrzej Siewior wrote:
>>  Because I
>> was thinking about putting in UBI and deal with it there sice it should
>> not lose data.
>
> :-) BTW, with current unresolved unstable bits problem I do not
> recommend to use UBI/UBIFS if you need high power cut tolerance.

Also, we cannot reasonably offload all of bad block marking to
UBI(FS), since there are other filesystems/modules/etc. that rely on
the MTD/NAND layer to properly handle bad blocks. If this is moved up
to the filesystem, we miss out on benefits for other layers.

Brian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-11 22:28   ` Artem Bityutskiy
  2012-01-12  7:58     ` Shmulik Ladkani
  2012-01-12  9:09     ` Sebastian Andrzej Siewior
@ 2012-01-17 10:22     ` Angus CLARK
  2012-01-17 13:33       ` Artem Bityutskiy
  2012-01-18 22:04       ` Brian Norris
  2 siblings, 2 replies; 28+ messages in thread
From: Angus CLARK @ 2012-01-17 10:22 UTC (permalink / raw)
  To: dedekind1
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Adrian Hunter, Gabor Juhos,
	linux-mtd, Jonas Gorski, Jamie Iles, Ivan Djelic, Robert Jarzmik,
	David Woodhouse, Maxim Levitsky, Dmitry Eremin-Solenikov,
	Kevin Cernekee, Barry Song, Jim Quinlan, Andres Salomon, Axel Lin,
	Anatolij Gustschin, Mike Frysinger, Arnd Bergmann, Lei Wen,
	Sascha Hauer, Artem Bityutskiy, Florian Fainelli, Peter Wippich,
	Matthieu CASTET, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Chuanxiao Dong, Joe Perches, Guillaume LECERF, Brian Norris,
	Roman Tereshonkov

On 01/11/2012 10:28 PM, Artem Bityutskiy wrote:
> In my view, OOB BB markers is the primary, reliable, and simple
> mechanism. And BBT is just an additional optimization to speed up system
> startup.

This seems to be contrary to the advice given by the various NAND manufacturers
(with a quite unusual show of consensus!)  Once a block has been deemed to have
gone bad, one cannot rely on *any* operations being successful, and that
includes writing a bad block marker to the OOB area.  The recommended approach
has for some time been to use a Flash-resident bad block table, with an initial
scan for the manufacturer-programmed bad-block markers.

(Indeed, this issue was raised recently in a meeting with one of the major NAND
manufacturers, and the design enginner was horrified at the thought of relying
on the OOB for tracking worn blocks.)

The use of OOB BB markers certainly has some benefits (as already mentioned in
previous posts), and I like the idea of being able to use OOB markers in
conjunction with BBTs.  However, IMHO, I believe the BBT should be regarded as
the primary source of information, especially when considering inconsistencies
between the OOB markers and the BBTs.

> 1. When we get erase error. Well, if SW erases a block, it does not care
> of the contents. This means that if after the reboot SW will re-try
> erasing it. And if the block is bad, and previously the erasure failed,
> it will fail again, and SW will mark it as bad again.
> 

This raises another point.  It is entirely possible that an erase operation will
succeed on a block where it previously failed.  However, that does not mean to
say the block has now become good.  On first erase failure, the block should be
considered bad and steps taken to ensure the block is not used.

In other words, we cannot rely on erase failures as a way of recovering bad
block status, although I accept in some circumstances, it is probably the best
we can do!

Cheers,

Angus

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-17 10:22     ` Angus CLARK
@ 2012-01-17 13:33       ` Artem Bityutskiy
  2012-01-18 22:04       ` Brian Norris
  1 sibling, 0 replies; 28+ messages in thread
From: Artem Bityutskiy @ 2012-01-17 13:33 UTC (permalink / raw)
  To: Angus CLARK
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Adrian Hunter, Gabor Juhos,
	linux-mtd, Jonas Gorski, Jamie Iles, Ivan Djelic, Robert Jarzmik,
	David Woodhouse, Maxim Levitsky, Dmitry Eremin-Solenikov,
	Kevin Cernekee, Barry Song, Jim Quinlan, Andres Salomon, Axel Lin,
	Anatolij Gustschin, Mike Frysinger, Arnd Bergmann, Lei Wen,
	Sascha Hauer, Florian Fainelli, Peter Wippich, Matthieu CASTET,
	Kyungmin Park, Shmulik Ladkani, Wolfram Sang, Chuanxiao Dong,
	Joe Perches, Guillaume LECERF, Brian Norris, Roman Tereshonkov

[-- Attachment #1: Type: text/plain, Size: 823 bytes --]

On Tue, 2012-01-17 at 10:22 +0000, Angus CLARK wrote:
> On 01/11/2012 10:28 PM, Artem Bityutskiy wrote:
> > In my view, OOB BB markers is the primary, reliable, and simple
> > mechanism. And BBT is just an additional optimization to speed up system
> > startup.
> 
> This seems to be contrary to the advice given by the various NAND manufacturers
> (with a quite unusual show of consensus!)  Once a block has been deemed to have
> gone bad, one cannot rely on *any* operations being successful, and that
> includes writing a bad block marker to the OOB area.  The recommended approach
> has for some time been to use a Flash-resident bad block table, with an initial
> scan for the manufacturer-programmed bad-block markers.

OK, thanks for correction and information.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-17 10:22     ` Angus CLARK
  2012-01-17 13:33       ` Artem Bityutskiy
@ 2012-01-18 22:04       ` Brian Norris
  2012-01-19  9:30         ` Angus CLARK
  1 sibling, 1 reply; 28+ messages in thread
From: Brian Norris @ 2012-01-18 22:04 UTC (permalink / raw)
  To: Angus CLARK
  Cc: Dan Carpenter, Barry Song, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Adrian Hunter, Gabor Juhos,
	linux-mtd, Jonas Gorski, Jamie Iles, Ivan Djelic, Robert Jarzmik,
	David Woodhouse, Maxim Levitsky, Dmitry Eremin-Solenikov,
	Kevin Cernekee, Kulikov Vasiliy, Jim Quinlan, Andres Salomon,
	Axel Lin, Anatolij Gustschin, Mike Frysinger, Arnd Bergmann,
	Lei Wen, Sascha Hauer, Artem Bityutskiy, Florian Fainelli,
	dedekind1, Peter Wippich, Matthieu CASTET, Kyungmin Park,
	Shmulik Ladkani, Wolfram Sang, Chuanxiao Dong, Joe Perches,
	Guillaume LECERF, Roman Tereshonkov

On Tue, Jan 17, 2012 at 2:22 AM, Angus CLARK <angus.clark@st.com> wrote:
> (Indeed, this issue was raised recently in a meeting with one of the major NAND
> manufacturers, and the design enginner was horrified at the thought of relying
> on the OOB for tracking worn blocks.)

That's interesting. I never had this impression, but perhaps the topic
just never came up.

> The use of OOB BB markers certainly has some benefits (as already mentioned in
> previous posts), and I like the idea of being able to use OOB markers in
> conjunction with BBTs.  However, IMHO, I believe the BBT should be regarded as
> the primary source of information, especially when considering inconsistencies
> between the OOB markers and the BBTs.

It looks like the facts are leaning toward flash-based BBT being the
preferable source of info, at least. But due to some practical
concerns (over reliability of BBT, resistance to corruption, and
non-Linux interaction with flash), I feel like we can't say 100% that
BBT is the primary source of bad block info. Now, if we can mitigate
the reliability/corruption issues, that leaves non-Linux (e.g.,
bootloader) interaction with flash.

Anyway, the important question is: how does this impact the current
solution I am developing? IMO, this seems primarily a matter of
perspective, which would drive future development but does not
fundamentally alter my proposed patch(es). The choice of "primary
source" may affect the order in which we update them and the handling
of power cuts, but otherwise, we want the same result regardless of
the "primary."

Another note regarding the primary source: if the BBT is sufficiently
corrupted (according to ECC), we fall back to the OOB markers. That
doesn't make the flash-based BBT the 100% primary source, but I think
it makes sense. This feature was pulled into the 3.2 release, BTW.

> On 01/11/2012 10:28 PM, Artem Bityutskiy wrote:
>> 1. When we get erase error. Well, if SW erases a block, it does not care
>> of the contents. This means that if after the reboot SW will re-try
>> erasing it. And if the block is bad, and previously the erasure failed,
>> it will fail again, and SW will mark it as bad again.
...
> In other words, we cannot rely on erase failures as a way of recovering bad
> block status, although I accept in some circumstances, it is probably the best
> we can do!

I think that if there are really power-cut issues while marking a bad
block, we will often have to resort to the imperfect "best we can do".
If we don't have any more fundamental objections, I will resend soon,
where we will write to OOB first, then to BBT. There will be an option
to simply disable writing markers to OOB.

Thanks,
Brian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-18 22:04       ` Brian Norris
@ 2012-01-19  9:30         ` Angus CLARK
  2012-01-19  9:59           ` Ricard Wanderlof
  0 siblings, 1 reply; 28+ messages in thread
From: Angus CLARK @ 2012-01-19  9:30 UTC (permalink / raw)
  To: Brian Norris
  Cc: Dan Carpenter, Barry Song, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Adrian Hunter, Gabor Juhos,
	linux-mtd, Jonas Gorski, Jamie Iles, Ivan Djelic, Robert Jarzmik,
	David Woodhouse, Maxim Levitsky, Dmitry Eremin-Solenikov,
	Kevin Cernekee, Kulikov Vasiliy, Jim Quinlan, Andres Salomon,
	Axel Lin, Anatolij Gustschin, Mike Frysinger, Arnd Bergmann,
	Lei Wen, Sascha Hauer, Artem Bityutskiy, Florian Fainelli,
	dedekind1, Peter Wippich, Matthieu CASTET, Kyungmin Park,
	Shmulik Ladkani, Wolfram Sang, Chuanxiao Dong, Joe Perches,
	Guillaume LECERF, Roman Tereshonkov

On 01/18/2012 10:04 PM, Brian Norris wrote:
> On Tue, Jan 17, 2012 at 2:22 AM, Angus CLARK <angus.clark@st.com> wrote:
>> (Indeed, this issue was raised recently in a meeting with one of the major NAND
>> manufacturers, and the design enginner was horrified at the thought of relying
>> on the OOB for tracking worn blocks.)
> 
> That's interesting. I never had this impression, but perhaps the topic
> just never came up.
> 

Since it was first brought to our attention, we have sought clarification from a
number of sources.  The general consensus seems to be that if a block has gone
bad, then one cannot rely on any further operations succeeding, including
writing BB markers to the OOB area.  However, the extent to which this is a
problem in practice is less clear.  Many of us have been using OOB BB markers
for years without any issue, although perhaps we just haven't noticed!

> Anyway, the important question is: how does this impact the current
> solution I am developing? IMO, this seems primarily a matter of
> perspective, which would drive future development but does not
> fundamentally alter my proposed patch(es). 

Yes, I fully agree.  The patches add functionality that many of us would find
useful and should be regarded as a step in the right direction.

> Another note regarding the primary source: if the BBT is sufficiently
> corrupted (according to ECC), we fall back to the OOB markers. That
> doesn't make the flash-based BBT the 100% primary source, but I think
> it makes sense. This feature was pulled into the 3.2 release, BTW.

If the BBT becomes corrupted, then the best we can do is rely on OOB markers,
and your patch at least gives us a chance to recover information about blocks
that have gone bad through use.  However, it does concern me that the BBTs can
become corrupted in the first place.  Some systems have no other choice but to
rely on BBTs (e.g. no space or access to OOB).  For SLC NAND at least, the
pattern of use for the BBT blocks is well within the reliability/endurance
specifications.  Do you have some experience on BBTs becoming corrupted, other
than our own development practices, of course :-)

Cheers,

Angus

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/6] NAND BBM + BBT updates
  2012-01-19  9:30         ` Angus CLARK
@ 2012-01-19  9:59           ` Ricard Wanderlof
  0 siblings, 0 replies; 28+ messages in thread
From: Ricard Wanderlof @ 2012-01-19  9:59 UTC (permalink / raw)
  To: Angus CLARK
  Cc: Dan Carpenter, Kulikov Vasiliy, Sebastian Andrzej Siewior,
	Nicolas Ferre, Dominik Brodowski, Peter Wippich, Gabor Juhos,
	linux-mtd@lists.infradead.org, Jonas Gorski, Jamie Iles,
	Ivan Djelic, Robert Jarzmik, David Woodhouse, Maxim Levitsky,
	Dmitry Eremin-Solenikov, Kevin Cernekee, Barry Song, Jim Quinlan,
	Andres Salomon, Axel Lin, Anatolij Gustschin, Mike Frysinger,
	Arnd Bergmann, Lei Wen, Sascha Hauer, Artem Bityutskiy,
	Florian Fainelli, dedekind1@gmail.com, Adrian Hunter,
	Matthieu CASTET, Kyungmin Park, Shmulik Ladkani, Wolfram Sang,
	Chuanxiao Dong, Joe Perches, Guillaume LECERF, Brian Norris,
	Roman Tereshonkov

On Thu, 19 Jan 2012, Angus CLARK wrote:

> On 01/18/2012 10:04 PM, Brian Norris wrote:
>> On Tue, Jan 17, 2012 at 2:22 AM, Angus CLARK <angus.clark@st.com> wrote:
>>> (Indeed, this issue was raised recently in a meeting with one of the major NAND
>>> manufacturers, and the design enginner was horrified at the thought of relying
>>> on the OOB for tracking worn blocks.)
>>
>> That's interesting. I never had this impression, but perhaps the topic
>> just never came up.
>>
> Since it was first brought to our attention, we have sought clarification from a
> number of sources.  The general consensus seems to be that if a block has gone
> bad, then one cannot rely on any further operations succeeding, including
> writing BB markers to the OOB area.  However, the extent to which this is a
> problem in practice is less clear.  Many of us have been using OOB BB markers
> for years without any issue, although perhaps we just haven't noticed!

As far as I understand, a block going bad during ordinary operation 
basically means it is worn out to the point that the on-flash write 
algorithm fails and responds with an error. So yes, that would mean that 
in principle writes will fail so that the oob cannot be written. On the 
other hand, that the chip reports a write error really means that it has 
not managed to reliabily write all bits; it would seem unlikely that all 
8 bits of the bad block marker byte in the oob would fail to get written 
with zeros.

We ran a test on a 32 MB flash many years ago to get some sort of an idea 
of what happens when a block 'wears out'. In that test, it was the erase 
operation that failed first, and even then it was not an either-or 
situation; a block where the chip reported an error during erase could 
very well be erased successfully later. Furthermore, the number of erase 
cycles was way above (with a factor 20 or so) above the endurance spec for 
the chip - not surprising, since the specs by nature must be conservative.

What was more interesting was that the data retention at this state was 
utterly lousy; writing to another block that had been written as many 
times as the one that had 'failed' would have a retention time in the 
region of hours before bits started flipping on subsequent reads.

What it all boiled down to was that for the failure mode we were seeing, 
it was appearent that one cannot rely on the error status from the flash 
to determine when a block is 'bad', one must have some form of erase 
counter and proactively set a block as bad once the number of erase cycles 
has reached a predetermined value (i.e. the endurance spec for the chip). 
Furthermore, it was appearent that a block going bad is primarily a 
wearing-out process, not something were a block suddenly 'goes bad' and is 
unusable after that.

Of course, there are must likely other failure modes which we have not 
observed, and also this was a couple of years ago; shrinking geometries 
have affected the behavior of flash chips since then.

It would be interesting if anyone else has any hard data on this; the 
flash manufacturers are usually not very forthcoming.

/Ricard
-- 
Ricard Wolf Wanderlöf                           ricardw(at)axis.com
Axis Communications AB, Lund, Sweden            www.axis.com
Phone +46 46 272 2016                           Fax +46 46 13 61 30

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2012-01-19  9:59 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-09 20:23 [PATCH v3 0/6] NAND BBM + BBT updates Brian Norris
2012-01-09 20:23 ` [PATCH v3 1/6] mtd: nand: add NAND_NO_WRITE_OOB option Brian Norris
2012-01-09 20:23 ` [PATCH v3 2/6] mtd: nand: write bad block marker by default even with BBT Brian Norris
2012-01-09 20:23 ` [PATCH v3 3/6] mtd: nand: erase block before marking bad Brian Norris
2012-01-13 22:42   ` Artem Bityutskiy
2012-01-13 23:07     ` Brian Norris
2012-01-09 20:23 ` [PATCH v3 4/6] mtd: nand: fix SCAN2NDPAGE check for BBM Brian Norris
2012-01-09 20:23 ` [PATCH v3 5/6] mtd: nand: differentiate 1- vs. 2-byte writes when marking bad blocks Brian Norris
2012-01-09 20:23 ` [PATCH v3 6/6] mtd: nand: correct comment on nand_chip badblockbits Brian Norris
2012-01-10  9:44 ` [PATCH v3 0/6] NAND BBM + BBT updates Sebastian Andrzej Siewior
2012-01-10 18:54   ` Brian Norris
2012-01-11 22:28   ` Artem Bityutskiy
2012-01-12  7:58     ` Shmulik Ladkani
2012-01-13 22:12       ` Artem Bityutskiy
2012-01-16 19:35         ` Shmulik Ladkani
2012-01-12  9:09     ` Sebastian Andrzej Siewior
2012-01-13 22:36       ` Artem Bityutskiy
2012-01-16 20:59         ` Woodhouse, David
2012-01-17  8:23           ` Artem Bityutskiy
2012-01-17  8:27             ` Artem Bityutskiy
2012-01-17 11:19         ` Angus CLARK
2012-01-17 13:06           ` Ivan Djelic
2012-01-18 22:18         ` Brian Norris
2012-01-17 10:22     ` Angus CLARK
2012-01-17 13:33       ` Artem Bityutskiy
2012-01-18 22:04       ` Brian Norris
2012-01-19  9:30         ` Angus CLARK
2012-01-19  9:59           ` Ricard Wanderlof

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).