[PATCH v2 1/4] mtd: nand: localize ECC failures per page

linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 1/4] mtd: nand: localize ECC failures per page
@ 2013-12-09 20:09 Brian Norris
  2013-12-09 20:09 ` [PATCH v2 2/4] mtd: nand: add ONFI vendor block for Micron Brian Norris
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Brian Norris @ 2013-12-09 20:09 UTC (permalink / raw)
  To: linux-mtd; +Cc: Huang Shijie, Brian Norris

ECC failures can be tracked at the page level, not the do_read_ops level
(i.e., a potentially multi-page transaction).

This helps prepare for READ RETRY support.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
---
v1 -> v2: no change

 drivers/mtd/nand/nand_base.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 9b3bb3c519e9..e85b07f4293d 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -1422,7 +1422,6 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 {
 	int chipnr, page, realpage, col, bytes, aligned, oob_required;
 	struct nand_chip *chip = mtd->priv;
-	struct mtd_ecc_stats stats;
 	int ret = 0;
 	uint32_t readlen = ops->len;
 	uint32_t oobreadlen = ops->ooblen;
@@ -1432,7 +1431,7 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 	uint8_t *bufpoi, *oob, *buf;
 	unsigned int max_bitflips = 0;
 
-	stats = mtd->ecc_stats;
+	bool ecc_fail = false;
 
 	chipnr = (int)(from >> chip->chip_shift);
 	chip->select_chip(mtd, chipnr);
@@ -1447,6 +1446,7 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 	oob_required = oob ? 1 : 0;
 
 	while (1) {
+		unsigned int ecc_failures = mtd->ecc_stats.failed;
 		bytes = min(mtd->writesize - col, readlen);
 		aligned = (bytes == mtd->writesize);
 
@@ -1483,7 +1483,7 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 			/* Transfer not aligned data */
 			if (!aligned) {
 				if (!NAND_HAS_SUBPAGE_READ(chip) && !oob &&
-				    !(mtd->ecc_stats.failed - stats.failed) &&
+				    !(mtd->ecc_stats.failed - ecc_failures) &&
 				    (ops->mode != MTD_OPS_RAW)) {
 					chip->pagebuf = realpage;
 					chip->pagebuf_bitflips = ret;
@@ -1513,6 +1513,9 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 				else
 					nand_wait_ready(mtd);
 			}
+
+			if (mtd->ecc_stats.failed - ecc_failures)
+				ecc_fail = true;
 		} else {
 			memcpy(buf, chip->buffers->databuf + col, bytes);
 			buf += bytes;
@@ -1547,7 +1550,7 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 	if (ret < 0)
 		return ret;
 
-	if (mtd->ecc_stats.failed - stats.failed)
+	if (ecc_fail)
 		return -EBADMSG;
 
 	return max_bitflips;
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/4] mtd: nand: add ONFI vendor block for Micron
  2013-12-09 20:09 [PATCH v2 1/4] mtd: nand: localize ECC failures per page Brian Norris
@ 2013-12-09 20:09 ` Brian Norris
  2013-12-09 20:09 ` [PATCH v2 3/4] mtd: nand: support Micron READ RETRY Brian Norris
  2013-12-09 20:09 ` [PATCH v2 4/4] mtd: nand: use __packed shorthand Brian Norris
  2 siblings, 0 replies; 18+ messages in thread
From: Brian Norris @ 2013-12-09 20:09 UTC (permalink / raw)
  To: linux-mtd; +Cc: Huang Shijie, Brian Norris

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
---
v1 -> v2: no change

 include/linux/mtd/nand.h | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
index f3ea8daf08ee..029fe5948dc4 100644
--- a/include/linux/mtd/nand.h
+++ b/include/linux/mtd/nand.h
@@ -285,7 +285,8 @@ struct nand_onfi_params {
 	u8 reserved4[7];
 
 	/* vendor */
-	u8 reserved5[90];
+	__le16 vendor_revision;
+	u8 vendor[88];
 
 	__le16 crc;
 } __attribute__((packed));
@@ -326,6 +327,26 @@ struct onfi_ext_param_page {
 	 */
 } __packed;
 
+struct nand_onfi_vendor_micron {
+	u8 two_plane_read;
+	u8 read_cache;
+	u8 read_unique_id;
+	u8 dq_imped;
+	u8 dq_imped_num_settings;
+	u8 dq_imped_feat_addr;
+	u8 rb_pulldown_strength;
+	u8 rb_pulldown_strength_feat_addr;
+	u8 rb_pulldown_strength_num_settings;
+	u8 otp_mode;
+	u8 otp_page_start;
+	u8 otp_data_prot_addr;
+	u8 otp_num_pages;
+	u8 otp_feat_addr;
+	u8 read_retry_options;
+	u8 reserved[72];
+	u8 param_revision;
+} __packed;
+
 /**
  * struct nand_hw_control - Control structure for hardware controller (e.g ECC generator) shared among independent devices
  * @lock:               protection lock
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-09 20:09 [PATCH v2 1/4] mtd: nand: localize ECC failures per page Brian Norris
  2013-12-09 20:09 ` [PATCH v2 2/4] mtd: nand: add ONFI vendor block for Micron Brian Norris
@ 2013-12-09 20:09 ` Brian Norris
  2013-12-11 13:54   ` Huang Shijie
  2013-12-09 20:09 ` [PATCH v2 4/4] mtd: nand: use __packed shorthand Brian Norris
  2 siblings, 1 reply; 18+ messages in thread
From: Brian Norris @ 2013-12-09 20:09 UTC (permalink / raw)
  To: linux-mtd; +Cc: Huang Shijie, Brian Norris

MLC NAND can experience a large number of bitflips (beyond the
recommended correctability capacity) due to drifts in the voltage
threshold (Vt). These bitflips can cause ECC errors to occur well within
the expected lifetime of the flash. To account for this, some
manufacturers provide a mechanism for shifting the Vt threshold after a
corrupted read. Micron provides the necessary information via the ONFI
vendor-specific parameter block (to indicate how many read-retry modes
are available) and the ONFI {GET,SET}_FEATURES commands with a
vendor-specific feature address (to support reading/switching the
current read-retry mode).

The recommmended sequence is as follows:

  1. Perform PAGE_READ operation
  2. If no ECC error, we are done
  3. Run SET_FEATURES with feature address 89h, mode 1
  4. Retry PAGE_READ operation
  5. If ECC error and there are remaining supported modes, increment the
     mode and return to step 3. Otherwise, this is a true ECC error.
  6. Run SET_FEATURES with feature address 89h, mode 0, to return to the
     default state.

Tested on Micron MT29F32G08CBADA, suppors 8 read-retry modes.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
---
v1 -> v2: fix a logic error when incrementing retry_mode, which caused -EINVAL
          failures on flash that didn't need READ RETRY

One thing I'm not sure about: do all relevant (i.e., ONFI-capable) NAND drivers
support SET_FEATURES properly? If not, then it's possible that this could break
nand_do_read_ops for such drivers. Not sure what the best method of handling
that would be.

 drivers/mtd/nand/nand_base.c | 71 +++++++++++++++++++++++++++++++++++++++++---
 include/linux/mtd/nand.h     |  6 ++++
 2 files changed, 73 insertions(+), 4 deletions(-)

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index e85b07f4293d..f7ac2bfb3445 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -1410,6 +1410,30 @@ static uint8_t *nand_transfer_oob(struct nand_chip *chip, uint8_t *oob,
 }
 
 /**
+ * nand_set_read_retry - [INTERN] Set the READ RETRY mode
+ * @mtd: MTD device structure
+ * @retry_mode: the retry mode to use
+ *
+ * Some vendors supply a special command to shift the Vt threshold, to be used
+ * when there are too many bitflips in a page (i.e., ECC error). After setting
+ * a new threshold, the host should retry reading the page.
+ */
+static int nand_set_read_retry(struct mtd_info *mtd, int retry_mode)
+{
+	struct nand_chip *chip = mtd->priv;
+	uint8_t feature[ONFI_SUBFEATURE_PARAM_LEN] = {retry_mode};
+
+	if (retry_mode >= chip->read_retries)
+		return -EINVAL;
+
+	if (chip->onfi_params.jedec_id == NAND_MFR_MICRON)
+		return chip->onfi_set_features(mtd, chip,
+				ONFI_FEATURE_ADDR_READ_RETRY, feature);
+
+	return -EOPNOTSUPP;
+}
+
+/**
  * nand_do_read_ops - [INTERN] Read data with ECC
  * @mtd: MTD device structure
  * @from: offset to read from
@@ -1431,6 +1455,7 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 	uint8_t *bufpoi, *oob, *buf;
 	unsigned int max_bitflips = 0;
 
+	int retry_mode = 0;
 	bool ecc_fail = false;
 
 	chipnr = (int)(from >> chip->chip_shift);
@@ -1494,8 +1519,6 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 				memcpy(buf, chip->buffers->databuf + col, bytes);
 			}
 
-			buf += bytes;
-
 			if (unlikely(oob)) {
 				int toread = min(oobreadlen, max_oobsize);
 
@@ -1514,8 +1537,27 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 					nand_wait_ready(mtd);
 			}
 
-			if (mtd->ecc_stats.failed - ecc_failures)
-				ecc_fail = true;
+			if (mtd->ecc_stats.failed - ecc_failures) {
+				if (retry_mode + 1 < chip->read_retries) {
+					retry_mode++;
+					pr_debug("ECC error; performing READ RETRY %d\n",
+							retry_mode);
+
+					ret = nand_set_read_retry(mtd,
+							retry_mode);
+					if (ret < 0)
+						break;
+
+					/* Reset failures */
+					mtd->ecc_stats.failed = ecc_failures;
+					continue;
+				} else {
+					/* No more retry modes; real failure */
+					ecc_fail = true;
+				}
+			}
+
+			buf += bytes;
 		} else {
 			memcpy(buf, chip->buffers->databuf + col, bytes);
 			buf += bytes;
@@ -1525,6 +1567,14 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 
 		readlen -= bytes;
 
+		/* Reset to retry mode 0 */
+		if (retry_mode) {
+			ret = nand_set_read_retry(mtd, 0);
+			if (ret < 0)
+				break;
+			retry_mode = 0;
+		}
+
 		if (!readlen)
 			break;
 
@@ -2932,6 +2982,16 @@ ext_out:
 }
 
 /*
+ * Configure chip properties from Micron vendor-specific ONFI table
+ */
+static void nand_onfi_detect_micron(struct nand_chip *chip,
+		struct nand_onfi_params *p)
+{
+	struct nand_onfi_vendor_micron *micron = (void *)p->vendor;
+	chip->read_retries = micron->read_retry_options;
+}
+
+/*
  * Check if the NAND chip is ONFI compliant, returns 1 if it is, 0 otherwise.
  */
 static int nand_flash_detect_onfi(struct mtd_info *mtd, struct nand_chip *chip,
@@ -3037,6 +3097,9 @@ static int nand_flash_detect_onfi(struct mtd_info *mtd, struct nand_chip *chip,
 		pr_warn("Could not retrieve ONFI ECC requirements\n");
 	}
 
+	if (p->jedec_id == NAND_MFR_MICRON)
+		nand_onfi_detect_micron(chip, p);
+
 	return 1;
 }
 
diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
index 029fe5948dc4..6e579e90955e 100644
--- a/include/linux/mtd/nand.h
+++ b/include/linux/mtd/nand.h
@@ -219,6 +219,9 @@ struct nand_chip;
 /* ONFI feature address */
 #define ONFI_FEATURE_ADDR_TIMING_MODE	0x1
 
+/* Vendor-specific feature address (Micron) */
+#define ONFI_FEATURE_ADDR_READ_RETRY	0x89
+
 /* ONFI subfeature parameters length */
 #define ONFI_SUBFEATURE_PARAM_LEN	4
 
@@ -518,6 +521,7 @@ struct nand_buffers {
  *			non 0 if ONFI supported.
  * @onfi_params:	[INTERN] holds the ONFI page parameter when ONFI is
  *			supported, 0 otherwise.
+ * @read_retries:	[INTERN] the number of read retry modes supported
  * @onfi_set_features:	[REPLACEABLE] set the features for ONFI nand
  * @onfi_get_features:	[REPLACEABLE] get the features for ONFI nand
  * @bbt:		[INTERN] bad block table pointer
@@ -589,6 +593,8 @@ struct nand_chip {
 	int onfi_version;
 	struct nand_onfi_params	onfi_params;
 
+	int read_retries;
+
 	flstate_t state;
 
 	uint8_t *oob_poi;
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 4/4] mtd: nand: use __packed shorthand
  2013-12-09 20:09 [PATCH v2 1/4] mtd: nand: localize ECC failures per page Brian Norris
  2013-12-09 20:09 ` [PATCH v2 2/4] mtd: nand: add ONFI vendor block for Micron Brian Norris
  2013-12-09 20:09 ` [PATCH v2 3/4] mtd: nand: support Micron READ RETRY Brian Norris
@ 2013-12-09 20:09 ` Brian Norris
  2 siblings, 0 replies; 18+ messages in thread
From: Brian Norris @ 2013-12-09 20:09 UTC (permalink / raw)
  To: linux-mtd; +Cc: Huang Shijie, Brian Norris

To be consistent with the rest of include/linux/mtd/nand.h, we should
use the __packed shorthand instead of __attribute__((packed)).

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
---
v1 -> v2: no change

 include/linux/mtd/nand.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
index 6e579e90955e..32213a5a6705 100644
--- a/include/linux/mtd/nand.h
+++ b/include/linux/mtd/nand.h
@@ -292,7 +292,7 @@ struct nand_onfi_params {
 	u8 vendor[88];
 
 	__le16 crc;
-} __attribute__((packed));
+} __packed;
 
 #define ONFI_CRC_BASE	0x4F4E
 
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-09 20:09 ` [PATCH v2 3/4] mtd: nand: support Micron READ RETRY Brian Norris
@ 2013-12-11 13:54   ` Huang Shijie
  2013-12-11 19:03     ` Brian Norris
  0 siblings, 1 reply; 18+ messages in thread
From: Huang Shijie @ 2013-12-11 13:54 UTC (permalink / raw)
  To: Brian Norris; +Cc: Huang Shijie, linux-mtd

On Mon, Dec 09, 2013 at 12:09:12PM -0800, Brian Norris wrote:
> + * nand_set_read_retry - [INTERN] Set the READ RETRY mode
> + * @mtd: MTD device structure
> + * @retry_mode: the retry mode to use
> + *
> + * Some vendors supply a special command to shift the Vt threshold, to be used
> + * when there are too many bitflips in a page (i.e., ECC error). After setting
> + * a new threshold, the host should retry reading the page.
> + */
> +static int nand_set_read_retry(struct mtd_info *mtd, int retry_mode)
> +{
> +	struct nand_chip *chip = mtd->priv;
> +	uint8_t feature[ONFI_SUBFEATURE_PARAM_LEN] = {retry_mode};
> +
This can cause a DMA warning.

> +	if (retry_mode >= chip->read_retries)
> +		return -EINVAL;
> +
> +	if (chip->onfi_params.jedec_id == NAND_MFR_MICRON)
> +		return chip->onfi_set_features(mtd, chip,
> +				ONFI_FEATURE_ADDR_READ_RETRY, feature);
I suggest to add a hook such as for nand_chip{}:
	chip->read_retry(..)

Different nand chips fill different hook.

For Micron, fill it with micron_read_retry();
for Toshiba, fill it with a toshiab_read_retry();
For Hynix, fill it with hynix_read_retry().

I am wondar if we should add a file for the read-retry in the
drivers/mtd/nand folder.


thanks
Huang Shijie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
       [not found] <20980858CB6D3A4BAE95CA194937D5E73EA53CF1@DBDE04.ent.ti.com>
@ 2013-12-11 18:37 ` Brian Norris
  2013-12-11 20:54   ` Gupta, Pekon
  0 siblings, 1 reply; 18+ messages in thread
From: Brian Norris @ 2013-12-11 18:37 UTC (permalink / raw)
  To: Gupta, Pekon; +Cc: Huang Shijie, linux-mtd@lists.infradead.org

On Wed, Dec 11, 2013 at 07:49:21AM +0000, Pekon Gupta wrote:
> >From: Brian Norris
>  [...]
> > /*
> >+ * Configure chip properties from Micron vendor-specific ONFI table
> >+ */
> >+static void nand_onfi_detect_micron(struct nand_chip *chip,
> >+		struct nand_onfi_params *p)
> >+{
> >+	struct nand_onfi_vendor_micron *micron = (void *)p->vendor;
> >+	chip->read_retries = micron->read_retry_options;
> >+}
> >
> This may not be needed. Refer comments below..
> 
> [...]
> 
> > static int nand_flash_detect_onfi(struct mtd_info *mtd, struct nand_chip *chip,
> >@@ -3037,6 +3097,9 @@ static int nand_flash_detect_onfi(struct mtd_info *mtd, struct nand_chip *chip,
> > 		pr_warn("Could not retrieve ONFI ECC requirements\n");
> > 	}
> >
> >+	if (p->jedec_id == NAND_MFR_MICRON)
> >+		nand_onfi_detect_micron(chip, p);
> >+
> >
> Do all Micron devices support this vendor-specific ONFI information ?
> - legacy device ?
> - SLC NAND ?
> If not then we might need more filters here ..

We can look at the vendor_revision (as of the previous patch, this is in
p->vendor_revision). But the vendor block is really
backwards-compatible, since byte 180 (the micron->read_retries field)
defaulted to 0 on legacy and SLC Micron ONFI NAND. I'll summarize the
datasheets I'm referring to:

Needs retry   Cell-type    Part number          Vendor revision    Byte 180
-----------   ---------    ----------------     ---------------    ------------
No            SLC          MT29F16G08ABABA      01h, 00h           Reserved (0)
No            MLC          MT29F32G08CBABA      01h, 00h           Reserved (0)
No            SLC          MT29F1G08AACWP       01h, 00h           0
Yes           MLC          MT29F32G08CBADA      01h, 00h           08h
Yes           MLC          MT29F64G08CBABA      02h, 00h           08h

These datasheets range from 2008 to 2012, and they cover the range of
ONFI 1.0 to ONFI 2.3, so I think it's safe to say that all Micron
revisions support this vendor block "number of read retry modes" byte.
At most, we chould check for:

	if (le16_to_cpu(p->vendor_revision) < 0x0100)
		return;

> [...]
> 
> >diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
> >@@ -518,6 +521,7 @@ struct nand_buffers {
> >  *			non 0 if ONFI supported.
> >  * @onfi_params:	[INTERN] holds the ONFI page parameter when ONFI is
> >  *			supported, 0 otherwise.
> >+ * @read_retries:	[INTERN] the number of read retry modes supported
> >  * @onfi_set_features:	[REPLACEABLE] set the features for ONFI nand
> >  * @onfi_get_features:	[REPLACEABLE] get the features for ONFI nand
> >  * @bbt:		[INTERN] bad block table pointer
> >@@ -589,6 +593,8 @@ struct nand_chip {
> > 	int onfi_version;
> > 	struct nand_onfi_params	onfi_params;
> >
> >+	int read_retries;
> >+
> >
> I think we should not add new field 'read_retries' to 'struct nand_chip' bcoz
> - It's a vendor specific feature,

I don't know that much about other vendor's read-retry support, but I
expect that they would have a similar parameter to control how many
times we retry. This is not intended to be vendor-specific, and we can
adapt if other vendors' approach is too different. (However, I suspect
that we can't do this very well for some vendors; they don't
standardize, and they are getting more and more difficult to support
reliably without deep knowledge of their quirks.)

> - And also specifically for MLC NAND.

Your point? Where do you propose to put MLC information?

> Alternatively you can indirectly access this as part of onfi_params ..
> (struct nand_onfi_vendor_micron *) micron = chip->onfi_param->vendor
> And then micron->read_retries.

We shouldn't be probing our vendor-specific properties on every
invocation of nand_do_read_ops(); it should either be in a field (like
my nand_chip.read_retries) or in a callback that we assign per-vendor,
like Huang suggested.

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-11 13:54   ` Huang Shijie
@ 2013-12-11 19:03     ` Brian Norris
  2013-12-11 20:31       ` Ezequiel Garcia
  2013-12-12  3:47       ` Huang Shijie
  0 siblings, 2 replies; 18+ messages in thread
From: Brian Norris @ 2013-12-11 19:03 UTC (permalink / raw)
  To: Huang Shijie; +Cc: Huang Shijie, linux-mtd, Pekon Gupta, Ezequiel Garcia

On Wed, Dec 11, 2013 at 09:54:54PM +0800, Huang Shijie wrote:
> On Mon, Dec 09, 2013 at 12:09:12PM -0800, Brian Norris wrote:
> > + * nand_set_read_retry - [INTERN] Set the READ RETRY mode
> > + * @mtd: MTD device structure
> > + * @retry_mode: the retry mode to use
> > + *
> > + * Some vendors supply a special command to shift the Vt threshold, to be used
> > + * when there are too many bitflips in a page (i.e., ECC error). After setting
> > + * a new threshold, the host should retry reading the page.
> > + */
> > +static int nand_set_read_retry(struct mtd_info *mtd, int retry_mode)
> > +{
> > +	struct nand_chip *chip = mtd->priv;
> > +	uint8_t feature[ONFI_SUBFEATURE_PARAM_LEN] = {retry_mode};
> > +
> This can cause a DMA warning.

...on GPMI NAND, but not on most (any?) other drivers. Why does GPMI try
to use DMA on *every* operation? That doesn't even make sense for a 4 or
5 byte transfer. Plus, we don't give a guarantee that buffers will be
DMA-able in MTD (UBI uses vmalloc() buffers, for instance), so I'm sure
you'll hit problems other places. I can fix this one (use
chip->buffers->databuf instead?) but I think GPMI is a bad citizen in
this regard, given that you have no guarantee. You need to fix MTD
systematically if you expect this guarantee.

(For one, nand_default_block_markbad() uses stack-allocated buffers; but
I see that GPMI overrides this callback.)

> > +	if (retry_mode >= chip->read_retries)
> > +		return -EINVAL;
> > +
> > +	if (chip->onfi_params.jedec_id == NAND_MFR_MICRON)
> > +		return chip->onfi_set_features(mtd, chip,
> > +				ONFI_FEATURE_ADDR_READ_RETRY, feature);
> I suggest to add a hook such as for nand_chip{}:
> 	chip->read_retry(..)
> 
> Different nand chips fill different hook.

OK, that can make sense.

> For Micron, fill it with micron_read_retry();
> for Toshiba, fill it with a toshiab_read_retry();
> For Hynix, fill it with hynix_read_retry().

Do you have info on Toshiba/Hynix read retry? I'd be interested to know
what their "features" look like. Do they have a similar set of supported
retry modes, where we have to cycle through each mode and retry? I'm
interested in whether the nand_chip.read_retries field is useful for the
others.

Also, what would a 'read_retry()' callback look like? What are its
parameters? Right now, it's just the struct mtd_info + an incrementing
integer, from 0 to chip->read_retries-1.

> I am wondar if we should add a file for the read-retry in the
> drivers/mtd/nand folder.

TBH, I'm not really very happy to be opening the read-retry can of
worms. From what I hear, read-retry is just the first step down a path
of vendor-specific detail, which ends with you re-implementing the FTL
that these vendors develop for their SSDs.

But anyway, no I don't think read-retry warrants its own file, at least
not yet. I'd need to understand where it's heading before we decide that
it needs this kind of separation.

I do think, though, that it may be worth moving some more of the
vendor-specific stuff (like the identification code,
nand_get_flash_type(), + read-retry callbacks) into a new file. Maybe
nand_vendors.c? nand_ident.c? But I think it needs a bit of cleanup
first; untangling the auto-buswidth stuff that Pekon, Ezequiel and I
have been dealing with might be a start.

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-11 19:03     ` Brian Norris
@ 2013-12-11 20:31       ` Ezequiel Garcia
  2013-12-12  3:49         ` Huang Shijie
  2013-12-12  3:47       ` Huang Shijie
  1 sibling, 1 reply; 18+ messages in thread
From: Ezequiel Garcia @ 2013-12-11 20:31 UTC (permalink / raw)
  To: Brian Norris; +Cc: linux-mtd, Huang Shijie, Pekon Gupta, Huang Shijie

On Wed, Dec 11, 2013 at 11:03:13AM -0800, Brian Norris wrote:
> On Wed, Dec 11, 2013 at 09:54:54PM +0800, Huang Shijie wrote:
> > On Mon, Dec 09, 2013 at 12:09:12PM -0800, Brian Norris wrote:
> > > + * nand_set_read_retry - [INTERN] Set the READ RETRY mode
> > > + * @mtd: MTD device structure
> > > + * @retry_mode: the retry mode to use
> > > + *
> > > + * Some vendors supply a special command to shift the Vt threshold, to be used
> > > + * when there are too many bitflips in a page (i.e., ECC error). After setting
> > > + * a new threshold, the host should retry reading the page.
> > > + */
> > > +static int nand_set_read_retry(struct mtd_info *mtd, int retry_mode)
> > > +{
> > > +	struct nand_chip *chip = mtd->priv;
> > > +	uint8_t feature[ONFI_SUBFEATURE_PARAM_LEN] = {retry_mode};
> > > +
> > This can cause a DMA warning.
> 
> ...on GPMI NAND, but not on most (any?) other drivers. Why does GPMI try
> to use DMA on *every* operation? That doesn't even make sense for a 4 or
> 5 byte transfer. Plus, we don't give a guarantee that buffers will be
> DMA-able in MTD (UBI uses vmalloc() buffers, for instance), so I'm sure
> you'll hit problems other places. I can fix this one (use
> chip->buffers->databuf instead?) but I think GPMI is a bad citizen in
> this regard, given that you have no guarantee. You need to fix MTD
> systematically if you expect this guarantee.
> 
> (For one, nand_default_block_markbad() uses stack-allocated buffers; but
> I see that GPMI overrides this callback.)
> 

Isn't completely bloated to setup a DMA operation just to transfer a
bunch of bytes? Wouldn't that seriously hurt performance?

Or maybe it doesn't matter because those commands are scarcely
invoked...
-- 
Ezequiel García, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-11 18:37 ` [PATCH v2 3/4] mtd: nand: support Micron READ RETRY Brian Norris
@ 2013-12-11 20:54   ` Gupta, Pekon
  2013-12-17  5:22     ` Brian Norris
  0 siblings, 1 reply; 18+ messages in thread
From: Gupta, Pekon @ 2013-12-11 20:54 UTC (permalink / raw)
  To: Brian Norris; +Cc: Huang Shijie, linux-mtd@lists.infradead.org

Hi Brian,

>From: Brian Norris
>>On Wed, Dec 11, 2013 at 07:49:21AM +0000, Pekon Gupta wrote:
>> >From: Brian Norris
[...]

>> Do all Micron devices support this vendor-specific ONFI information ?
>> - legacy device ?
>> - SLC NAND ?
>> If not then we might need more filters here ..
>
>We can look at the vendor_revision (as of the previous patch, this is in
>p->vendor_revision). But the vendor block is really
>backwards-compatible, since byte 180 (the micron->read_retries field)
>defaulted to 0 on legacy and SLC Micron ONFI NAND. I'll summarize the
>datasheets I'm referring to:
>
>Needs retry   Cell-type    Part number          Vendor revision    Byte 180
>-----------   ---------    ----------------     ---------------    ------------
>No            SLC          MT29F16G08ABABA      01h, 00h           Reserved (0)
>No            MLC          MT29F32G08CBABA      01h, 00h           Reserved (0)
>No            SLC          MT29F1G08AACWP       01h, 00h           0
>Yes           MLC          MT29F32G08CBADA      01h, 00h           08h
>Yes           MLC          MT29F64G08CBABA      02h, 00h           08h
>
>These datasheets range from 2008 to 2012, and they cover the range of
>ONFI 1.0 to ONFI 2.3, so I think it's safe to say that all Micron
>revisions support this vendor block "number of read retry modes" byte.
>
Yes, I have a Micron MLC data-sheet [1] which marks 'read_retry_options'
Byte 180 as reserved (0x0). So asked this question..


>At most, we chould check for:
>
>	if (le16_to_cpu(p->vendor_revision) < 0x0100)
>		return;
>
But, your above table shows that there are MLC NAND devices with
Vendor_revision == 01h, 00h there are both types of 'read_retry_options'
"Reserved(0)" and "08h". So, I think it would be safe to check for below:
@@nand_do_read_ops (..)
+		if (mtd->ecc_stats.failed - ecc_failures) {
+				if ((retry_mode + 1 < chip->read_retries) && 
+ ----->>>>				(!micron->read_retry_options)) {
+			
+		/* Then proceed with retry sequence */


 [...]


>>
>> >diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
>> >@@ -518,6 +521,7 @@ struct nand_buffers {
>> >  *			non 0 if ONFI supported.
>> >  * @onfi_params:	[INTERN] holds the ONFI page parameter when ONFI is
>> >  *			supported, 0 otherwise.
>> >+ * @read_retries:	[INTERN] the number of read retry modes supported
>> >  * @onfi_set_features:	[REPLACEABLE] set the features for ONFI nand
>> >  * @onfi_get_features:	[REPLACEABLE] get the features for ONFI nand
>> >  * @bbt:		[INTERN] bad block table pointer
>> >@@ -589,6 +593,8 @@ struct nand_chip {
>> > 	int onfi_version;
>> > 	struct nand_onfi_params	onfi_params;
>> >
>> >+	int read_retries;
>> >+
>> >
>> I think we should not add new field 'read_retries' to 'struct nand_chip' bcoz
>> - It's a vendor specific feature,
>
>I don't know that much about other vendor's read-retry support, but I
>expect that they would have a similar parameter to control how many
>times we retry. This is not intended to be vendor-specific, and we can
>adapt if other vendors' approach is too different. (However, I suspect
>that we can't do this very well for some vendors; they don't
>standardize, and they are getting more and more difficult to support
>reliably without deep knowledge of their quirks.)
>
>> - And also specifically for MLC NAND.
>
>Your point? Where do you propose to put MLC information?
>
>> Alternatively you can indirectly access this as part of onfi_params ..
>> (struct nand_onfi_vendor_micron *) micron = chip->onfi_param->vendor
>> And then micron->read_retries.
>
>We shouldn't be probing our vendor-specific properties on every
>invocation of nand_do_read_ops(); it should either be in a field (like
>my nand_chip.read_retries) or in a callback that we assign per-vendor,
>like Huang suggested.
>
No, I'm not saying to probe it every time.

But I'm not getting one thing here that..
(1) 'struct nand_onfi_param' is itself is packed into 'struct nand_chip'.
  @@nand.h 	struct nand_onfi_params	onfi_params;
(2) And 'vendor block' information is already part of 
    'struct nand_onfi_params' marked as vendor[88].
[Patch 2/4]
-	u8 reserved5[90];
+	__le16 vendor_revision;
+	u8 vendor[88];

Then why do we need another field 'int read_retries' in nand_chip ?
As 'read_retry_options' information is already packed in
'struct nand_chip' as part of 'onfi_param.vendor[]'. So all we just 
need is to re-cast the u8 vendor[] array into meaningful fields.
.. correct ? Am I missing something here ?
So information in vendor[] is same as any other field in nand_chip,
Then why do we need to duplicate this data.


with regards, pekon

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-11 19:03     ` Brian Norris
  2013-12-11 20:31       ` Ezequiel Garcia
@ 2013-12-12  3:47       ` Huang Shijie
  2013-12-17  4:43         ` Brian Norris
  1 sibling, 1 reply; 18+ messages in thread
From: Huang Shijie @ 2013-12-12  3:47 UTC (permalink / raw)
  To: Brian Norris; +Cc: Huang Shijie, linux-mtd, Pekon Gupta, Ezequiel Garcia

On Wed, Dec 11, 2013 at 11:03:13AM -0800, Brian Norris wrote:
> On Wed, Dec 11, 2013 at 09:54:54PM +0800, Huang Shijie wrote:
> > On Mon, Dec 09, 2013 at 12:09:12PM -0800, Brian Norris wrote:
> > > + * nand_set_read_retry - [INTERN] Set the READ RETRY mode
> > > + * @mtd: MTD device structure
> > > + * @retry_mode: the retry mode to use
> > > + *
> > > + * Some vendors supply a special command to shift the Vt threshold, to be used
> > > + * when there are too many bitflips in a page (i.e., ECC error). After setting
> > > + * a new threshold, the host should retry reading the page.
> > > + */
> > > +static int nand_set_read_retry(struct mtd_info *mtd, int retry_mode)
> > > +{
> > > +	struct nand_chip *chip = mtd->priv;
> > > +	uint8_t feature[ONFI_SUBFEATURE_PARAM_LEN] = {retry_mode};
> > > +
> > This can cause a DMA warning.
> 
> ...on GPMI NAND, but not on most (any?) other drivers. Why does GPMI try
> to use DMA on *every* operation? That doesn't even make sense for a 4 or
> 5 byte transfer. Plus, we don't give a guarantee that buffers will be
> DMA-able in MTD (UBI uses vmalloc() buffers, for instance), so I'm sure
> you'll hit problems other places. I can fix this one (use
> chip->buffers->databuf instead?) but I think GPMI is a bad citizen in
> this regard, given that you have no guarantee. You need to fix MTD
> systematically if you expect this guarantee.
You do not need to change this code, if you like it.

The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG.


> 
> (For one, nand_default_block_markbad() uses stack-allocated buffers; but
> I see that GPMI overrides this callback.)
> 
> > > +	if (retry_mode >= chip->read_retries)
> > > +		return -EINVAL;
> > > +
> > > +	if (chip->onfi_params.jedec_id == NAND_MFR_MICRON)
> > > +		return chip->onfi_set_features(mtd, chip,
> > > +				ONFI_FEATURE_ADDR_READ_RETRY, feature);
> > I suggest to add a hook such as for nand_chip{}:
> > 	chip->read_retry(..)
> > 
> > Different nand chips fill different hook.
> 
> OK, that can make sense.
> 
> > For Micron, fill it with micron_read_retry();
> > for Toshiba, fill it with a toshiab_read_retry();
> > For Hynix, fill it with hynix_read_retry().
> 
> Do you have info on Toshiba/Hynix read retry? I'd be interested to know
> what their "features" look like. Do they have a similar set of supported
> retry modes, where we have to cycle through each mode and retry? I'm
> interested in whether the nand_chip.read_retries field is useful for the
> others.

i will send you the documents when i back office.
Different nand chips use different read-retry methods. A headache to us.

The common idea between the read-retry methods is:
  "we have to cycle through each mode and retry"

So the chip.read_retries is used by others too.

I ever implemented the Micron's read-retry in my local code.
I planned to implement others, but i was intterrupted by high priority
jobs.

> 
> Also, what would a 'read_retry()' callback look like? What are its
> parameters? Right now, it's just the struct mtd_info + an incrementing
> integer, from 0 to chip->read_retries-1.

I think the parameters are good now.

> 
> > I am wondar if we should add a file for the read-retry in the
> > drivers/mtd/nand folder.
> 
> TBH, I'm not really very happy to be opening the read-retry can of
> worms. From what I hear, read-retry is just the first step down a path

it is ok to me if you do not like to create a new file, that is just a
suggestion.


After the nand chip use <19nm, afaik, all the NAND chips needs the
read-retry. We have to implement the read-retry for all of them.
The nightmare is coming.

thanks
Huang Shijie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-11 20:31       ` Ezequiel Garcia
@ 2013-12-12  3:49         ` Huang Shijie
  0 siblings, 0 replies; 18+ messages in thread
From: Huang Shijie @ 2013-12-12  3:49 UTC (permalink / raw)
  To: Ezequiel Garcia; +Cc: Huang Shijie, Brian Norris, linux-mtd, Pekon Gupta

On Wed, Dec 11, 2013 at 05:31:12PM -0300, Ezequiel Garcia wrote:
> Isn't completely bloated to setup a DMA operation just to transfer a
> bunch of bytes? Wouldn't that seriously hurt performance?
> 
> Or maybe it doesn't matter because those commands are scarcely
> invoked...
As you said, those commands are scarcely invoked.

thanks
Huang Shijie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-17  4:43         ` Brian Norris
@ 2013-12-17  4:23           ` Huang Shijie
  2013-12-17  5:06             ` Brian Norris
  0 siblings, 1 reply; 18+ messages in thread
From: Huang Shijie @ 2013-12-17  4:23 UTC (permalink / raw)
  To: Brian Norris; +Cc: linux-mtd, Huang Shijie, Pekon Gupta, Ezequiel Garcia

On Mon, Dec 16, 2013 at 08:43:42PM -0800, Brian Norris wrote:
> > 
> > The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG.
> 
> But the warning has a significant purpose: it is *not* legal to perform
> DMA on stack memory, and no driver should do this. Either nand_base or
> gpmi-nand need to be fixed here.
> 
> But I think it's the GPMI driver is the one that needs fixing, not this
> patch. Are there any scenarios where it makes sense for gpmi-nand to use
> DMA for its read_buf/write_buf? Shouldn't the only DMA-necessary
> operations come through the ecc.write_page and ecc.read_page callbacks?
> If I'm correct, then gpmi-nand should not be using DMA at all in
> read_buf/write_buf.
Currently, all the operations, including the read_buf/write_buf, use
the DMA.

Is there any ABI tells us we should not use the DMA for read_buf/write_buf? :)


> 
> > Different nand chips use different read-retry methods. A headache to us.
> 
> Any chance we can push these vendors to implement things through a
> standard, like JEDEC? I believe there's a JEDEC parameter page standard
we do not have such influence to push these vendors.
they will not listen to us. :(

> now, though I haven't looked at it closely. Micron's support is nice and
> simple because it properly uses the ONFI vendor block. Other vendors
> should take note...
yes. Micron is good example.


thanks
Huang Shijie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-12  3:47       ` Huang Shijie
@ 2013-12-17  4:43         ` Brian Norris
  2013-12-17  4:23           ` Huang Shijie
  0 siblings, 1 reply; 18+ messages in thread
From: Brian Norris @ 2013-12-17  4:43 UTC (permalink / raw)
  To: Huang Shijie; +Cc: Huang Shijie, linux-mtd, Pekon Gupta, Ezequiel Garcia

On Thu, Dec 12, 2013 at 11:47:52AM +0800, Huang Shijie wrote:
> On Wed, Dec 11, 2013 at 11:03:13AM -0800, Brian Norris wrote:
> > On Wed, Dec 11, 2013 at 09:54:54PM +0800, Huang Shijie wrote:
> > > On Mon, Dec 09, 2013 at 12:09:12PM -0800, Brian Norris wrote:
> > > > +	uint8_t feature[ONFI_SUBFEATURE_PARAM_LEN] = {retry_mode};
> > > > +
> > > This can cause a DMA warning.
> > 
> > ...on GPMI NAND, but not on most (any?) other drivers. Why does GPMI try
> > to use DMA on *every* operation? That doesn't even make sense for a 4 or
> > 5 byte transfer. Plus, we don't give a guarantee that buffers will be
> > DMA-able in MTD (UBI uses vmalloc() buffers, for instance), so I'm sure
> > you'll hit problems other places. I can fix this one (use
> > chip->buffers->databuf instead?) but I think GPMI is a bad citizen in
> > this regard, given that you have no guarantee. You need to fix MTD
> > systematically if you expect this guarantee.
> You do not need to change this code, if you like it.
> 
> The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG.

But the warning has a significant purpose: it is *not* legal to perform
DMA on stack memory, and no driver should do this. Either nand_base or
gpmi-nand need to be fixed here.

But I think it's the GPMI driver is the one that needs fixing, not this
patch. Are there any scenarios where it makes sense for gpmi-nand to use
DMA for its read_buf/write_buf? Shouldn't the only DMA-necessary
operations come through the ecc.write_page and ecc.read_page callbacks?
If I'm correct, then gpmi-nand should not be using DMA at all in
read_buf/write_buf.

> Different nand chips use different read-retry methods. A headache to us.

Any chance we can push these vendors to implement things through a
standard, like JEDEC? I believe there's a JEDEC parameter page standard
now, though I haven't looked at it closely. Micron's support is nice and
simple because it properly uses the ONFI vendor block. Other vendors
should take note...

> The common idea between the read-retry methods is:
>   "we have to cycle through each mode and retry"
> 
> So the chip.read_retries is used by others too.

Good to know.

> After the nand chip use <19nm, afaik, all the NAND chips needs the
> read-retry. We have to implement the read-retry for all of them.
> The nightmare is coming.

*cue the dark organ music*

Sigh. I might have to abandon MTD for eMMC some day ;)

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-17  4:23           ` Huang Shijie
@ 2013-12-17  5:06             ` Brian Norris
  2013-12-17  5:11               ` Huang Shijie
  0 siblings, 1 reply; 18+ messages in thread
From: Brian Norris @ 2013-12-17  5:06 UTC (permalink / raw)
  To: Huang Shijie; +Cc: linux-mtd, Huang Shijie, Pekon Gupta, Ezequiel Garcia

On Tue, Dec 17, 2013 at 12:23:08PM +0800, Huang Shijie wrote:
> On Mon, Dec 16, 2013 at 08:43:42PM -0800, Brian Norris wrote:
> > > 
> > > The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG.
> > 
> > But the warning has a significant purpose: it is *not* legal to perform
> > DMA on stack memory, and no driver should do this. Either nand_base or
> > gpmi-nand need to be fixed here.
> > 
> > But I think it's the GPMI driver is the one that needs fixing, not this
> > patch. Are there any scenarios where it makes sense for gpmi-nand to use
> > DMA for its read_buf/write_buf? Shouldn't the only DMA-necessary
> > operations come through the ecc.write_page and ecc.read_page callbacks?
> > If I'm correct, then gpmi-nand should not be using DMA at all in
> > read_buf/write_buf.
> Currently, all the operations, including the read_buf/write_buf, use
> the DMA.
> 
> Is there any ABI tells us we should not use the DMA for read_buf/write_buf? :)

No... but this is not about ABI, it's about designing a reasonable API
and abstraction between users (NAND drivers) and the helper code
(nand_base). MTD never had a good design for DMA-able operations; but I
think that it's at least an approachable problem if we provide
limitations on when and where we might expect to use DMA.

Particularly, I think it is quite reasonable to restrict DMA to
mtd_read()/mtd_write()-like operations -- so this trickles down to
read_page/write_page, but not to device configuration operations like
ONFI set_features/get_features/read parameter tables, etc.

Do you disagree? Do you see some need to perform DMA on all operations?

> > > Different nand chips use different read-retry methods. A headache to us.
> > 
> > Any chance we can push these vendors to implement things through a
> > standard, like JEDEC? I believe there's a JEDEC parameter page standard
> we do not have such influence to push these vendors.
> they will not listen to us. :(

We can try! Strength in numbers! BTW, last time I brought up ONFI w/ a
Toshiba representative, I think I offended them :) But at least I had
their ear for a moment.

Anyway, it may be more reasonable to suggest a JEDEC standard, as I
think they felt JEDEC was more vendor-independent, and they were at
least open to entertaining the idea.

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-17  5:06             ` Brian Norris
@ 2013-12-17  5:11               ` Huang Shijie
  2013-12-17  7:01                 ` Brian Norris
  0 siblings, 1 reply; 18+ messages in thread
From: Huang Shijie @ 2013-12-17  5:11 UTC (permalink / raw)
  To: Brian Norris; +Cc: linux-mtd, Huang Shijie, Pekon Gupta, Ezequiel Garcia

On Mon, Dec 16, 2013 at 09:06:13PM -0800, Brian Norris wrote:
> On Tue, Dec 17, 2013 at 12:23:08PM +0800, Huang Shijie wrote:
> > On Mon, Dec 16, 2013 at 08:43:42PM -0800, Brian Norris wrote:
> > > > 
> > > > The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG.
> Particularly, I think it is quite reasonable to restrict DMA to
> mtd_read()/mtd_write()-like operations -- so this trickles down to
> read_page/write_page, but not to device configuration operations like
> ONFI set_features/get_features/read parameter tables, etc.
> 
> Do you disagree? Do you see some need to perform DMA on all operations?

The gpmi is much coupled with the MXS-DMA engine. the gpmi can uses the
mxs-dma to set the gpmi registers.

Use the DMA makes the GPMI code very simple, it really makes the code very
_ugly_ if we do not use the DMA for read_buf/write_buf.

If you insist that it is okay to use the local array in the nand_base.c,
I can use kmalloc/memcpy in the gpmi's read_buf/write_buf hooks. In such a 
way, we can make the gpmi looks more normal.

sorry, I really think it is no need to force the gpmi do not use the DMA
for read_buf/write_buf.

> 
> > > > Different nand chips use different read-retry methods. A headache to us.
> > > 
> > > Any chance we can push these vendors to implement things through a
> > > standard, like JEDEC? I believe there's a JEDEC parameter page standard
> > we do not have such influence to push these vendors.
> > they will not listen to us. :(
> 
> We can try! Strength in numbers! BTW, last time I brought up ONFI w/ a
> Toshiba representative, I think I offended them :) But at least I had
> their ear for a moment.
> 
look good to me. :)

I will suggest their FAE/AE when i meet them next time.

thanks
Huang Shijie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-11 20:54   ` Gupta, Pekon
@ 2013-12-17  5:22     ` Brian Norris
  0 siblings, 0 replies; 18+ messages in thread
From: Brian Norris @ 2013-12-17  5:22 UTC (permalink / raw)
  To: Gupta, Pekon; +Cc: Huang Shijie, linux-mtd@lists.infradead.org

Hi Pekon,

On Wed, Dec 11, 2013 at 08:54:03PM +0000, Pekon Gupta wrote:
> >From: Brian Norris
> >>On Wed, Dec 11, 2013 at 07:49:21AM +0000, Pekon Gupta wrote:
> >> >From: Brian Norris
> [...]
> 
> >> Do all Micron devices support this vendor-specific ONFI information ?
> >> - legacy device ?
> >> - SLC NAND ?
> >> If not then we might need more filters here ..
> >
> >We can look at the vendor_revision (as of the previous patch, this is in
> >p->vendor_revision). But the vendor block is really
> >backwards-compatible, since byte 180 (the micron->read_retries field)
> >defaulted to 0 on legacy and SLC Micron ONFI NAND. I'll summarize the
> >datasheets I'm referring to:
> >
> >Needs retry   Cell-type    Part number          Vendor revision    Byte 180
> >-----------   ---------    ----------------     ---------------    ------------
> >No            SLC          MT29F16G08ABABA      01h, 00h           Reserved (0)
> >No            MLC          MT29F32G08CBABA      01h, 00h           Reserved (0)
> >No            SLC          MT29F1G08AACWP       01h, 00h           0
> >Yes           MLC          MT29F32G08CBADA      01h, 00h           08h
> >Yes           MLC          MT29F64G08CBABA      02h, 00h           08h
> >
> >These datasheets range from 2008 to 2012, and they cover the range of
> >ONFI 1.0 to ONFI 2.3, so I think it's safe to say that all Micron
> >revisions support this vendor block "number of read retry modes" byte.
> >
> Yes, I have a Micron MLC data-sheet [1] which marks 'read_retry_options'
> Byte 180 as reserved (0x0). So asked this question..
> 
> 
> >At most, we chould check for:
> >
> >	if (le16_to_cpu(p->vendor_revision) < 0x0100)

BTW, I was doing the endian swapping wrong. This should be:

	if (le16_to_cpu(p->vendor_revision) < 1)

> >		return;
> >
> But, your above table shows that there are MLC NAND devices with
> Vendor_revision == 01h, 00h there are both types of 'read_retry_options'
> "Reserved(0)" and "08h".

The above check is probably not needed at all, but it's really only
there to rule out the possibility that somewhere there is an
incompatible vendor_revision 00h for Micron.

> So, I think it would be safe to check for below:
> @@nand_do_read_ops (..)
> +		if (mtd->ecc_stats.failed - ecc_failures) {
> +				if ((retry_mode + 1 < chip->read_retries) && 
> + ----->>>>				(!micron->read_retry_options)) {
> +			
> +		/* Then proceed with retry sequence */

No, this removes the abstraction that I intentionally placed there. See
my reasoning below.

> >> >diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
> >> >@@ -518,6 +521,7 @@ struct nand_buffers {
> >> >  *			non 0 if ONFI supported.
> >> >  * @onfi_params:	[INTERN] holds the ONFI page parameter when ONFI is
> >> >  *			supported, 0 otherwise.
> >> >+ * @read_retries:	[INTERN] the number of read retry modes supported
> >> >  * @onfi_set_features:	[REPLACEABLE] set the features for ONFI nand
> >> >  * @onfi_get_features:	[REPLACEABLE] get the features for ONFI nand
> >> >  * @bbt:		[INTERN] bad block table pointer
> >> >@@ -589,6 +593,8 @@ struct nand_chip {
> >> > 	int onfi_version;
> >> > 	struct nand_onfi_params	onfi_params;
> >> >
> >> >+	int read_retries;
> >> >+
> >> >
> >> I think we should not add new field 'read_retries' to 'struct nand_chip' bcoz
> >> - It's a vendor specific feature,
> >
> >I don't know that much about other vendor's read-retry support, but I
> >expect that they would have a similar parameter to control how many
> >times we retry. This is not intended to be vendor-specific, and we can
> >adapt if other vendors' approach is too different. (However, I suspect
> >that we can't do this very well for some vendors; they don't
> >standardize, and they are getting more and more difficult to support
> >reliably without deep knowledge of their quirks.)
> >
> >> - And also specifically for MLC NAND.
> >
> >Your point? Where do you propose to put MLC information?
> >
> >> Alternatively you can indirectly access this as part of onfi_params ..
> >> (struct nand_onfi_vendor_micron *) micron = chip->onfi_param->vendor
> >> And then micron->read_retries.
> >
> >We shouldn't be probing our vendor-specific properties on every
> >invocation of nand_do_read_ops(); it should either be in a field (like
> >my nand_chip.read_retries) or in a callback that we assign per-vendor,
> >like Huang suggested.
> >
> No, I'm not saying to probe it every time.
> 
> But I'm not getting one thing here that..
> (1) 'struct nand_onfi_param' is itself is packed into 'struct nand_chip'.
>   @@nand.h 	struct nand_onfi_params	onfi_params;
> (2) And 'vendor block' information is already part of 
>     'struct nand_onfi_params' marked as vendor[88].
> [Patch 2/4]
> -	u8 reserved5[90];
> +	__le16 vendor_revision;
> +	u8 vendor[88];
> 
> Then why do we need another field 'int read_retries' in nand_chip ?

It's called an abstraction: the property 'read_retries' is a general
property that could apply to many different chips, some of which are not
ONFI compatible.

> As 'read_retry_options' information is already packed in
> 'struct nand_chip' as part of 'onfi_param.vendor[]'. So all we just 
> need is to re-cast the u8 vendor[] array into meaningful fields.
> .. correct ? Am I missing something here ?

As Huang answered in this thread, other vendors indeed have a similar
property that (if someone wants to add support) can be represented in
the same field. Thus, the generic code (nand_do_read_ops) only has to
handle the abstract property (nand_chip.read_retries) rather than the
specifics for each vendor (i.e., casting the ONFI structure for Micron;
parsing from ID for Toshiba; etc.).

> So information in vendor[] is same as any other field in nand_chip,
> Then why do we need to duplicate this data.

Try to apply this question to the "minimum required ECC strength"
property. It is present in nand_onfi_params.ecc_bits (or in the EPP);
but because there is more than way to retrieve this info, we established
nand_chip.ecc_strength_ds as an abstraction.

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-17  5:11               ` Huang Shijie
@ 2013-12-17  7:01                 ` Brian Norris
  2013-12-17  7:12                   ` Huang Shijie
  0 siblings, 1 reply; 18+ messages in thread
From: Brian Norris @ 2013-12-17  7:01 UTC (permalink / raw)
  To: Huang Shijie; +Cc: linux-mtd, Huang Shijie, Pekon Gupta, Ezequiel Garcia

On Tue, Dec 17, 2013 at 01:11:25PM +0800, Huang Shijie wrote:
> On Mon, Dec 16, 2013 at 09:06:13PM -0800, Brian Norris wrote:
> > On Tue, Dec 17, 2013 at 12:23:08PM +0800, Huang Shijie wrote:
> > > On Mon, Dec 16, 2013 at 08:43:42PM -0800, Brian Norris wrote:
> > > > > 
> > > > > The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG.
> > Particularly, I think it is quite reasonable to restrict DMA to
> > mtd_read()/mtd_write()-like operations -- so this trickles down to
> > read_page/write_page, but not to device configuration operations like
> > ONFI set_features/get_features/read parameter tables, etc.
> > 
> > Do you disagree? Do you see some need to perform DMA on all operations?
> 
> The gpmi is much coupled with the MXS-DMA engine. the gpmi can uses the
> mxs-dma to set the gpmi registers.
> 
> Use the DMA makes the GPMI code very simple, it really makes the code very
> _ugly_ if we do not use the DMA for read_buf/write_buf.

It's not just about ugliness, it's about maintainability. I can clean up
a few cases here and there that don't conform to your driver's needs now
(notice I pointed out another instance of this in Uwe's patch), but
unless we establish better guidelines for NAND, this is fragile.

As a diversion, let's look at atmel_nand for an example. It uses DMA as
well, but it dodges some of your problems using a heuristic, so that it
only uses DMA for larger-than-OOB-size transactions (see
atmel_read_buf). Additionally, it has a hack-y check for highmem, to try
to confirm that the region it's trying to map is, in fact, DMA-able.

So, atmel_nand does not have an ideal solution, but it at least attempts
to solve parts of this problem. GPMI, on the other hand, ignores it
entirely.

> If you insist that it is okay to use the local array in the nand_base.c,
> I can use kmalloc/memcpy in the gpmi's read_buf/write_buf hooks. In such a 
> way, we can make the gpmi looks more normal.

I think a bounce buffer is necessary, if you really want to use DMA all
the time; some transactions are just not worth initializing DMA-able
buffers in the high-level code. But even better, can GPMI easily utilize
some sort of PIO mode for small/slow-path transactions like this, or is
it inextricably tied to the MXS-DMA engine?

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] mtd: nand: support Micron READ RETRY
  2013-12-17  7:01                 ` Brian Norris
@ 2013-12-17  7:12                   ` Huang Shijie
  0 siblings, 0 replies; 18+ messages in thread
From: Huang Shijie @ 2013-12-17  7:12 UTC (permalink / raw)
  To: Brian Norris; +Cc: linux-mtd, Huang Shijie, Pekon Gupta, Ezequiel Garcia

On Mon, Dec 16, 2013 at 11:01:03PM -0800, Brian Norris wrote:
> On Tue, Dec 17, 2013 at 01:11:25PM +0800, Huang Shijie wrote:
> > On Mon, Dec 16, 2013 at 09:06:13PM -0800, Brian Norris wrote:
> > > On Tue, Dec 17, 2013 at 12:23:08PM +0800, Huang Shijie wrote:
> > > > On Mon, Dec 16, 2013 at 08:43:42PM -0800, Brian Norris wrote:
> > > > > > 
> > > > > > The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG.
> > > Particularly, I think it is quite reasonable to restrict DMA to
> > > mtd_read()/mtd_write()-like operations -- so this trickles down to
> > > read_page/write_page, but not to device configuration operations like
> > > ONFI set_features/get_features/read parameter tables, etc.
> > > 
> > > Do you disagree? Do you see some need to perform DMA on all operations?
> > 
> > The gpmi is much coupled with the MXS-DMA engine. the gpmi can uses the
> > mxs-dma to set the gpmi registers.
> > 
> > Use the DMA makes the GPMI code very simple, it really makes the code very
> > _ugly_ if we do not use the DMA for read_buf/write_buf.
> 
> It's not just about ugliness, it's about maintainability. I can clean up
> a few cases here and there that don't conform to your driver's needs now
> (notice I pointed out another instance of this in Uwe's patch), but
> unless we establish better guidelines for NAND, this is fragile.
> 
> As a diversion, let's look at atmel_nand for an example. It uses DMA as
> well, but it dodges some of your problems using a heuristic, so that it
> only uses DMA for larger-than-OOB-size transactions (see
> atmel_read_buf). Additionally, it has a hack-y check for highmem, to try
> to confirm that the region it's trying to map is, in fact, DMA-able.
> 
> So, atmel_nand does not have an ideal solution, but it at least attempts
> to solve parts of this problem. GPMI, on the other hand, ignores it
> entirely.

ok.

i will fix it by a new patch to add more checks for the read_buf/write_buf.


> 
> > If you insist that it is okay to use the local array in the nand_base.c,
> > I can use kmalloc/memcpy in the gpmi's read_buf/write_buf hooks. In such a 
> > way, we can make the gpmi looks more normal.
> 
> I think a bounce buffer is necessary, if you really want to use DMA all
> the time; some transactions are just not worth initializing DMA-able
> buffers in the high-level code. But even better, can GPMI easily utilize
> some sort of PIO mode for small/slow-path transactions like this, or is
yes, the gpmi can utilize the PIO mode for the transactions.


> it inextricably tied to the MXS-DMA engine?

You can think the mxs-dma is designed for GPMI.


thanks
Huang Shijie

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-12-17  7:42 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-09 20:09 [PATCH v2 1/4] mtd: nand: localize ECC failures per page Brian Norris
2013-12-09 20:09 ` [PATCH v2 2/4] mtd: nand: add ONFI vendor block for Micron Brian Norris
2013-12-09 20:09 ` [PATCH v2 3/4] mtd: nand: support Micron READ RETRY Brian Norris
2013-12-11 13:54   ` Huang Shijie
2013-12-11 19:03     ` Brian Norris
2013-12-11 20:31       ` Ezequiel Garcia
2013-12-12  3:49         ` Huang Shijie
2013-12-12  3:47       ` Huang Shijie
2013-12-17  4:43         ` Brian Norris
2013-12-17  4:23           ` Huang Shijie
2013-12-17  5:06             ` Brian Norris
2013-12-17  5:11               ` Huang Shijie
2013-12-17  7:01                 ` Brian Norris
2013-12-17  7:12                   ` Huang Shijie
2013-12-09 20:09 ` [PATCH v2 4/4] mtd: nand: use __packed shorthand Brian Norris
     [not found] <20980858CB6D3A4BAE95CA194937D5E73EA53CF1@DBDE04.ent.ti.com>
2013-12-11 18:37 ` [PATCH v2 3/4] mtd: nand: support Micron READ RETRY Brian Norris
2013-12-11 20:54   ` Gupta, Pekon
2013-12-17  5:22     ` Brian Norris

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).