From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx1.redhat.com ([66.187.233.31]) by canuck.infradead.org with esmtps (Exim 4.43 #1 (Red Hat Linux)) id 1Crkor-0000i0-8I for linux-mtd@lists.infradead.org; Thu, 20 Jan 2005 17:30:27 -0500 Message-ID: <41F03217.3070300@redhat.com> Date: Thu, 20 Jan 2005 16:35:03 -0600 From: "David A. Marlin" MIME-Version: 1.0 To: tglx@linutronix.de References: <41ED2EB3.1070203@redhat.com> <1106068748.16877.109.camel@tglx.tec.linutronix.de> <41ED50B7.9000107@redhat.com> <1106082905.16877.141.camel@tglx.tec.linutronix.de> Content-Type: multipart/mixed; boundary="------------010202070704020408020205" Cc: MTD List Subject: Re: additional error checks for AG-AND erase/write List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , This is a multi-part message in MIME format. --------------010202070704020408020205 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Thomas Gleixner wrote: : > > Rename nand_read_ecc to do_nand_read_ecc and add an flag argument. Look > at the erase function, where we have the additional argument for > allowbbt. > flag bits: 0-7 tolerable errors > 8 do not get chip > or something like that. > Create a new wrapper function nand_read_ecc which calls do_nand_read_ecc > with flags = 0xff. Depending on this argument we grab the device and > release it on the end and react on the error check. > The same call can be done from erase in the failure path. > > To make this actually work we should add a member like > int (*ext_errchk) (.....) > which is checked in the error path > if (this->ext_errchk) { > res = this->ext_errchk(....); > } > This is a board supplied function, as the board driver knows how much > errors are acceptable. We return the number of corrected bits anyway in > the correct_data() function, but we check only for -1 at the moment. > An additional check does not hurt here, as we are in the slow path > again. We check the return value of correct_data() for > (res == -1 || res > flags & 0xff) which ensures in the normal case this > tolerance will never bite us, as error correction of 256 bits is unreal. > > Hope that helps. I think this will make the code not uglier than it is > anyway and keeps it nicely dependend on the board drivers featurelist. Attached is a patch that I think implements what you described. Please let me know if I've missed anything or gone astray in the details. Note: I changed a few literals to defined symbols in 'nand_base.c'. Please let me know if you would prefer this in a separate patch (or not at all). Thank you, d.marlin ========= --------------010202070704020408020205 Content-Type: text/plain; name="ag-and-05.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="ag-and-05.patch" Index: include/linux/mtd/nand.h =================================================================== RCS file: /home/cvs/mtd/include/linux/mtd/nand.h,v retrieving revision 1.69 diff -u -r1.69 nand.h --- include/linux/mtd/nand.h 17 Jan 2005 18:29:18 -0000 1.69 +++ include/linux/mtd/nand.h 20 Jan 2005 20:48:12 -0000 @@ -50,6 +50,8 @@ * update of nand_chip structure description * 01-17-2005 dmarlin added extended commands for AG-AND device and added option * for BBT_AUTO_REFRESH. + * 01-20-2005 dmarlin added optional pointer to hardware specific callback for + * extra error status checks. */ #ifndef __LINUX_MTD_NAND_H #define __LINUX_MTD_NAND_H @@ -164,7 +166,7 @@ /* * Constants for Hardware ECC -*/ + */ /* Reset Hardware ECC for read */ #define NAND_ECC_READ 0 /* Reset Hardware ECC for write */ @@ -172,6 +174,10 @@ /* Enable Hardware ECC before syndrom is read back from flash */ #define NAND_ECC_READSYN 2 +/* Bit mask for flags passed to do_nand_read_ecc */ +#define NAND_GET_DEVICE 0x80 + + /* Option constants for bizarre disfunctionality and real * features */ @@ -308,6 +314,8 @@ * @badblock_pattern: [REPLACEABLE] bad block scan pattern used for initial bad block scan * @controller: [OPTIONAL] a pointer to a hardware controller structure which is shared among multiple independend devices * @priv: [OPTIONAL] pointer to private chip date + * @errstat: [OPTIONAL] hardware specific function to perform additional error status checks + * (determine if errors are correctable) */ struct nand_chip { @@ -363,6 +371,7 @@ struct nand_bbt_descr *badblock_pattern; struct nand_hw_control *controller; void *priv; + int (*errstat)(struct mtd_info *mtd, struct nand_chip *this, int state, int status, int page); }; /* @@ -484,6 +493,9 @@ extern int nand_default_bbt (struct mtd_info *mtd); extern int nand_isbad_bbt (struct mtd_info *mtd, loff_t offs, int allowbbt); extern int nand_erase_nand (struct mtd_info *mtd, struct erase_info *instr, int allowbbt); +extern int do_nand_read_ecc (struct mtd_info *mtd, loff_t from, size_t len, + size_t * retlen, u_char * buf, u_char * oob_buf, + struct nand_oobinfo *oobsel, int flags); /* * Constants for oob configuration Index: drivers/mtd/nand/nand_base.c =================================================================== RCS file: /home/cvs/mtd/drivers/mtd/nand/nand_base.c,v retrieving revision 1.128 diff -u -r1.128 nand_base.c --- drivers/mtd/nand/nand_base.c 18 Jan 2005 16:14:56 -0000 1.128 +++ drivers/mtd/nand/nand_base.c 20 Jan 2005 20:49:37 -0000 @@ -42,6 +42,10 @@ * a "device recovery" operation must be performed when power is restored * to ensure correct operation. * + * 01-20-2005 dmarlin: added support for optional hardware specific callback routine to + * perform extra error status checks on erase and write failures. This required + * adding a wrapper function for nand_read_ecc. + * * Credits: * David Woodhouse for adding multichip support * @@ -480,7 +484,7 @@ struct nand_chip *this = mtd->priv; /* Check the WP bit */ this->cmdfunc (mtd, NAND_CMD_STATUS, -1, -1); - return (this->read_byte(mtd) & 0x80) ? 0 : 1; + return (this->read_byte(mtd) & NAND_STATUS_WP) ? 0 : 1; } /** @@ -585,7 +589,7 @@ this->hwcontrol(mtd, NAND_CTL_SETCLE); this->write_byte(mtd, NAND_CMD_STATUS); this->hwcontrol(mtd, NAND_CTL_CLRCLE); - while ( !(this->read_byte(mtd) & 0x40)); + while ( !(this->read_byte(mtd) & NAND_STATUS_READY)); return; /* This applies to read commands */ @@ -692,7 +696,7 @@ this->hwcontrol(mtd, NAND_CTL_SETCLE); this->write_byte(mtd, NAND_CMD_STATUS); this->hwcontrol(mtd, NAND_CTL_CLRCLE); - while ( !(this->read_byte(mtd) & 0x40)); + while ( !(this->read_byte(mtd) & NAND_STATUS_READY)); return; case NAND_CMD_READ0: @@ -896,8 +900,14 @@ if (!cached) { /* call wait ready function */ status = this->waitfunc (mtd, this, FL_WRITING); + + /* See if operation failed and additional status checks are available */ + if ((status & NAND_STATUS_FAIL) && (this->errstat)) { + status = this->errstat(mtd, this, FL_WRITING, status, page); + } + /* See if device thinks it succeeded */ - if (status & 0x01) { + if (status & NAND_STATUS_FAIL) { DEBUG (MTD_DEBUG_LEVEL0, "%s: " "Failed write, page 0x%08x, ", __FUNCTION__, page); return -EIO; } @@ -1052,6 +1062,30 @@ static int nand_read_ecc (struct mtd_info *mtd, loff_t from, size_t len, size_t * retlen, u_char * buf, u_char * oob_buf, struct nand_oobinfo *oobsel) { + return do_nand_read_ecc(mtd, from, len, retlen, buf, oob_buf, oobsel, 0xff); +} + + +/** + * do_nand_read_ecc - [MTD Interface] Read data with ECC + * @mtd: MTD device structure + * @from: offset to read from + * @len: number of bytes to read + * @retlen: pointer to variable to store the number of read bytes + * @buf: the databuffer to put data + * @oob_buf: filesystem supplied oob data buffer + * @oobsel: oob selection structure + * @flags: flag to indicate if nand_get_device/nand_release_device should be preformed + * and how many corrected error bits are acceptable: + * bits 0..7 - number of tolerable errors + * bit 8 - 0 == do not get/release chip, 1 == get/release chip + * + * NAND read with ECC + */ +int do_nand_read_ecc (struct mtd_info *mtd, loff_t from, size_t len, + size_t * retlen, u_char * buf, u_char * oob_buf, + struct nand_oobinfo *oobsel, int flags) +{ int i, j, col, realpage, page, end, ecc, chipnr, sndcmd = 1; int read = 0, oob = 0, ecc_status = 0, ecc_failed = 0; struct nand_chip *this = mtd->priv; @@ -1076,7 +1110,8 @@ } /* Grab the lock and see if the device is available */ - nand_get_device (this, mtd, FL_READING); + if (flags & NAND_GET_DEVICE) + nand_get_device (this, mtd, FL_READING); /* use userspace supplied oobinfo, if zero */ if (oobsel == NULL) @@ -1180,7 +1215,8 @@ /* We calc error correction directly, it checks the hw * generator for an error, reads back the syndrome and * does the error correction on the fly */ - if (this->correct_data(mtd, &data_poi[datidx], &oob_data[i], &ecc_code[i]) == -1) { + ecc_status = this->correct_data(mtd, &data_poi[datidx], &oob_data[i], &ecc_code[i]); + if ((ecc_status == -1) || (ecc_status > (flags && 0xff))) { DEBUG (MTD_DEBUG_LEVEL0, "nand_read_ecc: " "Failed ECC read, page 0x%08x on chip %d\n", page, chipnr); ecc_failed++; @@ -1219,7 +1255,7 @@ p[i] = ecc_status; } - if (ecc_status == -1) { + if ((ecc_status == -1) || (ecc_status > (flags && 0xff))) { DEBUG (MTD_DEBUG_LEVEL0, "nand_read_ecc: " "Failed ECC read, page 0x%08x\n", page); ecc_failed++; } @@ -1289,7 +1325,8 @@ } /* Deselect and wake up anyone waiting on the device */ - nand_release_device(mtd); + if (flags & NAND_GET_DEVICE) + nand_release_device(mtd); /* * Return success, if no ECC failures, else -EBADMSG @@ -1758,7 +1795,7 @@ status = this->waitfunc (mtd, this, FL_WRITING); /* See if device thinks it succeeded */ - if (status & 0x01) { + if (status & NAND_STATUS_FAIL) { DEBUG (MTD_DEBUG_LEVEL0, "nand_write_oob: " "Failed write, page 0x%08x\n", page); ret = -EIO; goto out; @@ -2103,8 +2140,13 @@ status = this->waitfunc (mtd, this, FL_ERASING); + /* See if operation failed and additional status checks are available */ + if ((status & NAND_STATUS_FAIL) && (this->errstat)) { + status = this->errstat(mtd, this, FL_ERASING, status, page); + } + /* See if block erase succeeded */ - if (status & 0x01) { + if (status & NAND_STATUS_FAIL) { DEBUG (MTD_DEBUG_LEVEL0, "nand_erase: " "Failed erase, page 0x%08x\n", page); instr->state = MTD_ERASE_FAILED; instr->fail_addr = (page << this->page_shift); --------------010202070704020408020205--