From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pb0-x22a.google.com ([2607:f8b0:400e:c01::22a]) by merlin.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1WUuBx-0002qi-An for linux-mtd@lists.infradead.org; Tue, 01 Apr 2014 08:37:09 +0000 Received: by mail-pb0-f42.google.com with SMTP id rr13so9535259pbb.29 for ; Tue, 01 Apr 2014 01:36:47 -0700 (PDT) Date: Tue, 1 Apr 2014 01:36:44 -0700 From: Brian Norris To: David Mosberger Subject: Re: [PATCH] mtd: nand: Add support for Micron on-die ECC controller (rev2). Message-ID: <20140401083644.GG6400@brian-ubuntu> References: <1395875121-9320-1-git-send-email-davidm@egauge.net> <20980858CB6D3A4BAE95CA194937D5E73EAB5F84@DBDE04.ent.ti.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Cc: Gerhard Sittig , "linux-mtd@lists.infradead.org" , "Gupta, Pekon" , "dedekind1@gmail.com" List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Fri, Mar 28, 2014 at 09:52:37AM -0600, David Mosberger wrote: > On Thu, Mar 27, 2014 at 12:56 AM, Gupta, Pekon wrote: > > >>+ set_on_die_ecc(mtd, chip, 1); > >>+ > >>+ chkoob = chkbuf + mtd->writesize; > >>+ rawoob = rawbuf + mtd->writesize; > >>+ eccpos = chip->ecc.layout->eccpos; > >>+ for (i = 0; i < chip->ecc.steps; ++i) { > >>+ /* Count bit flips in the actual data area: */ > >>+ flips = bitdiff(chkbuf, rawbuf, chip->ecc.size); > >>+ /* Count bit flips in the ECC bytes: */ > >>+ for (j = 0; j < chip->ecc.bytes; ++j) { > > > > You should check bit-flips in complete OOB region (mtd->oobsize) not just ecc.bytes. > > I was under the impression that OOB data bytes cannot be assumed to be > ECC protected. > As it happens, when using Internal ECC on those Micron chips, *some* > of the OOB databytes > are ECC protected, but I didn't think it was necessary to count those > for bitflips, since OOB users > won't assume ECC protection anyhow. Am I wrong about that? This is a touchy subject. Most of your comments are correct; traditionally, ECC did not protect OOB, and some of the main users of it (like JFFS2) assume that it isn't. This is patently false on some modern systems, which do protect it. But for this case, the max_bitflips count is used for determining when the error rate is unacceptably high, not just to see whether your data is currently corrupt. So if extra bitflips in this (otherwise unimportant) spare area might eventually cause the ECC hardware to report an error, then we need to count them. Brian