From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from protonic.xs4all.nl ([83.163.252.89]) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1WzlQc-0006nq-CR for linux-mtd@lists.infradead.org; Wed, 25 Jun 2014 11:31:51 +0000 Date: Wed, 25 Jun 2014 13:31:29 +0200 From: David Jander To: "Gupta, Pekon" Subject: Re: [FRC] [PATCH] MTD: nand_base.c: Enable support for Samsung E-die SLC NAND Message-ID: <20140625133129.060cd535@archvile> In-Reply-To: <20980858CB6D3A4BAE95CA194937D5E73EAF7560@DBDE04.ent.ti.com> References: <1403259137-22171-1-git-send-email-david@protonic.nl> <20980858CB6D3A4BAE95CA194937D5E73EAF6A08@DBDE04.ent.ti.com> <20980858CB6D3A4BAE95CA194937D5E73EAF7560@DBDE04.ent.ti.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: "linux-mtd@lists.infradead.org" , Ted Juan , "sjhill@realitydiluted.com" , "tglx@linutronix.de" , Brian Norris , David Woodhouse List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Dear Pekon, On Wed, 25 Jun 2014 10:04:11 +0000 "Gupta, Pekon" wrote: > Hi Ted, > > >From: Ted Juan [mailto:ted.juan@gmail.com] > >Dear Pekon, > > > >I backup the raw data to data2[] before doing elm_decode_bch_error_page(); > >Dump code is as below. The raw data is the same with the correction > >data that all more than 8 bit-flips. > > > (a) In that case you should contact the Flash vendor here. > Fresh NAND device from factory should not violate the spec. > I don't suspect a driver issue here, because the raw data read itself > has random bit-flips. Sorry to interrupt, but this does sound serious. Are you absolutely sure your hardware is OK? Is the power-supply clean and well enough decoupled? Timings within specs? If electrical specifications are not met, this could explain the bit-flips. It is possible that Samsung is at fault here (they screwed up the specs for this version anyway), but double checking the hardware looks like a good idea here... > (b) Also, it may be the case that there few particular blocks which has gone > bad and those are is showing again and again at each boot. However, If it > was such a case that only some handful blocks on NAND device have gone > bad, then UBI torture test [1] should have detected them and marked them > bad. And those should not re-appear in next time. > - You can check (b) by scrubbing all bad-blocks from u-boot > #u-boot> nand scrub.chip all > #u-boot> nand bad (should report 0 bad blocks) > - Then, re-boot and let UBI detect bad-blocks on its own using torture-test > - And then again reset the system 2nd time and check newly detected > bad-blocks #u-boot> nand bad (should report [n] bad blocks) > > (c) You can also check, if you are seeing bit-flips only during > erased-pages ? You can identify this by adding prints in u-boot. > There is slight difference in u-boot and kernel omap-gpmc NAND drivers, > - u-boot: simply ignores erased-pages and does not check for bit-flips in > them. > - kernel: counts number of bit-flips in erased-pages also. > > > >The full data log is put as below but include some useless dump data. > >https://drive.google.com/file/d/0BwVGpNFs7l22RmZXTHhJWXFYYWs/edit?usp=sharing > > > There will be no correction done if 'un-correctable error' flag is raised by > ELM. Therefore pre-correction and post-correction data matches in below dump. > Bit-flip correction will _only_ happen if the number of bit-flips are within > correctable range (that is <=8 for BCH8 ECC scheme). > > > [1] $kernel/drivers/mtd/ubi/io.c @@ torture_peb() Best regards, -- David Jander Protonic Holland.