public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* Erasing NAND bad blocks?
@ 2005-08-09 16:41 Steven Hein
  2005-08-09 23:33 ` Thomas Gleixner
  0 siblings, 1 reply; 5+ messages in thread
From: Steven Hein @ 2005-08-09 16:41 UTC (permalink / raw)
  To: linux-mtd

(Yes, I do know that erasing NAND flash blocks that are marked bad
is a VERY BAD IDEA.....I'm asking the question regarding a
specific HW/SW debug situation.......)

In the course of bringing up new hardware with NAND flash attached
I have had occasions where a software bug will cause a NAND-based
filesystem (such as YAFFS) to mark *all* of the blocks in a filesystem
as bad.  In the past, I have hacked the nand_erase() function to
allow erasing of bad blocks, then wrote a custom app to scan the OOB
data, doing a MEM_ERASE for blocks that had been marked bad by the FS.
Just wondering.....has anyone else
run into this situation, and is there a more graceful way of doing
this (i.e. without hacking the MTD NAND driver)?

Thanks!
Steve

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Steve Hein (ssh@sgi.com)              Engineering Diagnostics/Software
Silicon Graphics, Inc.                          
1168 Industrial Blvd.                 Phone: (715) 726-8410
Chippewa Falls, WI 54729              Fax:   (715) 726-6715
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~	

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Erasing NAND bad blocks?
  2005-08-09 16:41 Erasing NAND bad blocks? Steven Hein
@ 2005-08-09 23:33 ` Thomas Gleixner
  2005-08-10 20:08   ` Charles Manning
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Gleixner @ 2005-08-09 23:33 UTC (permalink / raw)
  To: Steven Hein; +Cc: linux-mtd

On Tue, 2005-08-09 at 11:41 -0500, Steven Hein wrote:
> (Yes, I do know that erasing NAND flash blocks that are marked bad
> is a VERY BAD IDEA.....I'm asking the question regarding a
> specific HW/SW debug situation.......)
> 
> In the course of bringing up new hardware with NAND flash attached
> I have had occasions where a software bug will cause a NAND-based
> filesystem (such as YAFFS) to mark *all* of the blocks in a filesystem
> as bad.  In the past, I have hacked the nand_erase() function to
> allow erasing of bad blocks, then wrote a custom app to scan the OOB
> data, doing a MEM_ERASE for blocks that had been marked bad by the FS.
> Just wondering.....has anyone else
> run into this situation, and is there a more graceful way of doing
> this (i.e. without hacking the MTD NAND driver)?

This topic comes around on a regular base. 

I was more than once tempted to provide a user space interface to do
that. There are pro and cons. 

The con which holds me off is the experince that >90% of the bad block
complaints are related to buggy hardware and board drivers. Most of the
people trapping into this spend plenty of time to tinker around the real
problem.

Adding this functionality (it's simple) would just add some more
confusing reports like "Hey, I erased all the bad blocks with the
--force-erase-bad-block option, but it still shows me those annoying
messages".

OTH, we do enough witchcraft based help already, as obviously a lot of
those people 
- ignore documentation 
- are to a high grade advisory resistant
- ...

So it might just adding another shade of nerve racking, but helpful for
those who know what they are doing.

I'm not going to implement it myself, but patches are welcome. :)

Hint: 

	case MEMERASEFORCED:
		check_root_priviledges()
		force = 1;
	case MEMERASE:
		......
		erase->state = force ? ...FORCE... : 0;

erase.state seems to be a nice transport mechanism and it will uncover
all un/half initialized instances of struct erase_info by a simple check
for ...FORCE... || 0


tglx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Erasing NAND bad blocks?
  2005-08-09 23:33 ` Thomas Gleixner
@ 2005-08-10 20:08   ` Charles Manning
  2005-08-10 21:34     ` Sergei Sharonov
  2005-08-10 21:38     ` Thomas Gleixner
  0 siblings, 2 replies; 5+ messages in thread
From: Charles Manning @ 2005-08-10 20:08 UTC (permalink / raw)
  To: linux-mtd, tglx; +Cc: Steven Hein

On Wednesday 10 August 2005 11:33, Thomas Gleixner wrote:
> On Tue, 2005-08-09 at 11:41 -0500, Steven Hein wrote:
> > (Yes, I do know that erasing NAND flash blocks that are marked bad
> > is a VERY BAD IDEA.....I'm asking the question regarding a
> > specific HW/SW debug situation.......)
> >
> > In the course of bringing up new hardware with NAND flash attached
> > I have had occasions where a software bug will cause a NAND-based
> > filesystem (such as YAFFS) to mark *all* of the blocks in a filesystem
> > as bad.  In the past, I have hacked the nand_erase() function to
> > allow erasing of bad blocks, then wrote a custom app to scan the OOB
> > data, doing a MEM_ERASE for blocks that had been marked bad by the FS.
> > Just wondering.....has anyone else
> > run into this situation, and is there a more graceful way of doing
> > this (i.e. without hacking the MTD NAND driver)?
>

As this comes around often wrt YAFFS1, I think there are two things that can 
be done in YAFFS to help address the issue:
1) Make the bad block marker used in YAFFS1 something that is easy to 
recognise (eg. 'Y' ) instead of 0x00. That way it should be easy to recognise 
yaffs -vs- factory marked bad blocks.
2) Add a config in yaffs to not do any bad block marking during board bring 
up.

-- CHarles

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Erasing NAND bad blocks?
  2005-08-10 20:08   ` Charles Manning
@ 2005-08-10 21:34     ` Sergei Sharonov
  2005-08-10 21:38     ` Thomas Gleixner
  1 sibling, 0 replies; 5+ messages in thread
From: Sergei Sharonov @ 2005-08-10 21:34 UTC (permalink / raw)
  To: linux-mtd

Charles,


> 2) Add a config in yaffs to not do any bad block marking during board bring 
> up.

I suggest to disable retiring blocks that failed ECC on read. I see it happening
during power cycling. YAFFS leaks good blocks, e.g. disrupted write/erase
creates bad ECC and then GC retires perfectly good block. JFFS2 does not do that.
AFAIK, manufacturers suggest discarding only blocks that fail on write.

Sergei 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Erasing NAND bad blocks?
  2005-08-10 20:08   ` Charles Manning
  2005-08-10 21:34     ` Sergei Sharonov
@ 2005-08-10 21:38     ` Thomas Gleixner
  1 sibling, 0 replies; 5+ messages in thread
From: Thomas Gleixner @ 2005-08-10 21:38 UTC (permalink / raw)
  To: Charles Manning; +Cc: linux-mtd, Steven Hein

On Thu, 2005-08-11 at 08:08 +1200, Charles Manning wrote:
> As this comes around often wrt YAFFS1, I think there are two things that can 
> be done in YAFFS to help address the issue:
> 1) Make the bad block marker used in YAFFS1 something that is easy to 
> recognise (eg. 'Y' ) instead of 0x00. That way it should be easy to recognise 
> yaffs -vs- factory marked bad blocks.
> 2) Add a config in yaffs to not do any bad block marking during board bring 
> up.

2) is the safe way, but it should be blocked in the MTD/NAND layer to
protect all fs developers from the usual complaints resulting from "try
and error bring up"

tglx

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-08-10 21:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-09 16:41 Erasing NAND bad blocks? Steven Hein
2005-08-09 23:33 ` Thomas Gleixner
2005-08-10 20:08   ` Charles Manning
2005-08-10 21:34     ` Sergei Sharonov
2005-08-10 21:38     ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox