From: Miquel Raynal <miquel.raynal@bootlin.com>
To: "Lothar Waßmann" <LW@KARO-electronics.de>
Cc: "Daniel Glöckner" <dg@emlix.com>, "Han Xu" <han.xu@nxp.com>,
linux-mtd@lists.infradead.org,
"Brian Norris" <computersforpeace@gmail.com>
Subject: Re: Make NAND_BBT_NO_OOB_BBM configurable or let the gpmi driver decide?
Date: Tue, 15 Mar 2022 09:34:37 +0100 [thread overview]
Message-ID: <20220315093437.54737f0d@xps13> (raw)
In-Reply-To: <20220315080602.7a2e250d@ipc1.ka-ro>
Hi Lothar,
LW@KARO-electronics.de wrote on Tue, 15 Mar 2022 08:06:02 +0100:
> Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> > Hi Daniel,
> >
> > Sorry for the delay.
> >
> > dg@emlix.com wrote on Thu, 24 Feb 2022 19:17:43 +0100:
> >
> > > Hi Miquel,
> > >
> > > Am 24.02.22 um 17:03 schrieb Miquel Raynal:
> > > > dg@emlix.com wrote on Thu, 24 Feb 2022 16:55:27 +0100:
> > > >> Am 24.02.22 um 16:29 schrieb Miquel Raynal:
> > > >>> dg@emlix.com wrote on Wed, 23 Feb 2022 11:59:02 +0100:
> > > >>>> Am 22.02.22 um 23:02 schrieb Han Xu:>>> Could you please
> > > >>>> describe more details about what kind of error, how to
> > > >>>>> reproduce it and on which kernel version?
> > > >>>>
> > > >>>> You need a flash that has one bad block where programming the
> > > >>>> BBM sets NAND_STATUS_FAIL in its status register. The latest
> > > >>>> kernels should still have problems when this happens in a
> > > >>>> UBI.
> > > >>>
> > > >>> I believe we should try to tackle "why" this happens more than
> > > >>> try to workaround its consequences. Can you give more details
> > > >>> about why we get this status?
> > > >>
> > > >> Uhm, the block is bad, broken. It shows the same behavior even
> > > >> after power cycling. The other blocks are ok. I don't think it
> > > >> is our fault that it died so early.
> > > >
> > > > But why after a power cycle are we trying to write the BBM?
> > >
> > > I did not want to imply that Linux tries to write the block after
> > > every power cycle. UBI notices that the block is broken once and
> > > manages to mark it as bad in the BBT, so after power cycle it will
> > > not try to write to that block again. What I wanted to say is that
> > > manual testing of the block after power cycling shows that the
> > > block remains unusable.
> > >
> > > The problem is that UBI switches to read-only mode after it marked
> > > the block as bad in the BBT because the redundant BBM in the OOB of
> > > the block could not be written.
> >
> > I think I understand better your situation now.
> >
> > So here is our problem : why can't we write the OOB? If there is a
> > good reason this cannot happen, then we can provide the
> > NAND_BBT_NO_OOB_BBM flag. Otherwise we should find the root cause.
> >
> > > And we don't want to get into a situation
> > > where we have to reboot the system, especially if it is because of
> > > something we don't need.
> > >
> > > We could change nand_block_markbad_lowlevel to return success as
> > > long as updating the BBT succeeds, if you think that this is the
> > > correct approach.
> >
> > That is not a correct approach if we did not asked to bypass writing
> > BBMs explicitly.
> >
> The BBM in the OOB area is a "Factory Bad Block Marker" where the
> manufacturer marks initially bad blocks. There is no guarantee that the
> BBM can be written on a block that turned bad lateron.
Writing a BBM means programming one byte to 0. We don't care about the
other bytes in the entire page, really, so we don't really care if
other bits flip during this operation. Worst case scenario: none of
the bits in the BBM are programmed (quite unlikely given the fact that
it's probably the "data" which triggered the errors in the first
place, even less likely knowing that only the first page of the block
will receive the marker while it's maybe not this page which shown
errors in the first place).
Anyway, let's assume the bad block marker cannot be programmed. Why
would the raw PROGRAM PAGE operation fail? There is no read back
happening automatically. We need to understand why the NAND op failed
in the first place, I don't think it is related to the page being bad,
more to the specific 1-byte write that the driver tries to do. I
believe this issue is gpmi-specific.
> If a block turned BAD during use it is completely useless to try writing
> anything to it.
Not necessarily, in particular if UBI decided to turn it bad. It does
not mean the block has wear out completely, it just means that the
block is about to wear out.
> Depending on the nature of the NAND error that turned
> the block bad, trying to write that block may also affect random other
> blocks.
I don't think this can happen on SLC. And on MLC I believe it is
'correctly' handled thanks to the known pairing scheme.
Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
prev parent reply other threads:[~2022-03-15 8:35 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-21 19:00 Make NAND_BBT_NO_OOB_BBM configurable or let the gpmi driver decide? Daniel Glöckner
2022-02-22 22:02 ` Han Xu
2022-02-23 10:59 ` Daniel Glöckner
2022-02-24 15:29 ` Miquel Raynal
2022-02-24 15:55 ` Daniel Glöckner
2022-02-24 16:03 ` Miquel Raynal
2022-02-24 18:17 ` Daniel Glöckner
2022-03-14 15:45 ` Miquel Raynal
2022-03-15 7:06 ` Lothar Waßmann
2022-03-15 8:34 ` Miquel Raynal [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220315093437.54737f0d@xps13 \
--to=miquel.raynal@bootlin.com \
--cc=LW@KARO-electronics.de \
--cc=computersforpeace@gmail.com \
--cc=dg@emlix.com \
--cc=han.xu@nxp.com \
--cc=linux-mtd@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox