Re: UBIFS question - Boris Brezillon

public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed

From: Boris Brezillon <boris.brezillon@free-electrons.com>
To: Martin Townsend <mtownsend1973@gmail.com>
Cc: Ricard Wanderlof <ricard.wanderlof@axis.com>,
	Richard Weinberger <richard@nod.at>,
	"linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>
Subject: Re: UBIFS question
Date: Thu, 17 Mar 2016 15:55:44 +0100	[thread overview]
Message-ID: <20160317155544.3b43bbb9@bbrezillon> (raw)
In-Reply-To: <CABatt_zxp-6s++zLeWfijdTT4fevgTPk97sue-3GGj_6wGgF0w@mail.gmail.com>

Hi Martin,

On Thu, 17 Mar 2016 12:54:43 +0000
Martin Townsend <mtownsend1973@gmail.com> wrote:

> Hi Ricard, Richard
> 
> On Thu, Mar 17, 2016 at 11:43 AM, Ricard Wanderlof
> <ricard.wanderlof@axis.com> wrote:
> >
> >> > We expect the flash devices to start failing quicker than normally
> >> > expected due to the environment in which they will be operating in, so
> >> > sudden NAND blocks turning bad will eventually happen and what we
> >> > would like to do is try and capture this as soon as possible.
> >> > The boards are not accessible as they will be located in very remote
> >> > locations so detecting these failures before the system locks up would
> >> > be an advantage so we can report home with the information and fail
> >> > over to the other filesystem (providing that hasn't also been
> >> > corrupted).
> >>
> >> Dealing with sudden bad NAND blocks is almost impossible.
> >> Unless you have a copy of each block.
> >> NAND is not expected to gain bad blocks without an indication like
> >> correctable bitflips.
> 
> I'm not interested in dealing with sudden bad NAND blocks, I accept
> this will more than likely happen at some point but what I am
> interested in is early detection.  Once the system has booted most
> files will be cached to memory and the product that the flash devices
> are in is designed to run for many months without being power cycled
> so what I'm looking to do is monitor the health of the flash devices.
> Ideally I would like to know FEC counts but I doubt I will get this
> information :) But checking LEBs, pages etc for bad checksums would be
> great.
> 
> >
> > Yes, although the NAND flash documentation sometimes reads like blocks can
> > suddenly 'go bad' for no special reason, in practice it is due to
> > excessive erase/write cycles, i.e. its a wear problem.
> >
> > However, I don't know, if you are operating the flash in an environment
> > where there is cosmic radiation that can actually damage the chip for
> > instance, then of course any part of the chip could fail randomly with a
> > fairly high probability. But NAND bad block management is not designed to
> > take care of that case, which is why bad block detection is only done
> > during block erasure (i.e. when a block fails to erase).
> >
> I'm not sure how much I can say I'm afraid as I'm under NDA but assume
> that it is going to be operating in an environment where it's
> receiving more cosmic radiation than expected. So I could look at the
> bad block detection code to get some ideas?  I don't necessary want to
> mark blocks as bad I just want to detect them so I have an idea that
> the flash is failing.

I guess you're more worried about bitflips than blocks becoming bad
(which, AFAIK, can only happen when writing or erasing a block, not
when reading it).
If bitflips detection/prevention is what your looking for, I guess
ubihealthd (developed by Richard) could help.

[1]https://lwn.net/Articles/663751/
[2]https://lkml.org/lkml/2015/3/29/31


-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

next prev parent reply	other threads:[~2016-03-17 14:56 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-16  9:54 UBIFS question Martin Townsend
2016-03-16 23:12 ` Richard Weinberger
2016-03-17  8:33   ` Martin Townsend
2016-03-17  8:56     ` Richard Weinberger
2016-03-17 11:16       ` Martin Townsend
2016-03-17 11:25         ` Richard Weinberger
2016-03-17 11:43           ` Ricard Wanderlof
2016-03-17 12:54             ` Martin Townsend
2016-03-17 14:55               ` Boris Brezillon [this message]
2016-03-17 15:39                 ` Martin Townsend
2016-03-17 15:59                   ` Richard Weinberger
  -- strict thread matches above, loose matches on Subject: below --
2009-07-10 18:43 UBIFS Question Laurent .
2009-07-10 20:01 ` Corentin Chary
2009-07-11 14:55 ` Artem Bityutskiy
2009-07-14  6:11   ` Laurent .
2009-07-14  7:22     ` Artem Bityutskiy
2009-07-11 15:54 ` Vitaly Wool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160317155544.3b43bbb9@bbrezillon \
    --to=boris.brezillon@free-electrons.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=mtownsend1973@gmail.com \
    --cc=ricard.wanderlof@axis.com \
    --cc=richard@nod.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox