From: Ivan Djelic <ivan.djelic@parrot.com>
To: Cliff Brake <cliff.brake@gmail.com>
Cc: "linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>,
"dedekind1@gmail.com" <dedekind1@gmail.com>
Subject: Re: JFFS2 loss of power expectations
Date: Wed, 4 May 2011 00:03:42 +0200 [thread overview]
Message-ID: <20110503220342.GA3862@parrot.com> (raw)
In-Reply-To: <BANLkTinscG0a3mOwvCwQOLKRNoabStkOYg@mail.gmail.com>
On Tue, May 03, 2011 at 09:08:26PM +0100, Cliff Brake wrote:
> >> 2) any suggestions for debugging this?
> >
> > Some kind of device which may cut power is needed. Then you may write a
> > test program or script, cut power at random point, boot up, make sure
> > the FS look ok.
>
> Yes, we have a programmable PS set up to cut power during boot, and we
> can reproduce JFFS2 file system corruption with a day or so of
> testing. We are using a fairly old CPU board with a small SLC flash
> (128MB).
>
> Now, the question is how do we prevent it?
>
> We are looking into mounting the root file system in RO and sync
> modes, etc, but don't have test results yet.
>
> So, just looking for general ideas how to improve this situation.
Hi Cliff,
Just a few debugging ideas that helped me a lot in the past:
1. Try to focus your random power cuts so that they happen precisely during a
nand write/erase operation; this will help reproduce bugs much faster.
Ideally you could try to use a hw timer or watchdog to trigger a software
reset with µs precision.
2. Using instrumentation and targeted power cuts as described above, you
should be able to isolate the last interrupted nand operation that caused a
corruption: is it an interrupted page programming, or a partially erased block?
3. During reboot after a power cut, look for nand read failures. Are they
located as expected in the last page/block that was programmed/erased ? Or do
they appear in unrelated locations ? Or not appearing at all ?
4. If the above steps do not lead to an obvious explanation, they may still
provide you with a way to dump nand contents (before and after corruption) and
systematically reproduce the bug on a linux pc running nandsim. This makes
debugging much easier.
On the improvement side, I was going to suggest mounting as much as possible
as RO, but you mentioned that already.
Hope that helps,
Regards,
Ivan
prev parent reply other threads:[~2011-05-03 22:05 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-22 5:05 JFFS2 loss of power expectations Cliff Brake
2011-04-22 7:36 ` Artem Bityutskiy
2011-05-03 20:08 ` Cliff Brake
2011-05-03 22:03 ` Ivan Djelic [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110503220342.GA3862@parrot.com \
--to=ivan.djelic@parrot.com \
--cc=cliff.brake@gmail.com \
--cc=dedekind1@gmail.com \
--cc=linux-mtd@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.