All of lore.kernel.org
 help / color / mirror / Atom feed
From: pieterg@gmx.com (pieterg)
To: linux-arm-kernel@lists.infradead.org
Subject: pxa3xx_nand issues
Date: Mon, 27 Sep 2010 13:38:20 +0200	[thread overview]
Message-ID: <201009271338.21084.pieterg@gmx.com> (raw)
In-Reply-To: <AANLkTik8N+ATdhygBsatnrYJ0MOLd+L5ujB47Xrf_O_D@mail.gmail.com>

On Saturday 25 September 2010 04:50:04 Haojian Zhuang wrote:
> On Thu, Sep 23, 2010 at 11:29 PM, pieterg <pieterg@gmx.com> wrote:
> > On Thursday 23 September 2010 13:32:26 pieterg wrote:
> >> On Thursday 23 September 2010 08:05:56 Eric Miao wrote:
> >> > On Thu, Sep 23, 2010 at 1:12 AM, pieterg <pieterg@gmx.com> wrote:
> >> > > In my search for the cause of the huge number of single/double bit
> >> > > errors I'm experiencing on colibri pxa320/310 devices, I've come
> >> > > across this commit
> >
> > http://git.kernel.org/?p=linux/kernel/git/ycmiao/pxa-linux-2.6.git;a=co
> >mmit;h=7f9938d0fd6c778bd0ce296a3e3b50266de2b892
> >
> >> > > According to the commitlog, it attempts to work around an issue
> >> > > regarding non-page-aligned reads.
> >> > > The workaround seems to force page-aligned access, by dropping the
> >> > > offset within the page (column address bytes).
> >> > > However, in my setup (with a jffs2 filesystem on nand),
> >> > > non-page-aligned reads never occur, but non-page-aligned writes
> >> > > occur very frequently. (during the jffs2 gc).
> >> > > These are also affected by this commit, while the commitlog does
> >> > > not state whether or not the same issue would occur for the
> >> > > program command, and in that case, whether or not the same
> >> > > workaround would apply.
> >> > >
> >> > > I've tried to revert the commit, but unfortunately this doesn't
> >> > > reduce the huge number of single/double bit errors (and jffs2 crc
> >> > > errors as a result) I'm getting.
> >> > >
> >> > > But having these non-aligned writes during GC, would that indicate
> >> > > a problem with my jffs2 image parameters perhaps?
> >> > > (though I cannot imagine this could actually cause double bit
> >> > > errors)
> >> >
> >> > It might not be related to the commit above. ?The NAND controller
> >> > will always read the whole page and ignoring the column address,
> >> > that patch tries to make less confusion. The offset is actually
> >> > handled completely by software (memorized).
> >>
> >> I can see how the read offset works, but I do not quite see how this
> >> would work for writes (which call the same prepare_read_prog_cmd, and
> >> have their column address stripped as well).
> >> Found out that this happens when writing oob data by the way; these
> >> are writes with offset 2048 within the page. Jffs2 does this when
> >> writing cleanmarkers.
> >
> > Tested this, and found out that this commit is actually quite essential
> > for writes as well.
> > Without it, the OOB data doesn't get written.
> > So we can close this part of the topic, commit 7f9938d0 is perfectly
> > fine.
> >
> >> I could identify about 10 eraseblocks with pages which produce
> >> single/double bit errors.
> >> After I marked them bad (manually), I've seen no more bit errors, and
> >> the jffs2 rootfs has remained perfectly healthy.
> >
> > Turned out to be a short-term solution.
> > After a while I got more double-bit errors, and ended up bad-marking a
> > dozen or so other eraseblocks, and it does not seem to stop.
> >
> > Strangest thing is that when I write a new jffs2 image with uboot (nand
> > erase, nand write) or with the kernel (flash_eraseall, nandwrite), it
> > never contains any biterrors when I mount it.
> > Only after the filesystem has been mounted, gets modified, and then
> > after the first reboot, the biterrors are there.
>
> Could you make sure whether these "wrong" block are truely bad block?
> Maybe you can erase/write them continuously multi-times in XDB.

Unfortunately I don't have XDB.
However, I can erase/write/read them with u-boot and with the kernel 
(flash_eraseall / nandwrite), several times, without ever getting a 
NDSR_CS0_BBD status.
However, I get many NDSR_DBERR and NDSR_SBERR interrupts.

But because these occur during a read, the kernel never takes any action, 
the blocks will not be marked bad.
(And I find it hard to believe that such a huge number of blocks on a brand 
new chip would actually be bad)

Rgds, Pieter

  reply	other threads:[~2010-09-27 11:38 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-22 17:12 pxa3xx_nand issues pieterg
2010-09-23  6:05 ` Eric Miao
2010-09-23 11:32   ` pieterg
2010-09-23 15:29     ` pieterg
2010-09-23 18:03       ` Matt Reimer
2010-09-25  2:50       ` Haojian Zhuang
2010-09-27 11:38         ` pieterg [this message]
2010-09-26 14:32       ` Lei Wen
2010-09-27 11:54         ` pieterg
2010-09-27 12:22           ` Lei Wen
2010-09-27 13:50             ` pieterg
2010-09-27 17:39               ` pieterg
2010-10-01  0:15                 ` Marek Vasut
2010-10-01  6:55                   ` pieterg
2010-10-01  7:25                     ` Marek Vasut

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201009271338.21084.pieterg@gmx.com \
    --to=pieterg@gmx.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.