All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ira W. Snyder <iws@ovro.caltech.edu>
To: u-boot@lists.denx.de
Subject: [U-Boot] [PATCH 3/7] 83xx/85xx/86xx: Add ECC support
Date: Tue, 10 Nov 2009 09:53:50 -0800	[thread overview]
Message-ID: <20091110175350.GC1549@ovro.caltech.edu> (raw)
In-Reply-To: <1257874604.10661.189.camel@localhost.localdomain>

On Tue, Nov 10, 2009 at 11:36:44AM -0600, Peter Tyser wrote:
> > Ok, here are my results, this is on a 8349EMDS-derived board. My
> > 8349EMDS eval board doesn't have ECC memory.
> > 
> > 1) It might be nice to have something to print the current injection
> > registers. It is not a big deal, anyone using this should be an expert
> > anyway.
> 
> Thanks for the feedback.  I can add a printing of the current injection
> values when "ecc inject" is ran if others would like.
> 
> > 2) ecc inject off didn't seem to work, see the following capture:
> > 
> > => ecc info
> > No ECC errors have occurred
> > => ecc inject low 0x1
> > => ecc info
> > 
> > WARNING: ECC error in DDR Controller 0
> >         Addr:   0x0_0ff7ae40
> >         Data:   0x0fffdf9c_0ff7aed1     ECC:    0x81
> >         Expect: 0x0fffdf9c_0ff7aed0     ECC:    0x81
> >         Net:    DATA0
> >         Syndrome: 0x3b
> >         Single-Bit errors: 0x1e
> >         Attrib: 0x01002001
> >         Detect: 0x80000004 (MME, SBE) 
> > 
> > => ecc inject off
> > 
> > # Ok, now error injection is off, I still expect some errors to be
> > # present in the error registers
> > 
> > => ecc info
> > 
> > WARNING: ECC error in DDR Controller 0
> >         Addr:   0x0_0ff7ae1c
> >         Data:   0x0fffdf9c_0ff7d2a1     ECC:    0xe4
> >         Expect: 0x0fffdf9c_0ff7d2a0     ECC:    0xe4
> >         Net:    DATA0
> >         Syndrome: 0x3b
> >         Single-Bit errors: 0xd1
> >         Attrib: 0x01003001
> >         Detect: 0x80000004 (MME, SBE) 
> > 
> > # And there was the error. Now, I don't expect any more errors to
> > # be present, after all, injection is disabled.
> > #
> > # But there is one! Why?
> 
> I believe what's happening is:
> 1. You turn error injection on
> 2. Every time you perform a DRAM write, the value written has an ECC
> error
> 3. You write to DRAM lots of times, in lots of locations
> 4. You turn error injection off
> 5. There are still lots of ECC errors residing in DRAM that you discover
> later when you read from "corrupted" memory locations
> 
> So in theory, unless you scrub your memory, you might uncover lots more
> ECC errors later.
> 
> As an easily reproducible example try:
> > ecc inject low 1; mw.l 0x100000 0xbeefba11 0x800000; ecc inject off
> > ecc info
> > ecc info
> > md 0x100000
> > ecc info
> > ecc info
> > md 0x200000
> ...
> 
> The majority of the above ecc errors could be cleared by running the
> following command with ecc injection off:
> mw.l 0x100000 0xbeefba11 0x800000
> 
> 
> > => ecc info
> > 
> > WARNING: ECC error in DDR Controller 0
> >         Addr:   0x0_0fff8a0c
> >         Data:   0x0fff8a00_0fff8a01     ECC:    0xff
> >         Expect: 0x0fff8a00_0fff8a00     ECC:    0xff
> >         Net:    DATA0
> >         Syndrome: 0x3b
> >         Single-Bit errors: 0x04
> >         Attrib: 0x01003001
> >         Detect: 0x00000000
> > => 
> > 
> > # Note that I keep seeing ecc errors until I run the command:
> > # ecc inject low 0
> 
> Hmm...  "ecc inject off" should have the same effect as "ecc inject low
> 0".  Is there a chance some of the ECC errors still remaining in DRAM
> are the culprit?
> 
> > # Why did it take two runs of ecc info to clear all of the errors?
> 
> This is probably the same issue as above - lots errors are injected and
> there's no saying when exactly they'll turn up.
> 
> > Other than the above strangeness, everything is working great on my 83xx
> > board. I think the new output is pretty nice. It serves my purposes
> > equally well to the old code.
> 
> Thanks for trying the changes out,

Ok, this makes perfect sense. I didn't think about the possibility of
latent memory errors. :)

Here is a run using your instructions above. Keeping the possibility of
latent memory errors in mind, the behavior seems correct to me. You're
free to add my Tested-by if you'd like.

=> ecc inject low 1
=> mw.l 0x100000 0xbeefba11 0x800000
=> ecc inject off
=> ecc info

WARNING: ECC error in DDR Controller 0
        Addr:   0x0_0ff7ae40
        Data:   0x0fffdf9c_0ff7aed1     ECC:    0x81
        Expect: 0x0fffdf9c_0ff7aed0     ECC:    0x81
        Net:    DATA0
        Syndrome: 0x3b
        Single-Bit errors: 0x56
        Attrib: 0x01002001
        Detect: 0x80000004 (MME, SBE) 

=> ecc info

WARNING: ECC error in DDR Controller 0
        Addr:   0x0_0ff7ad08
        Data:   0x0ffd594c_0000087f     ECC:    0x91
        Expect: 0x0ffd594c_0000087e     ECC:    0x91
        Net:    DATA0
        Syndrome: 0x3b
        Single-Bit errors: 0x01
        Attrib: 0x01003001
        Detect: 0x00000000
=> ecc info
No ECC errors have occurred
=> ecc info
No ECC errors have occurred
=> ecc info
No ECC errors have occurred
=> md 0x100000 10
00100000: beefba11 beefba11 beefba11 beefba11    ................
00100010: beefba11 beefba11 beefba11 beefba11    ................
00100020: beefba11 beefba11 beefba11 beefba11    ................
00100030: beefba11 beefba11 beefba11 beefba11    ................
=> ecc info      

WARNING: ECC error in DDR Controller 0
        Addr:   0x0_0010003c
        Data:   0xbeefba11_beefba10     ECC:    0x7b
        Expect: 0xbeefba11_beefba11     ECC:    0x7b
        Net:    DATA0
        Syndrome: 0x3b
        Single-Bit errors: 0x13
        Attrib: 0x01002001
        Detect: 0x00000000
=> ecc info
No ECC errors have occurred
=> md 0x200000 10
00200000: beefba11 beefba11 beefba11 beefba11    ................
00200010: beefba11 beefba11 beefba11 beefba11    ................
00200020: beefba11 beefba11 beefba11 beefba11    ................
00200030: beefba11 beefba11 beefba11 beefba11    ................
=> ecc info

WARNING: ECC error in DDR Controller 0
        Addr:   0x0_0020003c
        Data:   0xbeefba11_beefba10     ECC:    0x7b
        Expect: 0xbeefba11_beefba11     ECC:    0x7b
        Net:    DATA0
        Syndrome: 0x3b
        Single-Bit errors: 0x10
        Attrib: 0x01002001
        Detect: 0x00000000
=> ecc info
No ECC errors have occurred
=> mw.l 0x100000 0xbeefba11 0x800000
=> ecc info

WARNING: ECC error in DDR Controller 0
        Addr:   0x0_001007c8
        Data:   0xbeefba11_beefba10     ECC:    0x7b
        Expect: 0xbeefba11_beefba11     ECC:    0x7b
        Net:    DATA0
        Syndrome: 0x3b
        Single-Bit errors: 0x06
        Attrib: 0x01003001
        Detect: 0x80000004 (MME, SBE) 

=> ecc info
No ECC errors have occurred
=> ecc info
No ECC errors have occurred

Ira

  reply	other threads:[~2009-11-10 17:53 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-09 23:37 [U-Boot] [PATCH v2 0/7] 83xx/85xx/86xx ECC update Peter Tyser
2009-11-09 23:37 ` [U-Boot] [PATCH 1/7] 83xx: Add CCSR DDR register offsets Peter Tyser
2009-11-09 23:37 ` [U-Boot] [PATCH 2/7] 85xx/86xx: Add ECC error injection defines Peter Tyser
2009-11-09 23:37 ` [U-Boot] [PATCH 3/7] 83xx/85xx/86xx: Add ECC support Peter Tyser
2009-11-10  0:25   ` Liu Dave-R63238
2009-11-10  0:32     ` Peter Tyser
2009-11-10  0:38       ` Liu Dave-R63238
2009-11-10  0:42         ` Liu Dave-R63238
2009-11-10  0:46           ` Liu Dave-R63238
2009-11-10  1:01           ` Peter Tyser
2009-11-10  1:08             ` Liu Dave-R63238
2009-11-10  1:20               ` Peter Tyser
2009-11-10  2:15                 ` Ira W. Snyder
2009-11-10  3:07                   ` Peter Tyser
2009-11-10 16:51                     ` Ira W. Snyder
2009-11-10 17:36                       ` Peter Tyser
2009-11-10 17:53                         ` Ira W. Snyder [this message]
2009-11-09 23:37 ` [U-Boot] [PATCH 4/7] 83xx: Migrate CONFIG_DDR_ECC_CMD to CONFIG_EDAC_FSL_ECC Peter Tyser
2009-11-10  0:51   ` [U-Boot] [PATCH 4/7] 83xx: Migrate CONFIG_DDR_ECC_CMD toCONFIG_EDAC_FSL_ECC Liu Dave-R63238
2009-11-10  1:09     ` Peter Tyser
2009-11-09 23:37 ` [U-Boot] [PATCH 5/7] Add check for ECC errors during SDRAM POST and mtest Peter Tyser
2009-11-09 23:37 ` [U-Boot] [PATCH 6/7] xes: Add 8xxx post support Peter Tyser
2009-11-09 23:37 ` [U-Boot] [PATCH 7/7] xes: Enable memory POST and ECC error reporting Peter Tyser
2009-12-16 15:55 ` [U-Boot] [PATCH v2 0/7] 83xx/85xx/86xx ECC update Peter Tyser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091110175350.GC1549@ovro.caltech.edu \
    --to=iws@ovro.caltech.edu \
    --cc=u-boot@lists.denx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.