public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check() ?
@ 2009-04-08 21:57 Jeff Haran
  2009-04-10 21:47 ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Haran @ 2009-04-08 21:57 UTC (permalink / raw)
  To: linux-kernel

Hi,

Recent versions of this function start off with:

static void mpc85xx_mc_check(struct mem_ctl_info *mci)
{
    struct mpc85xx_mc_pdata *pdata = mci->pvt_info;
    ...

    err_detect = in_be32(pdata->mc_vbase + MPC85XX_MC_ERR_DETECT);
    if (err_detect)
        return;

    ...
}

My reading of the Freescale 8548E Manual leads me to conclude that the
Memory Error Detect register (ERR_DETECT) will have various bits set if
the memory controller has detected an error since the last time it was
cleared. If no memory error has occurred, the register will contain 0.

Perhaps I am missing something very basic, but it seem to me that the
above "if" should be:

    if (!err_detect)
        return;

as the existing code would seem to read "if any errors have occurred,
ignore them", though perhaps testing has demonstrated that the Freescale
manual is in error.

Please include this email address in responses as I do not subscribe.

Thanks,

Jeff Haran
Brocade

^ permalink raw reply	[flat|nested] 7+ messages in thread
* bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check()
@ 2009-04-29  1:23 Jeff Haran
  2009-04-29  7:37 ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Haran @ 2009-04-29  1:23 UTC (permalink / raw)
  To: linux-kernel

Hi,

Recent versions of this function contain the following snippets:

    if (err_detect & DDR_EDE_SBE)
        edac_mc_handle_ce(mci, pfn, err_addr & PAGE_MASK,
                  syndrome, row_index, 0, mci->ctl_name);

    if (err_detect & DDR_EDE_MBE)
        edac_mc_handle_ue(mci, pfn, err_addr & PAGE_MASK,
                  row_index, mci->ctl_name);

I am pretty sure the references to PAGE_MASK should be proceeded by a
tilda, as in:

    if (err_detect & DDR_EDE_SBE)
        edac_mc_handle_ce(mci, pfn, err_addr & ~PAGE_MASK,
                  syndrome, row_index, 0, mci->ctl_name);

    if (err_detect & DDR_EDE_MBE)
        edac_mc_handle_ue(mci, pfn, err_addr & ~PAGE_MASK,
                  row_index, mci->ctl_name);


Much as I would like to submit a tested patch like the rest of the
world, I find myself in the situation where the only Freescale target
system I have to test on is running a 3 year old kernel (2.6.14), which
preceeds the introduction of EDAC driver support, at least for
Freescale. So the best I can do is borrow from the new EDAC driver and
backport it to the old kernel.

But I have learned a few things in this process and can thus share what
I've learned as it may be of help to the EDAC driver developers:

1) Before you read the Freescale 8548 CAPTURE_ADDRESS register, you want
to read CAPTURE_ATTRIBUTES first and make sure the VLD bit (least
significant bit in the register) is set or else the data in
CAPTURE_ADDRESS may not be yet valid.

2) When you are done scrubbing the memory with the single bit error, you
want to write 0 to CAPTURE_ATTRIBUTES so as to clear VLD and thus setup
the ECC capture logic to capture the next single bit error.

Please include this email address in responses as I do not subscribe.

Thanks,

Jeff Haran
Brocade

^ permalink raw reply	[flat|nested] 7+ messages in thread
* Re: bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check()
@ 2009-04-29  4:12 Doug Thompson
  0 siblings, 0 replies; 7+ messages in thread
From: Doug Thompson @ 2009-04-29  4:12 UTC (permalink / raw)
  To: Dave Jiang; +Cc: linux-kernel, Jeff Haran


Dave, can you look at this

doug thompson

--- On Tue, 4/28/09, Jeff Haran <jharan@Brocade.COM> wrote:

> From: Jeff Haran <jharan@Brocade.COM>
> Subject: bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check()
> To: linux-kernel@vger.kernel.org
> Date: Tuesday, April 28, 2009, 7:23 PM
> Hi,
> 
> Recent versions of this function contain the following
> snippets:
> 
>     if (err_detect & DDR_EDE_SBE)
>         edac_mc_handle_ce(mci, pfn,
> err_addr & PAGE_MASK,
>                
>   syndrome, row_index, 0, mci->ctl_name);
> 
>     if (err_detect & DDR_EDE_MBE)
>         edac_mc_handle_ue(mci, pfn,
> err_addr & PAGE_MASK,
>                
>   row_index, mci->ctl_name);
> 
> I am pretty sure the references to PAGE_MASK should be
> proceeded by a
> tilda, as in:
> 
>     if (err_detect & DDR_EDE_SBE)
>         edac_mc_handle_ce(mci, pfn,
> err_addr & ~PAGE_MASK,
>                
>   syndrome, row_index, 0, mci->ctl_name);
> 
>     if (err_detect & DDR_EDE_MBE)
>         edac_mc_handle_ue(mci, pfn,
> err_addr & ~PAGE_MASK,
>                
>   row_index, mci->ctl_name);
> 
> 
> Much as I would like to submit a tested patch like the rest
> of the
> world, I find myself in the situation where the only
> Freescale target
> system I have to test on is running a 3 year old kernel
> (2.6.14), which
> preceeds the introduction of EDAC driver support, at least
> for
> Freescale. So the best I can do is borrow from the new EDAC
> driver and
> backport it to the old kernel.
> 
> But I have learned a few things in this process and can
> thus share what
> I've learned as it may be of help to the EDAC driver
> developers:
> 
> 1) Before you read the Freescale 8548 CAPTURE_ADDRESS
> register, you want
> to read CAPTURE_ATTRIBUTES first and make sure the VLD bit
> (least
> significant bit in the register) is set or else the data
> in
> CAPTURE_ADDRESS may not be yet valid.
> 
> 2) When you are done scrubbing the memory with the single
> bit error, you
> want to write 0 to CAPTURE_ATTRIBUTES so as to clear VLD
> and thus setup
> the ECC capture logic to capture the next single bit
> error.
> 
> Please include this email address in responses as I do not
> subscribe.
> 
> Thanks,
> 
> Jeff Haran
> Brocade
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-04-29 12:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-08 21:57 bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check() ? Jeff Haran
2009-04-10 21:47 ` Andrew Morton
2009-04-13 17:10   ` Dave Jiang
  -- strict thread matches above, loose matches on Subject: below --
2009-04-29  1:23 bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check() Jeff Haran
2009-04-29  7:37 ` Andrew Morton
2009-04-29 12:46   ` Kumar Gala
2009-04-29  4:12 Doug Thompson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox