* Re: bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check() ?
[not found] <57AC2FA1761300418C7AB8F3EA493C9702C5F200@HQ-EXCH-5.corp.brocade.com>
@ 2009-04-10 21:47 ` Andrew Morton
2009-04-13 17:10 ` Dave Jiang
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2009-04-10 21:47 UTC (permalink / raw)
To: Jeff Haran; +Cc: linuxppc-dev, Dave Jiang, linux-kernel, Doug Thompson
(cc's added)
On Wed, 8 Apr 2009 14:57:42 -0700
"Jeff Haran" <jharan@Brocade.COM> wrote:
> Hi,
>
> Recent versions of this function start off with:
>
> static void mpc85xx_mc_check(struct mem_ctl_info *mci)
> {
> struct mpc85xx_mc_pdata *pdata = mci->pvt_info;
> ...
>
> err_detect = in_be32(pdata->mc_vbase + MPC85XX_MC_ERR_DETECT);
> if (err_detect)
> return;
>
> ...
> }
>
> My reading of the Freescale 8548E Manual leads me to conclude that the
> Memory Error Detect register (ERR_DETECT) will have various bits set if
> the memory controller has detected an error since the last time it was
> cleared. If no memory error has occurred, the register will contain 0.
>
> Perhaps I am missing something very basic, but it seem to me that the
> above "if" should be:
>
> if (!err_detect)
> return;
>
> as the existing code would seem to read "if any errors have occurred,
> ignore them", though perhaps testing has demonstrated that the Freescale
> manual is in error.
>
> Please include this email address in responses as I do not subscribe.
>
> Thanks,
>
> Jeff Haran
> Brocade
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check() ?
2009-04-10 21:47 ` bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check() ? Andrew Morton
@ 2009-04-13 17:10 ` Dave Jiang
0 siblings, 0 replies; 4+ messages in thread
From: Dave Jiang @ 2009-04-13 17:10 UTC (permalink / raw)
To: Jeff Haran; +Cc: linuxppc-dev, Andrew Morton, linux-kernel, Doug Thompson
Jeff, you are correct. I will submit a patch to correct that.
Andrew Morton wrote:
> (cc's added)
>
> On Wed, 8 Apr 2009 14:57:42 -0700
> "Jeff Haran" <jharan@Brocade.COM> wrote:
>
>> Hi,
>>
>> Recent versions of this function start off with:
>>
>> static void mpc85xx_mc_check(struct mem_ctl_info *mci)
>> {
>> struct mpc85xx_mc_pdata *pdata = mci->pvt_info;
>> ...
>>
>> err_detect = in_be32(pdata->mc_vbase + MPC85XX_MC_ERR_DETECT);
>> if (err_detect)
>> return;
>>
>> ...
>> }
>>
>> My reading of the Freescale 8548E Manual leads me to conclude that the
>> Memory Error Detect register (ERR_DETECT) will have various bits set if
>> the memory controller has detected an error since the last time it was
>> cleared. If no memory error has occurred, the register will contain 0.
>>
>> Perhaps I am missing something very basic, but it seem to me that the
>> above "if" should be:
>>
>> if (!err_detect)
>> return;
>>
>> as the existing code would seem to read "if any errors have occurred,
>> ignore them", though perhaps testing has demonstrated that the Freescale
>> manual is in error.
>>
>> Please include this email address in responses as I do not subscribe.
>>
>> Thanks,
>>
>> Jeff Haran
>> Brocade
>
--
------------------------------------------------------
Dave Jiang
Software Engineer
MontaVista Software, Inc.
http://www.mvista.com
------------------------------------------------------
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check()
[not found] <57AC2FA1761300418C7AB8F3EA493C9702E319DB@HQ-EXCH-5.corp.brocade.com>
@ 2009-04-29 7:37 ` Andrew Morton
2009-04-29 12:46 ` Kumar Gala
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2009-04-29 7:37 UTC (permalink / raw)
To: Jeff Haran
Cc: Doug, linux-kernel, Dave Jiang, linuxppc-dev, Thompson,
Kumar Gala
Let's cc the suitable people.
On Tue, 28 Apr 2009 18:23:42 -0700 "Jeff Haran" <jharan@Brocade.COM> wrote:
> Hi,
>
> Recent versions of this function contain the following snippets:
>
> if (err_detect & DDR_EDE_SBE)
> edac_mc_handle_ce(mci, pfn, err_addr & PAGE_MASK,
> syndrome, row_index, 0, mci->ctl_name);
>
> if (err_detect & DDR_EDE_MBE)
> edac_mc_handle_ue(mci, pfn, err_addr & PAGE_MASK,
> row_index, mci->ctl_name);
>
> I am pretty sure the references to PAGE_MASK should be proceeded by a
> tilda, as in:
>
> if (err_detect & DDR_EDE_SBE)
> edac_mc_handle_ce(mci, pfn, err_addr & ~PAGE_MASK,
> syndrome, row_index, 0, mci->ctl_name);
>
> if (err_detect & DDR_EDE_MBE)
> edac_mc_handle_ue(mci, pfn, err_addr & ~PAGE_MASK,
> row_index, mci->ctl_name);
>
Could well be. PAGE_MASK is very easy to get wrong. I've _never_
trusted my own memory of it and I always have to go back to the
definition when reviewing code :(
> Much as I would like to submit a tested patch like the rest of the
> world, I find myself in the situation where the only Freescale target
> system I have to test on is running a 3 year old kernel (2.6.14), which
> preceeds the introduction of EDAC driver support, at least for
> Freescale. So the best I can do is borrow from the new EDAC driver and
> backport it to the old kernel.
>
> But I have learned a few things in this process and can thus share what
> I've learned as it may be of help to the EDAC driver developers:
>
> 1) Before you read the Freescale 8548 CAPTURE_ADDRESS register, you want
> to read CAPTURE_ATTRIBUTES first and make sure the VLD bit (least
> significant bit in the register) is set or else the data in
> CAPTURE_ADDRESS may not be yet valid.
>
> 2) When you are done scrubbing the memory with the single bit error, you
> want to write 0 to CAPTURE_ATTRIBUTES so as to clear VLD and thus setup
> the ECC capture logic to capture the next single bit error.
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check()
2009-04-29 7:37 ` bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check() Andrew Morton
@ 2009-04-29 12:46 ` Kumar Gala
0 siblings, 0 replies; 4+ messages in thread
From: Kumar Gala @ 2009-04-29 12:46 UTC (permalink / raw)
To: Andrew Morton
Cc: Jeff Haran, linux-kernel, Dave Jiang, linuxppc-dev, Doug Thompson,
Kumar Gala
On Apr 29, 2009, at 2:37 AM, Andrew Morton wrote:
> Let's cc the suitable people.
>
> On Tue, 28 Apr 2009 18:23:42 -0700 "Jeff Haran" <jharan@Brocade.COM>
> wrote:
>
>> Hi,
>>
>> Recent versions of this function contain the following snippets:
>>
>> if (err_detect & DDR_EDE_SBE)
>> edac_mc_handle_ce(mci, pfn, err_addr & PAGE_MASK,
>> syndrome, row_index, 0, mci->ctl_name);
>>
>> if (err_detect & DDR_EDE_MBE)
>> edac_mc_handle_ue(mci, pfn, err_addr & PAGE_MASK,
>> row_index, mci->ctl_name);
>>
>> I am pretty sure the references to PAGE_MASK should be proceeded by a
>> tilda, as in:
>>
>> if (err_detect & DDR_EDE_SBE)
>> edac_mc_handle_ce(mci, pfn, err_addr & ~PAGE_MASK,
>> syndrome, row_index, 0, mci->ctl_name);
>>
>> if (err_detect & DDR_EDE_MBE)
>> edac_mc_handle_ue(mci, pfn, err_addr & ~PAGE_MASK,
>> row_index, mci->ctl_name);
>>
>
> Could well be. PAGE_MASK is very easy to get wrong. I've _never_
> trusted my own memory of it and I always have to go back to the
> definition when reviewing code :(
This should ~PAGE_MASK to get the offset into the page.
>> Much as I would like to submit a tested patch like the rest of the
>> world, I find myself in the situation where the only Freescale target
>> system I have to test on is running a 3 year old kernel (2.6.14),
>> which
>> preceeds the introduction of EDAC driver support, at least for
>> Freescale. So the best I can do is borrow from the new EDAC driver
>> and
>> backport it to the old kernel.
>>
>> But I have learned a few things in this process and can thus share
>> what
>> I've learned as it may be of help to the EDAC driver developers:
>>
>> 1) Before you read the Freescale 8548 CAPTURE_ADDRESS register, you
>> want
>> to read CAPTURE_ATTRIBUTES first and make sure the VLD bit (least
>> significant bit in the register) is set or else the data in
>> CAPTURE_ADDRESS may not be yet valid.
>>
>> 2) When you are done scrubbing the memory with the single bit
>> error, you
>> want to write 0 to CAPTURE_ATTRIBUTES so as to clear VLD and thus
>> setup
>> the ECC capture logic to capture the next single bit error.
This is a correct description based on how FSL error HW works.
- k
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-04-29 12:47 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <57AC2FA1761300418C7AB8F3EA493C9702E319DB@HQ-EXCH-5.corp.brocade.com>
2009-04-29 7:37 ` bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check() Andrew Morton
2009-04-29 12:46 ` Kumar Gala
[not found] <57AC2FA1761300418C7AB8F3EA493C9702C5F200@HQ-EXCH-5.corp.brocade.com>
2009-04-10 21:47 ` bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check() ? Andrew Morton
2009-04-13 17:10 ` Dave Jiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).