LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: PROBLEM: memory corrupting bug, bisected to 6dda9d55
From: Benjamin Herrenschmidt @ 2010-10-19 10:16 UTC (permalink / raw)
  To: pacman; +Cc: Mel Gorman, linux-mm, Andrew Morton, linuxppc-dev, linux-kernel
In-Reply-To: <20101018213348.10281.qmail@kosh.dhis.org>


> > >From there, you might be able to close onto the culprit a bit more, for
> > example, try using the DABR register to set data access breakpoints
> > shortly before the corruption spot. AFAIK, On those old 32-bit CPUs, you
> > can set whether you want it to break on a real or a virtual address.
> 
> I thought of that, but as far as I can tell, this CPU doesn't have DABR.
> /proc/cpuinfo
> processor	: 0
> cpu		: 7447/7457
> clock		: 999.999990MHz
> revision	: 1.1 (pvr 8002 0101)
> bogomips	: 66.66
> timebase	: 33333333
> platform	: CHRP
> model		: Pegasos2
> machine		: CHRP Pegasos2
> Memory		: 512 MB

AFAIK, the 7447 is just a derivative of the 7450 design which -does-
have a DABR ... Unless it's broken :-)

> My next thought was: right after the correct value appears in memory, unmap
> the page from the kernel and let it Oops when it tries to write there. Then I
> found out that the kernel is using BATs instead of page tables for its own
> view of memory. Booting with "nobats" completely changes the memory usage
> pattern (probably because it's allocating a lot of pages to hold PTEs that it
> didn't need before)

Right. And that hides the problem I suppose ?

> > You can also sprinkle tests for the page content through the code if
> > that doesn't work to try to "close in" on the culprit (for example if
> > it's a case of stray DMA, like a network driver bug or such).
> 
> No network drivers are loaded when this happens.

Ok.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH 1/2] P4080/eLBC: Make Freescale elbc interrupt common to elbc devices
From: Kumar Gala @ 2010-10-19 13:18 UTC (permalink / raw)
  To: Roy Zang
  Cc: B07421, dedekind1, B25806, linuxppc-dev, linux-mtd, akpm, dwmw2,
	B11780
In-Reply-To: <1287386552-10647-1-git-send-email-tie-fei.zang@freescale.com>


On Oct 18, 2010, at 2:22 AM, Roy Zang wrote:

> Move Freescale elbc interrupt from nand dirver to elbc driver.
> Then all elbc devices can use the interrupt instead of ONLY nand.
>=20
> For former nand driver, it had the two functions:
>=20
> 1. detecting nand flash partitions;
> 2. registering elbc interrupt.
>=20
> Now, second function is removed to fsl_lbc.c.
>=20
> Signed-off-by: Lan Chunhe-B25806 <b25806@freescale.com>
> Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
> Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>
> Cc: Wood Scott-B07421 <B07421@freescale.com>
> ---

Roy, this is a nit, but are these really p4080 specific?  just wondering =
why the subject is P4080/eLBC:...

- k=

^ permalink raw reply

* Re: [PATCH] mxc_udc: add workaround for ENGcm09152 for i.MX35
From: Greg KH @ 2010-10-19 16:10 UTC (permalink / raw)
  To: Eric Bénard
  Cc: dbrownell, Dinh.Nguyen, linux-usb, linux-kernel, linuxppc-dev,
	gregkh, linux-arm-kernel
In-Reply-To: <4CB8AA76.2020309@eukrea.com>

On Fri, Oct 15, 2010 at 09:24:38PM +0200, Eric Bénard wrote:
> Hi Greg,
> 
> Le 15/10/2010 21:10, Greg KH a écrit :
> >On Fri, Oct 15, 2010 at 02:30:58PM +0200, Eric Bénard wrote:
> >>this patch gives the possibility to workaround bug ENGcm09152
> >>on i.MX35 when the hardware workaround is also implemented on
> >>the board.
> >>It covers the workaround described on page 25 of the following Errata :
> >>http://cache.freescale.com/files/dsp/doc/errata/IMX35CE.pdf
> >>
> >>Signed-off-by: Eric Bénard<eric@eukrea.com>
> >>---
> >>  arch/arm/mach-mx3/mach-cpuimx35.c |    1 +
> >>  drivers/usb/gadget/fsl_mxc_udc.c  |   15 +++++++++++++++
> >>  include/linux/fsl_devices.h       |    3 +++
> >>  3 files changed, 19 insertions(+), 0 deletions(-)
> >
> >Do you want me to take this through my usb tree, or will it go through
> >some other developer's tree?
> >
> as most of the changes are in drivers/usb that would be great if you can take it.

No problem, now queued up.

greg k-h

^ permalink raw reply

* Re: PROBLEM: memory corrupting bug, bisected to 6dda9d55
From: Thomas Gleixner @ 2010-10-19 16:42 UTC (permalink / raw)
  To: Helmut Grohne
  Cc: Mel Gorman, LKML, linux-mm, pacman, Andrew Morton, linuxppc-dev
In-Reply-To: <20101019162407.GB10148@alf.mars>

On Tue, 19 Oct 2010, Helmut Grohne wrote:

> On Mon, Oct 18, 2010 at 11:55:44PM +0200, Thomas Gleixner wrote:
> > I might be completely one off as usual, but this thing reminds me of a
> > bug I stared at yesterday night:
> 
> This problem is completely unrelated. My problem was caused by using
> binutils-gold.

Ok, thanks for the update. One thing less to worry about :)

Thanks,

	tglx

^ permalink raw reply

* Re: PROBLEM: memory corrupting bug, bisected to 6dda9d55
From: Helmut Grohne @ 2010-10-19 16:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Mel Gorman, LKML, linux-mm, pacman, Andrew Morton, linuxppc-dev
In-Reply-To: <alpine.LFD.2.00.1010182342490.6815@localhost6.localdomain6>

On Mon, Oct 18, 2010 at 11:55:44PM +0200, Thomas Gleixner wrote:
> I might be completely one off as usual, but this thing reminds me of a
> bug I stared at yesterday night:

This problem is completely unrelated. My problem was caused by using
binutils-gold.

Helmut

^ permalink raw reply

* Re: Freescale P2020 CPU Freeze over PCIe abort signal
From: Eran Liberty @ 2010-10-19 16:53 UTC (permalink / raw)
  To: Eran Liberty; +Cc: linuxppc-dev, linux-pci
In-Reply-To: <4CBC8B40.4060706@extricom.com>

Eran Liberty wrote:
> Eran Liberty wrote:
>> This should probably go to the Freescale support, as it feels like a 
>> hardware issue yet the end result is a very frozen Linux kernel so I 
>> post here first...
>>
>> I have a programmable FPGA PCIe device connected to a Freescale's 
>> P2020 PCIe port. As part of the bring-up tests, we are testing two 
>> faulty scenarios:
>> 1. The FPGA totally ignores the PCIe transaction.
>> 2. The FPGA return a transaction abort.
>>
>> Both are plausible PCIe behavior and their should be outcome is 
>> documented in the PCIe spec. The first should be terminated by the 
>> transaction requestor timeout mechanism and raise an error, the 
>> second should abort the transaction and raise and error.
>>
>> In P2020 if I do any of those the CPU is left hung over the transaction.
>>
>> something like:
>> in_le32(addr)
>>
>> is turned into:
>> 7c 00 04 ac     sync   7c 00 4c 2c     lwbrx   r0,0,r9
>> 0c 00 00 00     twi     0,r0,0
>> 4c 00 01 2c     isync
>>
>> assembly code, where in r9 (in this example) hold an address which is 
>> physically mapped into the PCIe resource space.
>>
>> The CPU will hang over the load instruction.
>>
>> Just for the fun of it, I have wrote my own assembly function 
>> omitting everything but the load instruction; still freeze.
>> Replace "lwbrx" with a simple "lwz"; still freeze.
>>
>> It looks like the CPU snoozes till the PCIe transaction is done with 
>> no timeouts, ignoring any abort signal.
>>
>> I am going to:
>> A. Try to reach the Freescale support.
>> B. Asked the FPGA designed to give me a new behavior that will stall 
>> the PCIe transaction replay for 10 sec, but after those return ok.
>> C. report back here with either A or B.
>>
>> If you have any ideas I would love to hear them.
>>
>> -- Liberty
>>
> Some more info:
>
> As said the the FPGA designer provided me a PCIe device that will 
> stall its response to a variable amount of time. The CPU became 
> un-frozen after this amount of time. More over, we have found that in 
> that period till it un-froze the PCIe core did a retry to that 
> transaction over and over every 40 ms. This gave me the bright idea to 
> look for the word "retry" in the Freescale documentation which 
> rewarded me with these registers:
>
> ------------------------------------------------------- snip 
> -------------------------------------------------------
> 16.3.2.3        PCI Express Outbound Completion Timeout Register
>                (PEX_OTB_CPL_TOR)
> The PCI Express outbound completion timeout register, shown in Figure 
> 16-4, contains the maximum wait
> time for a response to come back as a result of an outbound non-posted 
> request before a timeout condition
> occurs.
> Offset 
> 0x00C                                                                                                
> Access: Read/Write
>         0   1              5     7   
> 8                                                                                      
> 31
>     R
>        TD            
> —                                                            TC
>     W
> Reset 0     0  0  0   0   0   0  0   0   0   0   1    0  0   0  0    
> 1   1  1    1   1  1   1   1   1  1   1   1  1  1   1  1
>            Figure 16-4. PCI Express Outbound Completion Timeout 
> Register (PEX_OTB_CPL_TOR)
> Table 16-6 describes the PCI Express outbound completion timeout 
> register fields.
>                                 Table 16-6. PEX_OTB_CPL_TOR Field 
> Descriptions
>  Bits     Name                                                     
> Description
>   0        TD     Timeout disable. This bit controls the 
> enabling/disabling of the timeout function.
>                   0 Enable completion timeout
>                   1 Disable completion timeout
>  1–7        —     Reserved
> 8–31       TC     Timeout counter. This is the value that is used to 
> load the response counter of the completion timeout.
>                   One TC unit is 8× the PCI Express controller clock 
> period; that is, one TC unit is 20 ns at 400 MHz, and 30
>                   ns at 266.66 MHz.
>                   The following are examples of timeout periods based 
> on different TC settings:
>                   0x00_0000 Reserved
>                   0x10_FFFF 22.28 ms at 400 MHz controller clock; 
> 33.34 ms at 266.66 MHz controller clock
>                   0xFF_FFFF 335.54 ms at 400 MHz controller clock; 
> 503.31 ms at 266.66 MHz controller clock
>
>
> 16.3.2.4       PCI Express Configuration Retry Timeout Register
>               (PEX_CONF_RTY_TOR)
> The PCI Express configuration retry timeout register, shown in Figure 
> 16-5, contains the maximum time
> period during which retries of configuration transactions which 
> resulted in a CRS response occur.
> Offset 
> 0x010                                                                               
> Access: Read/Write
>         0  1     3   
> 4                                                                                     
> 31
>     R
>        RD     —                                                 TC
>     W
> Reset 0    0  0  0  0   1  0  0  0  0   0  0   0  0  0  0   1  1  1  
> 1  1  1   1   1 1 1   1  1   1 1   1  1
>           Figure 16-5. PCI Express Configuration Retry Timeout 
> Register (PEX_CONF_RTY_TOR)
>                            QorIQ P2020 Integrated Processor Reference 
> Manual, Rev. 0
> 16-12                                                                                   
> Freescale Semiconductor
>                                                                                                 
> PCI Express Interface Controller
> Table 16-7 describes the PCI Express configuration retry timeout 
> register fields.
>                            Table 16-7. PEX_CONF_RTY_TOR Field 
> Descriptions
>  Bits  Name                                                     
> Description
>   0     RD    Retry disable. This bit disables the retry of a 
> configuration transaction that receives a CRS status response
>               packet.
>               0 Enable retry of a configuration transaction in 
> response to receiving a CRS status response until the timeout
>                  counter (defined by the PEX_CONF_RTY_TOR[TC] field) 
> has expired.
>               1 Disable retry of a configuration transaction 
> regardless of receiving a CRS status response.
>  1–3     —    Reserved
> 4–31    TC    Timeout counter. This is the value that is used to load 
> the CRS response counter.
>               One TC unit is 8× the PCI Express controller clock 
> period; that is, one TC unit is 20 ns at 400 MHz and 30 ns
>               at 266.66 MHz.
>               Timeout period based on different TC settings:
>               0x000_0000        Reserved
>               0x400_FFFF        1.34 s at 400 MHz controller clock, 
> 2.02 s at 266.66 MHz controller clock
>               0xFFF_FFFF        5.37 s at 400 MHz controller clock, 
> 8.05 s at 266.66 MHz controller clock
> ------------------------------------------------------- snap 
> -------------------------------------------------------
>
> Now this is all nice on the paper, but what the P2020 seems to be 
> doing in reality is
> 1. never expire
> 2. do re-tries even in the non configuration access
>
> I am going to try to disable completion timeout and see if I get 
> better behavior.
>
> -- Liberty
>
>
Disabling PEX_OTB_CPL_TOR,  PEX_CONF_RTY_TOR, or both yields the same 
behavior. The kernel freezes over the load command while the underlying 
hardware does PCIe transaction retries to infinity and beyond.

-- Liberty

^ permalink raw reply

* Re: PROBLEM: memory corrupting bug, bisected to 6dda9d55
From: pacman @ 2010-10-19 18:10 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Mel Gorman, linux-mm, Andrew Morton, linuxppc-dev, linux-kernel
In-Reply-To: <1287483410.2341.66.camel@pasglop>

Benjamin Herrenschmidt writes:
> > 
> > I thought of that, but as far as I can tell, this CPU doesn't have DABR.
> 
> AFAIK, the 7447 is just a derivative of the 7450 design which -does-
> have a DABR ... Unless it's broken :-)

Hmm. gdb resorts to single-stepping when I set a watchpoint while debugging
some userspace program, which I assumed was caused by lack of hardware
watchpoint support. But that's not important right now.

I made a new discovery. During a test boot while looking at the usual symptom
of a corrupted page cache, I run md5sum /sbin/e2fsck twice and got 2
different results, neither one of them correct. The third time, yet another
different result. A few dozen more times, a few dozen more unique results. I
had somehow managed to get a usable interactive shell while corruption was
ongoing.

So then I ran
  dd if=/dev/mem bs=4 count=1 skip=$((0xfc5c080/4)) | od -t x4
a few times very fast, plucking the first affected word directly out of
memory by its physical address. The result:

The low 16 bits are always zero as before. The high 16 bits are a counter,
being incremented at about 1000Hz (as close as I could measure with a crude
shell script. 1024Hz would also be within the margin of error). And it's
little-endian.

While I was watching this happen, there were only 5 or 6 userspace processes
running, and 3 of them were shells. So I doubt that anything in userspace was
doing it. It went on for a few minutes before I exited the interactive shell
and allowed the boot to continue, while keeping an extra shell running on
tty2 to continue making observations. It stopped incrementing almost
immediately.

So what type of driver, firmware, or hardware bug puts a 16-bit 1000Hz timer
in memory, and does it in little-endian instead of the CPU's native byte
order? And why does it stop doing it some time during the early init scripts,
shortly after the root filesystem fsck?

I have not yet attempted to repeat the experiment. If it is repeatable, I'll
probe more deeply into those init scripts later. I'm looking hard at
/etc/rcS.d/S11hwclock.sh

-- 
Alan Curry

^ permalink raw reply

* Re: PROBLEM: memory corrupting bug, bisected to 6dda9d55
From: Segher Boessenkool @ 2010-10-19 20:47 UTC (permalink / raw)
  To: pacman; +Cc: Mel Gorman, linux-kernel, linux-mm, Andrew Morton, linuxppc-dev
In-Reply-To: <20101019181021.22456.qmail@kosh.dhis.org>

> I made a new discovery.

And this nails it :-)

> So then I ran
>   dd if=/dev/mem bs=4 count=1 skip=$((0xfc5c080/4)) | od -t x4
> a few times very fast, plucking the first affected word directly out of
> memory by its physical address. The result:
>
> The low 16 bits are always zero as before. The high 16 bits are a counter,
> being incremented at about 1000Hz (as close as I could measure with a
> crude
> shell script. 1024Hz would also be within the margin of error). And it's
> little-endian.

> So what type of driver, firmware, or hardware bug puts a 16-bit 1000Hz
> timer
> in memory, and does it in little-endian instead of the CPU's native byte
> order? And why does it stop doing it some time during the early init
> scripts,
> shortly after the root filesystem fsck?

It looks like it is the frame counter in an USB OHCI HCCA.
16-bit, 1kHz update, offset x'80 in a page.

So either the kernel forgot to call quiesce on it, or the firmware
doesn't implement that, or the firmware messed up some other way.


Segher

^ permalink raw reply

* Re: PROBLEM: memory corrupting bug, bisected to 6dda9d55
From: Benjamin Herrenschmidt @ 2010-10-19 20:58 UTC (permalink / raw)
  To: pacman; +Cc: Mel Gorman, linux-mm, Andrew Morton, linuxppc-dev, linux-kernel
In-Reply-To: <20101019181021.22456.qmail@kosh.dhis.org>

On Tue, 2010-10-19 at 13:10 -0500, pacman@kosh.dhis.org wrote:
> 
> So what type of driver, firmware, or hardware bug puts a 16-bit 1000Hz
> timer
> in memory, and does it in little-endian instead of the CPU's native
> byte
> order? And why does it stop doing it some time during the early init
> scripts,
> shortly after the root filesystem fsck?
> 
> I have not yet attempted to repeat the experiment. If it is
> repeatable, I'll
> probe more deeply into those init scripts later. I'm looking hard at
> /etc/rcS.d/S11hwclock.sh 

Stinks of USB...

Ben.

^ permalink raw reply

* Re: PROBLEM: memory corrupting bug, bisected to 6dda9d55
From: Benjamin Herrenschmidt @ 2010-10-19 21:02 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Mel Gorman, linux-kernel, linux-mm, pacman, Andrew Morton,
	linuxppc-dev
In-Reply-To: <56111.84.105.60.153.1287521237.squirrel@gate.crashing.org>

On Tue, 2010-10-19 at 22:47 +0200, Segher Boessenkool wrote:
> 
> It looks like it is the frame counter in an USB OHCI HCCA.
> 16-bit, 1kHz update, offset x'80 in a page.
> 
> So either the kernel forgot to call quiesce on it, or the firmware
> doesn't implement that, or the firmware messed up some other way.

I vote for the FW being on crack. Wouldn't be the first time with
Pegasos.

It's an OHCI or an UHCI in there ?

Can you try in prom_init.c changing the prom_close_stdin() function to
also close "stdout" ? 

         if (prom_getprop(_prom->chosen, "stdin", &val, sizeof(val)) > 0)
                 call_prom("close", 1, 0, val);
+        if (prom_getprop(_prom->chosen, "stdout", &val, sizeof(val)) > 0)
+               call_prom("close", 1, 0, val);

See if that makes a difference ?

Last option would be to manually turn the thing off with MMIO in yet-another
pegasos workaround in prom_init.c.

Cheers,
Ben.

^ permalink raw reply

* Re: PROBLEM: memory corrupting bug, bisected to 6dda9d55
From: pacman @ 2010-10-20  3:23 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Mel Gorman, linux-kernel, linux-mm, Andrew Morton, linuxppc-dev
In-Reply-To: <1287522168.2198.5.camel@pasglop>

Benjamin Herrenschmidt writes:
> 
> On Tue, 2010-10-19 at 22:47 +0200, Segher Boessenkool wrote:
> > 
> > It looks like it is the frame counter in an USB OHCI HCCA.
> > 16-bit, 1kHz update, offset x'80 in a page.
> > 
> > So either the kernel forgot to call quiesce on it, or the firmware
> > doesn't implement that, or the firmware messed up some other way.
> 
> I vote for the FW being on crack. Wouldn't be the first time with
> Pegasos.
> 
> It's an OHCI or an UHCI in there ?

There's one of each... UHCI on the motherboard, OHCI on a card in a PCI
expansion slot. They shipped the ODW with the extra controller on an
expansion card since the on-board UHCI doesn't do USB2.0.

And that OHCI controller does appear to be the culprit. The 2 affected
addresses tick at 1000Hz until ohci-hcd is modprobe'd, then they stop.

I think the mm people can consider this closed. 6dda9d55 didn't do anything
but expose a problem which has been here all along. Will drop them from Cc
list in any further messages.

> 
> Can you try in prom_init.c changing the prom_close_stdin() function to
> also close "stdout" ? 
> 
>          if (prom_getprop(_prom->chosen, "stdin", &val, sizeof(val)) > 0)
>                  call_prom("close", 1, 0, val);
> +        if (prom_getprop(_prom->chosen, "stdout", &val, sizeof(val)) > 0)
> +               call_prom("close", 1, 0, val);
> 
> See if that makes a difference ?

Huge difference. With no stdout to print to, the kernel seems to freeze up.
Or at least it loses the console. The last message it prints is "Device tree
struct 0x00933000 -> 0x00957000" then there's just nothing. I waited a while
for the console to come on but it didn't.

The diff fragment above applied inside prom_close_stdin, but there are some
prom_printf calls after prom_close_stdin. Calling prom_printf after closing
stdout sounds like it could be bad. If I moved it down below all the
prom_printf's, it would be after the "quiesce" call. Would that be acceptable
(or even interesting as an experiment)? Does a close need a quiesce after it?

-- 
Alan Curry

^ permalink raw reply

* [PATCH v3] add icswx support
From: Tseng-Hui (Frank) Lin @ 2010-10-20  4:02 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tsenglin

icswx is a PowerPC co-processor instruction to send data to a
co-processor. On Book-S processors the LPAR_ID and process ID (PID) of
the owning process are registered in the window context of the
co-processor at initial time. When the icswx instruction is executed,
the L2 generates a cop-reg transaction on PowerBus. The transaction has
no address and the processor does not perform an MMU access to
authenticate the transaction. The coprocessor compares the LPAR_ID and
the PID included in the transaction and the LPAR_ID and PID held in the
window context to determine if the process is authorized to generate the
transaction.

The OS needs to assign a 16-bit PID for the process. This cop-PID needs
to be updated during context switch. The cop-PID needs to be destroyed
when the context is destroyed.

Change log from v2:
- Make the code a CPU feature and return -NODEV if CPU doesn't have
  icswx co-processor instruction.
- Change the goto loop in use_cop() into a do-while loop.
- Change context destroy code into a new destroy_context_acop() function
  and #define it based on CONFIG_ICSWX.
- Remove mmput() from drop_cop().
- Fix some TAB/space problems.

Signed-off-by: Sonny Rao <sonnyrao@linux.vnet.ibm.com>
Signed-off-by: Tseng-Hui (Frank) Lin <thlin@linux.vnet.ibm.com>

---
 arch/powerpc/include/asm/cputable.h    |    4 +-
 arch/powerpc/include/asm/mmu-hash64.h  |    5 ++
 arch/powerpc/include/asm/mmu_context.h |    6 ++
 arch/powerpc/include/asm/reg.h         |   11 +++
 arch/powerpc/include/asm/reg_booke.h   |    3 -
 arch/powerpc/mm/mmu_context_hash64.c   |  109
++++++++++++++++++++++++++++++++
 arch/powerpc/platforms/Kconfig.cputype |   17 +++++
 7 files changed, 151 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h
b/arch/powerpc/include/asm/cputable.h
index 3a40a99..bbb4e2c 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -198,6 +198,7 @@ extern const char *powerpc_base_platform;
 #define CPU_FTR_CP_USE_DCBTZ		LONG_ASM_CONST(0x0040000000000000)
 #define CPU_FTR_UNALIGNED_LD_STD	LONG_ASM_CONST(0x0080000000000000)
 #define CPU_FTR_ASYM_SMT		LONG_ASM_CONST(0x0100000000000000)
+#define CPU_FTR_ICSWX			LONG_ASM_CONST(0x0200000000000000)
 
 #ifndef __ASSEMBLY__
 
@@ -413,7 +414,8 @@ extern const char *powerpc_base_platform;
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
 	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
-	    CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT)
+	    CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
+	    CPU_FTR_ICSWX)
 #define CPU_FTRS_CELL	(CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
diff --git a/arch/powerpc/include/asm/mmu-hash64.h
b/arch/powerpc/include/asm/mmu-hash64.h
index acac35d..6c1ab90 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -423,6 +423,11 @@ typedef struct {
 #ifdef CONFIG_PPC_SUBPAGE_PROT
 	struct subpage_prot_table spt;
 #endif /* CONFIG_PPC_SUBPAGE_PROT */
+#ifdef CONFIG_ICSWX
+	unsigned long acop;	/* mask of enabled coprocessor types */
+#define HASH64_MAX_PID (0xFFFF)
+	unsigned int acop_pid;	/* pid value used with coprocessors */
+#endif /* CONFIG_ICSWX */
 } mm_context_t;
 
 
diff --git a/arch/powerpc/include/asm/mmu_context.h
b/arch/powerpc/include/asm/mmu_context.h
index 81fb412..88118de 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -80,6 +80,12 @@ static inline void switch_mm(struct mm_struct *prev,
struct mm_struct *next,
 
 #define deactivate_mm(tsk,mm)	do { } while (0)
 
+#ifdef CONFIG_ICSWX
+extern void switch_cop(struct mm_struct *next);
+extern int use_cop(unsigned long acop, struct mm_struct *mm);
+extern void drop_cop(unsigned long acop, struct mm_struct *mm);
+#endif /* CONFIG_ICSWX */
+
 /*
  * After we have set current->mm to a new value, this activates
  * the context for the new mm so we see the new mappings.
diff --git a/arch/powerpc/include/asm/reg.h
b/arch/powerpc/include/asm/reg.h
index ff0005eec..b86d876 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -170,8 +170,19 @@
 #define SPEFSCR_FRMC 	0x00000003	/* Embedded FP rounding mode control
*/
 
 /* Special Purpose Registers (SPRNs)*/
+
+#ifdef CONFIG_40x
+#define SPRN_PID	0x3B1	/* Process ID */
+#else
+#define SPRN_PID	0x030	/* Process ID */
+#ifdef CONFIG_BOOKE
+#define SPRN_PID0	SPRN_PID/* Process ID Register 0 */
+#endif
+#endif
+
 #define SPRN_CTR	0x009	/* Count Register */
 #define SPRN_DSCR	0x11
+#define SPRN_ACOP	0x1F	/* Available Coprocessor Register */
 #define SPRN_CTRLF	0x088
 #define SPRN_CTRLT	0x098
 #define   CTRL_CT	0xc0000000	/* current thread */
diff --git a/arch/powerpc/include/asm/reg_booke.h
b/arch/powerpc/include/asm/reg_booke.h
index 667a498..5b0c781 100644
--- a/arch/powerpc/include/asm/reg_booke.h
+++ b/arch/powerpc/include/asm/reg_booke.h
@@ -150,8 +150,6 @@
  * or IBM 40x.
  */
 #ifdef CONFIG_BOOKE
-#define SPRN_PID	0x030	/* Process ID */
-#define SPRN_PID0	SPRN_PID/* Process ID Register 0 */
 #define SPRN_CSRR0	0x03A	/* Critical Save and Restore Register 0 */
 #define SPRN_CSRR1	0x03B	/* Critical Save and Restore Register 1 */
 #define SPRN_DEAR	0x03D	/* Data Error Address Register */
@@ -168,7 +166,6 @@
 #define SPRN_TCR	0x154	/* Timer Control Register */
 #endif /* Book E */
 #ifdef CONFIG_40x
-#define SPRN_PID	0x3B1	/* Process ID */
 #define SPRN_DBCR1	0x3BD	/* Debug Control Register 1 */		
 #define SPRN_ESR	0x3D4	/* Exception Syndrome Register */
 #define SPRN_DEAR	0x3D5	/* Data Error Address Register */
diff --git a/arch/powerpc/mm/mmu_context_hash64.c
b/arch/powerpc/mm/mmu_context_hash64.c
index 2535828..6ef6ce2 100644
--- a/arch/powerpc/mm/mmu_context_hash64.c
+++ b/arch/powerpc/mm/mmu_context_hash64.c
@@ -18,6 +18,7 @@
 #include <linux/mm.h>
 #include <linux/spinlock.h>
 #include <linux/idr.h>
+#include <linux/percpu.h>
 #include <linux/module.h>
 #include <linux/gfp.h>
 
@@ -26,6 +27,113 @@
 static DEFINE_SPINLOCK(mmu_context_lock);
 static DEFINE_IDA(mmu_context_ida);
 
+#ifdef CONFIG_ICSWX
+static DEFINE_SPINLOCK(mmu_context_acop_lock);
+static DEFINE_IDA(cop_ida);
+
+/* Lazy switch the ACOP register */
+static DEFINE_PER_CPU(unsigned long, acop_reg);
+
+void switch_cop(struct mm_struct *next)
+{
+	if (!cpu_has_feature(CPU_FTR_ICSWX))
+		return;
+
+	mtspr(SPRN_PID, next->context.acop_pid);
+	if (next->context.acop_pid &&
+	    __get_cpu_var(acop_reg) != next->context.acop) {
+		mtspr(SPRN_ACOP, next->context.acop);
+		__get_cpu_var(acop_reg) = next->context.acop;
+	}
+}
+EXPORT_SYMBOL(switch_cop);
+
+int use_cop(unsigned long acop, struct mm_struct *mm)
+{
+	int acop_pid;
+	int err;
+
+	if (!cpu_has_feature(CPU_FTR_ICSWX))
+		return -ENODEV;
+
+	if (!mm)
+		return -EINVAL;
+
+	if (!mm->context.acop_pid) {
+		if (!ida_pre_get(&cop_ida, GFP_KERNEL))
+			return -ENOMEM;
+		do {
+			spin_lock(&mmu_context_acop_lock);
+			err = ida_get_new_above(&cop_ida, 1, &acop_pid);
+			spin_unlock(&mmu_context_acop_lock);
+		} while (err == -EAGAIN);
+
+		if (err)
+			return err;
+
+		if (acop_pid > HASH64_MAX_PID) {
+			spin_lock(&mmu_context_acop_lock);
+			ida_remove(&cop_ida, acop_pid);
+			spin_unlock(&mmu_context_acop_lock);
+			return -EBUSY;
+		}
+		mm->context.acop_pid = acop_pid;
+		if (mm == current->active_mm)
+			mtspr(SPRN_PID,  mm->context.acop_pid);
+	}
+	spin_lock(&mmu_context_acop_lock);
+	mm->context.acop |= acop;
+	spin_unlock(&mmu_context_acop_lock);
+
+	get_cpu_var(acop_reg) = mm->context.acop;
+	if (mm == current->active_mm)
+		mtspr(SPRN_ACOP, mm->context.acop);
+	put_cpu_var(acop_reg);
+
+	return mm->context.acop_pid;
+}
+EXPORT_SYMBOL(use_cop);
+
+void drop_cop(unsigned long acop, struct mm_struct *mm)
+{
+	if (!cpu_has_feature(CPU_FTR_ICSWX))
+		return;
+
+	if (WARN_ON(!mm))
+		return;
+
+	spin_lock(&mmu_context_acop_lock);
+	mm->context.acop &= ~acop;
+	spin_unlock(&mmu_context_acop_lock);
+	if (!mm->context.acop) {
+		spin_lock(&mmu_context_acop_lock);
+		ida_remove(&cop_ida, mm->context.acop_pid);
+		spin_unlock(&mmu_context_acop_lock);
+		mm->context.acop_pid = 0;
+		if (mm == current->active_mm)
+			mtspr(SPRN_PID, mm->context.acop_pid);
+	} else {
+		get_cpu_var(acop_reg) = mm->context.acop;
+		if (mm == current->active_mm)
+			mtspr(SPRN_ACOP, mm->context.acop);
+		put_cpu_var(acop_reg);
+	}
+}
+EXPORT_SYMBOL(drop_cop);
+
+static void destroy_context_acop(struct mm_struct *mm)
+{
+	if (mm->context.acop_pid) {
+		spin_lock(&mmu_context_acop_lock);
+		ida_remove(&cop_ida, mm->context.acop_pid);
+		spin_unlock(&mmu_context_acop_lock);
+	}
+}
+
+#else
+#define destroy_context_acop(mm)
+#endif /* CONFIG_ICSWX */
+
 /*
  * The proto-VSID space has 2^35 - 1 segments available for user
mappings.
  * Each segment contains 2^28 bytes.  Each context maps 2^44 bytes,
@@ -93,6 +201,7 @@ EXPORT_SYMBOL_GPL(__destroy_context);
 
 void destroy_context(struct mm_struct *mm)
 {
+	destroy_context_acop(mm);
 	__destroy_context(mm->context.id);
 	subpage_prot_free(mm);
 	mm->context.id = NO_CONTEXT;
diff --git a/arch/powerpc/platforms/Kconfig.cputype
b/arch/powerpc/platforms/Kconfig.cputype
index d361f81..7678e29 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -220,6 +220,23 @@ config VSX
 
 	  If in doubt, say Y here.
 
+config ICSWX
+	bool "Support for PowerPC icswx co-processor instruction"
+	depends on POWER4
+	default n
+	---help---
+
+	  Enabling this option to turn on the PowerPC icswx co-processor
+	  instruction support for POWER7 or newer processors.
+	  This option is only useful if you have a processor that supports
+	  icswx co-processor instruction. It does not have any effect on
+	  processors without icswx co-processor instruction.
+
+	  This support slightly increases kernel memory usage.
+
+	  Say N if you do not have a PowerPC processor supporting icswx
+	  instruction and a PowerPC co-processor.
+
 config SPE
 	bool "SPE Support"
 	depends on E200 || (E500 && !PPC_E500MC)

^ permalink raw reply related

* RE: [PATCH 1/2] P4080/eLBC: Make Freescale elbc interrupt common to elbc devices
From: Zang Roy-R61911 @ 2010-10-20  5:12 UTC (permalink / raw)
  To: Kumar Gala
  Cc: Wood Scott-B07421, dedekind1, Lan Chunhe-B25806, linuxppc-dev,
	linux-mtd, akpm, dwmw2, Gala Kumar-B11780
In-Reply-To: <2D2A6B88-DA17-467D-BB43-919C1CAAB894@kernel.crashing.org>



> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> Sent: Tuesday, October 19, 2010 21:19 PM
> To: Zang Roy-R61911
> Cc: linux-mtd@lists.infradead.org; Wood Scott-B07421;
dedekind1@gmail.com; Lan
> Chunhe-B25806; linuxppc-dev@ozlabs.org; akpm@linux-foundation.org;
> dwmw2@infradead.org; Gala Kumar-B11780
> Subject: Re: [PATCH 1/2] P4080/eLBC: Make Freescale elbc interrupt
common to
> elbc devices
>=20
>=20
> On Oct 18, 2010, at 2:22 AM, Roy Zang wrote:
>=20
> > Move Freescale elbc interrupt from nand dirver to elbc driver.
> > Then all elbc devices can use the interrupt instead of ONLY nand.
> >
> > For former nand driver, it had the two functions:
> >
> > 1. detecting nand flash partitions;
> > 2. registering elbc interrupt.
> >
> > Now, second function is removed to fsl_lbc.c.
> >
> > Signed-off-by: Lan Chunhe-B25806 <b25806@freescale.com>
> > Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
> > Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>
> > Cc: Wood Scott-B07421 <B07421@freescale.com>
> > ---
>=20
> Roy, this is a nit, but are these really p4080 specific?  just
wondering why
> the subject is P4080/eLBC:...
We start these code in P4080 project. Some customer want to track eLBC
error on P4080, but some of the code is limited in nand driver only ...
That is why P4080/eLBC ...
Roy

^ permalink raw reply

* Re:
From: Michal Simek @ 2010-10-20  5:31 UTC (permalink / raw)
  To: microblaze-uclinux; +Cc: nacc, linuxppc-dev, miltonm
In-Reply-To: <1287422825-14999-2-git-send-email-nacc@us.ibm.com>

Nishanth Aravamudan wrote:
> Use set_dma_ops and remove now used-once oddly named temp pointer sd.
> 
> Signed-off-by: Milton Miller <miltonm@bga.com>
> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
> Cc: benh@kernel.crashing.org
> Cc: linuxppc-dev@lists.ozlabs.org
> ---

Maybe I forget to write you that this patch is already applied.
http://git.monstr.eu/git/gitweb.cgi?p=linux-2.6-microblaze.git;a=commit;h=9a6df6cbfd903b6d9b4b1021f46d78601adfac77


Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian

^ permalink raw reply

* Re: [PATCH 1/2] P4080/eLBC: Make Freescale elbc interrupt common to elbc devices
From: Kumar Gala @ 2010-10-20  6:54 UTC (permalink / raw)
  To: Zang Roy-R61911
  Cc: Wood Scott-B07421, dedekind1, Lan Chunhe-B25806, linuxppc-dev,
	linux-mtd, akpm, dwmw2, Gala Kumar-B11780
In-Reply-To: <3850A844E6A3854C827AC5C0BEC7B60A2B08A5@zch01exm23.fsl.freescale.net>


On Oct 20, 2010, at 12:12 AM, Zang Roy-R61911 wrote:

> 
> 
>> -----Original Message-----
>> From: Kumar Gala [mailto:galak@kernel.crashing.org]
>> Sent: Tuesday, October 19, 2010 21:19 PM
>> To: Zang Roy-R61911
>> Cc: linux-mtd@lists.infradead.org; Wood Scott-B07421;
> dedekind1@gmail.com; Lan
>> Chunhe-B25806; linuxppc-dev@ozlabs.org; akpm@linux-foundation.org;
>> dwmw2@infradead.org; Gala Kumar-B11780
>> Subject: Re: [PATCH 1/2] P4080/eLBC: Make Freescale elbc interrupt
> common to
>> elbc devices
>> 
>> 
>> On Oct 18, 2010, at 2:22 AM, Roy Zang wrote:
>> 
>>> Move Freescale elbc interrupt from nand dirver to elbc driver.
>>> Then all elbc devices can use the interrupt instead of ONLY nand.
>>> 
>>> For former nand driver, it had the two functions:
>>> 
>>> 1. detecting nand flash partitions;
>>> 2. registering elbc interrupt.
>>> 
>>> Now, second function is removed to fsl_lbc.c.
>>> 
>>> Signed-off-by: Lan Chunhe-B25806 <b25806@freescale.com>
>>> Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
>>> Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>
>>> Cc: Wood Scott-B07421 <B07421@freescale.com>
>>> ---
>> 
>> Roy, this is a nit, but are these really p4080 specific?  just
> wondering why
>> the subject is P4080/eLBC:...
> We start these code in P4080 project. Some customer want to track eLBC
> error on P4080, but some of the code is limited in nand driver only ...
> That is why P4080/eLBC ...
> Roy

sure, but is anything about these patches p4080 specific?

- k

^ permalink raw reply

* Re: CONFIG_FEC is not good for mpc8xx ethernet?
From: Shawn Jin @ 2010-10-20  7:03 UTC (permalink / raw)
  To: tiejun.chen; +Cc: Scott Wood, ppcdev
In-Reply-To: <4CBCFC5D.3010403@windriver.com>

>> On MPC8xx you want drivers/net/fs_enet/mii-fec.c. =A0This is just the
>> MDIO driver; it doesn't handle any particular PHY. =A0I don't know if
>> there is a driver specifically for AM79C874, though the generic PHY
>> support may be good enough.
>
> Maybe.
>
> I can found one related patch for supporting PHY AM79C874 on 2.6.15,
> ------
> http://lists.ozlabs.org/pipermail/linuxppc-embedded/2005-November/021043.=
html
>
> But I don't see that on the latest kernel, and also I don't know the hist=
ory
> completely for that. Maybe its already merged into one generic PHY driver=
 but
> I'm not sure.

Thank Scott & Tiejun for valuable information.

The problem for me is that the PHY failed to be probed. The related
error messages are shown below. I even tried the patch Tiejun pointed
out. But that doesn't help. The phy ID read from the bus was all Fs.

FEC MII Bus: probed
mdio_bus fa200e00: error probing PHY at address 0

I don't know if AM79C874 requires any special handling. But from the
comment in mdiobb_cmd() there seems to be something special.
        /*
         * Send a 32 bit preamble ('1's) with an extra '1' bit for good
         * measure.  The IEEE spec says this is a PHY optional
         * requirement.  The AMD 79C874 requires one after power up and
         * one after a MII communications error.  This means that we are
         * doing more preambles than we need, but it is safer and will be
         * much more robust.
         */

If there is any network action in u-boot, e.g., tftp or ping, the PHY
can be successfully probed after that. Any hints what went wrong with
the PHY?

Thanks,
-Shawn.

^ permalink raw reply

* Re: CONFIG_FEC is not good for mpc8xx ethernet?
From: tiejun.chen @ 2010-10-20  7:46 UTC (permalink / raw)
  To: Shawn Jin; +Cc: Scott Wood, ppcdev
In-Reply-To: <AANLkTinN-xOecVynPYxmLoT9HJKFs0iPo3pLXF05Tudv@mail.gmail.com>

Shawn Jin wrote:
>>> On MPC8xx you want drivers/net/fs_enet/mii-fec.c. �This is just the
>>> MDIO driver; it doesn't handle any particular PHY. �I don't know if
>>> there is a driver specifically for AM79C874, though the generic PHY
>>> support may be good enough.
>> Maybe.
>>
>> I can found one related patch for supporting PHY AM79C874 on 2.6.15,
>> ------
>> http://lists.ozlabs.org/pipermail/linuxppc-embedded/2005-November/021043.html
>>
>> But I don't see that on the latest kernel, and also I don't know the history
>> completely for that. Maybe its already merged into one generic PHY driver but
>> I'm not sure.
> 
> Thank Scott & Tiejun for valuable information.
> 
> The problem for me is that the PHY failed to be probed. The related
> error messages are shown below. I even tried the patch Tiejun pointed
> out. But that doesn't help. The phy ID read from the bus was all Fs.
> 
> FEC MII Bus: probed
> mdio_bus fa200e00: error probing PHY at address 0

Is this is all log related to PHY? And are you sure your PHY Address is zero?

Often there are at most 32 PHY devices resided one MDIO bus. So you can dump PHY
ID to check if there is a PHY firstly. A ID value of 0xffff indicates that the
address is invalid if I recalled properly.

But I think PHY driver already do the above process on Linux.

So looks MDIO driver cannot compatible for your platform. I recommend you try
debug mdio driver to access valid PHY ID firstly. Especially where/why this stop
at address '0'? When you can get a valid PHY ID you can go phy driver.

> 
> I don't know if AM79C874 requires any special handling. But from the
> comment in mdiobb_cmd() there seems to be something special.
>         /*
>          * Send a 32 bit preamble ('1's) with an extra '1' bit for good
>          * measure.  The IEEE spec says this is a PHY optional
>          * requirement.  The AMD 79C874 requires one after power up and
>          * one after a MII communications error.  This means that we are
>          * doing more preambles than we need, but it is safer and will be
>          * much more robust.
>          */
> 
> If there is any network action in u-boot, e.g., tftp or ping, the PHY
> can be successfully probed after that. Any hints what went wrong with

On bootstrap the driver should reset MDIO bus/PHY before probing PHY again.

Tiejun

> the PHY?
> 
> Thanks,
> -Shawn.
> 

^ permalink raw reply

* RE: [PATCH 1/2] P4080/eLBC: Make Freescale elbc interrupt common to elbc devices
From: Zang Roy-R61911 @ 2010-10-20  8:33 UTC (permalink / raw)
  To: Kumar Gala
  Cc: Wood Scott-B07421, dedekind1, Lan Chunhe-B25806, linuxppc-dev,
	linux-mtd, akpm, dwmw2, Gala Kumar-B11780
In-Reply-To: <707F7370-D01B-462F-B896-D7F677AED8EB@kernel.crashing.org>



> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> Sent: Wednesday, October 20, 2010 14:55 PM
> To: Zang Roy-R61911
> Cc: linux-mtd@lists.infradead.org; Wood Scott-B07421;
dedekind1@gmail.com; Lan
> Chunhe-B25806; linuxppc-dev@ozlabs.org; akpm@linux-foundation.org;
> dwmw2@infradead.org; Gala Kumar-B11780
> Subject: Re: [PATCH 1/2] P4080/eLBC: Make Freescale elbc interrupt
common to
> elbc devices
>=20
>=20
> On Oct 20, 2010, at 12:12 AM, Zang Roy-R61911 wrote:
>=20
> >
> >
> >> -----Original Message-----
> >> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> >> Sent: Tuesday, October 19, 2010 21:19 PM
> >> To: Zang Roy-R61911
> >> Cc: linux-mtd@lists.infradead.org; Wood Scott-B07421;
> > dedekind1@gmail.com; Lan
> >> Chunhe-B25806; linuxppc-dev@ozlabs.org; akpm@linux-foundation.org;
> >> dwmw2@infradead.org; Gala Kumar-B11780
> >> Subject: Re: [PATCH 1/2] P4080/eLBC: Make Freescale elbc interrupt
> > common to
> >> elbc devices
> >>
> >>
> >> On Oct 18, 2010, at 2:22 AM, Roy Zang wrote:
> >>
> >>> Move Freescale elbc interrupt from nand dirver to elbc driver.
> >>> Then all elbc devices can use the interrupt instead of ONLY nand.
> >>>
> >>> For former nand driver, it had the two functions:
> >>>
> >>> 1. detecting nand flash partitions;
> >>> 2. registering elbc interrupt.
> >>>
> >>> Now, second function is removed to fsl_lbc.c.
> >>>
> >>> Signed-off-by: Lan Chunhe-B25806 <b25806@freescale.com>
> >>> Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
> >>> Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>
> >>> Cc: Wood Scott-B07421 <B07421@freescale.com>
> >>> ---
> >>
> >> Roy, this is a nit, but are these really p4080 specific?  just
> > wondering why
> >> the subject is P4080/eLBC:...
> > We start these code in P4080 project. Some customer want to track
eLBC
> > error on P4080, but some of the code is limited in nand driver only
...
> > That is why P4080/eLBC ...
> > Roy
>=20
> sure, but is anything about these patches p4080 specific?
No.
Should I update the subject by a new version.
Thanks.
Roy

^ permalink raw reply

* Re: CONFIG_FEC is not good for mpc8xx ethernet?
From: Shawn Jin @ 2010-10-20  9:19 UTC (permalink / raw)
  To: tiejun.chen; +Cc: Scott Wood, ppcdev
In-Reply-To: <AANLkTinN-xOecVynPYxmLoT9HJKFs0iPo3pLXF05Tudv@mail.gmail.com>

> The problem for me is that the PHY failed to be probed. The related
> error messages are shown below. I even tried the patch Tiejun pointed
> out. But that doesn't help. The phy ID read from the bus was all Fs.
>
> FEC MII Bus: probed
> mdio_bus fa200e00: error probing PHY at address 0

I think I figured out the probing failure. My board uses PortD bit8 as
an input pin from phy's MDC. I didn't set up this pin assignment.

When probing the PHY the fs_enet_fec_mii_read() is called to get phy
id. The correct phy id was returned. However when I tried to set up
the ip address using the command "ifconfig eth0 192.168.0.4". The same
function was called again. But this time the fecp->fec_r_cntrl
mysteriously became 0 so the kernel reported bug for that.

# ifconfig eth0 192.168.0.4
------------[ cut here ]------------
kernel BUG at drivers/net/fs_enet/mii-fec.c:58!
Oops: Exception in kernel mode, sig: 5 [#1]
MyMPC870
NIP: c012b79c LR: c012963c CTR: c012b77c
REGS: c7457c60 TRAP: 0700   Not tainted  (2.6.33.5)
MSR: 00029032 <EE,ME,CE,IR,DR>  CR: 24020042  XER: 20000000
TASK = c7840000[236] 'ifconfig' THREAD: c7456000
GPR00: 00000001 c7457d10 c7840000 c7845400 00000000 00000001 ffffffff 00000000
GPR08: c77c44fc c906ce00 c784806c 00000b9f 84020042 100b986c 10096042 1009604f
GPR16: 1009603b 10096030 10096001 100b188e c7457e18 ffff8914 c742430c c740b000
GPR24: c7424300 c78443c0 00000001 00000000 c7845428 c7845400 c7845600 c7845600
NIP [c012b79c] fs_enet_fec_mii_read+0x20/0x90
LR [c012963c] mdiobus_read+0x50/0x74
Call Trace:
[c7457d10] [c0115744] driver_bound+0x60/0xa0 (unreliable)
[c7457d30] [c0129094] genphy_config_init+0x24/0xd4
[c7457d40] [c0128920] phy_init_hw+0x4c/0x78
[c7457d50] [c0128a40] phy_connect_direct+0x24/0x88
[c7457d70] [c0133e50] of_phy_connect+0x48/0x6c
[c7457d90] [c012ae10] fs_enet_open+0xf0/0x2cc
[c7457db0] [c0148a54] dev_open+0x100/0x138
[c7457dd0] [c0146ca0] dev_change_flags+0x80/0x1a8
[c7457df0] [c018e104] devinet_ioctl+0x630/0x750
[c7457e60] [c018eb5c] inet_ioctl+0xcc/0xf8
[c7457e70] [c01370d8] sock_ioctl+0x60/0x28c
[c7457e90] [c007dbcc] vfs_ioctl+0x38/0x9c
[c7457ea0] [c007ddf0] do_vfs_ioctl+0x84/0x708
[c7457f10] [c007e4b4] sys_ioctl+0x40/0x74
[c7457f40] [c000de60] ret_from_syscall+0x0/0x38
Instruction dump:
80010014 7c0803a6 38210010 4e800020 81230018 81290000 7c0004ac 80090144
0c000000 4c00012c 68000004 5400f7fe <0f000000> 5484b810 64846002 54a5925a
---[ end trace 41bf95259a68372e ]---
Trace/breakpoint trap

I cannot find where the fec_r_cntrl would be reset to 0 after
fs_enet_mdio_probe() sets it to FEC_RCNTRL_MII_MODE. Odd?

Thanks,
-Shawn.

^ permalink raw reply

* Re: PROBLEM: memory corrupting bug, bisected to 6dda9d55
From: Benjamin Herrenschmidt @ 2010-10-20 10:32 UTC (permalink / raw)
  To: pacman; +Cc: Mel Gorman, linux-kernel, linux-mm, Andrew Morton, linuxppc-dev
In-Reply-To: <20101020032345.5240.qmail@kosh.dhis.org>

On Tue, 2010-10-19 at 22:23 -0500, pacman@kosh.dhis.org wrote:
> The diff fragment above applied inside prom_close_stdin, but there are
> some
> prom_printf calls after prom_close_stdin. Calling prom_printf after
> closing
> stdout sounds like it could be bad. If I moved it down below all the
> prom_printf's, it would be after the "quiesce" call. Would that be
> acceptable
> (or even interesting as an experiment)? Does a close need a quiesce
> after it?

Just try :-) "quiesce" is something that afaik only apple ever
implemented anyways. It uses hooks inside their OF to shut down all
drivers that do bus master (among other HW sanitization tasks).

Cheers,
Ben.

^ permalink raw reply

* Freescale P2020/ 85xx PCIe: DMA low throughtput
From: Natalie Shapira @ 2010-10-20 10:36 UTC (permalink / raw)
  To: galak, linuxppc-dev, leoli, zw

[-- Attachment #1: Type: text/plain, Size: 835 bytes --]

Hi,

I'm working on bring up for a new board based on Freescales p2020. I 
have a programmable FPGA as a PCIe device with a buffer I can write to 
and from.
I want to test  performence for the PCIe bus.
I encountered a problem while doing a DMA between the FPGA & DDR.
The whole buffer  moves  to and from  the device  with out mismatches 
but with low throughtput.
The thing is that the buffer divided to many transactions of byte size 
instead of transferring it in a burst.
I must mention that even a buffer of word size, divided in to byte 
transactions by the DMA (the core can read a word so it seems like the 
DMA fault.
I tried to change the latency timer, max latency, min latency and cache 
line in the configuration space of both sides of the pcie bus. It didn't 
help.
Do you have an idea what can it be?

Thanks,
Natalie.

[-- Attachment #2: Type: text/html, Size: 1081 bytes --]

^ permalink raw reply

* RE: Freescale P2020/ 85xx PCIe: DMA low throughtput
From: Jenkins, Clive @ 2010-10-20 12:49 UTC (permalink / raw)
  To: Natalie Shapira, galak, linuxppc-dev, leoli, zw
In-Reply-To: <4CBEC62E.30900@extricom.com>

> Hi,=20
>=20
> I'm working on bring up for a new board based on Freescales p2020.
> I have a programmable FPGA as a PCIe device with a buffer I can
> write to and from.
> I want to test  performence for the PCIe bus.=20
> I encountered a problem while doing a DMA between the FPGA & DDR.=20
> The whole buffer  moves  to and from  the device  with out
> mismatches but with low throughtput.=20
> The thing is that the buffer divided to many transactions of byte
> size instead of transferring it in a burst.=20
> I must mention that even a buffer of word size, divided in to byte
> transactions by the DMA (the core can read a word so it seems like
> the DMA fault.
> I tried to change the latency timer, max latency, min latency and
> cache line in the configuration space of both sides of the pcie
> bus. It didn't help.
> Do you have an idea what can it be?=20
>=20
> Thanks,
> Natalie.=20

Assuming the P2020 has the usual 85xx-style DMA engine, you may have
the Band Width Control cleared to 0. This 4-bit field (BWC) restricts
the transfer size to 2^BWC bytes, for BWC=3D0,1,..0xa. 0xb-0xe are
reserved. 0xf disables bandwidth sharing to allow uninterrupted
transfers from each channel, so if you are using several channels
one channel can completely lock out other channels. BWC=3D0x8 at reset
(2^8 =3D 256 bytes). See the P2020 manual for more details.

BWC is the field with mask 0x0f000000 in the MR (Master Reset)
register for the channel (0, 1, 2, 3), at offset 0x100, 0x180, 0x200,
0x280 relative to the base of the DMA controller.

Clive

^ permalink raw reply

* Re: PROBLEM: memory corrupting bug, bisected to 6dda9d55
From: pacman @ 2010-10-20 18:33 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1287570736.2198.19.camel@pasglop>

Benjamin Herrenschmidt writes:
> 
> On Tue, 2010-10-19 at 22:23 -0500, pacman@kosh.dhis.org wrote:
> > The diff fragment above applied inside prom_close_stdin, but there are
> > some
> > prom_printf calls after prom_close_stdin. Calling prom_printf after
> > closing
> > stdout sounds like it could be bad. If I moved it down below all the
> > prom_printf's, it would be after the "quiesce" call. Would that be
> > acceptable
> > (or even interesting as an experiment)? Does a close need a quiesce
> > after it?
> 
> Just try :-) "quiesce" is something that afaik only apple ever
> implemented anyways. It uses hooks inside their OF to shut down all
> drivers that do bus master (among other HW sanitization tasks).

I booted a version with a prom_close_stdout after the last prom_debug. It
didn't have any effect. That 1000Hz clock was still ticking.

-- 
Alan Curry

^ permalink raw reply

* Re: [QUESTION] MPC8343 'internal only' DMA support
From: Timur Tabi @ 2010-10-20 19:45 UTC (permalink / raw)
  To: KRONSTORFER Horst; +Cc: linuxppc-dev
In-Reply-To: <2E0C184B35151B4690232FF6CFE53FE501569473@VIECLEX02.frequentis.frq>

On Tue, Oct 19, 2010 at 3:15 AM, KRONSTORFER Horst
<Horst.KRONSTORFER@frequentis.com> wrote:
> i assume the mpc8343 dma controllers ability to do internally controlled
> operations (csb/csb)
> is _not_ affected by deactivating externally controlled operations via
> pinmultiplexing in sicrl.
>
> am I correct?

Hmmm... maybe.  In general, if you want an external master for the DMA
controller, I think you need to enable that via various registers.  So
if you don't enable external master, you won't have one.

Does that answer your question?

-- 
Timur Tabi
Linux kernel developer at Freescale

^ permalink raw reply

* Re: PROBLEM: memory corrupting bug, bisected to 6dda9d55
From: Benjamin Herrenschmidt @ 2010-10-20 20:56 UTC (permalink / raw)
  To: pacman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20101020183336.1714.qmail@kosh.dhis.org>

On Wed, 2010-10-20 at 13:33 -0500, pacman@kosh.dhis.org wrote:
> > Just try :-) "quiesce" is something that afaik only apple ever
> > implemented anyways. It uses hooks inside their OF to shut down all
> > drivers that do bus master (among other HW sanitization tasks).
> 
> I booted a version with a prom_close_stdout after the last prom_debug. It
> didn't have any effect. That 1000Hz clock was still ticking. 

Ok so you'll have to make up a "workaround" in prom_init that looks for
OHCI's in the device-tree and disable them.

Check if the OHCI node has some existing f-code words you can use for
that with "dev /path-to-ohci words" in OF for example. If not, you may
need to use the low level register accessors. Use OF client interface
"interpret" to run forth code from C.

Cheers,
Ben.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox