IP30: SMP, Almost there?

All of lore.kernel.org
 help / color / mirror / Atom feed

* IP30: SMP, Almost there?
@ 2015-05-18  5:39 Joshua Kinard
  2015-05-18 12:01 ` Joshua Kinard
  2015-05-24  3:17 ` Joshua Kinard
  0 siblings, 2 replies; 16+ messages in thread
From: Joshua Kinard @ 2015-05-18  5:39 UTC (permalink / raw)
  To: Linux MIPS List

So I've gotten the second CPU in Octane to "tick" again...somehow.  I am
certain someone's cat went missing in the process...

Anyways, it's booting into an initramfs and dying almost immediately with
errors from do_page_fault:

[   15.631359] do_page_fault(): sending SIGSEGV to init for invalid write
access to 0000000000000338
[   15.631395] epc = 0000000000478474 in busybox[400000+110000]
[   15.631408] ra  = 000000000047843c in busybox[400000+110000]

Segmentation fau[   17.399304] Instruction bus error, epc == 000000000041c000,
ra == 000000000041c5c8
lt
[   17.442702] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x0000000a
[   17.442702]
[   17.470272] ---[ end Kernel panic - not syncing: Attempted to kill init!
exitcode=0x0000000a


So after some digging around, I found this thread from way back in 2006 that
seems almost identical:
http://www.linux-mips.org/archives/linux-mips/2006-09/msg00169.html

However, none of the stuff regarding flush_icache_range seems to be around nor
relevant anymore.  But I did comment out one of the #if 0's in
arch/mips/mm/fault.c and got this output:
[   16.755572] Cpu0[init:1:0000000000520378:1:a800000020360bfc]
[   16.772869] Cpu0[init:1:000000007ff45fb0:1:a80000002001cec4]
[   16.790102] Cpu0[init:1:0000000000400160:0:0000000000400160]
[   16.807563] Cpu0[init:1:000000000041c000:0:000000000041c000]
[   16.825027] Cpu0[init:1:0000000000521ff8:1:0000000000402380]
[   16.842141] Cpu0[init:1:0000000000522010:1:00000000004023d8]
[   16.859289] Cpu0[init:1:0000000000422a6c:0:0000000000422a6c]
[   16.876768] Cpu0[init:1:000000000051fffc:0:0000000000400320]
[   16.893915] Cpu0[init:1:00000000004ddaf4:0:00000000004ddaf4]
[   16.911389] Cpu0[init:1:000000000094d008:1:000000000040519c]
[   16.928527] Cpu0[init:1:00000000004e7d9b:0:0000000000404aec]
[   16.946000] Cpu0[init:1:0000000000503cde:0:0000000000428994]
[   16.963441] Cpu0[init:1:000000000047f2d4:0:000000000047f2d4]
[   16.980945] Cpu0[init:1:00000000004f76e8:0:000000000047f380]
[   16.998410] Cpu0[init:1:000000000094eff8:1:00000000004051a0]
[   17.015596] Cpu0[init:1:000000007ff449c8:0:a80000002001d668]
[   17.032716] Cpu0[init:1:000000007ff449d0:1:a800000020360a48]
[   17.050655] Cpu0[init:1:000000000094fff8:1:00000000004051a0]
[   17.068127] Cpu0[init:1:0000000000950ff8:1:00000000004051a0]
[   17.085615] Cpu0[init:1:0000000000952ff8:1:00000000004051a0]
[   17.102741] Cpu0[init:1:0000000000951000:1:0000000000472fc8]
[   17.121391] Cpu0[init:1:0000000000953ff8:1:00000000004051a0]
[   17.138756] Cpu0[init:1:0000000000954ff8:1:00000000004051a0]
[   17.156542] Cpu0[init:1:000000007ff44de8:1:0000000000403398]
[   15.613954] Cpu1[init:75:000000000040c1a0:0:000000000040c1a0]
[   15.614065] Cpu1[init:75:000000007ff44de8:1:0000000000403398]
[   15.631203] Cpu1[init:75:0000000000413b58:0:0000000000413b58]
[   15.631276] Cpu1[init:75:000000000047843c:0:000000000047843c]
[   15.631336] Cpu1[init:75:0000000000000338:1:0000000000478474]

The invalid address (I believe what is effectively a NULL) of
0x0000000000000338 is pretty consistent with the netboot.  Sometimes I get a
panic in a mutex*slowpath function (I forget which one).  But it's way more
predictable with this netboot than with the disks inserted.

Any ideas where to start looking?

--J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-05-18  5:39 IP30: SMP, Almost there? Joshua Kinard
@ 2015-05-18 12:01 ` Joshua Kinard
  2015-05-20  5:23   ` Joshua Kinard
  2015-05-22 16:25   ` Ralf Baechle
  2015-05-24  3:17 ` Joshua Kinard
  1 sibling, 2 replies; 16+ messages in thread
From: Joshua Kinard @ 2015-05-18 12:01 UTC (permalink / raw)
  To: linux-mips

On 05/18/2015 01:39, Joshua Kinard wrote:
> So I've gotten the second CPU in Octane to "tick" again...somehow.  I am
> certain someone's cat went missing in the process...
> 
> Anyways, it's booting into an initramfs and dying almost immediately with
> errors from do_page_fault:
> 
> [   15.631359] do_page_fault(): sending SIGSEGV to init for invalid write
> access to 0000000000000338
> [   15.631395] epc = 0000000000478474 in busybox[400000+110000]
> [   15.631408] ra  = 000000000047843c in busybox[400000+110000]
> 
> Segmentation fau[   17.399304] Instruction bus error, epc == 000000000041c000,
> ra == 000000000041c5c8
> lt
> [   17.442702] Kernel panic - not syncing: Attempted to kill init!
> exitcode=0x0000000a
> [   17.442702]
> [   17.470272] ---[ end Kernel panic - not syncing: Attempted to kill init!
> exitcode=0x0000000a
> 
> 
> So after some digging around, I found this thread from way back in 2006 that
> seems almost identical:
> http://www.linux-mips.org/archives/linux-mips/2006-09/msg00169.html
> 
> However, none of the stuff regarding flush_icache_range seems to be around nor
> relevant anymore.  But I did comment out one of the #if 0's in
> arch/mips/mm/fault.c and got this output:
> [   16.755572] Cpu0[init:1:0000000000520378:1:a800000020360bfc]
> [   16.772869] Cpu0[init:1:000000007ff45fb0:1:a80000002001cec4]
> [   16.790102] Cpu0[init:1:0000000000400160:0:0000000000400160]
> [   16.807563] Cpu0[init:1:000000000041c000:0:000000000041c000]
> [   16.825027] Cpu0[init:1:0000000000521ff8:1:0000000000402380]
> [   16.842141] Cpu0[init:1:0000000000522010:1:00000000004023d8]
> [   16.859289] Cpu0[init:1:0000000000422a6c:0:0000000000422a6c]
> [   16.876768] Cpu0[init:1:000000000051fffc:0:0000000000400320]
> [   16.893915] Cpu0[init:1:00000000004ddaf4:0:00000000004ddaf4]
> [   16.911389] Cpu0[init:1:000000000094d008:1:000000000040519c]
> [   16.928527] Cpu0[init:1:00000000004e7d9b:0:0000000000404aec]
> [   16.946000] Cpu0[init:1:0000000000503cde:0:0000000000428994]
> [   16.963441] Cpu0[init:1:000000000047f2d4:0:000000000047f2d4]
> [   16.980945] Cpu0[init:1:00000000004f76e8:0:000000000047f380]
> [   16.998410] Cpu0[init:1:000000000094eff8:1:00000000004051a0]
> [   17.015596] Cpu0[init:1:000000007ff449c8:0:a80000002001d668]
> [   17.032716] Cpu0[init:1:000000007ff449d0:1:a800000020360a48]
> [   17.050655] Cpu0[init:1:000000000094fff8:1:00000000004051a0]
> [   17.068127] Cpu0[init:1:0000000000950ff8:1:00000000004051a0]
> [   17.085615] Cpu0[init:1:0000000000952ff8:1:00000000004051a0]
> [   17.102741] Cpu0[init:1:0000000000951000:1:0000000000472fc8]
> [   17.121391] Cpu0[init:1:0000000000953ff8:1:00000000004051a0]
> [   17.138756] Cpu0[init:1:0000000000954ff8:1:00000000004051a0]
> [   17.156542] Cpu0[init:1:000000007ff44de8:1:0000000000403398]
> [   15.613954] Cpu1[init:75:000000000040c1a0:0:000000000040c1a0]
> [   15.614065] Cpu1[init:75:000000007ff44de8:1:0000000000403398]
> [   15.631203] Cpu1[init:75:0000000000413b58:0:0000000000413b58]
> [   15.631276] Cpu1[init:75:000000000047843c:0:000000000047843c]
> [   15.631336] Cpu1[init:75:0000000000000338:1:0000000000478474]
> 
> The invalid address (I believe what is effectively a NULL) of
> 0x0000000000000338 is pretty consistent with the netboot.  Sometimes I get a
> panic in a mutex*slowpath function (I forget which one).  But it's way more
> predictable with this netboot than with the disks inserted.

Apparently, setting cca=5 on the kernel command line improves things.  The
netboot can load busybox ash and move around.  But booting the real userland is
still very problematic (XFS filesystem pretty much blows up on mounting root).

What is the relationship between the cache-coherency algorithm and SMP?  IP30
hardware is supposed to be cache-coherent.  A value of '5' sets the processors
to "cacheable coherent exclusive on write" (per the R10K manual).  But I am not
sure why things are still flakey.

--J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-05-18 12:01 ` Joshua Kinard
@ 2015-05-20  5:23   ` Joshua Kinard
  2015-05-21  6:00     ` Joshua Kinard
  2015-05-22 16:25   ` Ralf Baechle
  1 sibling, 1 reply; 16+ messages in thread
From: Joshua Kinard @ 2015-05-20  5:23 UTC (permalink / raw)
  To: linux-mips

On 05/18/2015 08:01, Joshua Kinard wrote:
> On 05/18/2015 01:39, Joshua Kinard wrote:
>> So I've gotten the second CPU in Octane to "tick" again...somehow.  I am
>> certain someone's cat went missing in the process...
>>
>> Anyways, it's booting into an initramfs and dying almost immediately with
>> errors from do_page_fault:

I've stripped the kernel config down to practically nothing (9MB w/ debugging,
-Os, and a small initramfs), yet still, when CPU1 starts up and starts
scheduling, I am running into "scheduling while atomic" warnings (due to
enabling that option in kernel debugging).  CPU0 then joins in with scheduling
while atomic.  I've tried switching to mutexes instead of spinlocks, but no
dice.  It seems to be stemming from core kernel code, but I know the problem
has to be in the IP30 SMP code.

I've looked at all of the other MIPS SMP implementations, and by far the one
that looks the closest is the Sibyte 1250 SMP code.  Everything else is far too
different, what with multicores and multithreading.  However, adopting most of
the Sibyte semantics for CPU1 bringup doesn't appear to work.  Only thing I
haven't done yet is handle CPU affinity.

Do I need to define the irq_set_affinity() function pointer in struct irq_chip?
 IP27 doesn't have this and no real ill effects w/ SMP are seen on that system
(just the random hardlock due to that vmscan BUG() or disk I/O).  Am I not
masking all IRQs correctly and running into a case of recursive interrupts?  Is
there still a way to halt all interrupts globally (the old big lock) and use
that as a debugging aid?

--J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-05-20  5:23   ` Joshua Kinard
@ 2015-05-21  6:00     ` Joshua Kinard
  2015-05-22 12:01       ` Maciej W. Rozycki
  2015-05-22 16:38       ` Ralf Baechle
  0 siblings, 2 replies; 16+ messages in thread
From: Joshua Kinard @ 2015-05-21  6:00 UTC (permalink / raw)
  To: linux-mips

On 05/20/2015 01:23, Joshua Kinard wrote:
> <stuff no one reads>

Found my "scheduling while atomic" bug.  Who knew that 'get_cpu_var' attempted
to disable preemption and thus, incremented preempt_count?  That got caught by
__schedule_bug and thus, sleeping while atomic.  Since I based some of the SMP
updates off of arch/mips/kernel/smp-bmips.c, I failed to notice originally,
that code used '__get_cpu_var', which is a different function (and doesn't try
to disable preemption).  I looked at the latest smp-bmips.c code, and that is
now using __this_cpu_ptr(), which solved the sleeping problem.

Still getting instruction bus errors, though.  But this time, one of the IBEs
happened in kernel context and I got an Oops for once:

[     0.148514] clocksource jiffies: mask: 0xffffffff max_cycles: 0xffffffff,
max_idle_ns: 19112604462750000 ns
[     0.198242] Switched to clocksource HEART
[     1.137056] Warning: unable to open an initial console.
[     1.156456] Freeing unused kernel memory: 4544K (a800000020190000 -
a800000020600000)
[     1.169048] Instruction bus error, epc == 00000000004289ac, ra ==
000000000047d054
[     1.183979] Instruction bus error, epc == 00000000004289ac, ra ==
000000000047d054
[     1.195707] Instruction bus error, epc == 000000000040448c, ra ==
0000000000404440
[     1.206829] Instruction bus error, epc == a8000000200ff12c, ra ==
a800000020104fec
[     1.209208] Oops[#1]:
[     1.209907] CPU: 1 PID: 42 Comm: init Not tainted
4.1.0-rc3-mipsgit-20150517 #100
[     1.212251] task: a80000009f372800 ti: a80000009f3bc000 task.ti:
a80000009f3bc000
[     1.214585] $ 0   : 0000000000000000 0000000000948ba9 000000009f35f000
a80000009f35f000
[     1.217115] $ 4   : a80000009f35f000 0000000000948ba0 0000000000000009
a80000009f35f000
[     1.219642] $ 8   : 0000000000000000 0000000000000000 0000000000000000
0000000000000fa4
[     1.222168] $12   : 0000000000000000 ffffffff84080018 0000000000000001
a80000009f3a8000
[     1.224694] $16   : a8000000232d3cc8 0000000000000009 0000000000000000
a800000020650000
[     1.227221] $20   : a80000009f3bfe00 a80000009f3bfdb0 0000000000000009
0000000000000009
[     1.229748] $24   : 0000000000000009 000000000047338c

[     1.232273] $28   : a80000009f3bc000 a80000009f3bfce0 0000000000948ba0
a800000020104fec
[     1.234802] Hi    : 00000000001dda1d
[     1.235897] Lo    : 000000000009f35f
[     1.237010] epc   : a8000000200ff12c __copy_user_common+0xa4/0x358
[     1.238933]     Not tainted
[     1.239801] ra    : a800000020104fec copy_page_from_iter+0x338/0x3f8
[     1.241777] Status: b004fce3 KX SX UX KERNEL EXL IE
[     1.243346] Cause : 00000018
[     1.244222] PrId  : 00000f24 (R14000)
[     1.245353] Process init (pid: 42, threadinfo=a80000009f3bc000,
task=a80000009f372800, tls=00000000005287dc
[     1.248431] Stack : 3200000000000000 000000000047338c a80000009f357a80
a80000009f357a80
o  0000000000000000 a80000009f357a80 0000000000000000 a80000009f3bfdb0
o  a80000009f357e80 0000000000000000 a80000002013e500 a800000020190000
o  a80000009f357c80 a8000000200c38ec a80000002013e528 a8000000232d3cc8
o  a80000009f357e80 a80000009f3bfe70 a80000009f3bfe70 0000000000948ba0
o  0000000000000009 0000000000000063 0000000000000030 0000000000000020
o  0000000000000000 a8000000200bbb14 00000001b004fce1 0000000000000000
o  0000000000000000 0000000000000000 0000000000000000 0000000000000000
o  0000000000948ba0 0000000000000009 a80000009f3bfe40 a80000009f357e80
o  ...
[     1.268825] Call Trace:
[     1.269571] [<a8000000200ff12c>] __copy_user_common+0xa4/0x358
[     1.271403] [<a800000020104fec>] copy_page_from_iter+0x338/0x3f8
[     1.273295] [<a8000000200c38ec>] pipe_write+0x27c/0x45c
[     1.274939] [<a8000000200bbb14>] __vfs_write+0xc8/0xfc
[     1.276539] [<a8000000200bbc78>] vfs_write+0xc8/0x11c
[     1.278112] [<a8000000200bbde0>] SyS_write+0x5c/0xb8
[     1.279670] [<a8000000200185d8>] handle_sys+0x138/0x15c
[     1.281294]
[     1.281726]
Code: cc810100  14d8ffea  00000000 <10c00070> 2cc80020  1500000e 30d80007
dca80000  dca90008
[     1.285075] ---[ end trace f34b7ca5fa10bf37 ]---
[     1.286573] Fatal exception: panic in 5 seconds
<snip>


Following the code path, __copy_user_common has this assembly:

    a8000000200ff120:   cc810100    pref   0x1,256(a0)
    a8000000200ff124:   14d8ffea    bne    a2,t8,a8000000200ff0d0
<__copy_user_common+0x48>
    a8000000200ff128:   00000000    nop
--> a8000000200ff12c:   10c00070    beqz   a2,a8000000200ff2f0
<__copy_user_common+0x268>
    a8000000200ff130:   2cc80020    sltiu  a4,a2,32
    a8000000200ff134:   1500000e    bnez   a4,a8000000200ff170
<__copy_user_common+0xe8>
    a8000000200ff138:   30d80007    andi   t8,a2,0x7
    a8000000200ff13c:   dca80000    ld     a4,0(a1)
    a8000000200ff140:   dca90008    ld     a5,8(a1)


Which is defined in arch/mips/lib/memcpy.S:

        PREFD(  1, 8*32(dst) )
        bne     len, rem, 1b
         nop

        /*
         * len == rem == the number of bytes left to copy < 8*NBYTES
         */
.Lcleanup_both_aligned\@:
-->     beqz    len, .Ldone\@
         sltu   t0, len, 4*NBYTES
        bnez    t0, .Lless_than_4units\@
         and    rem, len, (NBYTES-1)    # rem = len % NBYTES
        /*
         * len >= 4*NBYTES
         */
        LOAD( t0, UNIT(0)(src), .Ll_exc\@)
        LOAD( t1, UNIT(1)(src), .Ll_exc_copy\@)


Where I am lost is, though, why would I get an IBE on a 'beqz' instruction?
It's a valid instruction from MIPS-I ('beqz' is just 'beq' w/ $0 as rt).  the
R10K Manual states this:

"""
A Bus Error exception occurs when a processor block read, upgrade, or
double/single/partial-word read request receives an external ERR completion
response, or a processor double/single/partial-word read request receives an
external ACK completion response where the associated external
double/single/partial-word data response contains an uncorrectable error. This
exception is not maskable.
"""

My guess is there's still something not kosher with icache flushing somewhere.
 I can reboot this kernel multiple times and not always get the same IBE.  Most
happen in user context, which are impossible to trace.

Anyone got ideas?  Is there some way to dump the contents of the icache and/or
dcache for debugging?

--J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-05-21  6:00     ` Joshua Kinard
@ 2015-05-22 12:01       ` Maciej W. Rozycki
  2015-05-22 17:11         ` Ralf Baechle
  2015-05-22 16:38       ` Ralf Baechle
  1 sibling, 1 reply; 16+ messages in thread
From: Maciej W. Rozycki @ 2015-05-22 12:01 UTC (permalink / raw)
  To: Joshua Kinard; +Cc: linux-mips

On Thu, 21 May 2015, Joshua Kinard wrote:

> Where I am lost is, though, why would I get an IBE on a 'beqz' instruction?

 A bus error is an external event, a signal asserted to the CPU by bus 
logic on a failed read cycle.  Whether you get a Data or Instruction Bus 
Error exception (DBE vs IBE) merely depends on whether it was a data read 
or an instruction fetch cycle.  The class of the error is only resolved by 
the CPU internally as obviously any external logic does not know the 
reason the CPU put the read cycle on the bus for that failed.  Note that 
the read cycle might well be a part of a cache fill.

 As it is a failure of a read that causes a bus error, it does not matter 
whether the instruction that was supposed to be fetched is valid or not.  
It has never been successfully fetched let alone decoded.  For an invalid 
instruction that has been fetched and decoded you'd get a Reserved 
Instruction exception instead.

 A typical reason for a bus error is a bus timeout, where no target on the 
bus responded to a cycle, a parity error of data presented on the bus or 
an uncorrected (multi-bit) memory access ECC error, driven by the memory 
controller in parallel to data presented.

 NB bus errors on write cycles, such as a bus timeout or an ECC error on a 
partial memory update (e.g. an uncached byte write), are asynchronous and 
normally do not cause a Bus Error exception.  A hardware interrupt is 
typically issued instead.

> My guess is there's still something not kosher with icache flushing somewhere.

 That would be odd.  Even if the state of the cache was inconsistent, I'd 
expect a Cache Error exception at worst, and rubbish returned typically, 
rather than a Bus Error exception.

> Anyone got ideas?  Is there some way to dump the contents of the icache and/or
> dcache for debugging?

 I'd rather expect an uncorrected ECC error being the cause here, maybe 
you need to clean the contacts of your memory modules.  From user 
documentation, such as a maintenance manual that should be available for 
your system, you might be able to infer which memory module the physical 
address of 0x200ff12c corresponds to and start by cleaning that module 
first.  Try to strip the system as much as possible and e.g. run with a 
single known-good memory module only (or whatever number of modules is the 
minimum).  Run any extra system diagnostics if provided by the firmware.

 It's interesting to note in the log you provided:

> [     1.169048] Instruction bus error, epc == 00000000004289ac, ra == 000000000047d054
> [     1.183979] Instruction bus error, epc == 00000000004289ac, ra == 000000000047d054
> [     1.195707] Instruction bus error, epc == 000000000040448c, ra == 0000000000404440
> [     1.206829] Instruction bus error, epc == a8000000200ff12c, ra == a800000020104fec

that the error always happens in the same 4th word (address ending with 
0xc) of a 16-byte span.  Which may indeed mean there's an issue with a 
particular memory module that supplies data for this word (assuming your 
system has a 128-bit memory controller data bus with 64-bit DRAM modules 
arranged in pairs and individually supplying data for each half of the bus 
or suchlike).

 Then checking and possibly tightening the power supply connection might 
be a good idea too.  Other connections may be worth checking, e.g. the CPU 
daughtercard(s) if applicable.  Also any problems with overheating like a 
loose heatsink, a blocked ventilation shaft and suchlike.  I'd definitely 
double-check memory first though.

 If that did not help, then I'd start suspecting your system is faulty. :(

  Maciej

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-05-22 12:01       ` Maciej W. Rozycki
@ 2015-05-22 17:11         ` Ralf Baechle
  2015-05-23 22:52           ` Joshua Kinard
  0 siblings, 1 reply; 16+ messages in thread
From: Ralf Baechle @ 2015-05-22 17:11 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Joshua Kinard, linux-mips

On Fri, May 22, 2015 at 01:01:01PM +0100, Maciej W. Rozycki wrote:

> > Where I am lost is, though, why would I get an IBE on a 'beqz' instruction?
> 
>  A bus error is an external event, a signal asserted to the CPU by bus 
> logic on a failed read cycle.  Whether you get a Data or Instruction Bus 
> Error exception (DBE vs IBE) merely depends on whether it was a data read 
> or an instruction fetch cycle.  The class of the error is only resolved by 
> the CPU internally as obviously any external logic does not know the 
> reason the CPU put the read cycle on the bus for that failed.  Note that 
> the read cycle might well be a part of a cache fill.
> 
>  As it is a failure of a read that causes a bus error, it does not matter 
> whether the instruction that was supposed to be fetched is valid or not.  
> It has never been successfully fetched let alone decoded.  For an invalid 
> instruction that has been fetched and decoded you'd get a Reserved 
> Instruction exception instead.
> 
>  A typical reason for a bus error is a bus timeout, where no target on the 
> bus responded to a cycle, a parity error of data presented on the bus or 
> an uncorrected (multi-bit) memory access ECC error, driven by the memory 
> controller in parallel to data presented.
> 
>  NB bus errors on write cycles, such as a bus timeout or an ECC error on a 
> partial memory update (e.g. an uncached byte write), are asynchronous and 
> normally do not cause a Bus Error exception.  A hardware interrupt is 
> typically issued instead.
> 
> > My guess is there's still something not kosher with icache flushing somewhere.
> 
>  That would be odd.  Even if the state of the cache was inconsistent, I'd 
> expect a Cache Error exception at worst, and rubbish returned typically, 
> rather than a Bus Error exception.
> 
> > Anyone got ideas?  Is there some way to dump the contents of the icache and/or
> > dcache for debugging?
> 
>  I'd rather expect an uncorrected ECC error being the cause here, maybe 
> you need to clean the contacts of your memory modules.  From user 
> documentation, such as a maintenance manual that should be available for 
> your system, you might be able to infer which memory module the physical 
> address of 0x200ff12c corresponds to and start by cleaning that module 
> first.  Try to strip the system as much as possible and e.g. run with a 
> single known-good memory module only (or whatever number of modules is the 
> minimum).  Run any extra system diagnostics if provided by the firmware.
> 
>  It's interesting to note in the log you provided:
> 
> > [     1.169048] Instruction bus error, epc == 00000000004289ac, ra == 000000000047d054
> > [     1.183979] Instruction bus error, epc == 00000000004289ac, ra == 000000000047d054
> > [     1.195707] Instruction bus error, epc == 000000000040448c, ra == 0000000000404440
> > [     1.206829] Instruction bus error, epc == a8000000200ff12c, ra == a800000020104fec
> 
> that the error always happens in the same 4th word (address ending with 
> 0xc) of a 16-byte span.  Which may indeed mean there's an issue with a 
> particular memory module that supplies data for this word (assuming your 
> system has a 128-bit memory controller data bus with 64-bit DRAM modules 
> arranged in pairs and individually supplying data for each half of the bus 
> or suchlike).
> 
>  Then checking and possibly tightening the power supply connection might 
> be a good idea too.  Other connections may be worth checking, e.g. the CPU 
> daughtercard(s) if applicable.  Also any problems with overheating like a 
> loose heatsink, a blocked ventilation shaft and suchlike.  I'd definitely 
> double-check memory first though.
> 
>  If that did not help, then I'd start suspecting your system is faulty. :(

He might run IRIX on it for testing.  Also I think one of the BSDs has
support.

Octane is a close relative of the IP27 which does ECC anything an all,
all addresses fully decoded.  So if software does something stupid,
hardware will notice, quickly though not necessarily in very obvious
ways.

Some of IP27's reactions are a bit unobvious though.  First, the uncached
addres space (CCA 2) works differently that one might think.  IP27 uses
the R10000's uncached attribute feature which subdivides the CPUs
uncached XKPHYS address space into four addres spaces with the highest
address byte being 0x90, 0x92, 0x94 or 0x96.  The classic uncached
memory access happens with UC=3, that is the top address byte being
0x96.

Do not use that.  EVER.  It entirely bypasses the CPU's cache coherency
logic.  Due to all the consistency checking between the directory
caches and other involved agents the memory controller might detect the
inconsistency between cache and memory and send guess what, a bus
error.

For I/O purpose UC attribute value 1 is used, that is top byte 0x92.
UC values 0 and 2 allow direct manipulation of the directory caches
and atomic operations without the need to read the line into the CPU.

So that's what IP27 does.  Not sure how much of this behavious its
little brother IP30 has copied.

  Ralf

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-05-22 17:11         ` Ralf Baechle
@ 2015-05-23 22:52           ` Joshua Kinard
  2015-05-24 14:25             ` Maciej W. Rozycki
  0 siblings, 1 reply; 16+ messages in thread
From: Joshua Kinard @ 2015-05-23 22:52 UTC (permalink / raw)
  To: Ralf Baechle, Maciej W. Rozycki; +Cc: linux-mips

On 05/22/2015 13:11, Ralf Baechle wrote:
> On Fri, May 22, 2015 at 01:01:01PM +0100, Maciej W. Rozycki wrote:
> 
>>> Where I am lost is, though, why would I get an IBE on a 'beqz' instruction?
>>
>>  A bus error is an external event, a signal asserted to the CPU by bus 
>> logic on a failed read cycle.  Whether you get a Data or Instruction Bus 
>> Error exception (DBE vs IBE) merely depends on whether it was a data read 
>> or an instruction fetch cycle.  The class of the error is only resolved by 
>> the CPU internally as obviously any external logic does not know the 
>> reason the CPU put the read cycle on the bus for that failed.  Note that 
>> the read cycle might well be a part of a cache fill.
>>
>>  As it is a failure of a read that causes a bus error, it does not matter 
>> whether the instruction that was supposed to be fetched is valid or not.  
>> It has never been successfully fetched let alone decoded.  For an invalid 
>> instruction that has been fetched and decoded you'd get a Reserved 
>> Instruction exception instead.
>>
>>  A typical reason for a bus error is a bus timeout, where no target on the 
>> bus responded to a cycle, a parity error of data presented on the bus or 
>> an uncorrected (multi-bit) memory access ECC error, driven by the memory 
>> controller in parallel to data presented.
>>
>>  NB bus errors on write cycles, such as a bus timeout or an ECC error on a 
>> partial memory update (e.g. an uncached byte write), are asynchronous and 
>> normally do not cause a Bus Error exception.  A hardware interrupt is 
>> typically issued instead.
>>
>>> My guess is there's still something not kosher with icache flushing somewhere.
>>
>>  That would be odd.  Even if the state of the cache was inconsistent, I'd 
>> expect a Cache Error exception at worst, and rubbish returned typically, 
>> rather than a Bus Error exception.
>>
>>> Anyone got ideas?  Is there some way to dump the contents of the icache and/or
>>> dcache for debugging?
>>
>>  I'd rather expect an uncorrected ECC error being the cause here, maybe 
>> you need to clean the contacts of your memory modules.  From user 
>> documentation, such as a maintenance manual that should be available for 
>> your system, you might be able to infer which memory module the physical 
>> address of 0x200ff12c corresponds to and start by cleaning that module 
>> first.  Try to strip the system as much as possible and e.g. run with a 
>> single known-good memory module only (or whatever number of modules is the 
>> minimum).  Run any extra system diagnostics if provided by the firmware.
>>
>>  It's interesting to note in the log you provided:
>>
>>> [     1.169048] Instruction bus error, epc == 00000000004289ac, ra == 000000000047d054
>>> [     1.183979] Instruction bus error, epc == 00000000004289ac, ra == 000000000047d054
>>> [     1.195707] Instruction bus error, epc == 000000000040448c, ra == 0000000000404440
>>> [     1.206829] Instruction bus error, epc == a8000000200ff12c, ra == a800000020104fec
>>
>> that the error always happens in the same 4th word (address ending with 
>> 0xc) of a 16-byte span.  Which may indeed mean there's an issue with a 
>> particular memory module that supplies data for this word (assuming your 
>> system has a 128-bit memory controller data bus with 64-bit DRAM modules 
>> arranged in pairs and individually supplying data for each half of the bus 
>> or suchlike).
>>
>>  Then checking and possibly tightening the power supply connection might 
>> be a good idea too.  Other connections may be worth checking, e.g. the CPU 
>> daughtercard(s) if applicable.  Also any problems with overheating like a 
>> loose heatsink, a blocked ventilation shaft and suchlike.  I'd definitely 
>> double-check memory first though.
>>
>>  If that did not help, then I'd start suspecting your system is faulty. :(
> 
> He might run IRIX on it for testing.  Also I think one of the BSDs has
> support.
> 
> Octane is a close relative of the IP27 which does ECC anything an all,
> all addresses fully decoded.  So if software does something stupid,
> hardware will notice, quickly though not necessarily in very obvious
> ways.

Probably can't hurt to boot IRIX on it and run the hardware diagnostics,
especially on the R14K module.  Doubtful that it's hardware, though.  UP kernel
boots normally, and even an SMP kernel boots normally if I disable one of the
CPUs in ARCS.  It's only when booting both CPUs, CPU1 is the one generating the
IBE's.

And yeah, OpenBSD supports IP30 (along with a few others).  I talk to Miod, the
main OpenBsd/sgi developer, sometimes.


> Some of IP27's reactions are a bit unobvious though.  First, the uncached
> addres space (CCA 2) works differently that one might think.  IP27 uses
> the R10000's uncached attribute feature which subdivides the CPUs
> uncached XKPHYS address space into four addres spaces with the highest
> address byte being 0x90, 0x92, 0x94 or 0x96.  The classic uncached
> memory access happens with UC=3, that is the top address byte being
> 0x96.

I was reading the IRIX Device Driver Programming Guide (007-0911-210), Chapter
1, and saw the explanation for this.  Also the bit on how the memory addresses
are coded so that a reference to the specific node number can be encoded as
well to assist the CPUs in accessing the memory closest to them.  Definitely
interesting, but apparently Octane doesn't appear to use any of this.  As far
as I can tell, it's main UNCAC_BASE and IO_BASE is classic 0x9000000000000000,
CAC_BASE is 0xa800000000000000, and MAP_BASE is 0xc000000000000000.  These are
all the defaults in mach-generic/spaces.h, so IP30 has never had to define a
local spaces.h override.

It DOES look like I need to hardcode the 'cca=5' bit, somewhere, though.
Whatever the Octane is booting up with does not work for SMP.


> Do not use that.  EVER.  It entirely bypasses the CPU's cache coherency
> logic.  Due to all the consistency checking between the directory
> caches and other involved agents the memory controller might detect the
> inconsistency between cache and memory and send guess what, a bus
> error.
> 
> For I/O purpose UC attribute value 1 is used, that is top byte 0x92.
> UC values 0 and 2 allow direct manipulation of the directory caches
> and atomic operations without the need to read the line into the CPU.
> 
> So that's what IP27 does.  Not sure how much of this behavious its
> little brother IP30 has copied.

About the only thing that I can determine that Octane uses from IP27 are the
CPUs, SCSI chips, and dreaded IOC3.  HEART seems to have the same logical
circuitry as HUB for the count/compare timers, but that seems to be where the
similarities end.  Unlike IP27, which doesn't hardwire the interrupt vectors,
due to the distributed model of its design, Octane's HEART has 64 fixed
interrupts in HEART for controlling access to hardware (one bit in HEART_ISR
for each interrupt).

IRIX headers imply that there might be a WAR for HEART, too, given several
references to HEART_COHERENCY_WAR.  If that macro is defined, then several
dcache_wb_inval functions become available for HEART, so I wonder if I might
have a screwy revision of HEART.  Though I have a Rev F HEART, so that WAR
might be for a much older rev.  Hard to tell w/o the HEART ASIC documentation.

--J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-05-23 22:52           ` Joshua Kinard
@ 2015-05-24 14:25             ` Maciej W. Rozycki
  0 siblings, 0 replies; 16+ messages in thread
From: Maciej W. Rozycki @ 2015-05-24 14:25 UTC (permalink / raw)
  To: Joshua Kinard, Ralf Baechle; +Cc: linux-mips

On Sat, 23 May 2015, Joshua Kinard wrote:

> > Some of IP27's reactions are a bit unobvious though.  First, the uncached
> > addres space (CCA 2) works differently that one might think.  IP27 uses
> > the R10000's uncached attribute feature which subdivides the CPUs
> > uncached XKPHYS address space into four addres spaces with the highest
> > address byte being 0x90, 0x92, 0x94 or 0x96.  The classic uncached
> > memory access happens with UC=3, that is the top address byte being
> > 0x96.

 Interesting.  For the record, this is noted in Section 6.23 "Support for 
Uncached Attribute" of the R10k manual, though the interpretation of the 
attributes is itself system-specific.

 And it looks we do handle the attributes correctly in `ioremap', via 
IO_BASE.  However it also looks to me like a corresponding update to 
`pte_to_entrylo' is needed so that we don't attempt an uncached virtual 
mapping with the wrong attribute (e.g. with an O_DSYNC mmap(2) of 
/dev/mem).  Ralf, WDYT?

> I was reading the IRIX Device Driver Programming Guide (007-0911-210), Chapter
> 1, and saw the explanation for this.  Also the bit on how the memory addresses
> are coded so that a reference to the specific node number can be encoded as
> well to assist the CPUs in accessing the memory closest to them.  Definitely
> interesting, but apparently Octane doesn't appear to use any of this.  As far
> as I can tell, it's main UNCAC_BASE and IO_BASE is classic 0x9000000000000000,
> CAC_BASE is 0xa800000000000000, and MAP_BASE is 0xc000000000000000.  These are
> all the defaults in mach-generic/spaces.h, so IP30 has never had to define a
> local spaces.h override.
> 
> It DOES look like I need to hardcode the 'cca=5' bit, somewhere, though.
> Whatever the Octane is booting up with does not work for SMP.

 Hmm, `coherency_setup' normally does the sane thing, Config.K0 should 
have been correctly set up by the firmware.  If not (what is it then?), 
then it looks to me like a quirk to resolve in platform code; IMHO just 
rewrite Config.K0 with the right value early on, before `coherency_setup' 
is called.

> > Do not use that.  EVER.  It entirely bypasses the CPU's cache coherency
> > logic.  Due to all the consistency checking between the directory
> > caches and other involved agents the memory controller might detect the
> > inconsistency between cache and memory and send guess what, a bus
> > error.

 OK, that does sound like plausible explanation for the bus error to me.

  Maciej

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-05-21  6:00     ` Joshua Kinard
  2015-05-22 12:01       ` Maciej W. Rozycki
@ 2015-05-22 16:38       ` Ralf Baechle
  2015-05-23 22:57         ` Joshua Kinard
  1 sibling, 1 reply; 16+ messages in thread
From: Ralf Baechle @ 2015-05-22 16:38 UTC (permalink / raw)
  To: Joshua Kinard; +Cc: linux-mips

On Thu, May 21, 2015 at 02:00:09AM -0400, Joshua Kinard wrote:

> Where I am lost is, though, why would I get an IBE on a 'beqz' instruction?
> It's a valid instruction from MIPS-I ('beqz' is just 'beq' w/ $0 as rt).  the
> R10K Manual states this:
> 
> """
> A Bus Error exception occurs when a processor block read, upgrade, or
> double/single/partial-word read request receives an external ERR completion
> response, or a processor double/single/partial-word read request receives an
> external ACK completion response where the associated external
> double/single/partial-word data response contains an uncorrectable error. This
> exception is not maskable.
> """
> 
> My guess is there's still something not kosher with icache flushing somewhere.
>  I can reboot this kernel multiple times and not always get the same IBE.  Most

Not or improperly flush the I-cache will result in stale instructions
getting executed.  An IBE error otoh is the result of a bus error being
signalled for the CPU's attempt to load instructions from memory.  With
the exception of a few special cases I-cache flushing doesn't happen
when eecuting kernel code, but only for userland and it's also somewhat
unlikely for improper I-cache flushing to result in an IBE error.

A huge problem tracking down the cause of a bus error is that they're
getting signalled by an external agent that is they are not generated by
the CPU itself and there may be a significant delay until the CPU
actually takes the exception.  In my experience the EPC is practically
always worthless in tracking down the cause of the bus error.  Details
depend on circumstances, as usual.

  Ralf

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-05-22 16:38       ` Ralf Baechle
@ 2015-05-23 22:57         ` Joshua Kinard
  0 siblings, 0 replies; 16+ messages in thread
From: Joshua Kinard @ 2015-05-23 22:57 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips

On 05/22/2015 12:38, Ralf Baechle wrote:
> On Thu, May 21, 2015 at 02:00:09AM -0400, Joshua Kinard wrote:
> 
>> Where I am lost is, though, why would I get an IBE on a 'beqz' instruction?
>> It's a valid instruction from MIPS-I ('beqz' is just 'beq' w/ $0 as rt).  the
>> R10K Manual states this:
>>
>> """
>> A Bus Error exception occurs when a processor block read, upgrade, or
>> double/single/partial-word read request receives an external ERR completion
>> response, or a processor double/single/partial-word read request receives an
>> external ACK completion response where the associated external
>> double/single/partial-word data response contains an uncorrectable error. This
>> exception is not maskable.
>> """
>>
>> My guess is there's still something not kosher with icache flushing somewhere.
>>  I can reboot this kernel multiple times and not always get the same IBE.  Most
> 
> Not or improperly flush the I-cache will result in stale instructions
> getting executed.  An IBE error otoh is the result of a bus error being
> signalled for the CPU's attempt to load instructions from memory.  With
> the exception of a few special cases I-cache flushing doesn't happen
> when eecuting kernel code, but only for userland and it's also somewhat
> unlikely for improper I-cache flushing to result in an IBE error.

Well, the IBE's are happening in userland, loading init, on CPU1.  I hacked
together a basic bus error handler from IP27's and using that, instead of
seeing four IBE's in a row, I can get CPU1 to stall and dump whatever debug
data I want.  Downside is, I've only got the Odyssey Early console available,
so I have to take pictures of the debug text or oops data, then manually type
it into a text file.

Further experimenting with a dual R12K module suggests that whatever the
problem is, it's got something to do with the R14K.  I'm having better success
with the R12K dual module thus far.  More on that later...

> A huge problem tracking down the cause of a bus error is that they're
> getting signalled by an external agent that is they are not generated by
> the CPU itself and there may be a significant delay until the CPU
> actually takes the exception.  In my experience the EPC is practically
> always worthless in tracking down the cause of the bus error.  Details
> depend on circumstances, as usual.

I thought that agent might be HEART, but the HEART_CAUSE register reads
0x00000000 when an IBE happens, which means no issues from its end.

How does one probe the SysAD bus?  The R10K documentation has some breakdown of
the bit format of SysAD messages.  Is there a memory address somewhere that can
be used to read data off the bus or even talk to it to get error information
(like, does it have a CAUSE register or something)?

Otherwise, figuring out what's wrong with the R14K is going to take a long time...

--J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-05-18 12:01 ` Joshua Kinard
  2015-05-20  5:23   ` Joshua Kinard
@ 2015-05-22 16:25   ` Ralf Baechle
  1 sibling, 0 replies; 16+ messages in thread
From: Ralf Baechle @ 2015-05-22 16:25 UTC (permalink / raw)
  To: Joshua Kinard; +Cc: linux-mips

On Mon, May 18, 2015 at 08:01:07AM -0400, Joshua Kinard wrote:

> What is the relationship between the cache-coherency algorithm and SMP?  IP30
> hardware is supposed to be cache-coherent.  A value of '5' sets the processors
> to "cacheable coherent exclusive on write" (per the R10K manual).  But I am not
> sure why things are still flakey.
> 
> --J

For a cache coherent platform with the R10000 you must use CCA 5 for all
RAM access or all hell will break loose.

For a 32 bit kernel this means the CCA bits of c0_config need to be set
to CCA 5.  64 bit kernels such as those on IP30 are running XKPHYS, not
CSEG0 but still need to use CCA 5.  That means the address bits that
select the CCA need to be set to 5.  Which means kernel addresses will
start with 0xa8.

The same holds true for TLB mappings, they also need to use mode 5.

Also, all accesses to a particular page of physical memory need to use the
same CCA.  Mixing modes is undefined and will in all likelyhood set above
mention hell loose.

All SMP systems need to be coherent between their CPUs.  Traditionally
only SMP MIPS systems are coherent will systems that do not support
multiple processors are non-coherent.  Those may use CCA 3 but again
mixing is not permitted.

Finally there's CCA 2 which is uncached.  That is only sensible for
I/O purposes, data structures such rings as ethernet drivers, gfx bitmaps.
Yet again multiple access modes is not permitted.

The kernel's cca command line option is a bit of a hack meant for hardware
testing and debug.  For a 64 bit kernel these lines in <asm/mach-generic/-
spaces.h> select the suitable base address in XKPHYS:

#ifndef CAC_BASE
#ifdef CONFIG_DMA_NONCOHERENT
#define CAC_BASE                _AC(0x9800000000000000, UL)
#else
#define CAC_BASE                _AC(0xa800000000000000, UL)
#endif
#endif

So you simply need to not select DMA_NONCOHERENT for IP30 and the right
value of 0xa800000000000000 will be used for the kernel base address.

Btw, don't tinker with the CCA bits in c0_config; the firmware will have
configured that correctly for your platform.  The kernel reads that
value and uses it for the CCA field for any TLB mappings.

  Ralf

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-05-18  5:39 IP30: SMP, Almost there? Joshua Kinard
  2015-05-18 12:01 ` Joshua Kinard
@ 2015-05-24  3:17 ` Joshua Kinard
  2015-06-01  5:08   ` Joshua Kinard
  2015-06-01 19:32   ` IP30: SMP, Almost there? Ralf Baechle
  1 sibling, 2 replies; 16+ messages in thread
From: Joshua Kinard @ 2015-05-24  3:17 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle

On 05/18/2015 01:39, Joshua Kinard wrote:
> So I've gotten the second CPU in Octane to "tick" again...somehow.  I am
> certain someone's cat went missing in the process...

So, yeah, the problem appears to be specific to the R14000 CPU module.  I
swapped in an R12K dual CPU module, and after a little bit of tinkering to
revert a few hacks and clean up the code, it boots into SMP, mounts the
userland, and has successfully sync'ed a Gentoo Portage tree w/o annihilating
the XFS filesystem or the MD RAID5 array.  Even compiled a few C files.

# cat /proc/interrupts
           CPU0       CPU1
 14:          0          0     HEART  powerbtn
 15:          0          0     HEART  acfail
 16:          0      44887     HEART  qla1280
 17:          0      16904     HEART  qla1280
 18:       1853          0     HEART  ioc3-eth
 20:        243          0     HEART  ioc3-io
 46:     348850          0     HEART  cpu0-ipi
 47:          0     315948     HEART  cpu1-ipi
 50:       1268          0     HEART  heart_timer
 71:     118453     195177       CPU  timer

# cat /proc/cpuinfo
system type             : SGI Octane
machine                 : Unknown
processor               : 0
cpu model               : R12000 V3.5  FPU V0.0
BogoMIPS                : 600.47
byteorder               : big endian
wait instruction        : no
microsecond timers      : yes
tlb_entries             : 64
extra interrupt vector  : no
hardware watchpoint     : yes, count: 0, address/irw mask: []
isa                     : mips2 mips3 mips4
ASEs implemented        :
shadow register sets    : 1
kscratch registers      : 0
package                 : 0
core                    : 0
VCED exceptions         : not available
VCEI exceptions         : not available

processor               : 1
cpu model               : R12000 V3.5  FPU V0.0
BogoMIPS                : 600.47
byteorder               : big endian
wait instruction        : no
microsecond timers      : yes
tlb_entries             : 64
extra interrupt vector  : no
hardware watchpoint     : yes, count: 0, address/irw mask: []
isa                     : mips2 mips3 mips4
ASEs implemented        :
shadow register sets    : 1
kscratch registers      : 0
package                 : 0
core                    : 0
VCED exceptions         : not available
VCEI exceptions         : not available

I even got the IRQs to be fanned out across both CPUs.  Well, primarily the
qla1280 drivers.  They randomly hop between both CPUs, but no ill effects so far.

But if I boot that *same* working kernel on an R14000 dual module, I get handed
an IBE as soon as the userland mounts.  The only documented differences that I
can find on the R14000 is that it supports DDR memory, being able to do memory
operations on the rising edge and falling edge of each clock.  Not sure if that
matters to the kernel at all, but I know of nothing else that describes the
R14K's internals, such as if there's some new bit in CP0 config,
branch-diagnostic, status, etc, that might explain why these IBE's are happening.

Guess I need to hunt down my old dual R10K module next and verify that works
fine...

Also, is there a way to hardcode the cca=5 setting for IP30?  Maybe it needs to
be a hidden Kconfig item?.  I tried setting cpu->writecombine in cpu-probe.c,
but no dice there.  If I boot an SMP kernel on dual R12K's w/o cca=5, I'll get
one or two pretty-specific oopses.  The one I did grab complains about bad
spinlock magic in the core tty driver somewhere.  I can transcribe that oops
later on if interested.

--J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-05-24  3:17 ` Joshua Kinard
@ 2015-06-01  5:08   ` Joshua Kinard
  2015-06-01  6:00     ` IP30: SMP, Almost there! Joshua Kinard
  2015-06-01 19:32   ` IP30: SMP, Almost there? Ralf Baechle
  1 sibling, 1 reply; 16+ messages in thread
From: Joshua Kinard @ 2015-06-01  5:08 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle

On 05/23/2015 23:17, Joshua Kinard wrote:
> On 05/18/2015 01:39, Joshua Kinard wrote:
>> So I've gotten the second CPU in Octane to "tick" again...somehow.  I am
>> certain someone's cat went missing in the process...
> 
> So, yeah, the problem appears to be specific to the R14000 CPU module.  I
> swapped in an R12K dual CPU module, and after a little bit of tinkering to
> revert a few hacks and clean up the code, it boots into SMP, mounts the
> userland, and has successfully sync'ed a Gentoo Portage tree w/o annihilating
> the XFS filesystem or the MD RAID5 array.  Even compiled a few C files.
> 
[snip]
> 
> I even got the IRQs to be fanned out across both CPUs.  Well, primarily the
> qla1280 drivers.  They randomly hop between both CPUs, but no ill effects so far.
> 
> But if I boot that *same* working kernel on an R14000 dual module, I get handed
> an IBE as soon as the userland mounts.  The only documented differences that I
> can find on the R14000 is that it supports DDR memory, being able to do memory
> operations on the rising edge and falling edge of each clock.  Not sure if that
> matters to the kernel at all, but I know of nothing else that describes the
> R14K's internals, such as if there's some new bit in CP0 config,
> branch-diagnostic, status, etc, that might explain why these IBE's are happening.
> 
> Guess I need to hunt down my old dual R10K module next and verify that works
> fine...
> 
> Also, is there a way to hardcode the cca=5 setting for IP30?  Maybe it needs to
> be a hidden Kconfig item?.  I tried setting cpu->writecombine in cpu-probe.c,
> but no dice there.  If I boot an SMP kernel on dual R12K's w/o cca=5, I'll get
> one or two pretty-specific oopses.  The one I did grab complains about bad
> spinlock magic in the core tty driver somewhere.  I can transcribe that oops
> later on if interested.

So far, the problem looks to have been blindly assigning all 64 HEART IRQs to
'handle_level_irq', including the SMP IPI IRQs.  I fixed that by assigning the
four IPI IRQs and four unused debug IRQs to 'handle_percpu_irq'.  So far, no
bus errors, even on R14000.  Also successfully tested 16KB PAGE_SIZE and no bus
errors.  Next, 64KB PAGE_SIZE w/ CONFIG_TRANSPARENT_HUGEPAGE, which was pretty
good at triggering bus errors.

</jinx>

--J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there!
  2015-06-01  5:08   ` Joshua Kinard
@ 2015-06-01  6:00     ` Joshua Kinard
  0 siblings, 0 replies; 16+ messages in thread
From: Joshua Kinard @ 2015-06-01  6:00 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle

On 06/01/2015 01:08, Joshua Kinard wrote:
> On 05/23/2015 23:17, Joshua Kinard wrote:
>> On 05/18/2015 01:39, Joshua Kinard wrote:
>>> So I've gotten the second CPU in Octane to "tick" again...somehow.  I am
>>> certain someone's cat went missing in the process...
>>
>> So, yeah, the problem appears to be specific to the R14000 CPU module.  I
>> swapped in an R12K dual CPU module, and after a little bit of tinkering to
>> revert a few hacks and clean up the code, it boots into SMP, mounts the
>> userland, and has successfully sync'ed a Gentoo Portage tree w/o annihilating
>> the XFS filesystem or the MD RAID5 array.  Even compiled a few C files.
>>
> [snip]
>>
>> I even got the IRQs to be fanned out across both CPUs.  Well, primarily the
>> qla1280 drivers.  They randomly hop between both CPUs, but no ill effects so far.
>>
>> But if I boot that *same* working kernel on an R14000 dual module, I get handed
>> an IBE as soon as the userland mounts.  The only documented differences that I
>> can find on the R14000 is that it supports DDR memory, being able to do memory
>> operations on the rising edge and falling edge of each clock.  Not sure if that
>> matters to the kernel at all, but I know of nothing else that describes the
>> R14K's internals, such as if there's some new bit in CP0 config,
>> branch-diagnostic, status, etc, that might explain why these IBE's are happening.
>>
>> Guess I need to hunt down my old dual R10K module next and verify that works
>> fine...
>>
>> Also, is there a way to hardcode the cca=5 setting for IP30?  Maybe it needs to
>> be a hidden Kconfig item?.  I tried setting cpu->writecombine in cpu-probe.c,
>> but no dice there.  If I boot an SMP kernel on dual R12K's w/o cca=5, I'll get
>> one or two pretty-specific oopses.  The one I did grab complains about bad
>> spinlock magic in the core tty driver somewhere.  I can transcribe that oops
>> later on if interested.
> 
> So far, the problem looks to have been blindly assigning all 64 HEART IRQs to
> 'handle_level_irq', including the SMP IPI IRQs.  I fixed that by assigning the
> four IPI IRQs and four unused debug IRQs to 'handle_percpu_irq'.  So far, no
> bus errors, even on R14000.  Also successfully tested 16KB PAGE_SIZE and no bus
> errors.  Next, 64KB PAGE_SIZE w/ CONFIG_TRANSPARENT_HUGEPAGE, which was pretty
> good at triggering bus errors.
> 
> </jinx>

CONFIG_TRANSPARENT_HUGEPAGE + HUGETLBFS is still not quite right on R14K CPUs.
 I can very easily trip up bus errors with that config by running 'sync', 'ls',
or 'swapon'/'swapoff' in rapid succession in a minimal bash shell
(init=/bin/bash).  But this was doable even with a single R14K module, so it
has to be a different problem.

At least 16KB and 64KB PAGE_SIZE seem to work well enough now.  Progress!

Also, is there a clear-cut explanation of the difference between
read[bwlq]/write[bwlq] and the raw/__raw/____raw variants?  Which is safe to
use in machine code (like in the SMP or IRQ setup code) versus elsewhere?  Any
warnings, gotchas, etc one has to be aware of?

--J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-05-24  3:17 ` Joshua Kinard
  2015-06-01  5:08   ` Joshua Kinard
@ 2015-06-01 19:32   ` Ralf Baechle
  2015-06-02  5:31     ` Joshua Kinard
  1 sibling, 1 reply; 16+ messages in thread
From: Ralf Baechle @ 2015-06-01 19:32 UTC (permalink / raw)
  To: Joshua Kinard; +Cc: linux-mips

On Sat, May 23, 2015 at 11:17:25PM -0400, Joshua Kinard wrote:

> I even got the IRQs to be fanned out across both CPUs.  Well, primarily the
> qla1280 drivers.  They randomly hop between both CPUs, but no ill effects so far.
> 
> But if I boot that *same* working kernel on an R14000 dual module, I get handed
> an IBE as soon as the userland mounts.  The only documented differences that I
> can find on the R14000 is that it supports DDR memory, being able to do memory
> operations on the rising edge and falling edge of each clock.  Not sure if that
> matters to the kernel at all, but I know of nothing else that describes the
> R14K's internals, such as if there's some new bit in CP0 config,
> branch-diagnostic, status, etc, that might explain why these IBE's are happening.
> 
> Guess I need to hunt down my old dual R10K module next and verify that works
> fine...
> 
> Also, is there a way to hardcode the cca=5 setting for IP30?  Maybe it needs to
> be a hidden Kconfig item?.  I tried setting cpu->writecombine in cpu-probe.c,
> but no dice there.  If I boot an SMP kernel on dual R12K's w/o cca=5, I'll get
> one or two pretty-specific oopses.  The one I did grab complains about bad
> spinlock magic in the core tty driver somewhere.  I can transcribe that oops
> later on if interested.

Can you insert something like:

  printk("c0_config: %08x\n", read_c0_config());

into a kernel and boot it without cca=5?  I'm really curious what the
startup CCA is.

  Ralf

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IP30: SMP, Almost there?
  2015-06-01 19:32   ` IP30: SMP, Almost there? Ralf Baechle
@ 2015-06-02  5:31     ` Joshua Kinard
  0 siblings, 0 replies; 16+ messages in thread
From: Joshua Kinard @ 2015-06-02  5:31 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips

On 06/01/2015 15:32, Ralf Baechle wrote:
> On Sat, May 23, 2015 at 11:17:25PM -0400, Joshua Kinard wrote:
> 
>> I even got the IRQs to be fanned out across both CPUs.  Well, primarily the
>> qla1280 drivers.  They randomly hop between both CPUs, but no ill effects so far.
>>
>> But if I boot that *same* working kernel on an R14000 dual module, I get handed
>> an IBE as soon as the userland mounts.  The only documented differences that I
>> can find on the R14000 is that it supports DDR memory, being able to do memory
>> operations on the rising edge and falling edge of each clock.  Not sure if that
>> matters to the kernel at all, but I know of nothing else that describes the
>> R14K's internals, such as if there's some new bit in CP0 config,
>> branch-diagnostic, status, etc, that might explain why these IBE's are happening.
>>
>> Guess I need to hunt down my old dual R10K module next and verify that works
>> fine...
>>
>> Also, is there a way to hardcode the cca=5 setting for IP30?  Maybe it needs to
>> be a hidden Kconfig item?.  I tried setting cpu->writecombine in cpu-probe.c,
>> but no dice there.  If I boot an SMP kernel on dual R12K's w/o cca=5, I'll get
>> one or two pretty-specific oopses.  The one I did grab complains about bad
>> spinlock magic in the core tty driver somewhere.  I can transcribe that oops
>> later on if interested.
> 
> Can you insert something like:
> 
>   printk("c0_config: %08x\n", read_c0_config());
> 
> into a kernel and boot it without cca=5?  I'm really curious what the
> startup CCA is.
> 
>   Ralf

It's cca=3 as the default.  Wasn't there a patch long ago that made that the
default?

                       D
              I   D    S     S   S  B S S  E   P  P C D   K
              C   C  0 D 0   C   S  E K B  C   M  E T N   0
             -----------------------------------------------
             xxx xxx x x xx xxx xxx x x x xxxx xx x x xx xxx
0x6c1ab3a3   011 011 0 0 00 011 010 1 0 1 1001 11 0 1 00 011    no cca
0x6c1ab3a5   011 011 0 0 00 011 010 1 0 1 1001 11 0 1 00 101    cca=5


I think cca=4 also works okay, but I have to test it a bit more.  Which is the
better one to stick with?

--J

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-06-02  7:06 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-18  5:39 IP30: SMP, Almost there? Joshua Kinard
2015-05-18 12:01 ` Joshua Kinard
2015-05-20  5:23   ` Joshua Kinard
2015-05-21  6:00     ` Joshua Kinard
2015-05-22 12:01       ` Maciej W. Rozycki
2015-05-22 17:11         ` Ralf Baechle
2015-05-23 22:52           ` Joshua Kinard
2015-05-24 14:25             ` Maciej W. Rozycki
2015-05-22 16:38       ` Ralf Baechle
2015-05-23 22:57         ` Joshua Kinard
2015-05-22 16:25   ` Ralf Baechle
2015-05-24  3:17 ` Joshua Kinard
2015-06-01  5:08   ` Joshua Kinard
2015-06-01  6:00     ` IP30: SMP, Almost there! Joshua Kinard
2015-06-01 19:32   ` IP30: SMP, Almost there? Ralf Baechle
2015-06-02  5:31     ` Joshua Kinard

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.