All of lore.kernel.org
 help / color / mirror / Atom feed
* IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
@ 2014-11-02 10:53 Joshua Kinard
  2014-11-03 18:52 ` David Daney
  0 siblings, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-02 10:53 UTC (permalink / raw)
  To: Linux MIPS List


So I have been testing the Onyx2 I have out the last few days with the IOC3
metadriver used on Octane, and I can get it to boot, but if
CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.

If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
Gentoo's 'emerge' command  can produce one.  Switch to CONFIG_PAGE_SIZE_16KB,
and the bus errors are far less frequent.  I suspect CONFIG_PAGE_SIZE_64KB will
be even less.

Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good.  It's
been up for almost 8 hours compiling, and not a single bus error yet.  It's got
2x node board with dual R12K/400MHz CPUs per node.

I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's causing
R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
CPUs).  I tried getting a core dump on one of the bus errors, but that produces a
truncated or corrupted core file that actually crashed GDB, plus I get a nice
oops message in dmesg:

[ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted 3.17.1-mipsgit-20141006 #57
[ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti: a8000000fa6f0000
[ 1302.260000] $ 0   : 0000000000000000 0000000000000001 0000000000000000 a8000000ff5ad800
[ 1302.260000] $ 4   : a8000000006d5480 00000000000f9c00 00000001f380173f a800000001000000
[ 1302.260000] $ 8   : 00000001f380173f 0000000000100077 a8000000fe77a000 0000000000000000
[ 1302.260000] $12   : 0000000000660000 0000000000000000 0000000000000000 776bc40c00000004
[ 1302.260000] $16   : 0000000000e00000 0000000000000000 00000000018ee000 6db6db6db6db6db7
[ 1302.260000] $20   : 00000000000000ca a8000000006d5480 a8000000ff65fa68 0000000000001000
[ 1302.260000] $24   : 0000000000000000 a8000000000469c0
[ 1302.260000] $28   : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000 a800000000046720
[ 1302.260000] Hi    : 00000000002ed400
[ 1302.260000] Lo    : 00000000000f9c00
[ 1302.260000] epc   : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
[ 1302.260000]     Not tainted
[ 1302.260000] ra    : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
[ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
[ 1302.260000] Cause : 0000c010
[ 1302.260000] BadVA : 00000001f380173f
[ 1302.260000] PrId  : 00000e35 (R12000)
[ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000, task=a8000000ffbbf288, tls=00000000778d2490)
[ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00 a8000000006d5480
          a8000000ff65fa68 0000000000001000 0000000000e00000 a80000000010cb00
          a8000000046a2000 a8000000ff65fa68 00000000018ee000 6db6db6db6db6db7
          a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800 a8000000005fbd90
          0000000300000080 a8000000ff668580 a8000000005fbd90 5349474900000080
          a8000000fa6f3ad8 a8000000005fbd90 0000000600000088 a8000000ff5ad928
          a8000000005fbd90 46494c4500002bf9 c000000000101000 0000000a00000080
          0000000000000000 0000000000000000 0000000000000000 0000000000000000
          0000000000000000 0000000000000000 0000000000000000 0000000000000000
          0000000000000000 0000000000000000 0000000000000000 0000000000000000
          ...
[ 1302.260000] Call Trace:
[ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
[ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
[ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
[ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
[ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
[ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
[ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
[ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
[ 1302.260000]
[ 1302.260000]
Code: 0010327a  30c60ff8  00c8302d <dcc60000> 30c80001  1100003e  00000000  bfb40000  df880000
[ 1305.340000] ---[ end trace c7649a6433db8d18 ]---

Thoughts?


-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-02 10:53 IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors Joshua Kinard
@ 2014-11-03 18:52 ` David Daney
  2014-11-04  1:08   ` Joshua Kinard
  0 siblings, 1 reply; 27+ messages in thread
From: David Daney @ 2014-11-03 18:52 UTC (permalink / raw)
  To: Joshua Kinard; +Cc: Linux MIPS List

On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>
> So I have been testing the Onyx2 I have out the last few days with the IOC3
> metadriver used on Octane, and I can get it to boot, but if
> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>
> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
> Gentoo's 'emerge' command  can produce one.  Switch to CONFIG_PAGE_SIZE_16KB,
> and the bus errors are far less frequent.  I suspect CONFIG_PAGE_SIZE_64KB will
> be even less.
>
> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good.  It's
> been up for almost 8 hours compiling, and not a single bus error yet.  It's got
> 2x node board with dual R12K/400MHz CPUs per node.
>
> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's causing
> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
> CPUs).  I tried getting a core dump on one of the bus errors, but that produces a
> truncated or corrupted core file that actually crashed GDB, plus I get a nice
> oops message in dmesg:

Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE, 
huge pages will be created and used in the background transparently to 
the userspace application.

With 4KB base page size, the huge pages will be 2MB in size..  I don't 
know much about the R10K/R12K/R14K CPUs, but it is possible that either 
their TLBs cannot handle such pages, or that the TLB Exception handlers 
don't contain proper code for these CPUs.

For each doubling of the base PAGE_SIZE, the huge page size will 
increase by a factor of 4.  So with 16KB base pages the huge page size 
would be 32MB, since there are many fewer opportunities to transparently 
use a 32MB page, I would expect any errors related to huge pages to be 
correspondingly less frequent.

With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that 
that could never be used by normal userspace programs.

>
> [ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted 3.17.1-mipsgit-20141006 #57
> [ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti: a8000000fa6f0000
> [ 1302.260000] $ 0   : 0000000000000000 0000000000000001 0000000000000000 a8000000ff5ad800
> [ 1302.260000] $ 4   : a8000000006d5480 00000000000f9c00 00000001f380173f a800000001000000
> [ 1302.260000] $ 8   : 00000001f380173f 0000000000100077 a8000000fe77a000 0000000000000000
> [ 1302.260000] $12   : 0000000000660000 0000000000000000 0000000000000000 776bc40c00000004
> [ 1302.260000] $16   : 0000000000e00000 0000000000000000 00000000018ee000 6db6db6db6db6db7
> [ 1302.260000] $20   : 00000000000000ca a8000000006d5480 a8000000ff65fa68 0000000000001000
> [ 1302.260000] $24   : 0000000000000000 a8000000000469c0
> [ 1302.260000] $28   : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000 a800000000046720
> [ 1302.260000] Hi    : 00000000002ed400
> [ 1302.260000] Lo    : 00000000000f9c00
> [ 1302.260000] epc   : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
> [ 1302.260000]     Not tainted
> [ 1302.260000] ra    : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
> [ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
> [ 1302.260000] Cause : 0000c010
> [ 1302.260000] BadVA : 00000001f380173f
> [ 1302.260000] PrId  : 00000e35 (R12000)
> [ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000, task=a8000000ffbbf288, tls=00000000778d2490)
> [ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00 a8000000006d5480
>            a8000000ff65fa68 0000000000001000 0000000000e00000 a80000000010cb00
>            a8000000046a2000 a8000000ff65fa68 00000000018ee000 6db6db6db6db6db7
>            a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800 a8000000005fbd90
>            0000000300000080 a8000000ff668580 a8000000005fbd90 5349474900000080
>            a8000000fa6f3ad8 a8000000005fbd90 0000000600000088 a8000000ff5ad928
>            a8000000005fbd90 46494c4500002bf9 c000000000101000 0000000a00000080
>            0000000000000000 0000000000000000 0000000000000000 0000000000000000
>            0000000000000000 0000000000000000 0000000000000000 0000000000000000
>            0000000000000000 0000000000000000 0000000000000000 0000000000000000
>            ...
> [ 1302.260000] Call Trace:
> [ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
> [ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
> [ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
> [ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
> [ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
> [ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
> [ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
> [ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
> [ 1302.260000]
> [ 1302.260000]
> Code: 0010327a  30c60ff8  00c8302d <dcc60000> 30c80001  1100003e  00000000  bfb40000  df880000
> [ 1305.340000] ---[ end trace c7649a6433db8d18 ]---
>
> Thoughts?
>
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-03 18:52 ` David Daney
@ 2014-11-04  1:08   ` Joshua Kinard
  2014-11-04  1:23     ` David Daney
  2014-11-05 13:52     ` Ralf Baechle
  0 siblings, 2 replies; 27+ messages in thread
From: Joshua Kinard @ 2014-11-04  1:08 UTC (permalink / raw)
  To: David Daney; +Cc: Linux MIPS List

On 11/03/2014 13:52, David Daney wrote:
> On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>>
>> So I have been testing the Onyx2 I have out the last few days with the IOC3
>> metadriver used on Octane, and I can get it to boot, but if
>> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>>
>> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
>> Gentoo's 'emerge' command  can produce one.  Switch to CONFIG_PAGE_SIZE_16KB,
>> and the bus errors are far less frequent.  I suspect CONFIG_PAGE_SIZE_64KB will
>> be even less.
>>
>> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good.  It's
>> been up for almost 8 hours compiling, and not a single bus error yet.  It's got
>> 2x node board with dual R12K/400MHz CPUs per node.
>>
>> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's causing
>> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
>> CPUs).  I tried getting a core dump on one of the bus errors, but that
>> produces a
>> truncated or corrupted core file that actually crashed GDB, plus I get a nice
>> oops message in dmesg:
> 
> Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE, huge
> pages will be created and used in the background transparently to the userspace
> application.
> 
> With 4KB base page size, the huge pages will be 2MB in size..  I don't know
> much about the R10K/R12K/R14K CPUs, but it is possible that either their TLBs
> cannot handle such pages, or that the TLB Exception handlers don't contain
> proper code for these CPUs.
> 
> For each doubling of the base PAGE_SIZE, the huge page size will increase by a
> factor of 4.  So with 16KB base pages the huge page size would be 32MB, since
> there are many fewer opportunities to transparently use a 32MB page, I would
> expect any errors related to huge pages to be correspondingly less frequent.
> 
> With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that that
> could never be used by normal userspace programs.

I checked the R10K/R12K manual, and the PageMask register there has bits 24:13
open for setting a mask value.  It looks like these CPUs only support a page
size from 4KB to 16MB (so a 2MB page size should work w/ transparent
hugepages).  I assume that the R14K on the Octane might be the same (but I
don't have a manual specific to the R14k, so I don't know).  All of the
remaining bits in that register read 0 and must have 0's written back.

I guess I could find a way to have the kernel trigger a non-fatal oops/dump the
registers on a bus error and get a look at the cause register to see if that
sheds any light on things.  Doesn't a SIGBUS on MIPS typically mean that an
address wasn't aligned on a 32-bit boundary?  Or could it also mean other things?

I believe that the R10K is largely compatible with the R4K-style TLB setup, but
Ralf or someone else more knowledge in that area will have to verify.  Maybe
the R10k-family CPUs need their own TLB routines, or what currently exists
needs modifications?  I have not tried to understand the whole TLB thing in
MIPS yet, so that's a bit of voodoo to me.

--J



>> [ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted
>> 3.17.1-mipsgit-20141006 #57
>> [ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti:
>> a8000000fa6f0000
>> [ 1302.260000] $ 0   : 0000000000000000 0000000000000001 0000000000000000
>> a8000000ff5ad800
>> [ 1302.260000] $ 4   : a8000000006d5480 00000000000f9c00 00000001f380173f
>> a800000001000000
>> [ 1302.260000] $ 8   : 00000001f380173f 0000000000100077 a8000000fe77a000
>> 0000000000000000
>> [ 1302.260000] $12   : 0000000000660000 0000000000000000 0000000000000000
>> 776bc40c00000004
>> [ 1302.260000] $16   : 0000000000e00000 0000000000000000 00000000018ee000
>> 6db6db6db6db6db7
>> [ 1302.260000] $20   : 00000000000000ca a8000000006d5480 a8000000ff65fa68
>> 0000000000001000
>> [ 1302.260000] $24   : 0000000000000000 a8000000000469c0
>> [ 1302.260000] $28   : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000
>> a800000000046720
>> [ 1302.260000] Hi    : 00000000002ed400
>> [ 1302.260000] Lo    : 00000000000f9c00
>> [ 1302.260000] epc   : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
>> [ 1302.260000]     Not tainted
>> [ 1302.260000] ra    : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
>> [ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
>> [ 1302.260000] Cause : 0000c010
>> [ 1302.260000] BadVA : 00000001f380173f
>> [ 1302.260000] PrId  : 00000e35 (R12000)
>> [ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000,
>> task=a8000000ffbbf288, tls=00000000778d2490)
>> [ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00
>> a8000000006d5480
>>            a8000000ff65fa68 0000000000001000 0000000000e00000 a80000000010cb00
>>            a8000000046a2000 a8000000ff65fa68 00000000018ee000 6db6db6db6db6db7
>>            a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800 a8000000005fbd90
>>            0000000300000080 a8000000ff668580 a8000000005fbd90 5349474900000080
>>            a8000000fa6f3ad8 a8000000005fbd90 0000000600000088 a8000000ff5ad928
>>            a8000000005fbd90 46494c4500002bf9 c000000000101000 0000000a00000080
>>            0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>            0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>            0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>            ...
>> [ 1302.260000] Call Trace:
>> [ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
>> [ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
>> [ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
>> [ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
>> [ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
>> [ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
>> [ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
>> [ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
>> [ 1302.260000]
>> [ 1302.260000]
>> Code: 0010327a  30c60ff8  00c8302d <dcc60000> 30c80001  1100003e  00000000 
>> bfb40000  df880000
>> [ 1305.340000] ---[ end trace c7649a6433db8d18 ]---
>>
>> Thoughts?



-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-04  1:08   ` Joshua Kinard
@ 2014-11-04  1:23     ` David Daney
  2014-11-04  1:34       ` Joshua Kinard
                         ` (2 more replies)
  2014-11-05 13:52     ` Ralf Baechle
  1 sibling, 3 replies; 27+ messages in thread
From: David Daney @ 2014-11-04  1:23 UTC (permalink / raw)
  To: Joshua Kinard; +Cc: Linux MIPS List

On 11/03/2014 05:08 PM, Joshua Kinard wrote:
> On 11/03/2014 13:52, David Daney wrote:
>> On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>>>
>>> So I have been testing the Onyx2 I have out the last few days with the IOC3
>>> metadriver used on Octane, and I can get it to boot, but if
>>> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>>>
>>> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
>>> Gentoo's 'emerge' command  can produce one.  Switch to CONFIG_PAGE_SIZE_16KB,
>>> and the bus errors are far less frequent.  I suspect CONFIG_PAGE_SIZE_64KB will
>>> be even less.
>>>
>>> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good.  It's
>>> been up for almost 8 hours compiling, and not a single bus error yet.  It's got
>>> 2x node board with dual R12K/400MHz CPUs per node.
>>>
>>> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's causing
>>> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
>>> CPUs).  I tried getting a core dump on one of the bus errors, but that
>>> produces a
>>> truncated or corrupted core file that actually crashed GDB, plus I get a nice
>>> oops message in dmesg:
>>
>> Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE, huge
>> pages will be created and used in the background transparently to the userspace
>> application.
>>
>> With 4KB base page size, the huge pages will be 2MB in size..  I don't know
>> much about the R10K/R12K/R14K CPUs, but it is possible that either their TLBs
>> cannot handle such pages, or that the TLB Exception handlers don't contain
>> proper code for these CPUs.
>>
>> For each doubling of the base PAGE_SIZE, the huge page size will increase by a
>> factor of 4.  So with 16KB base pages the huge page size would be 32MB, since
>> there are many fewer opportunities to transparently use a 32MB page, I would
>> expect any errors related to huge pages to be correspondingly less frequent.
>>
>> With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that that
>> could never be used by normal userspace programs.
>
> I checked the R10K/R12K manual, and the PageMask register there has bits 24:13
> open for setting a mask value.  It looks like these CPUs only support a page
> size from 4KB to 16MB (so a 2MB page size should work w/ transparent
> hugepages).  I assume that the R14K on the Octane might be the same (but I
> don't have a manual specific to the R14k, so I don't know).  All of the
> remaining bits in that register read 0 and must have 0's written back.
>
> I guess I could find a way to have the kernel trigger a non-fatal oops/dump the
> registers on a bus error and get a look at the cause register to see if that
> sheds any light on things.  Doesn't a SIGBUS on MIPS typically mean that an
> address wasn't aligned on a 32-bit boundary?  Or could it also mean other things?
>
> I believe that the R10K is largely compatible with the R4K-style TLB setup, but
> Ralf or someone else more knowledge in that area will have to verify.  Maybe
> the R10k-family CPUs need their own TLB routines, or what currently exists
> needs modifications?  I have not tried to understand the whole TLB thing in
> MIPS yet, so that's a bit of voodoo to me.

I haven't checked, but there may be workarounds required in the TLB 
management code that are not in place for the huge page case.  When the 
huge TLB code was developed, we didn't do any testing on R10K.  Somebody 
should dump the exception handlers and carefully look at the rest of the 
huge TLB management code, and check to see that any required workarounds 
are in place.

David.


>
> --J
>
>
>
>>> [ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted
>>> 3.17.1-mipsgit-20141006 #57
>>> [ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti:
>>> a8000000fa6f0000
>>> [ 1302.260000] $ 0   : 0000000000000000 0000000000000001 0000000000000000
>>> a8000000ff5ad800
>>> [ 1302.260000] $ 4   : a8000000006d5480 00000000000f9c00 00000001f380173f
>>> a800000001000000
>>> [ 1302.260000] $ 8   : 00000001f380173f 0000000000100077 a8000000fe77a000
>>> 0000000000000000
>>> [ 1302.260000] $12   : 0000000000660000 0000000000000000 0000000000000000
>>> 776bc40c00000004
>>> [ 1302.260000] $16   : 0000000000e00000 0000000000000000 00000000018ee000
>>> 6db6db6db6db6db7
>>> [ 1302.260000] $20   : 00000000000000ca a8000000006d5480 a8000000ff65fa68
>>> 0000000000001000
>>> [ 1302.260000] $24   : 0000000000000000 a8000000000469c0
>>> [ 1302.260000] $28   : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000
>>> a800000000046720
>>> [ 1302.260000] Hi    : 00000000002ed400
>>> [ 1302.260000] Lo    : 00000000000f9c00
>>> [ 1302.260000] epc   : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
>>> [ 1302.260000]     Not tainted
>>> [ 1302.260000] ra    : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
>>> [ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
>>> [ 1302.260000] Cause : 0000c010
>>> [ 1302.260000] BadVA : 00000001f380173f
>>> [ 1302.260000] PrId  : 00000e35 (R12000)
>>> [ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000,
>>> task=a8000000ffbbf288, tls=00000000778d2490)
>>> [ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00
>>> a8000000006d5480
>>>             a8000000ff65fa68 0000000000001000 0000000000e00000 a80000000010cb00
>>>             a8000000046a2000 a8000000ff65fa68 00000000018ee000 6db6db6db6db6db7
>>>             a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800 a8000000005fbd90
>>>             0000000300000080 a8000000ff668580 a8000000005fbd90 5349474900000080
>>>             a8000000fa6f3ad8 a8000000005fbd90 0000000600000088 a8000000ff5ad928
>>>             a8000000005fbd90 46494c4500002bf9 c000000000101000 0000000a00000080
>>>             0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>>             0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>>             0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>>             ...
>>> [ 1302.260000] Call Trace:
>>> [ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
>>> [ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
>>> [ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
>>> [ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
>>> [ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
>>> [ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
>>> [ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
>>> [ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
>>> [ 1302.260000]
>>> [ 1302.260000]
>>> Code: 0010327a  30c60ff8  00c8302d <dcc60000> 30c80001  1100003e  00000000
>>> bfb40000  df880000
>>> [ 1305.340000] ---[ end trace c7649a6433db8d18 ]---
>>>
>>> Thoughts?
>
>
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-04  1:23     ` David Daney
@ 2014-11-04  1:34       ` Joshua Kinard
  2014-11-04  1:43         ` David Daney
  2014-11-05  9:07       ` Joshua Kinard
  2014-11-05 16:09       ` Ralf Baechle
  2 siblings, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-04  1:34 UTC (permalink / raw)
  To: linux-mips

On 11/03/2014 20:23, David Daney wrote:
> On 11/03/2014 05:08 PM, Joshua Kinard wrote:
>> On 11/03/2014 13:52, David Daney wrote:
>>> On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>>>>
>>>> So I have been testing the Onyx2 I have out the last few days with the IOC3
>>>> metadriver used on Octane, and I can get it to boot, but if
>>>> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>>>>
>>>> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
>>>> Gentoo's 'emerge' command  can produce one.  Switch to CONFIG_PAGE_SIZE_16KB,
>>>> and the bus errors are far less frequent.  I suspect CONFIG_PAGE_SIZE_64KB
>>>> will
>>>> be even less.
>>>>
>>>> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good.  It's
>>>> been up for almost 8 hours compiling, and not a single bus error yet.  It's
>>>> got
>>>> 2x node board with dual R12K/400MHz CPUs per node.
>>>>
>>>> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's
>>>> causing
>>>> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
>>>> CPUs).  I tried getting a core dump on one of the bus errors, but that
>>>> produces a
>>>> truncated or corrupted core file that actually crashed GDB, plus I get a nice
>>>> oops message in dmesg:
>>>
>>> Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE, huge
>>> pages will be created and used in the background transparently to the userspace
>>> application.
>>>
>>> With 4KB base page size, the huge pages will be 2MB in size..  I don't know
>>> much about the R10K/R12K/R14K CPUs, but it is possible that either their TLBs
>>> cannot handle such pages, or that the TLB Exception handlers don't contain
>>> proper code for these CPUs.
>>>
>>> For each doubling of the base PAGE_SIZE, the huge page size will increase by a
>>> factor of 4.  So with 16KB base pages the huge page size would be 32MB, since
>>> there are many fewer opportunities to transparently use a 32MB page, I would
>>> expect any errors related to huge pages to be correspondingly less frequent.
>>>
>>> With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that that
>>> could never be used by normal userspace programs.
>>
>> I checked the R10K/R12K manual, and the PageMask register there has bits 24:13
>> open for setting a mask value.  It looks like these CPUs only support a page
>> size from 4KB to 16MB (so a 2MB page size should work w/ transparent
>> hugepages).  I assume that the R14K on the Octane might be the same (but I
>> don't have a manual specific to the R14k, so I don't know).  All of the
>> remaining bits in that register read 0 and must have 0's written back.
>>
>> I guess I could find a way to have the kernel trigger a non-fatal oops/dump the
>> registers on a bus error and get a look at the cause register to see if that
>> sheds any light on things.  Doesn't a SIGBUS on MIPS typically mean that an
>> address wasn't aligned on a 32-bit boundary?  Or could it also mean other
>> things?
>>
>> I believe that the R10K is largely compatible with the R4K-style TLB setup, but
>> Ralf or someone else more knowledge in that area will have to verify.  Maybe
>> the R10k-family CPUs need their own TLB routines, or what currently exists
>> needs modifications?  I have not tried to understand the whole TLB thing in
>> MIPS yet, so that's a bit of voodoo to me.
> 
> I haven't checked, but there may be workarounds required in the TLB management
> code that are not in place for the huge page case.  When the huge TLB code was
> developed, we didn't do any testing on R10K.  Somebody should dump the
> exception handlers and carefully look at the rest of the huge TLB management
> code, and check to see that any required workarounds are in place.

How does one dump the exception handlers?  Is it a debug switch somewhere?

--J




> David.
> 
> 
>>
>> --J
>>
>>
>>
>>>> [ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted
>>>> 3.17.1-mipsgit-20141006 #57
>>>> [ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti:
>>>> a8000000fa6f0000
>>>> [ 1302.260000] $ 0   : 0000000000000000 0000000000000001 0000000000000000
>>>> a8000000ff5ad800
>>>> [ 1302.260000] $ 4   : a8000000006d5480 00000000000f9c00 00000001f380173f
>>>> a800000001000000
>>>> [ 1302.260000] $ 8   : 00000001f380173f 0000000000100077 a8000000fe77a000
>>>> 0000000000000000
>>>> [ 1302.260000] $12   : 0000000000660000 0000000000000000 0000000000000000
>>>> 776bc40c00000004
>>>> [ 1302.260000] $16   : 0000000000e00000 0000000000000000 00000000018ee000
>>>> 6db6db6db6db6db7
>>>> [ 1302.260000] $20   : 00000000000000ca a8000000006d5480 a8000000ff65fa68
>>>> 0000000000001000
>>>> [ 1302.260000] $24   : 0000000000000000 a8000000000469c0
>>>> [ 1302.260000] $28   : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000
>>>> a800000000046720
>>>> [ 1302.260000] Hi    : 00000000002ed400
>>>> [ 1302.260000] Lo    : 00000000000f9c00
>>>> [ 1302.260000] epc   : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
>>>> [ 1302.260000]     Not tainted
>>>> [ 1302.260000] ra    : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
>>>> [ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
>>>> [ 1302.260000] Cause : 0000c010
>>>> [ 1302.260000] BadVA : 00000001f380173f
>>>> [ 1302.260000] PrId  : 00000e35 (R12000)
>>>> [ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000,
>>>> task=a8000000ffbbf288, tls=00000000778d2490)
>>>> [ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00
>>>> a8000000006d5480
>>>>             a8000000ff65fa68 0000000000001000 0000000000e00000
>>>> a80000000010cb00
>>>>             a8000000046a2000 a8000000ff65fa68 00000000018ee000
>>>> 6db6db6db6db6db7
>>>>             a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800
>>>> a8000000005fbd90
>>>>             0000000300000080 a8000000ff668580 a8000000005fbd90
>>>> 5349474900000080
>>>>             a8000000fa6f3ad8 a8000000005fbd90 0000000600000088
>>>> a8000000ff5ad928
>>>>             a8000000005fbd90 46494c4500002bf9 c000000000101000
>>>> 0000000a00000080
>>>>             0000000000000000 0000000000000000 0000000000000000
>>>> 0000000000000000
>>>>             0000000000000000 0000000000000000 0000000000000000
>>>> 0000000000000000
>>>>             0000000000000000 0000000000000000 0000000000000000
>>>> 0000000000000000
>>>>             ...
>>>> [ 1302.260000] Call Trace:
>>>> [ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
>>>> [ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
>>>> [ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
>>>> [ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
>>>> [ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
>>>> [ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
>>>> [ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
>>>> [ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
>>>> [ 1302.260000]
>>>> [ 1302.260000]
>>>> Code: 0010327a  30c60ff8  00c8302d <dcc60000> 30c80001  1100003e  00000000
>>>> bfb40000  df880000
>>>> [ 1305.340000] ---[ end trace c7649a6433db8d18 ]---
>>>>
>>>> Thoughts?



-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-04  1:34       ` Joshua Kinard
@ 2014-11-04  1:43         ` David Daney
  2014-11-04  5:51           ` Joshua Kinard
  0 siblings, 1 reply; 27+ messages in thread
From: David Daney @ 2014-11-04  1:43 UTC (permalink / raw)
  To: Joshua Kinard; +Cc: linux-mips

On 11/03/2014 05:34 PM, Joshua Kinard wrote:
> On 11/03/2014 20:23, David Daney wrote:
>> On 11/03/2014 05:08 PM, Joshua Kinard wrote:
>>> On 11/03/2014 13:52, David Daney wrote:
>>>> On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>>>>>
>>>>> So I have been testing the Onyx2 I have out the last few days with the IOC3
>>>>> metadriver used on Octane, and I can get it to boot, but if
>>>>> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>>>>>
>>>>> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
>>>>> Gentoo's 'emerge' command  can produce one.  Switch to CONFIG_PAGE_SIZE_16KB,
>>>>> and the bus errors are far less frequent.  I suspect CONFIG_PAGE_SIZE_64KB
>>>>> will
>>>>> be even less.
>>>>>
>>>>> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good.  It's
>>>>> been up for almost 8 hours compiling, and not a single bus error yet.  It's
>>>>> got
>>>>> 2x node board with dual R12K/400MHz CPUs per node.
>>>>>
>>>>> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's
>>>>> causing
>>>>> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
>>>>> CPUs).  I tried getting a core dump on one of the bus errors, but that
>>>>> produces a
>>>>> truncated or corrupted core file that actually crashed GDB, plus I get a nice
>>>>> oops message in dmesg:
>>>>
>>>> Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE, huge
>>>> pages will be created and used in the background transparently to the userspace
>>>> application.
>>>>
>>>> With 4KB base page size, the huge pages will be 2MB in size..  I don't know
>>>> much about the R10K/R12K/R14K CPUs, but it is possible that either their TLBs
>>>> cannot handle such pages, or that the TLB Exception handlers don't contain
>>>> proper code for these CPUs.
>>>>
>>>> For each doubling of the base PAGE_SIZE, the huge page size will increase by a
>>>> factor of 4.  So with 16KB base pages the huge page size would be 32MB, since
>>>> there are many fewer opportunities to transparently use a 32MB page, I would
>>>> expect any errors related to huge pages to be correspondingly less frequent.
>>>>
>>>> With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that that
>>>> could never be used by normal userspace programs.
>>>
>>> I checked the R10K/R12K manual, and the PageMask register there has bits 24:13
>>> open for setting a mask value.  It looks like these CPUs only support a page
>>> size from 4KB to 16MB (so a 2MB page size should work w/ transparent
>>> hugepages).  I assume that the R14K on the Octane might be the same (but I
>>> don't have a manual specific to the R14k, so I don't know).  All of the
>>> remaining bits in that register read 0 and must have 0's written back.
>>>
>>> I guess I could find a way to have the kernel trigger a non-fatal oops/dump the
>>> registers on a bus error and get a look at the cause register to see if that
>>> sheds any light on things.  Doesn't a SIGBUS on MIPS typically mean that an
>>> address wasn't aligned on a 32-bit boundary?  Or could it also mean other
>>> things?
>>>
>>> I believe that the R10K is largely compatible with the R4K-style TLB setup, but
>>> Ralf or someone else more knowledge in that area will have to verify.  Maybe
>>> the R10k-family CPUs need their own TLB routines, or what currently exists
>>> needs modifications?  I have not tried to understand the whole TLB thing in
>>> MIPS yet, so that's a bit of voodoo to me.
>>
>> I haven't checked, but there may be workarounds required in the TLB management
>> code that are not in place for the huge page case.  When the huge TLB code was
>> developed, we didn't do any testing on R10K.  Somebody should dump the
>> exception handlers and carefully look at the rest of the huge TLB management
>> code, and check to see that any required workarounds are in place.
>
> How does one dump the exception handlers?  Is it a debug switch somewhere?
>

Add as the very first line of tlbex.c "#define DEBUG 1"

Then rebuild, and pass "debug" on the kernel command line.

The output can be fed though gas, and then disassembled with objdump -d


> --J
>
>
>
>
>> David.
>>
>>
>>>
>>> --J
>>>
>>>
>>>
>>>>> [ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted
>>>>> 3.17.1-mipsgit-20141006 #57
>>>>> [ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti:
>>>>> a8000000fa6f0000
>>>>> [ 1302.260000] $ 0   : 0000000000000000 0000000000000001 0000000000000000
>>>>> a8000000ff5ad800
>>>>> [ 1302.260000] $ 4   : a8000000006d5480 00000000000f9c00 00000001f380173f
>>>>> a800000001000000
>>>>> [ 1302.260000] $ 8   : 00000001f380173f 0000000000100077 a8000000fe77a000
>>>>> 0000000000000000
>>>>> [ 1302.260000] $12   : 0000000000660000 0000000000000000 0000000000000000
>>>>> 776bc40c00000004
>>>>> [ 1302.260000] $16   : 0000000000e00000 0000000000000000 00000000018ee000
>>>>> 6db6db6db6db6db7
>>>>> [ 1302.260000] $20   : 00000000000000ca a8000000006d5480 a8000000ff65fa68
>>>>> 0000000000001000
>>>>> [ 1302.260000] $24   : 0000000000000000 a8000000000469c0
>>>>> [ 1302.260000] $28   : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000
>>>>> a800000000046720
>>>>> [ 1302.260000] Hi    : 00000000002ed400
>>>>> [ 1302.260000] Lo    : 00000000000f9c00
>>>>> [ 1302.260000] epc   : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
>>>>> [ 1302.260000]     Not tainted
>>>>> [ 1302.260000] ra    : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
>>>>> [ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
>>>>> [ 1302.260000] Cause : 0000c010
>>>>> [ 1302.260000] BadVA : 00000001f380173f
>>>>> [ 1302.260000] PrId  : 00000e35 (R12000)
>>>>> [ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000,
>>>>> task=a8000000ffbbf288, tls=00000000778d2490)
>>>>> [ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00
>>>>> a8000000006d5480
>>>>>              a8000000ff65fa68 0000000000001000 0000000000e00000
>>>>> a80000000010cb00
>>>>>              a8000000046a2000 a8000000ff65fa68 00000000018ee000
>>>>> 6db6db6db6db6db7
>>>>>              a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800
>>>>> a8000000005fbd90
>>>>>              0000000300000080 a8000000ff668580 a8000000005fbd90
>>>>> 5349474900000080
>>>>>              a8000000fa6f3ad8 a8000000005fbd90 0000000600000088
>>>>> a8000000ff5ad928
>>>>>              a8000000005fbd90 46494c4500002bf9 c000000000101000
>>>>> 0000000a00000080
>>>>>              0000000000000000 0000000000000000 0000000000000000
>>>>> 0000000000000000
>>>>>              0000000000000000 0000000000000000 0000000000000000
>>>>> 0000000000000000
>>>>>              0000000000000000 0000000000000000 0000000000000000
>>>>> 0000000000000000
>>>>>              ...
>>>>> [ 1302.260000] Call Trace:
>>>>> [ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
>>>>> [ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
>>>>> [ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
>>>>> [ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
>>>>> [ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
>>>>> [ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
>>>>> [ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
>>>>> [ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
>>>>> [ 1302.260000]
>>>>> [ 1302.260000]
>>>>> Code: 0010327a  30c60ff8  00c8302d <dcc60000> 30c80001  1100003e  00000000
>>>>> bfb40000  df880000
>>>>> [ 1305.340000] ---[ end trace c7649a6433db8d18 ]---
>>>>>
>>>>> Thoughts?
>
>
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-04  1:43         ` David Daney
@ 2014-11-04  5:51           ` Joshua Kinard
  0 siblings, 0 replies; 27+ messages in thread
From: Joshua Kinard @ 2014-11-04  5:51 UTC (permalink / raw)
  To: David Daney; +Cc: linux-mips

[-- Attachment #1: Type: text/plain, Size: 8557 bytes --]

On 11/03/2014 20:43, David Daney wrote:
> On 11/03/2014 05:34 PM, Joshua Kinard wrote:
>> On 11/03/2014 20:23, David Daney wrote:
>>> On 11/03/2014 05:08 PM, Joshua Kinard wrote:
>>>> On 11/03/2014 13:52, David Daney wrote:
>>>>> On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>>>>>>
>>>>>> So I have been testing the Onyx2 I have out the last few days with the IOC3
>>>>>> metadriver used on Octane, and I can get it to boot, but if
>>>>>> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>>>>>>
>>>>>> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
>>>>>> Gentoo's 'emerge' command  can produce one.  Switch to
>>>>>> CONFIG_PAGE_SIZE_16KB,
>>>>>> and the bus errors are far less frequent.  I suspect CONFIG_PAGE_SIZE_64KB
>>>>>> will
>>>>>> be even less.
>>>>>>
>>>>>> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good. 
>>>>>> It's
>>>>>> been up for almost 8 hours compiling, and not a single bus error yet.  It's
>>>>>> got
>>>>>> 2x node board with dual R12K/400MHz CPUs per node.
>>>>>>
>>>>>> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's
>>>>>> causing
>>>>>> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
>>>>>> CPUs).  I tried getting a core dump on one of the bus errors, but that
>>>>>> produces a
>>>>>> truncated or corrupted core file that actually crashed GDB, plus I get a
>>>>>> nice
>>>>>> oops message in dmesg:
>>>>>
>>>>> Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE, huge
>>>>> pages will be created and used in the background transparently to the
>>>>> userspace
>>>>> application.
>>>>>
>>>>> With 4KB base page size, the huge pages will be 2MB in size..  I don't know
>>>>> much about the R10K/R12K/R14K CPUs, but it is possible that either their TLBs
>>>>> cannot handle such pages, or that the TLB Exception handlers don't contain
>>>>> proper code for these CPUs.
>>>>>
>>>>> For each doubling of the base PAGE_SIZE, the huge page size will increase
>>>>> by a
>>>>> factor of 4.  So with 16KB base pages the huge page size would be 32MB, since
>>>>> there are many fewer opportunities to transparently use a 32MB page, I would
>>>>> expect any errors related to huge pages to be correspondingly less frequent.
>>>>>
>>>>> With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that that
>>>>> could never be used by normal userspace programs.
>>>>
>>>> I checked the R10K/R12K manual, and the PageMask register there has bits 24:13
>>>> open for setting a mask value.  It looks like these CPUs only support a page
>>>> size from 4KB to 16MB (so a 2MB page size should work w/ transparent
>>>> hugepages).  I assume that the R14K on the Octane might be the same (but I
>>>> don't have a manual specific to the R14k, so I don't know).  All of the
>>>> remaining bits in that register read 0 and must have 0's written back.
>>>>
>>>> I guess I could find a way to have the kernel trigger a non-fatal oops/dump
>>>> the
>>>> registers on a bus error and get a look at the cause register to see if that
>>>> sheds any light on things.  Doesn't a SIGBUS on MIPS typically mean that an
>>>> address wasn't aligned on a 32-bit boundary?  Or could it also mean other
>>>> things?
>>>>
>>>> I believe that the R10K is largely compatible with the R4K-style TLB setup,
>>>> but
>>>> Ralf or someone else more knowledge in that area will have to verify.  Maybe
>>>> the R10k-family CPUs need their own TLB routines, or what currently exists
>>>> needs modifications?  I have not tried to understand the whole TLB thing in
>>>> MIPS yet, so that's a bit of voodoo to me.
>>>
>>> I haven't checked, but there may be workarounds required in the TLB management
>>> code that are not in place for the huge page case.  When the huge TLB code was
>>> developed, we didn't do any testing on R10K.  Somebody should dump the
>>> exception handlers and carefully look at the rest of the huge TLB management
>>> code, and check to see that any required workarounds are in place.
>>
>> How does one dump the exception handlers?  Is it a debug switch somewhere?
>>
> 
> Add as the very first line of tlbex.c "#define DEBUG 1"
> 
> Then rebuild, and pass "debug" on the kernel command line.
> 
> The output can be fed though gas, and then disassembled with objdump -d

Had to fiddle with gas a little bit, but that was because I was using
cross-compiler.  Got it to work, though.  tlb1-no-transparent-hugepage.dis is
without CONFIG_TRANSPARENT_HUGEPAGE, while tlb2-transparent-hugepage.dis is
with that option enabled.  I don't think any of the other patches/hacks I've
added to my IP27 build have affected the output.

I saw that CPU1 through CPU3 also dumped r4000_tlb_refill only, but that looks
to be the same across all of the CPUs, so I only compiled/disassembled the
output from CPU0.

--J


>>>>>> [ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted
>>>>>> 3.17.1-mipsgit-20141006 #57
>>>>>> [ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti:
>>>>>> a8000000fa6f0000
>>>>>> [ 1302.260000] $ 0   : 0000000000000000 0000000000000001 0000000000000000
>>>>>> a8000000ff5ad800
>>>>>> [ 1302.260000] $ 4   : a8000000006d5480 00000000000f9c00 00000001f380173f
>>>>>> a800000001000000
>>>>>> [ 1302.260000] $ 8   : 00000001f380173f 0000000000100077 a8000000fe77a000
>>>>>> 0000000000000000
>>>>>> [ 1302.260000] $12   : 0000000000660000 0000000000000000 0000000000000000
>>>>>> 776bc40c00000004
>>>>>> [ 1302.260000] $16   : 0000000000e00000 0000000000000000 00000000018ee000
>>>>>> 6db6db6db6db6db7
>>>>>> [ 1302.260000] $20   : 00000000000000ca a8000000006d5480 a8000000ff65fa68
>>>>>> 0000000000001000
>>>>>> [ 1302.260000] $24   : 0000000000000000 a8000000000469c0
>>>>>> [ 1302.260000] $28   : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000
>>>>>> a800000000046720
>>>>>> [ 1302.260000] Hi    : 00000000002ed400
>>>>>> [ 1302.260000] Lo    : 00000000000f9c00
>>>>>> [ 1302.260000] epc   : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
>>>>>> [ 1302.260000]     Not tainted
>>>>>> [ 1302.260000] ra    : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
>>>>>> [ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
>>>>>> [ 1302.260000] Cause : 0000c010
>>>>>> [ 1302.260000] BadVA : 00000001f380173f
>>>>>> [ 1302.260000] PrId  : 00000e35 (R12000)
>>>>>> [ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000,
>>>>>> task=a8000000ffbbf288, tls=00000000778d2490)
>>>>>> [ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00
>>>>>> a8000000006d5480
>>>>>>              a8000000ff65fa68 0000000000001000 0000000000e00000
>>>>>> a80000000010cb00
>>>>>>              a8000000046a2000 a8000000ff65fa68 00000000018ee000
>>>>>> 6db6db6db6db6db7
>>>>>>              a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800
>>>>>> a8000000005fbd90
>>>>>>              0000000300000080 a8000000ff668580 a8000000005fbd90
>>>>>> 5349474900000080
>>>>>>              a8000000fa6f3ad8 a8000000005fbd90 0000000600000088
>>>>>> a8000000ff5ad928
>>>>>>              a8000000005fbd90 46494c4500002bf9 c000000000101000
>>>>>> 0000000a00000080
>>>>>>              0000000000000000 0000000000000000 0000000000000000
>>>>>> 0000000000000000
>>>>>>              0000000000000000 0000000000000000 0000000000000000
>>>>>> 0000000000000000
>>>>>>              0000000000000000 0000000000000000 0000000000000000
>>>>>> 0000000000000000
>>>>>>              ...
>>>>>> [ 1302.260000] Call Trace:
>>>>>> [ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
>>>>>> [ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
>>>>>> [ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
>>>>>> [ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
>>>>>> [ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
>>>>>> [ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
>>>>>> [ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
>>>>>> [ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
>>>>>> [ 1302.260000]
>>>>>> [ 1302.260000]
>>>>>> Code: 0010327a  30c60ff8  00c8302d <dcc60000> 30c80001  1100003e  00000000
>>>>>> bfb40000  df880000
>>>>>> [ 1305.340000] ---[ end trace c7649a6433db8d18 ]---
>>>>>>
>>>>>> Thoughts?


-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

[-- Attachment #2: tlb1-no-transparent-hugepage.dis --]
[-- Type: text/plain, Size: 7693 bytes --]


tlb1:     file format elf64-tradbigmips


Disassembly of section .text:

0000000000000000 <tlbmiss_handler>:
   0:	40252000 	dmfc0	a1,$4
   4:	00052dfa 	dsrl	a1,a1,0x17
   8:	3c06a800 	lui	a2,0xa800
   c:	00063438 	dsll	a2,a2,0x10
  10:	64c60091 	daddiu	a2,a2,145
  14:	00063438 	dsll	a2,a2,0x10
  18:	00c5302d 	daddu	a2,a2,a1
  1c:	fcc44100 	sd	a0,16640(a2)
  20:	03e00008 	jr	ra
  24:	00000000 	nop
	...

0000000000000048 <r4000_tlb_load>:
  48:	403a2000 	dmfc0	k0,$4
  4c:	001ad6ba 	dsrl	k0,k0,0x1a
  50:	001ad1f8 	dsll	k0,k0,0x7
  54:	3c1ba800 	lui	k1,0xa800
  58:	001bdc38 	dsll	k1,k1,0x10
  5c:	677b0091 	daddiu	k1,k1,145
  60:	001bdc38 	dsll	k1,k1,0x10
  64:	677b5500 	daddiu	k1,k1,21760
  68:	035bd02d 	daddu	k0,k0,k1
  6c:	ff410000 	sd	at,0(k0)
  70:	ff420008 	sd	v0,8(k0)
  74:	403b4000 	dmfc0	k1,$8
  78:	001b0abe 	dsrl32	at,k1,0xa
  7c:	14200026 	bnez	at,118 <r4000_tlb_load+0xd0>
  80:	40212000 	dmfc0	at,$4
  84:	00010dfa 	dsrl	at,at,0x17
  88:	3c1ba800 	lui	k1,0xa800
  8c:	001bdc38 	dsll	k1,k1,0x10
  90:	677b0091 	daddiu	k1,k1,145
  94:	001bdc38 	dsll	k1,k1,0x10
  98:	003b082d 	daddu	at,at,k1
  9c:	403b4000 	dmfc0	k1,$8
  a0:	dc214100 	ld	at,16640(at)
  a4:	001bdeba 	dsrl	k1,k1,0x1a
  a8:	337bfff8 	andi	k1,k1,0xfff8
  ac:	003b082d 	daddu	at,at,k1
  b0:	403b4000 	dmfc0	k1,$8
  b4:	dc210000 	ld	at,0(at)
  b8:	001bdb7a 	dsrl	k1,k1,0xd
  bc:	337bfff8 	andi	k1,k1,0xfff8
  c0:	003b082d 	daddu	at,at,k1
  c4:	d03b0000 	lld	k1,0(at)
  c8:	42000008 	tlbp
  cc:	33620003 	andi	v0,k1,0x3
  d0:	38420003 	xori	v0,v0,0x3
  d4:	14400016 	bnez	v0,130 <r4000_tlb_load+0xe8>
  d8:	377b0048 	ori	k1,k1,0x48
  dc:	f03b0000 	scd	k1,0(at)
  e0:	5360fff8 	beqzl	k1,c4 <r4000_tlb_load+0x7c>
  e4:	00000000 	nop
  e8:	34210008 	ori	at,at,0x8
  ec:	38210008 	xori	at,at,0x8
  f0:	dc3b0000 	ld	k1,0(at)
  f4:	dc210008 	ld	at,8(at)
  f8:	001bd97a 	dsrl	k1,k1,0x5
  fc:	40bb1000 	dmtc0	k1,$2
 100:	0001097a 	dsrl	at,at,0x5
 104:	40a11800 	dmtc0	at,$3
 108:	42000002 	tlbwi
 10c:	df410000 	ld	at,0(k0)
 110:	df420008 	ld	v0,8(k0)
 114:	42000018 	eret
 118:	3c01a800 	lui	at,0xa800
 11c:	00010c38 	dsll	at,at,0x10
 120:	6421008f 	daddiu	at,at,143
 124:	00010c38 	dsll	at,at,0x10
 128:	1000ffde 	b	a4 <r4000_tlb_load+0x5c>
 12c:	64210000 	daddiu	at,at,0
 130:	df410000 	ld	at,0(k0)
 134:	df420008 	ld	v0,8(k0)
 138:	08010cc8 	j	43320 <r4000_tlb_refill+0x42cd8>
 13c:	00000000 	nop
	...

0000000000000248 <r4000_tlb_store>:
 248:	403a2000 	dmfc0	k0,$4
 24c:	001ad6ba 	dsrl	k0,k0,0x1a
 250:	001ad1f8 	dsll	k0,k0,0x7
 254:	3c1ba800 	lui	k1,0xa800
 258:	001bdc38 	dsll	k1,k1,0x10
 25c:	677b0091 	daddiu	k1,k1,145
 260:	001bdc38 	dsll	k1,k1,0x10
 264:	677b5500 	daddiu	k1,k1,21760
 268:	035bd02d 	daddu	k0,k0,k1
 26c:	ff410000 	sd	at,0(k0)
 270:	ff420008 	sd	v0,8(k0)
 274:	403b4000 	dmfc0	k1,$8
 278:	001b0abe 	dsrl32	at,k1,0xa
 27c:	14200027 	bnez	at,31c <r4000_tlb_store+0xd4>
 280:	40212000 	dmfc0	at,$4
 284:	00010dfa 	dsrl	at,at,0x17
 288:	3c1ba800 	lui	k1,0xa800
 28c:	001bdc38 	dsll	k1,k1,0x10
 290:	677b0091 	daddiu	k1,k1,145
 294:	001bdc38 	dsll	k1,k1,0x10
 298:	003b082d 	daddu	at,at,k1
 29c:	403b4000 	dmfc0	k1,$8
 2a0:	dc214100 	ld	at,16640(at)
 2a4:	001bdeba 	dsrl	k1,k1,0x1a
 2a8:	337bfff8 	andi	k1,k1,0xfff8
 2ac:	003b082d 	daddu	at,at,k1
 2b0:	403b4000 	dmfc0	k1,$8
 2b4:	dc210000 	ld	at,0(at)
 2b8:	001bdb7a 	dsrl	k1,k1,0xd
 2bc:	337bfff8 	andi	k1,k1,0xfff8
 2c0:	003b082d 	daddu	at,at,k1
 2c4:	d03b0000 	lld	k1,0(at)
 2c8:	42000008 	tlbp
 2cc:	33620005 	andi	v0,k1,0x5
 2d0:	38420005 	xori	v0,v0,0x5
 2d4:	14400017 	bnez	v0,334 <r4000_tlb_store+0xec>
 2d8:	00000000 	nop
 2dc:	377b00d8 	ori	k1,k1,0xd8
 2e0:	f03b0000 	scd	k1,0(at)
 2e4:	5360fff7 	beqzl	k1,2c4 <r4000_tlb_store+0x7c>
 2e8:	00000000 	nop
 2ec:	34210008 	ori	at,at,0x8
 2f0:	38210008 	xori	at,at,0x8
 2f4:	dc3b0000 	ld	k1,0(at)
 2f8:	dc210008 	ld	at,8(at)
 2fc:	001bd97a 	dsrl	k1,k1,0x5
 300:	40bb1000 	dmtc0	k1,$2
 304:	0001097a 	dsrl	at,at,0x5
 308:	40a11800 	dmtc0	at,$3
 30c:	42000002 	tlbwi
 310:	df410000 	ld	at,0(k0)
 314:	df420008 	ld	v0,8(k0)
 318:	42000018 	eret
 31c:	3c01a800 	lui	at,0xa800
 320:	00010c38 	dsll	at,at,0x10
 324:	6421008f 	daddiu	at,at,143
 328:	00010c38 	dsll	at,at,0x10
 32c:	1000ffdd 	b	2a4 <r4000_tlb_store+0x5c>
 330:	64210000 	daddiu	at,at,0
 334:	df410000 	ld	at,0(k0)
 338:	df420008 	ld	v0,8(k0)
 33c:	08010d13 	j	4344c <r4000_tlb_refill+0x42e04>
 340:	00000000 	nop
	...

0000000000000448 <r4000_tlb_modify>:
 448:	403a2000 	dmfc0	k0,$4
 44c:	001ad6ba 	dsrl	k0,k0,0x1a
 450:	001ad1f8 	dsll	k0,k0,0x7
 454:	3c1ba800 	lui	k1,0xa800
 458:	001bdc38 	dsll	k1,k1,0x10
 45c:	677b0091 	daddiu	k1,k1,145
 460:	001bdc38 	dsll	k1,k1,0x10
 464:	677b5500 	daddiu	k1,k1,21760
 468:	035bd02d 	daddu	k0,k0,k1
 46c:	ff410000 	sd	at,0(k0)
 470:	ff420008 	sd	v0,8(k0)
 474:	403b4000 	dmfc0	k1,$8
 478:	001b0abe 	dsrl32	at,k1,0xa
 47c:	14200025 	bnez	at,514 <r4000_tlb_modify+0xcc>
 480:	40212000 	dmfc0	at,$4
 484:	00010dfa 	dsrl	at,at,0x17
 488:	3c1ba800 	lui	k1,0xa800
 48c:	001bdc38 	dsll	k1,k1,0x10
 490:	677b0091 	daddiu	k1,k1,145
 494:	001bdc38 	dsll	k1,k1,0x10
 498:	003b082d 	daddu	at,at,k1
 49c:	403b4000 	dmfc0	k1,$8
 4a0:	dc214100 	ld	at,16640(at)
 4a4:	001bdeba 	dsrl	k1,k1,0x1a
 4a8:	337bfff8 	andi	k1,k1,0xfff8
 4ac:	003b082d 	daddu	at,at,k1
 4b0:	403b4000 	dmfc0	k1,$8
 4b4:	dc210000 	ld	at,0(at)
 4b8:	001bdb7a 	dsrl	k1,k1,0xd
 4bc:	337bfff8 	andi	k1,k1,0xfff8
 4c0:	003b082d 	daddu	at,at,k1
 4c4:	d03b0000 	lld	k1,0(at)
 4c8:	42000008 	tlbp
 4cc:	33620004 	andi	v0,k1,0x4
 4d0:	10400016 	beqz	v0,52c <r4000_tlb_modify+0xe4>
 4d4:	377b00d8 	ori	k1,k1,0xd8
 4d8:	f03b0000 	scd	k1,0(at)
 4dc:	5360fff9 	beqzl	k1,4c4 <r4000_tlb_modify+0x7c>
 4e0:	00000000 	nop
 4e4:	34210008 	ori	at,at,0x8
 4e8:	38210008 	xori	at,at,0x8
 4ec:	dc3b0000 	ld	k1,0(at)
 4f0:	dc210008 	ld	at,8(at)
 4f4:	001bd97a 	dsrl	k1,k1,0x5
 4f8:	40bb1000 	dmtc0	k1,$2
 4fc:	0001097a 	dsrl	at,at,0x5
 500:	40a11800 	dmtc0	at,$3
 504:	42000002 	tlbwi
 508:	df410000 	ld	at,0(k0)
 50c:	df420008 	ld	v0,8(k0)
 510:	42000018 	eret
 514:	3c01a800 	lui	at,0xa800
 518:	00010c38 	dsll	at,at,0x10
 51c:	6421008f 	daddiu	at,at,143
 520:	00010c38 	dsll	at,at,0x10
 524:	1000ffdf 	b	4a4 <r4000_tlb_modify+0x5c>
 528:	64210000 	daddiu	at,at,0
 52c:	df410000 	ld	at,0(k0)
 530:	df420008 	ld	v0,8(k0)
 534:	08010d13 	j	4344c <r4000_tlb_refill+0x42e04>
 538:	00000000 	nop
	...

0000000000000648 <r4000_tlb_refill>:
 648:	07410006 	bgez	k0,664 <r4000_tlb_refill+0x1c>
 64c:	3c1ba800 	lui	k1,0xa800
 650:	001bdc38 	dsll	k1,k1,0x10
 654:	677b008f 	daddiu	k1,k1,143
 658:	001bdc38 	dsll	k1,k1,0x10
 65c:	10000026 	b	6f8 <r4000_tlb_refill+0xb0>
 660:	677b0000 	daddiu	k1,k1,0
 664:	3c1ba800 	lui	k1,0xa800
 668:	001bdc38 	dsll	k1,k1,0x10
 66c:	677b0004 	daddiu	k1,k1,4
 670:	001bdc38 	dsll	k1,k1,0x10
 674:	677b3320 	daddiu	k1,k1,13088
 678:	03600008 	jr	k1
 67c:	00000000 	nop
	...
 6c8:	403a4000 	dmfc0	k0,$8
 6cc:	001adabe 	dsrl32	k1,k0,0xa
 6d0:	1760ffdd 	bnez	k1,648 <r4000_tlb_refill>
 6d4:	403b2000 	dmfc0	k1,$4
 6d8:	001bddfa 	dsrl	k1,k1,0x17
 6dc:	3c1aa800 	lui	k0,0xa800
 6e0:	001ad438 	dsll	k0,k0,0x10
 6e4:	675a0091 	daddiu	k0,k0,145
 6e8:	001ad438 	dsll	k0,k0,0x10
 6ec:	037ad82d 	daddu	k1,k1,k0
 6f0:	403a4000 	dmfc0	k0,$8
 6f4:	df7b4100 	ld	k1,16640(k1)
 6f8:	001ad6ba 	dsrl	k0,k0,0x1a
 6fc:	335afff8 	andi	k0,k0,0xfff8
 700:	037ad82d 	daddu	k1,k1,k0
 704:	403aa000 	dmfc0	k0,$20
 708:	df7b0000 	ld	k1,0(k1)
 70c:	001ad13a 	dsrl	k0,k0,0x4
 710:	335afff0 	andi	k0,k0,0xfff0
 714:	037ad82d 	daddu	k1,k1,k0
 718:	df7a0000 	ld	k0,0(k1)
 71c:	df7b0008 	ld	k1,8(k1)
 720:	001ad17a 	dsrl	k0,k0,0x5
 724:	40ba1000 	dmtc0	k0,$2
 728:	001bd97a 	dsrl	k1,k1,0x5
 72c:	40bb1800 	dmtc0	k1,$3
 730:	42000006 	tlbwr
 734:	42000018 	eret
	...

[-- Attachment #3: tlb2-transparent-hugepage.dis --]
[-- Type: text/plain, Size: 10634 bytes --]


tlb2:     file format elf64-tradbigmips


Disassembly of section .text:

0000000000000000 <tlbmiss_handler>:
   0:	40252000 	dmfc0	a1,$4
   4:	00052dfa 	dsrl	a1,a1,0x17
   8:	3c06a800 	lui	a2,0xa800
   c:	00063438 	dsll	a2,a2,0x10
  10:	64c60092 	daddiu	a2,a2,146
  14:	00063438 	dsll	a2,a2,0x10
  18:	00c5302d 	daddu	a2,a2,a1
  1c:	fcc44100 	sd	a0,16640(a2)
  20:	03e00008 	jr	ra
  24:	00000000 	nop
	...

0000000000000048 <r4000_tlb_load>:
  48:	403a2000 	dmfc0	k0,$4
  4c:	001ad6ba 	dsrl	k0,k0,0x1a
  50:	001ad1f8 	dsll	k0,k0,0x7
  54:	3c1ba800 	lui	k1,0xa800
  58:	001bdc38 	dsll	k1,k1,0x10
  5c:	677b0092 	daddiu	k1,k1,146
  60:	001bdc38 	dsll	k1,k1,0x10
  64:	677b5500 	daddiu	k1,k1,21760
  68:	035bd02d 	daddu	k0,k0,k1
  6c:	ff410000 	sd	at,0(k0)
  70:	ff420008 	sd	v0,8(k0)
  74:	403b4000 	dmfc0	k1,$8
  78:	001b0abe 	dsrl32	at,k1,0xa
  7c:	14200029 	bnez	at,124 <r4000_tlb_load+0xdc>
  80:	40212000 	dmfc0	at,$4
  84:	00010dfa 	dsrl	at,at,0x17
  88:	3c1ba800 	lui	k1,0xa800
  8c:	001bdc38 	dsll	k1,k1,0x10
  90:	677b0092 	daddiu	k1,k1,146
  94:	001bdc38 	dsll	k1,k1,0x10
  98:	003b082d 	daddu	at,at,k1
  9c:	403b4000 	dmfc0	k1,$8
  a0:	dc214100 	ld	at,16640(at)
  a4:	001bdeba 	dsrl	k1,k1,0x1a
  a8:	337bfff8 	andi	k1,k1,0xfff8
  ac:	003b082d 	daddu	at,at,k1
  b0:	dc3b0000 	ld	k1,0(at)
  b4:	337b0020 	andi	k1,k1,0x20
  b8:	17600020 	bnez	k1,13c <r4000_tlb_load+0xf4>
  bc:	403b4000 	dmfc0	k1,$8
  c0:	dc210000 	ld	at,0(at)
  c4:	001bdb7a 	dsrl	k1,k1,0xd
  c8:	337bfff8 	andi	k1,k1,0xfff8
  cc:	003b082d 	daddu	at,at,k1
  d0:	d03b0000 	lld	k1,0(at)
  d4:	42000008 	tlbp
  d8:	33620003 	andi	v0,k1,0x3
  dc:	38420003 	xori	v0,v0,0x3
  e0:	1440002c 	bnez	v0,194 <r4000_tlb_load+0x14c>
  e4:	377b0108 	ori	k1,k1,0x108
  e8:	f03b0000 	scd	k1,0(at)
  ec:	5360fff8 	beqzl	k1,d0 <r4000_tlb_load+0x88>
  f0:	00000000 	nop
  f4:	34210008 	ori	at,at,0x8
  f8:	38210008 	xori	at,at,0x8
  fc:	dc3b0000 	ld	k1,0(at)
 100:	dc210008 	ld	at,8(at)
 104:	001bd9fa 	dsrl	k1,k1,0x7
 108:	40bb1000 	dmtc0	k1,$2
 10c:	000109fa 	dsrl	at,at,0x7
 110:	40a11800 	dmtc0	at,$3
 114:	42000002 	tlbwi
 118:	df410000 	ld	at,0(k0)
 11c:	df420008 	ld	v0,8(k0)
 120:	42000018 	eret
 124:	3c01a800 	lui	at,0xa800
 128:	00010c38 	dsll	at,at,0x10
 12c:	64210090 	daddiu	at,at,144
 130:	00010c38 	dsll	at,at,0x10
 134:	1000ffdb 	b	a4 <r4000_tlb_load+0x5c>
 138:	64210000 	daddiu	at,at,0
 13c:	d03b0000 	lld	k1,0(at)
 140:	33620003 	andi	v0,k1,0x3
 144:	38420003 	xori	v0,v0,0x3
 148:	14400012 	bnez	v0,194 <r4000_tlb_load+0x14c>
 14c:	42000008 	tlbp
 150:	377b0108 	ori	k1,k1,0x108
 154:	f03b0000 	scd	k1,0(at)
 158:	1360fff8 	beqz	k1,13c <r4000_tlb_load+0xf4>
 15c:	dc3b0000 	ld	k1,0(at)
 160:	3c010040 	lui	at,0x40
 164:	001bd9fa 	dsrl	k1,k1,0x7
 168:	40bb1000 	dmtc0	k1,$2
 16c:	0361d82d 	daddu	k1,k1,at
 170:	40bb1800 	dmtc0	k1,$3
 174:	3c1b1fff 	lui	k1,0x1fff
 178:	377be000 	ori	k1,k1,0xe000
 17c:	409b2800 	mtc0	k1,$5
 180:	42000002 	tlbwi
 184:	3c1b0001 	lui	k1,0x1
 188:	377be000 	ori	k1,k1,0xe000
 18c:	1000ffe2 	b	118 <r4000_tlb_load+0xd0>
 190:	409b2800 	mtc0	k1,$5
 194:	df410000 	ld	at,0(k0)
 198:	df420008 	ld	v0,8(k0)
 19c:	08010f14 	j	43c50 <r4000_tlb_refill+0x43608>
 1a0:	00000000 	nop
	...

0000000000000248 <r4000_tlb_store>:
 248:	403a2000 	dmfc0	k0,$4
 24c:	001ad6ba 	dsrl	k0,k0,0x1a
 250:	001ad1f8 	dsll	k0,k0,0x7
 254:	3c1ba800 	lui	k1,0xa800
 258:	001bdc38 	dsll	k1,k1,0x10
 25c:	677b0092 	daddiu	k1,k1,146
 260:	001bdc38 	dsll	k1,k1,0x10
 264:	677b5500 	daddiu	k1,k1,21760
 268:	035bd02d 	daddu	k0,k0,k1
 26c:	ff410000 	sd	at,0(k0)
 270:	ff420008 	sd	v0,8(k0)
 274:	403b4000 	dmfc0	k1,$8
 278:	001b0abe 	dsrl32	at,k1,0xa
 27c:	1420002a 	bnez	at,328 <r4000_tlb_store+0xe0>
 280:	40212000 	dmfc0	at,$4
 284:	00010dfa 	dsrl	at,at,0x17
 288:	3c1ba800 	lui	k1,0xa800
 28c:	001bdc38 	dsll	k1,k1,0x10
 290:	677b0092 	daddiu	k1,k1,146
 294:	001bdc38 	dsll	k1,k1,0x10
 298:	003b082d 	daddu	at,at,k1
 29c:	403b4000 	dmfc0	k1,$8
 2a0:	dc214100 	ld	at,16640(at)
 2a4:	001bdeba 	dsrl	k1,k1,0x1a
 2a8:	337bfff8 	andi	k1,k1,0xfff8
 2ac:	003b082d 	daddu	at,at,k1
 2b0:	dc3b0000 	ld	k1,0(at)
 2b4:	337b0020 	andi	k1,k1,0x20
 2b8:	17600021 	bnez	k1,340 <r4000_tlb_store+0xf8>
 2bc:	403b4000 	dmfc0	k1,$8
 2c0:	dc210000 	ld	at,0(at)
 2c4:	001bdb7a 	dsrl	k1,k1,0xd
 2c8:	337bfff8 	andi	k1,k1,0xfff8
 2cc:	003b082d 	daddu	at,at,k1
 2d0:	d03b0000 	lld	k1,0(at)
 2d4:	42000008 	tlbp
 2d8:	33620005 	andi	v0,k1,0x5
 2dc:	38420005 	xori	v0,v0,0x5
 2e0:	1440002e 	bnez	v0,39c <r4000_tlb_store+0x154>
 2e4:	00000000 	nop
 2e8:	377b0318 	ori	k1,k1,0x318
 2ec:	f03b0000 	scd	k1,0(at)
 2f0:	5360fff7 	beqzl	k1,2d0 <r4000_tlb_store+0x88>
 2f4:	00000000 	nop
 2f8:	34210008 	ori	at,at,0x8
 2fc:	38210008 	xori	at,at,0x8
 300:	dc3b0000 	ld	k1,0(at)
 304:	dc210008 	ld	at,8(at)
 308:	001bd9fa 	dsrl	k1,k1,0x7
 30c:	40bb1000 	dmtc0	k1,$2
 310:	000109fa 	dsrl	at,at,0x7
 314:	40a11800 	dmtc0	at,$3
 318:	42000002 	tlbwi
 31c:	df410000 	ld	at,0(k0)
 320:	df420008 	ld	v0,8(k0)
 324:	42000018 	eret
 328:	3c01a800 	lui	at,0xa800
 32c:	00010c38 	dsll	at,at,0x10
 330:	64210090 	daddiu	at,at,144
 334:	00010c38 	dsll	at,at,0x10
 338:	1000ffda 	b	2a4 <r4000_tlb_store+0x5c>
 33c:	64210000 	daddiu	at,at,0
 340:	d03b0000 	lld	k1,0(at)
 344:	33620005 	andi	v0,k1,0x5
 348:	38420005 	xori	v0,v0,0x5
 34c:	14400013 	bnez	v0,39c <r4000_tlb_store+0x154>
 350:	00000000 	nop
 354:	42000008 	tlbp
 358:	377b0318 	ori	k1,k1,0x318
 35c:	f03b0000 	scd	k1,0(at)
 360:	1360fff7 	beqz	k1,340 <r4000_tlb_store+0xf8>
 364:	dc3b0000 	ld	k1,0(at)
 368:	3c010040 	lui	at,0x40
 36c:	001bd9fa 	dsrl	k1,k1,0x7
 370:	40bb1000 	dmtc0	k1,$2
 374:	0361d82d 	daddu	k1,k1,at
 378:	40bb1800 	dmtc0	k1,$3
 37c:	3c1b1fff 	lui	k1,0x1fff
 380:	377be000 	ori	k1,k1,0xe000
 384:	409b2800 	mtc0	k1,$5
 388:	42000002 	tlbwi
 38c:	3c1b0001 	lui	k1,0x1
 390:	377be000 	ori	k1,k1,0xe000
 394:	1000ffe1 	b	31c <r4000_tlb_store+0xd4>
 398:	409b2800 	mtc0	k1,$5
 39c:	df410000 	ld	at,0(k0)
 3a0:	df420008 	ld	v0,8(k0)
 3a4:	08010f5f 	j	43d7c <r4000_tlb_refill+0x43734>
 3a8:	00000000 	nop
	...

0000000000000448 <r4000_tlb_modify>:
 448:	403a2000 	dmfc0	k0,$4
 44c:	001ad6ba 	dsrl	k0,k0,0x1a
 450:	001ad1f8 	dsll	k0,k0,0x7
 454:	3c1ba800 	lui	k1,0xa800
 458:	001bdc38 	dsll	k1,k1,0x10
 45c:	677b0092 	daddiu	k1,k1,146
 460:	001bdc38 	dsll	k1,k1,0x10
 464:	677b5500 	daddiu	k1,k1,21760
 468:	035bd02d 	daddu	k0,k0,k1
 46c:	ff410000 	sd	at,0(k0)
 470:	ff420008 	sd	v0,8(k0)
 474:	403b4000 	dmfc0	k1,$8
 478:	001b0abe 	dsrl32	at,k1,0xa
 47c:	14200028 	bnez	at,520 <r4000_tlb_modify+0xd8>
 480:	40212000 	dmfc0	at,$4
 484:	00010dfa 	dsrl	at,at,0x17
 488:	3c1ba800 	lui	k1,0xa800
 48c:	001bdc38 	dsll	k1,k1,0x10
 490:	677b0092 	daddiu	k1,k1,146
 494:	001bdc38 	dsll	k1,k1,0x10
 498:	003b082d 	daddu	at,at,k1
 49c:	403b4000 	dmfc0	k1,$8
 4a0:	dc214100 	ld	at,16640(at)
 4a4:	001bdeba 	dsrl	k1,k1,0x1a
 4a8:	337bfff8 	andi	k1,k1,0xfff8
 4ac:	003b082d 	daddu	at,at,k1
 4b0:	dc3b0000 	ld	k1,0(at)
 4b4:	337b0020 	andi	k1,k1,0x20
 4b8:	1760001f 	bnez	k1,538 <r4000_tlb_modify+0xf0>
 4bc:	403b4000 	dmfc0	k1,$8
 4c0:	dc210000 	ld	at,0(at)
 4c4:	001bdb7a 	dsrl	k1,k1,0xd
 4c8:	337bfff8 	andi	k1,k1,0xfff8
 4cc:	003b082d 	daddu	at,at,k1
 4d0:	d03b0000 	lld	k1,0(at)
 4d4:	42000008 	tlbp
 4d8:	33620004 	andi	v0,k1,0x4
 4dc:	1040002b 	beqz	v0,58c <r4000_tlb_modify+0x144>
 4e0:	377b0318 	ori	k1,k1,0x318
 4e4:	f03b0000 	scd	k1,0(at)
 4e8:	5360fff9 	beqzl	k1,4d0 <r4000_tlb_modify+0x88>
 4ec:	00000000 	nop
 4f0:	34210008 	ori	at,at,0x8
 4f4:	38210008 	xori	at,at,0x8
 4f8:	dc3b0000 	ld	k1,0(at)
 4fc:	dc210008 	ld	at,8(at)
 500:	001bd9fa 	dsrl	k1,k1,0x7
 504:	40bb1000 	dmtc0	k1,$2
 508:	000109fa 	dsrl	at,at,0x7
 50c:	40a11800 	dmtc0	at,$3
 510:	42000002 	tlbwi
 514:	df410000 	ld	at,0(k0)
 518:	df420008 	ld	v0,8(k0)
 51c:	42000018 	eret
 520:	3c01a800 	lui	at,0xa800
 524:	00010c38 	dsll	at,at,0x10
 528:	64210090 	daddiu	at,at,144
 52c:	00010c38 	dsll	at,at,0x10
 530:	1000ffdc 	b	4a4 <r4000_tlb_modify+0x5c>
 534:	64210000 	daddiu	at,at,0
 538:	d03b0000 	lld	k1,0(at)
 53c:	33620004 	andi	v0,k1,0x4
 540:	10400012 	beqz	v0,58c <r4000_tlb_modify+0x144>
 544:	42000008 	tlbp
 548:	377b0318 	ori	k1,k1,0x318
 54c:	f03b0000 	scd	k1,0(at)
 550:	1360fff9 	beqz	k1,538 <r4000_tlb_modify+0xf0>
 554:	dc3b0000 	ld	k1,0(at)
 558:	3c010040 	lui	at,0x40
 55c:	001bd9fa 	dsrl	k1,k1,0x7
 560:	40bb1000 	dmtc0	k1,$2
 564:	0361d82d 	daddu	k1,k1,at
 568:	40bb1800 	dmtc0	k1,$3
 56c:	3c1b1fff 	lui	k1,0x1fff
 570:	377be000 	ori	k1,k1,0xe000
 574:	409b2800 	mtc0	k1,$5
 578:	42000002 	tlbwi
 57c:	3c1b0001 	lui	k1,0x1
 580:	377be000 	ori	k1,k1,0xe000
 584:	1000ffe3 	b	514 <r4000_tlb_modify+0xcc>
 588:	409b2800 	mtc0	k1,$5
 58c:	df410000 	ld	at,0(k0)
 590:	df420008 	ld	v0,8(k0)
 594:	08010f5f 	j	43d7c <r4000_tlb_refill+0x43734>
 598:	00000000 	nop
	...

0000000000000648 <r4000_tlb_refill>:
 648:	df7a0000 	ld	k0,0(k1)
 64c:	3c1b0040 	lui	k1,0x40
 650:	001ad1fa 	dsrl	k0,k0,0x7
 654:	40ba1000 	dmtc0	k0,$2
 658:	035bd02d 	daddu	k0,k0,k1
 65c:	40ba1800 	dmtc0	k0,$3
 660:	3c1a1fff 	lui	k0,0x1fff
 664:	375ae000 	ori	k0,k0,0xe000
 668:	409a2800 	mtc0	k0,$5
 66c:	42000006 	tlbwr
 670:	3c1a0001 	lui	k0,0x1
 674:	375ae000 	ori	k0,k0,0xe000
 678:	10000031 	b	740 <r4000_tlb_refill+0xf8>
 67c:	409a2800 	mtc0	k0,$5
 680:	07410006 	bgez	k0,69c <r4000_tlb_refill+0x54>
 684:	3c1ba800 	lui	k1,0xa800
 688:	001bdc38 	dsll	k1,k1,0x10
 68c:	677b0090 	daddiu	k1,k1,144
 690:	001bdc38 	dsll	k1,k1,0x10
 694:	10000018 	b	6f8 <r4000_tlb_refill+0xb0>
 698:	677b0000 	daddiu	k1,k1,0
 69c:	3c1ba800 	lui	k1,0xa800
 6a0:	001bdc38 	dsll	k1,k1,0x10
 6a4:	677b0004 	daddiu	k1,k1,4
 6a8:	001bdc38 	dsll	k1,k1,0x10
 6ac:	677b3c50 	daddiu	k1,k1,15440
 6b0:	03600008 	jr	k1
 6b4:	00000000 	nop
	...
 6c8:	403a4000 	dmfc0	k0,$8
 6cc:	001adabe 	dsrl32	k1,k0,0xa
 6d0:	1760ffeb 	bnez	k1,680 <r4000_tlb_refill+0x38>
 6d4:	403b2000 	dmfc0	k1,$4
 6d8:	001bddfa 	dsrl	k1,k1,0x17
 6dc:	3c1aa800 	lui	k0,0xa800
 6e0:	001ad438 	dsll	k0,k0,0x10
 6e4:	675a0092 	daddiu	k0,k0,146
 6e8:	001ad438 	dsll	k0,k0,0x10
 6ec:	037ad82d 	daddu	k1,k1,k0
 6f0:	403a4000 	dmfc0	k0,$8
 6f4:	df7b4100 	ld	k1,16640(k1)
 6f8:	001ad6ba 	dsrl	k0,k0,0x1a
 6fc:	335afff8 	andi	k0,k0,0xfff8
 700:	037ad82d 	daddu	k1,k1,k0
 704:	df7a0000 	ld	k0,0(k1)
 708:	335a0020 	andi	k0,k0,0x20
 70c:	1740ffce 	bnez	k0,648 <r4000_tlb_refill>
 710:	403aa000 	dmfc0	k0,$20
 714:	df7b0000 	ld	k1,0(k1)
 718:	001ad13a 	dsrl	k0,k0,0x4
 71c:	335afff0 	andi	k0,k0,0xfff0
 720:	037ad82d 	daddu	k1,k1,k0
 724:	df7a0000 	ld	k0,0(k1)
 728:	df7b0008 	ld	k1,8(k1)
 72c:	001ad1fa 	dsrl	k0,k0,0x7
 730:	40ba1000 	dmtc0	k0,$2
 734:	001bd9fa 	dsrl	k1,k1,0x7
 738:	40bb1800 	dmtc0	k1,$3
 73c:	42000006 	tlbwr
 740:	42000018 	eret
	...

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-04  1:23     ` David Daney
  2014-11-04  1:34       ` Joshua Kinard
@ 2014-11-05  9:07       ` Joshua Kinard
  2014-11-05 10:21         ` Ralf Baechle
  2014-11-05 16:09       ` Ralf Baechle
  2 siblings, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-05  9:07 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle

On 11/03/2014 20:23, David Daney wrote:
> On 11/03/2014 05:08 PM, Joshua Kinard wrote:
>> On 11/03/2014 13:52, David Daney wrote:
>>> On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>>>>
>>>> So I have been testing the Onyx2 I have out the last few days with the IOC3
>>>> metadriver used on Octane, and I can get it to boot, but if
>>>> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>>>>
>>>> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
>>>> Gentoo's 'emerge' command  can produce one.  Switch to CONFIG_PAGE_SIZE_16KB,
>>>> and the bus errors are far less frequent.  I suspect CONFIG_PAGE_SIZE_64KB
>>>> will
>>>> be even less.
>>>>
>>>> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good.  It's
>>>> been up for almost 8 hours compiling, and not a single bus error yet.  It's
>>>> got
>>>> 2x node board with dual R12K/400MHz CPUs per node.
>>>>
>>>> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's
>>>> causing
>>>> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
>>>> CPUs).  I tried getting a core dump on one of the bus errors, but that
>>>> produces a
>>>> truncated or corrupted core file that actually crashed GDB, plus I get a nice
>>>> oops message in dmesg:
>>>
>>> Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE, huge
>>> pages will be created and used in the background transparently to the userspace
>>> application.
>>>
>>> With 4KB base page size, the huge pages will be 2MB in size..  I don't know
>>> much about the R10K/R12K/R14K CPUs, but it is possible that either their TLBs
>>> cannot handle such pages, or that the TLB Exception handlers don't contain
>>> proper code for these CPUs.
>>>
>>> For each doubling of the base PAGE_SIZE, the huge page size will increase by a
>>> factor of 4.  So with 16KB base pages the huge page size would be 32MB, since
>>> there are many fewer opportunities to transparently use a 32MB page, I would
>>> expect any errors related to huge pages to be correspondingly less frequent.
>>>
>>> With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that that
>>> could never be used by normal userspace programs.
>>
>> I checked the R10K/R12K manual, and the PageMask register there has bits 24:13
>> open for setting a mask value.  It looks like these CPUs only support a page
>> size from 4KB to 16MB (so a 2MB page size should work w/ transparent
>> hugepages).  I assume that the R14K on the Octane might be the same (but I
>> don't have a manual specific to the R14k, so I don't know).  All of the
>> remaining bits in that register read 0 and must have 0's written back.
>>
>> I guess I could find a way to have the kernel trigger a non-fatal oops/dump the
>> registers on a bus error and get a look at the cause register to see if that
>> sheds any light on things.  Doesn't a SIGBUS on MIPS typically mean that an
>> address wasn't aligned on a 32-bit boundary?  Or could it also mean other
>> things?
>>
>> I believe that the R10K is largely compatible with the R4K-style TLB setup, but
>> Ralf or someone else more knowledge in that area will have to verify.  Maybe
>> the R10k-family CPUs need their own TLB routines, or what currently exists
>> needs modifications?  I have not tried to understand the whole TLB thing in
>> MIPS yet, so that's a bit of voodoo to me.
> 
> I haven't checked, but there may be workarounds required in the TLB management
> code that are not in place for the huge page case.  When the huge TLB code was
> developed, we didn't do any testing on R10K.  Somebody should dump the
> exception handlers and carefully look at the rest of the huge TLB management
> code, and check to see that any required workarounds are in place.
> 
> David.

I did some digging, and it looks like Ralf added CPU_SUPPORTS_HUGEPAGES support
a few years ago to most of the CPUs:
http://marc.info/?l=git-commits-head&m=135552890201646&w=2

It was pointed out to me off list that this statement for the PageMask register
in the R10K manual may explain things:

"""TLB read and write operations use this register as either a source or a
destination; when virtual addresses are presented for translation into physical
address, the corresponding bits in the TLB identify which virtual address bits
among bits 24:13 are used in the comparison. When the Mask field is not one of
the values shown in Table 13-6, the operation of the TLB is undefined. The 0
field is reserved; it must be written as zeroes, and returns zeroes when read."""

2MB page sizes aren't explicitly listed in this table in the manual, so setting
bits 24:13 in PageMask might be leading to this "undefined behavior", which on
R12K might include the random bus errors/segfaults, and R14K triggers an IBE
that needs a cold reboot.

The only other R10K system I have is the IP28, but I haven't gotten that to
boot up in a few years.

Checking the NEC Vr-Series programming manual and the PMC-Sierra RM7000 manual,
at least the R5000 and RM7000 also carry this restriction because they have the
same bits defined in PageMask.

My O2 w/ RM7K is out of commission at the moment, so I can't test for that.
Anyone got an R5K/R5200/RM7K O2/Indy/I2 and can check that CPU?

-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-05  9:07       ` Joshua Kinard
@ 2014-11-05 10:21         ` Ralf Baechle
  0 siblings, 0 replies; 27+ messages in thread
From: Ralf Baechle @ 2014-11-05 10:21 UTC (permalink / raw)
  To: Joshua Kinard; +Cc: linux-mips

On Wed, Nov 05, 2014 at 04:07:24AM -0500, Joshua Kinard wrote:

> It was pointed out to me off list that this statement for the PageMask register
> in the R10K manual may explain things:
> 
> """TLB read and write operations use this register as either a source or a
> destination; when virtual addresses are presented for translation into physical
> address, the corresponding bits in the TLB identify which virtual address bits
> among bits 24:13 are used in the comparison. When the Mask field is not one of
> the values shown in Table 13-6, the operation of the TLB is undefined. The 0
> field is reserved; it must be written as zeroes, and returns zeroes when read."""
> 
> 2MB page sizes aren't explicitly listed in this table in the manual, so setting
> bits 24:13 in PageMask might be leading to this "undefined behavior", which on
> R12K might include the random bus errors/segfaults, and R14K triggers an IBE
> that needs a cold reboot.

All MIPS CPUs with a R4000-style TLB have this restriction.  It's just that the
behaviour of such bitmask values being undefined the resulting behviour is likely
to differ between CPU types.

2MB pages will be loaded into the TLB as a pair of adjacent pair of 1MB pages.

> The only other R10K system I have is the IP28, but I haven't gotten that to
> boot up in a few years.

  Ralf

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-04  1:08   ` Joshua Kinard
  2014-11-04  1:23     ` David Daney
@ 2014-11-05 13:52     ` Ralf Baechle
  1 sibling, 0 replies; 27+ messages in thread
From: Ralf Baechle @ 2014-11-05 13:52 UTC (permalink / raw)
  To: Joshua Kinard; +Cc: David Daney, Linux MIPS List

On Mon, Nov 03, 2014 at 08:08:58PM -0500, Joshua Kinard wrote:

> I guess I could find a way to have the kernel trigger a non-fatal oops/dump the
> registers on a bus error and get a look at the cause register to see if that
> sheds any light on things.  Doesn't a SIGBUS on MIPS typically mean that an
> address wasn't aligned on a 32-bit boundary?  Or could it also mean other things?
> 
> I believe that the R10K is largely compatible with the R4K-style TLB setup, but
> Ralf or someone else more knowledge in that area will have to verify.  Maybe
> the R10k-family CPUs need their own TLB routines, or what currently exists
> needs modifications?  I have not tried to understand the whole TLB thing in
> MIPS yet, so that's a bit of voodoo to me.

Voodoo that normally works a lot better than the conventional code it replaced!

The R10000 TLB is basically the all dancing, all singing version of other
MIPS TLBs.  Noteworthy differences are that TLB hazards are handled in hardware
and that the R10000 automatically detects multiple matching TLB entries on a
TLB write in which case it will automatically invalidate the old entry before
writing the new entry.  It also is the only MIPS CPU to implement a c0_framemask
register but to my understanding of that functionality the only software
handling that register's functionality needs is initialization to zero essentially
disabling it.  The R10000 supports a maximum page size of 16M.

  Ralf

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-04  1:23     ` David Daney
  2014-11-04  1:34       ` Joshua Kinard
  2014-11-05  9:07       ` Joshua Kinard
@ 2014-11-05 16:09       ` Ralf Baechle
  2014-11-07 10:22         ` Joshua Kinard
  2 siblings, 1 reply; 27+ messages in thread
From: Ralf Baechle @ 2014-11-05 16:09 UTC (permalink / raw)
  To: David Daney; +Cc: Joshua Kinard, Linux MIPS List

On Mon, Nov 03, 2014 at 05:23:29PM -0800, David Daney wrote:

> I haven't checked, but there may be workarounds required in the TLB
> management code that are not in place for the huge page case.  When the huge
> TLB code was developed, we didn't do any testing on R10K.  Somebody should
> dump the exception handlers and carefully look at the rest of the huge TLB
> management code, and check to see that any required workarounds are in
> place.

Joshua, if you happen to have R10000 errata sheets around, maybe you could
check if there's anything suspicious?  Off the top of my head I don't recall
any R10000 TLB erratas but the R10000 had plenty of erratas due to it's - by
the standards of the time - high complexity.

  Ralf

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-05 16:09       ` Ralf Baechle
@ 2014-11-07 10:22         ` Joshua Kinard
  2014-11-07 18:30           ` David Daney
  0 siblings, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-07 10:22 UTC (permalink / raw)
  To: Ralf Baechle, David Daney; +Cc: Linux MIPS List

On 11/05/2014 11:09, Ralf Baechle wrote:
> On Mon, Nov 03, 2014 at 05:23:29PM -0800, David Daney wrote:
> 
>> I haven't checked, but there may be workarounds required in the TLB
>> management code that are not in place for the huge page case.  When the huge
>> TLB code was developed, we didn't do any testing on R10K.  Somebody should
>> dump the exception handlers and carefully look at the rest of the huge TLB
>> management code, and check to see that any required workarounds are in
>> place.
> 
> Joshua, if you happen to have R10000 errata sheets around, maybe you could
> check if there's anything suspicious?  Off the top of my head I don't recall
> any R10000 TLB erratas but the R10000 had plenty of erratas due to it's - by
> the standards of the time - high complexity.
> 
>   Ralf

All I have are errata sheets for Rev 2.3, 2.4, and 2.5 of the R10K.  Nothing
specific on the R12K, and nil for the R14K/R16K.

That said, poking through other areas of the R10K/R12K User Manual, there are
paragraphs titled "Errata" and regarding the PageMask register or TLB, they
state this:

Page 41
The calculated address is translated from a 44-bit virtual address into a
40-bit physical address using a translation-lookaside buffer. The TLB contains
64 entries, each of which can translate two pages. Each entry can select a page
size ranging from 4 Kbytes to 16 Mbytes, inclusive, in __powers__ of 4, as
shown in Figure 1-6.

Page 316:
Translated virtual addresses retrieve data in blocks, which are called pages.
In the R10000 processor, the size of each page may be selected from a range
that runs from 4 Kbytes to 16 Mbytes inclusive, __in_powers_of_4__ (that is, 4
Kbytes, 16 Kbytes, 64 Kbytes, etc.).

So my guess is unless hugepages can happen in powers of 4, they're not
compatible w/ the R10K-series (and likely not the R5K/RM7K, either, since they
all have the same 24:13 bits in the PageMask register).  It seems the logical
choice would be to remove 'select CPU_SUPPORTS_HUGEPAGES' from CPU_R5000,
CPU_NEVADA, CPU_R10000, and CPU_RM7000 in arch/mips/Kconfig.

-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-07 10:22         ` Joshua Kinard
@ 2014-11-07 18:30           ` David Daney
  2014-11-09  0:09             ` Joshua Kinard
  0 siblings, 1 reply; 27+ messages in thread
From: David Daney @ 2014-11-07 18:30 UTC (permalink / raw)
  To: Joshua Kinard; +Cc: Ralf Baechle, Linux MIPS List

On 11/07/2014 02:22 AM, Joshua Kinard wrote:
[...]
>
> So my guess is unless hugepages can happen in powers of 4,

Huge  pages are currently only supported on MIPS64 for this reason.

huge_page_mask_size = (normal_page_size/8 * normal_page_size) / 2;

If you take log2 of everything you get

huge_page_mask_bits = normal_page_bits - 3 + normal_page_bits - 1
   = 2 * normal_page_bits - 4 (always even)

So all page sizes result in huge pages that meet the power of 4 criterion.

> they're not
> compatible w/ the R10K-series (and likely not the R5K/RM7K, either, since they
> all have the same 24:13 bits in the PageMask register).  It seems the logical
> choice would be to remove 'select CPU_SUPPORTS_HUGEPAGES' from CPU_R5000,
> CPU_NEVADA, CPU_R10000, and CPU_RM7000 in arch/mips/Kconfig.
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-07 18:30           ` David Daney
@ 2014-11-09  0:09             ` Joshua Kinard
  2014-11-10  7:04               ` Joshua Kinard
  0 siblings, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-09  0:09 UTC (permalink / raw)
  To: David Daney; +Cc: Ralf Baechle, Linux MIPS List

On 11/07/2014 13:30, David Daney wrote:
> On 11/07/2014 02:22 AM, Joshua Kinard wrote:
> [...]
>>
>> So my guess is unless hugepages can happen in powers of 4,
> 
> Huge  pages are currently only supported on MIPS64 for this reason.
> 
> huge_page_mask_size = (normal_page_size/8 * normal_page_size) / 2;
> 
> If you take log2 of everything you get
> 
> huge_page_mask_bits = normal_page_bits - 3 + normal_page_bits - 1
>   = 2 * normal_page_bits - 4 (always even)
> 
> So all page sizes result in huge pages that meet the power of 4 criterion.

Well, looks like I'll have to bisect to hunt the problem down.  Obviously there
is something with transparent hugepages that the R10K-family dislikes.  Just a
question of "what?".  Seems like I'm the only one left with this kind of
equipment and interest to play with it :)

-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-09  0:09             ` Joshua Kinard
@ 2014-11-10  7:04               ` Joshua Kinard
  2014-11-10 10:51                 ` Ralf Baechle
  0 siblings, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-10  7:04 UTC (permalink / raw)
  To: David Daney; +Cc: Ralf Baechle, Linux MIPS List

On 11/08/2014 19:09, Joshua Kinard wrote:
> On 11/07/2014 13:30, David Daney wrote:
>> On 11/07/2014 02:22 AM, Joshua Kinard wrote:
>> [...]
>>>
>>> So my guess is unless hugepages can happen in powers of 4,
>>
>> Huge  pages are currently only supported on MIPS64 for this reason.
>>
>> huge_page_mask_size = (normal_page_size/8 * normal_page_size) / 2;
>>
>> If you take log2 of everything you get
>>
>> huge_page_mask_bits = normal_page_bits - 3 + normal_page_bits - 1
>>   = 2 * normal_page_bits - 4 (always even)
>>
>> So all page sizes result in huge pages that meet the power of 4 criterion.
> 
> Well, looks like I'll have to bisect to hunt the problem down.  Obviously there
> is something with transparent hugepages that the R10K-family dislikes.  Just a
> question of "what?".  Seems like I'm the only one left with this kind of
> equipment and interest to play with it :)

I gave up on bisecting this.  3.7 and 3.9 kernels are not bootable on my Onyx2
w/o additional patches to fix the PCI probing code to deal with the card cage I
have in my system (basically, it stops probing after it discovers the first PCI
bus).  Even with that fixed, normal init refused to load on those kernels, and
dash as init just outright crashed.  Must be some other IP27 bug that was fixed
at some point, and I didn't feel like applying multiple patches to every bisect
checkout, which might've altered results and led me to blaming the wrong commit.

It does look like the PageMask register is getting set to the correct values on
PAGE_SIZE_4K and PAGE_SIZE_16K when a hugepage is needed (PM_1M and PM_16M).
The PAGE_SIZE_64K case wouldn't be valid on R10k, as that uses PM_256M for a
hugepage, which is bits 28:13 in PageMask and that would lead to "undefined
behavior".  I'm assuming another register is getting set to an incorrect value
in the huge pagecase (EntryLo0 or EntryLo1?  EntryHi?), but I don't have the
required knowledge to fiddle w/ the TLB code to figure it out.

So, I sent in the patch that marks CPU_SUPPORTS_HUGEPAGES as BROKEN until
someone feels like tackling it (if ever).

Sidenote: Is it possible to add additional CP0 registers to a register dump on
a panic or oops?  I looked around ptrace.c and ptrace.h and see where these
registers are setup and printed out, but I can't find out where the actual
values are fetched from the CPU and put into struct pt_regs.  I am assuming
it's a snippet of asm somewhere.  Adding R10K's PageMask, Config, ErrorEpc, And
Context/XContext registers seems like useful debugging info.

-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-10  7:04               ` Joshua Kinard
@ 2014-11-10 10:51                 ` Ralf Baechle
  2014-11-10 11:20                   ` Thomas Bogendoerfer
  2014-11-10 11:22                   ` Joshua Kinard
  0 siblings, 2 replies; 27+ messages in thread
From: Ralf Baechle @ 2014-11-10 10:51 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Joshua Kinard; +Cc: David Daney, Linux MIPS List

Thomas,

can you test CONFIG_TRANSPARENT_HUGEPAGE on an IP28?

All in all the R10000's TLB is unproblematic; my gut feeling is that
rather something else specific to IP27 is spoiling the broth.

  Ralf

On Mon, Nov 10, 2014 at 02:04:10AM -0500, Joshua Kinard wrote:
> Date:   Mon, 10 Nov 2014 02:04:10 -0500
> From: Joshua Kinard <kumba@gentoo.org>
> To: David Daney <ddaney.cavm@gmail.com>
> CC: Ralf Baechle <ralf@linux-mips.org>, Linux MIPS List
>  <linux-mips@linux-mips.org>
> Subject: Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
> Content-Type: text/plain; charset=windows-1252
> 
> On 11/08/2014 19:09, Joshua Kinard wrote:
> > On 11/07/2014 13:30, David Daney wrote:
> >> On 11/07/2014 02:22 AM, Joshua Kinard wrote:
> >> [...]
> >>>
> >>> So my guess is unless hugepages can happen in powers of 4,
> >>
> >> Huge  pages are currently only supported on MIPS64 for this reason.
> >>
> >> huge_page_mask_size = (normal_page_size/8 * normal_page_size) / 2;
> >>
> >> If you take log2 of everything you get
> >>
> >> huge_page_mask_bits = normal_page_bits - 3 + normal_page_bits - 1
> >>   = 2 * normal_page_bits - 4 (always even)
> >>
> >> So all page sizes result in huge pages that meet the power of 4 criterion.
> > 
> > Well, looks like I'll have to bisect to hunt the problem down.  Obviously there
> > is something with transparent hugepages that the R10K-family dislikes.  Just a
> > question of "what?".  Seems like I'm the only one left with this kind of
> > equipment and interest to play with it :)
> 
> I gave up on bisecting this.  3.7 and 3.9 kernels are not bootable on my Onyx2
> w/o additional patches to fix the PCI probing code to deal with the card cage I
> have in my system (basically, it stops probing after it discovers the first PCI
> bus).  Even with that fixed, normal init refused to load on those kernels, and
> dash as init just outright crashed.  Must be some other IP27 bug that was fixed
> at some point, and I didn't feel like applying multiple patches to every bisect
> checkout, which might've altered results and led me to blaming the wrong commit.
> 
> It does look like the PageMask register is getting set to the correct values on
> PAGE_SIZE_4K and PAGE_SIZE_16K when a hugepage is needed (PM_1M and PM_16M).
> The PAGE_SIZE_64K case wouldn't be valid on R10k, as that uses PM_256M for a
> hugepage, which is bits 28:13 in PageMask and that would lead to "undefined
> behavior".  I'm assuming another register is getting set to an incorrect value
> in the huge pagecase (EntryLo0 or EntryLo1?  EntryHi?), but I don't have the
> required knowledge to fiddle w/ the TLB code to figure it out.
> 
> So, I sent in the patch that marks CPU_SUPPORTS_HUGEPAGES as BROKEN until
> someone feels like tackling it (if ever).
> 
> Sidenote: Is it possible to add additional CP0 registers to a register dump on
> a panic or oops?  I looked around ptrace.c and ptrace.h and see where these
> registers are setup and printed out, but I can't find out where the actual
> values are fetched from the CPU and put into struct pt_regs.  I am assuming
> it's a snippet of asm somewhere.  Adding R10K's PageMask, Config, ErrorEpc, And
> Context/XContext registers seems like useful debugging info.
> 
> -- 
> Joshua Kinard
> Gentoo/MIPS
> kumba@gentoo.org
> 4096R/D25D95E3 2011-03-28
> 
> "The past tempts us, the present confuses us, the future frightens us.  And our
> lives slip away, moment by moment, lost in that vast, terrible in-between."
> 
> --Emperor Turhan, Centauri Republic

  Ralf

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-10 10:51                 ` Ralf Baechle
@ 2014-11-10 11:20                   ` Thomas Bogendoerfer
  2014-11-10 14:22                     ` Joshua Kinard
  2014-11-10 21:30                     ` Thomas Bogendoerfer
  2014-11-10 11:22                   ` Joshua Kinard
  1 sibling, 2 replies; 27+ messages in thread
From: Thomas Bogendoerfer @ 2014-11-10 11:20 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Joshua Kinard, David Daney, Linux MIPS List

On Mon, Nov 10, 2014 at 11:51:06AM +0100, Ralf Baechle wrote:
> Thomas,
> 
> can you test CONFIG_TRANSPARENT_HUGEPAGE on an IP28?
> 
> All in all the R10000's TLB is unproblematic; my gut feeling is that
> rather something else specific to IP27 is spoiling the broth.

I'll give it a spin later today.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-10 10:51                 ` Ralf Baechle
  2014-11-10 11:20                   ` Thomas Bogendoerfer
@ 2014-11-10 11:22                   ` Joshua Kinard
  1 sibling, 0 replies; 27+ messages in thread
From: Joshua Kinard @ 2014-11-10 11:22 UTC (permalink / raw)
  To: Ralf Baechle, Thomas Bogendoerfer; +Cc: David Daney, Linux MIPS List

On 11/10/2014 05:51, Ralf Baechle wrote:
> Thomas,
> 
> can you test CONFIG_TRANSPARENT_HUGEPAGE on an IP28?
> 
> All in all the R10000's TLB is unproblematic; my gut feeling is that
> rather something else specific to IP27 is spoiling the broth.
> 
>   Ralf

I don't know if it's specific to IP27.  I have problems on the Octane w/ an
R14000 and CONFIG_TRANSPARENT_HUGEPAGE (instruction bus errors, needs cold
reboot to clear).  I didn't have the same issues w/ the R12000 CPU module
installed, but I did not test things as thoroughly the last time I installed
it.  I'll see about swapping the R12K module back in tonight or tomorrow and
doing the same tests as on the IP27 that can trigger problems.

--J


> On Mon, Nov 10, 2014 at 02:04:10AM -0500, Joshua Kinard wrote:
>> Date:   Mon, 10 Nov 2014 02:04:10 -0500
>> From: Joshua Kinard <kumba@gentoo.org>
>> To: David Daney <ddaney.cavm@gmail.com>
>> CC: Ralf Baechle <ralf@linux-mips.org>, Linux MIPS List
>>  <linux-mips@linux-mips.org>
>> Subject: Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
>> Content-Type: text/plain; charset=windows-1252
>>
>> On 11/08/2014 19:09, Joshua Kinard wrote:
>>> On 11/07/2014 13:30, David Daney wrote:
>>>> On 11/07/2014 02:22 AM, Joshua Kinard wrote:
>>>> [...]
>>>>>
>>>>> So my guess is unless hugepages can happen in powers of 4,
>>>>
>>>> Huge  pages are currently only supported on MIPS64 for this reason.
>>>>
>>>> huge_page_mask_size = (normal_page_size/8 * normal_page_size) / 2;
>>>>
>>>> If you take log2 of everything you get
>>>>
>>>> huge_page_mask_bits = normal_page_bits - 3 + normal_page_bits - 1
>>>>   = 2 * normal_page_bits - 4 (always even)
>>>>
>>>> So all page sizes result in huge pages that meet the power of 4 criterion.
>>>
>>> Well, looks like I'll have to bisect to hunt the problem down.  Obviously there
>>> is something with transparent hugepages that the R10K-family dislikes.  Just a
>>> question of "what?".  Seems like I'm the only one left with this kind of
>>> equipment and interest to play with it :)
>>
>> I gave up on bisecting this.  3.7 and 3.9 kernels are not bootable on my Onyx2
>> w/o additional patches to fix the PCI probing code to deal with the card cage I
>> have in my system (basically, it stops probing after it discovers the first PCI
>> bus).  Even with that fixed, normal init refused to load on those kernels, and
>> dash as init just outright crashed.  Must be some other IP27 bug that was fixed
>> at some point, and I didn't feel like applying multiple patches to every bisect
>> checkout, which might've altered results and led me to blaming the wrong commit.
>>
>> It does look like the PageMask register is getting set to the correct values on
>> PAGE_SIZE_4K and PAGE_SIZE_16K when a hugepage is needed (PM_1M and PM_16M).
>> The PAGE_SIZE_64K case wouldn't be valid on R10k, as that uses PM_256M for a
>> hugepage, which is bits 28:13 in PageMask and that would lead to "undefined
>> behavior".  I'm assuming another register is getting set to an incorrect value
>> in the huge pagecase (EntryLo0 or EntryLo1?  EntryHi?), but I don't have the
>> required knowledge to fiddle w/ the TLB code to figure it out.
>>
>> So, I sent in the patch that marks CPU_SUPPORTS_HUGEPAGES as BROKEN until
>> someone feels like tackling it (if ever).
>>
>> Sidenote: Is it possible to add additional CP0 registers to a register dump on
>> a panic or oops?  I looked around ptrace.c and ptrace.h and see where these
>> registers are setup and printed out, but I can't find out where the actual
>> values are fetched from the CPU and put into struct pt_regs.  I am assuming
>> it's a snippet of asm somewhere.  Adding R10K's PageMask, Config, ErrorEpc, And
>> Context/XContext registers seems like useful debugging info.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-10 11:20                   ` Thomas Bogendoerfer
@ 2014-11-10 14:22                     ` Joshua Kinard
  2014-11-10 16:55                       ` David Daney
  2014-11-10 21:30                     ` Thomas Bogendoerfer
  1 sibling, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-10 14:22 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Ralf Baechle; +Cc: David Daney, Linux MIPS List

On 11/10/2014 06:20, Thomas Bogendoerfer wrote:
> On Mon, Nov 10, 2014 at 11:51:06AM +0100, Ralf Baechle wrote:
>> Thomas,
>>
>> can you test CONFIG_TRANSPARENT_HUGEPAGE on an IP28?
>>
>> All in all the R10000's TLB is unproblematic; my gut feeling is that
>> rather something else specific to IP27 is spoiling the broth.
> 
> I'll give it a spin later today.
> 
> Thomas.

Try testing with and without CONFIG_HUGETLBFS in the kernel.  File systems ->
Pseudo filesystems -> HugeTLB file system support

So far, it seems adding that option in with CONFIG_TRANSPARENT_HUGEPAGE makes
both IP27 and IP30 behave.  Without, I get data bus errors or segfaults on IP27
running Gentoo's "emerge" program on PAGE_SIZE_4K.

IP30 seems to be fine on an R12000 with or without that option, but I only have
a dual R12K module to test against.  I've only had the R14K dual module for a
few days, and I could not reproduce the bus errors on that module, either.  So
I wonder if there is something funny with the hardware on the single R14K
module, which I did get IBE's on before.  And whether that will behave once
CONFIG_HUGETLBFS is in the kernel.

If so, maybe the fix is to make CONFIG_HUGETLBFS automatically selected if
CONFIG_TRANSPARENT_HUGEPAGE?

--J

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-10 14:22                     ` Joshua Kinard
@ 2014-11-10 16:55                       ` David Daney
  2014-11-10 17:03                         ` Ralf Baechle
  0 siblings, 1 reply; 27+ messages in thread
From: David Daney @ 2014-11-10 16:55 UTC (permalink / raw)
  To: Joshua Kinard; +Cc: Thomas Bogendoerfer, Ralf Baechle, Linux MIPS List

On 11/10/2014 06:22 AM, Joshua Kinard wrote:
> On 11/10/2014 06:20, Thomas Bogendoerfer wrote:
>> On Mon, Nov 10, 2014 at 11:51:06AM +0100, Ralf Baechle wrote:
>>> Thomas,
>>>
>>> can you test CONFIG_TRANSPARENT_HUGEPAGE on an IP28?
>>>
>>> All in all the R10000's TLB is unproblematic; my gut feeling is that
>>> rather something else specific to IP27 is spoiling the broth.
>>
>> I'll give it a spin later today.
>>
>> Thomas.
>
> Try testing with and without CONFIG_HUGETLBFS in the kernel.  File systems ->
> Pseudo filesystems -> HugeTLB file system support
>
> So far, it seems adding that option in with CONFIG_TRANSPARENT_HUGEPAGE makes
> both IP27 and IP30 behave.  Without, I get data bus errors or segfaults on IP27
> running Gentoo's "emerge" program on PAGE_SIZE_4K.
>
> IP30 seems to be fine on an R12000 with or without that option, but I only have
> a dual R12K module to test against.  I've only had the R14K dual module for a
> few days, and I could not reproduce the bus errors on that module, either.  So
> I wonder if there is something funny with the hardware on the single R14K
> module, which I did get IBE's on before.  And whether that will behave once
> CONFIG_HUGETLBFS is in the kernel.
>
> If so, maybe the fix is to make CONFIG_HUGETLBFS automatically selected if
> CONFIG_TRANSPARENT_HUGEPAGE?
>

Yes, you may be on to something here.  Certianly basic huge TLB support 
must be in place for TRANSPARENT_HUGEPAGE to work.

It could be that the Kconfig symbols for the various portions of huge 
page support are missing the required dependencies.

FWIW, I always build with a huge page Kconfig options set.

I have:
$ grep HUGE .config
CONFIG_SYS_SUPPORTS_HUGETLBFS=y
CONFIG_MIPS_HUGE_TLB_SUPPORT=y
CONFIG_CPU_SUPPORTS_HUGEPAGES=y
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y

I suspect that you may not need CONFIG_HUGETLBFS, but 
CONFIG_HUGETLB_PAGE is probably essential.



David Daney


> --J
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-10 16:55                       ` David Daney
@ 2014-11-10 17:03                         ` Ralf Baechle
  2014-11-10 17:29                           ` David Daney
  2014-11-11 11:11                           ` Joshua Kinard
  0 siblings, 2 replies; 27+ messages in thread
From: Ralf Baechle @ 2014-11-10 17:03 UTC (permalink / raw)
  To: David Daney; +Cc: Joshua Kinard, Thomas Bogendoerfer, Linux MIPS List

On Mon, Nov 10, 2014 at 08:55:09AM -0800, David Daney wrote:

> Yes, you may be on to something here.  Certianly basic huge TLB support must
> be in place for TRANSPARENT_HUGEPAGE to work.
> 
> It could be that the Kconfig symbols for the various portions of huge page
> support are missing the required dependencies.
> 
> FWIW, I always build with a huge page Kconfig options set.
> 
> I have:
> $ grep HUGE .config
> CONFIG_SYS_SUPPORTS_HUGETLBFS=y
> CONFIG_MIPS_HUGE_TLB_SUPPORT=y
> CONFIG_CPU_SUPPORTS_HUGEPAGES=y
> CONFIG_TRANSPARENT_HUGEPAGE=y
> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
> CONFIG_HUGETLBFS=y
> CONFIG_HUGETLB_PAGE=y
> 
> I suspect that you may not need CONFIG_HUGETLBFS, but CONFIG_HUGETLB_PAGE is
> probably essential.

IP27 also has NUMA as the only in-tree MIPS system - and it's NUMA support
is not in the best support state to say the least.  Just an observation -
at this point in time there is no obvious connection between either

  R10000 <-> transparent huge page

or

  NUMA <-> transparent huge page

  Ralf

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-10 17:03                         ` Ralf Baechle
@ 2014-11-10 17:29                           ` David Daney
  2014-11-11 11:11                           ` Joshua Kinard
  1 sibling, 0 replies; 27+ messages in thread
From: David Daney @ 2014-11-10 17:29 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Joshua Kinard, Thomas Bogendoerfer, Linux MIPS List

On 11/10/2014 09:03 AM, Ralf Baechle wrote:
> On Mon, Nov 10, 2014 at 08:55:09AM -0800, David Daney wrote:
>
>> Yes, you may be on to something here.  Certianly basic huge TLB support must
>> be in place for TRANSPARENT_HUGEPAGE to work.
>>
>> It could be that the Kconfig symbols for the various portions of huge page
>> support are missing the required dependencies.
>>
>> FWIW, I always build with a huge page Kconfig options set.
>>
>> I have:
>> $ grep HUGE .config
>> CONFIG_SYS_SUPPORTS_HUGETLBFS=y
>> CONFIG_MIPS_HUGE_TLB_SUPPORT=y
>> CONFIG_CPU_SUPPORTS_HUGEPAGES=y
>> CONFIG_TRANSPARENT_HUGEPAGE=y
>> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
>> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
>> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
>> CONFIG_HUGETLBFS=y
>> CONFIG_HUGETLB_PAGE=y
>>
>> I suspect that you may not need CONFIG_HUGETLBFS, but CONFIG_HUGETLB_PAGE is
>> probably essential.
>
> IP27 also has NUMA as the only in-tree MIPS system - and it's NUMA support
> is not in the best support state to say the least.  Just an observation -
> at this point in time there is no obvious connection between either
>
>    R10000 <-> transparent huge page
>
> or
>
>    NUMA <-> transparent huge page
>
>    Ralf
>

FYI, I am running with CONFIG_TRANSPARENT_HUGEPAGE on a 2-node NUMA 
system (48 CPUs per node) OCTEON III, and the huge pages have not been 
an issue.  So I don't think there are any inherent NUMA issues with 
HUGEPAGES.

David Daney

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-10 11:20                   ` Thomas Bogendoerfer
  2014-11-10 14:22                     ` Joshua Kinard
@ 2014-11-10 21:30                     ` Thomas Bogendoerfer
  2014-11-11  7:47                       ` Ralf Baechle
  1 sibling, 1 reply; 27+ messages in thread
From: Thomas Bogendoerfer @ 2014-11-10 21:30 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Joshua Kinard, David Daney, Linux MIPS List

On Mon, Nov 10, 2014 at 12:20:39PM +0100, Thomas Bogendoerfer wrote:
> On Mon, Nov 10, 2014 at 11:51:06AM +0100, Ralf Baechle wrote:
> > Thomas,
> > 
> > can you test CONFIG_TRANSPARENT_HUGEPAGE on an IP28?
> > 
> > All in all the R10000's TLB is unproblematic; my gut feeling is that
> > rather something else specific to IP27 is spoiling the broth.
> 
> I'll give it a spin later today.

looks like IP28 has more problems than HUGEPAGES... even without
huge pages enabled it locks up during upgrading debian packages:-(
My gut feeling is that there is another spot hitting the ll/sc errata
stuff for this old R10k CPU.

So no new data out of that.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-10 21:30                     ` Thomas Bogendoerfer
@ 2014-11-11  7:47                       ` Ralf Baechle
  2014-11-11  9:24                         ` Thomas Bogendoerfer
  0 siblings, 1 reply; 27+ messages in thread
From: Ralf Baechle @ 2014-11-11  7:47 UTC (permalink / raw)
  To: Thomas Bogendoerfer; +Cc: Joshua Kinard, David Daney, Linux MIPS List

On Mon, Nov 10, 2014 at 10:30:10PM +0100, Thomas Bogendoerfer wrote:

> looks like IP28 has more problems than HUGEPAGES... even without
> huge pages enabled it locks up during upgrading debian packages:-(
> My gut feeling is that there is another spot hitting the ll/sc errata
> stuff for this old R10k CPU.

You have the dreaded v2.6 CPU?

  Ralf

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-11  7:47                       ` Ralf Baechle
@ 2014-11-11  9:24                         ` Thomas Bogendoerfer
  2014-11-11  9:38                           ` Ralf Baechle
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Bogendoerfer @ 2014-11-11  9:24 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Joshua Kinard, David Daney, Linux MIPS List

On Tue, Nov 11, 2014 at 08:47:58AM +0100, Ralf Baechle wrote:
> On Mon, Nov 10, 2014 at 10:30:10PM +0100, Thomas Bogendoerfer wrote:
> 
> > looks like IP28 has more problems than HUGEPAGES... even without
> > huge pages enabled it locks up during upgrading debian packages:-(
> > My gut feeling is that there is another spot hitting the ll/sc errata
> > stuff for this old R10k CPU.
> 
> You have the dreaded v2.6 CPU?

V2.5 even.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-11  9:24                         ` Thomas Bogendoerfer
@ 2014-11-11  9:38                           ` Ralf Baechle
  0 siblings, 0 replies; 27+ messages in thread
From: Ralf Baechle @ 2014-11-11  9:38 UTC (permalink / raw)
  To: Thomas Bogendoerfer; +Cc: Joshua Kinard, David Daney, Linux MIPS List

On Tue, Nov 11, 2014 at 10:24:07AM +0100, Thomas Bogendoerfer wrote:

> On Tue, Nov 11, 2014 at 08:47:58AM +0100, Ralf Baechle wrote:
> > On Mon, Nov 10, 2014 at 10:30:10PM +0100, Thomas Bogendoerfer wrote:
> > 
> > > looks like IP28 has more problems than HUGEPAGES... even without
> > > huge pages enabled it locks up during upgrading debian packages:-(
> > > My gut feeling is that there is another spot hitting the ll/sc errata
> > > stuff for this old R10k CPU.
> > 
> > You have the dreaded v2.6 CPU?
> 
> V2.5 even.

I'm impressed.  Not sure if I've ever seen a v2.5 errata sheet.  So far
I thought the v2.6 CPUs in my one Origin were the only ones that ever
left the SGI campus.

  Ralf

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
  2014-11-10 17:03                         ` Ralf Baechle
  2014-11-10 17:29                           ` David Daney
@ 2014-11-11 11:11                           ` Joshua Kinard
  1 sibling, 0 replies; 27+ messages in thread
From: Joshua Kinard @ 2014-11-11 11:11 UTC (permalink / raw)
  To: Ralf Baechle, David Daney; +Cc: Thomas Bogendoerfer, Linux MIPS List

On 11/10/2014 12:03, Ralf Baechle wrote:
> On Mon, Nov 10, 2014 at 08:55:09AM -0800, David Daney wrote:
> 
>> Yes, you may be on to something here.  Certianly basic huge TLB support must
>> be in place for TRANSPARENT_HUGEPAGE to work.
>>
>> It could be that the Kconfig symbols for the various portions of huge page
>> support are missing the required dependencies.
>>
>> FWIW, I always build with a huge page Kconfig options set.
>>
>> I have:
>> $ grep HUGE .config
>> CONFIG_SYS_SUPPORTS_HUGETLBFS=y
>> CONFIG_MIPS_HUGE_TLB_SUPPORT=y
>> CONFIG_CPU_SUPPORTS_HUGEPAGES=y
>> CONFIG_TRANSPARENT_HUGEPAGE=y
>> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
>> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
>> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
>> CONFIG_HUGETLBFS=y
>> CONFIG_HUGETLB_PAGE=y
>>
>> I suspect that you may not need CONFIG_HUGETLBFS, but CONFIG_HUGETLB_PAGE is
>> probably essential.
> 
> IP27 also has NUMA as the only in-tree MIPS system - and it's NUMA support
> is not in the best support state to say the least.  Just an observation -
> at this point in time there is no obvious connection between either
> 
>   R10000 <-> transparent huge page
> 
> or
> 
>   NUMA <-> transparent huge page
> 
>   Ralf

I briefly tried NUMA on the Onyx2, and it failed to load init.  init actually
spat out its --help info and quit, which panicked the kernel.  So I didn't test
that too much more.  I am also booting an 'M' kernel, not an 'N'.

That said, I went back to playing around with the Octane, which also seems to
have issues when CONFIG_TRANSPARENT_HUGEPAGE is present.  I now think that it's
not hugepages support at all, but something in the code covered by
CONFIG_MIGRATION.

Booting a 3.17.2 kernel on the Octane with CONFIG_TRANSPARENT_HUGEPAGES but
without CONFIG_HUGETLBFS (and, consequently, without CONFIG_HUGETLB_PAGE),
didn't immediately trigger my instruction bus errors upon loading init, despite
multiple cold reboots.  It took several tries before I could get 3.17.2 to
trigger it.

Backtracking to 3.16, I found out that I could trigger the problem virtually
every single cold boot on 3.16.4, but NOT 3.16.5.  Going through 3.16.5's
changelog, I tried backing out several commits that dealt with transparent
hugepages, jiffies calculation, and finally hit on this one:
http://git.linux-mips.org/?p=ralf/linux.git;a=commit;h=e9203e7b4019370e6d8f69cbf71c052aad22ced7

"""
commit d3cb8bf6081b8b7a2dabb1264fe968fd870fa595 upstream.

A migration entry is marked as write if pte_write was true at the time the
entry was created. The VMA protections are not double checked when migration
entries are being removed as mprotect marks write-migration-entries as
read. It means that potentially we take a spurious fault to mark PTEs write
again but it's straight-forward. However, there is a race between write
migrations being marked read and migrations finishing. This potentially
allows a PTE to be write that should have been read. Close this race by
double checking the VMA permissions using maybe_mkwrite when migration
completes.
"""

CONFIG_MIGRATION is enabled by default when you select
CONFIG_TRANSPARENT_HUGEPAGE, and when I backed that patch out of 3.16.5, the
frequency of a cold boot resulting in IBE's upon loading init increased -- 6
out of 7 reboots in one test run.

Leaving that patch backed out, I enabled CONFIG_HUGETLBFS and
CONFIG_HUGETLB_PAGE, and so far, out of five cold boots, all boot up fine.
This mirrors the behavior on the IP27 machine where CONFIG_HUGETLBFS seems to
fix problems.  I tried backing the migration patch out on the IP27 kernel and
it doesn't seem to have an effect there.

This seems to suggest that CONFIG_MIGRATION plays a part somehow, but only if
CONFIG_HUGETLB_PAGE is left out.  Doesn't look like CONFIG_HUGETLBFS matters,
as I haven't mounted that filesystem anywhere.

The symptoms on each systems are different -- I only get IBE's on Octane,
sometimes mixed with DBE's, and usually when init loads.  If by luck, init
loads, the IBE's are not likely to happen and the machine seems to run fine.  I
also confirmed that the R12K module on Octane suffers the same problem -- seems
to be a bit more resilient, though.

IP27 only ever gets DBE's, and not usually while loading init, but when
executing other userland programs, like Gentoo's emerge (written in Python).

It also looks like turning on CONFIG_HUGETLBFS and CONFIG_HUGETLB_PAGE fixed my
problems on Octane w/ PAGE_SIZE_16K/PAGE_SIZE_64K triggering random
sigbus/sigsegv signals, too (if anyone remembers that mail thread form a few
months ago).

So I'm curious why CONFIG_HUGETLB_PAGE is hidden and selected only with
CONFIG_HUGETLBFS?  It does cause arch/mips/mm/hugetlbpage.c to get built, so
maybe that's the critical part?  If so, it seems then for MIPS, that should be
in the the 'Kernel type' menu w/ CONFIG_TRANSPARENT_HUGEPAGE, and not invisibly
hidden away deep the 'File systems' submenu.

--J

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2014-11-11 11:11 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-02 10:53 IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors Joshua Kinard
2014-11-03 18:52 ` David Daney
2014-11-04  1:08   ` Joshua Kinard
2014-11-04  1:23     ` David Daney
2014-11-04  1:34       ` Joshua Kinard
2014-11-04  1:43         ` David Daney
2014-11-04  5:51           ` Joshua Kinard
2014-11-05  9:07       ` Joshua Kinard
2014-11-05 10:21         ` Ralf Baechle
2014-11-05 16:09       ` Ralf Baechle
2014-11-07 10:22         ` Joshua Kinard
2014-11-07 18:30           ` David Daney
2014-11-09  0:09             ` Joshua Kinard
2014-11-10  7:04               ` Joshua Kinard
2014-11-10 10:51                 ` Ralf Baechle
2014-11-10 11:20                   ` Thomas Bogendoerfer
2014-11-10 14:22                     ` Joshua Kinard
2014-11-10 16:55                       ` David Daney
2014-11-10 17:03                         ` Ralf Baechle
2014-11-10 17:29                           ` David Daney
2014-11-11 11:11                           ` Joshua Kinard
2014-11-10 21:30                     ` Thomas Bogendoerfer
2014-11-11  7:47                       ` Ralf Baechle
2014-11-11  9:24                         ` Thomas Bogendoerfer
2014-11-11  9:38                           ` Ralf Baechle
2014-11-10 11:22                   ` Joshua Kinard
2014-11-05 13:52     ` Ralf Baechle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.