* IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
@ 2014-11-02 10:53 Joshua Kinard
2014-11-03 18:52 ` David Daney
0 siblings, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-02 10:53 UTC (permalink / raw)
To: Linux MIPS List
So I have been testing the Onyx2 I have out the last few days with the IOC3
metadriver used on Octane, and I can get it to boot, but if
CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
Gentoo's 'emerge' command can produce one. Switch to CONFIG_PAGE_SIZE_16KB,
and the bus errors are far less frequent. I suspect CONFIG_PAGE_SIZE_64KB will
be even less.
Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good. It's
been up for almost 8 hours compiling, and not a single bus error yet. It's got
2x node board with dual R12K/400MHz CPUs per node.
I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's causing
R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
CPUs). I tried getting a core dump on one of the bus errors, but that produces a
truncated or corrupted core file that actually crashed GDB, plus I get a nice
oops message in dmesg:
[ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted 3.17.1-mipsgit-20141006 #57
[ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti: a8000000fa6f0000
[ 1302.260000] $ 0 : 0000000000000000 0000000000000001 0000000000000000 a8000000ff5ad800
[ 1302.260000] $ 4 : a8000000006d5480 00000000000f9c00 00000001f380173f a800000001000000
[ 1302.260000] $ 8 : 00000001f380173f 0000000000100077 a8000000fe77a000 0000000000000000
[ 1302.260000] $12 : 0000000000660000 0000000000000000 0000000000000000 776bc40c00000004
[ 1302.260000] $16 : 0000000000e00000 0000000000000000 00000000018ee000 6db6db6db6db6db7
[ 1302.260000] $20 : 00000000000000ca a8000000006d5480 a8000000ff65fa68 0000000000001000
[ 1302.260000] $24 : 0000000000000000 a8000000000469c0
[ 1302.260000] $28 : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000 a800000000046720
[ 1302.260000] Hi : 00000000002ed400
[ 1302.260000] Lo : 00000000000f9c00
[ 1302.260000] epc : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
[ 1302.260000] Not tainted
[ 1302.260000] ra : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
[ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
[ 1302.260000] Cause : 0000c010
[ 1302.260000] BadVA : 00000001f380173f
[ 1302.260000] PrId : 00000e35 (R12000)
[ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000, task=a8000000ffbbf288, tls=00000000778d2490)
[ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00 a8000000006d5480
a8000000ff65fa68 0000000000001000 0000000000e00000 a80000000010cb00
a8000000046a2000 a8000000ff65fa68 00000000018ee000 6db6db6db6db6db7
a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800 a8000000005fbd90
0000000300000080 a8000000ff668580 a8000000005fbd90 5349474900000080
a8000000fa6f3ad8 a8000000005fbd90 0000000600000088 a8000000ff5ad928
a8000000005fbd90 46494c4500002bf9 c000000000101000 0000000a00000080
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
...
[ 1302.260000] Call Trace:
[ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
[ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
[ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
[ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
[ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
[ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
[ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
[ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
[ 1302.260000]
[ 1302.260000]
Code: 0010327a 30c60ff8 00c8302d <dcc60000> 30c80001 1100003e 00000000 bfb40000 df880000
[ 1305.340000] ---[ end trace c7649a6433db8d18 ]---
Thoughts?
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28
"The past tempts us, the present confuses us, the future frightens us. And our
lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-02 10:53 IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors Joshua Kinard
@ 2014-11-03 18:52 ` David Daney
2014-11-04 1:08 ` Joshua Kinard
0 siblings, 1 reply; 27+ messages in thread
From: David Daney @ 2014-11-03 18:52 UTC (permalink / raw)
To: Joshua Kinard; +Cc: Linux MIPS List
On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>
> So I have been testing the Onyx2 I have out the last few days with the IOC3
> metadriver used on Octane, and I can get it to boot, but if
> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>
> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
> Gentoo's 'emerge' command can produce one. Switch to CONFIG_PAGE_SIZE_16KB,
> and the bus errors are far less frequent. I suspect CONFIG_PAGE_SIZE_64KB will
> be even less.
>
> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good. It's
> been up for almost 8 hours compiling, and not a single bus error yet. It's got
> 2x node board with dual R12K/400MHz CPUs per node.
>
> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's causing
> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
> CPUs). I tried getting a core dump on one of the bus errors, but that produces a
> truncated or corrupted core file that actually crashed GDB, plus I get a nice
> oops message in dmesg:
Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE,
huge pages will be created and used in the background transparently to
the userspace application.
With 4KB base page size, the huge pages will be 2MB in size.. I don't
know much about the R10K/R12K/R14K CPUs, but it is possible that either
their TLBs cannot handle such pages, or that the TLB Exception handlers
don't contain proper code for these CPUs.
For each doubling of the base PAGE_SIZE, the huge page size will
increase by a factor of 4. So with 16KB base pages the huge page size
would be 32MB, since there are many fewer opportunities to transparently
use a 32MB page, I would expect any errors related to huge pages to be
correspondingly less frequent.
With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that
that could never be used by normal userspace programs.
>
> [ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted 3.17.1-mipsgit-20141006 #57
> [ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti: a8000000fa6f0000
> [ 1302.260000] $ 0 : 0000000000000000 0000000000000001 0000000000000000 a8000000ff5ad800
> [ 1302.260000] $ 4 : a8000000006d5480 00000000000f9c00 00000001f380173f a800000001000000
> [ 1302.260000] $ 8 : 00000001f380173f 0000000000100077 a8000000fe77a000 0000000000000000
> [ 1302.260000] $12 : 0000000000660000 0000000000000000 0000000000000000 776bc40c00000004
> [ 1302.260000] $16 : 0000000000e00000 0000000000000000 00000000018ee000 6db6db6db6db6db7
> [ 1302.260000] $20 : 00000000000000ca a8000000006d5480 a8000000ff65fa68 0000000000001000
> [ 1302.260000] $24 : 0000000000000000 a8000000000469c0
> [ 1302.260000] $28 : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000 a800000000046720
> [ 1302.260000] Hi : 00000000002ed400
> [ 1302.260000] Lo : 00000000000f9c00
> [ 1302.260000] epc : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
> [ 1302.260000] Not tainted
> [ 1302.260000] ra : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
> [ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
> [ 1302.260000] Cause : 0000c010
> [ 1302.260000] BadVA : 00000001f380173f
> [ 1302.260000] PrId : 00000e35 (R12000)
> [ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000, task=a8000000ffbbf288, tls=00000000778d2490)
> [ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00 a8000000006d5480
> a8000000ff65fa68 0000000000001000 0000000000e00000 a80000000010cb00
> a8000000046a2000 a8000000ff65fa68 00000000018ee000 6db6db6db6db6db7
> a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800 a8000000005fbd90
> 0000000300000080 a8000000ff668580 a8000000005fbd90 5349474900000080
> a8000000fa6f3ad8 a8000000005fbd90 0000000600000088 a8000000ff5ad928
> a8000000005fbd90 46494c4500002bf9 c000000000101000 0000000a00000080
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> ...
> [ 1302.260000] Call Trace:
> [ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
> [ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
> [ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
> [ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
> [ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
> [ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
> [ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
> [ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
> [ 1302.260000]
> [ 1302.260000]
> Code: 0010327a 30c60ff8 00c8302d <dcc60000> 30c80001 1100003e 00000000 bfb40000 df880000
> [ 1305.340000] ---[ end trace c7649a6433db8d18 ]---
>
> Thoughts?
>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-03 18:52 ` David Daney
@ 2014-11-04 1:08 ` Joshua Kinard
2014-11-04 1:23 ` David Daney
2014-11-05 13:52 ` Ralf Baechle
0 siblings, 2 replies; 27+ messages in thread
From: Joshua Kinard @ 2014-11-04 1:08 UTC (permalink / raw)
To: David Daney; +Cc: Linux MIPS List
On 11/03/2014 13:52, David Daney wrote:
> On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>>
>> So I have been testing the Onyx2 I have out the last few days with the IOC3
>> metadriver used on Octane, and I can get it to boot, but if
>> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>>
>> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
>> Gentoo's 'emerge' command can produce one. Switch to CONFIG_PAGE_SIZE_16KB,
>> and the bus errors are far less frequent. I suspect CONFIG_PAGE_SIZE_64KB will
>> be even less.
>>
>> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good. It's
>> been up for almost 8 hours compiling, and not a single bus error yet. It's got
>> 2x node board with dual R12K/400MHz CPUs per node.
>>
>> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's causing
>> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
>> CPUs). I tried getting a core dump on one of the bus errors, but that
>> produces a
>> truncated or corrupted core file that actually crashed GDB, plus I get a nice
>> oops message in dmesg:
>
> Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE, huge
> pages will be created and used in the background transparently to the userspace
> application.
>
> With 4KB base page size, the huge pages will be 2MB in size.. I don't know
> much about the R10K/R12K/R14K CPUs, but it is possible that either their TLBs
> cannot handle such pages, or that the TLB Exception handlers don't contain
> proper code for these CPUs.
>
> For each doubling of the base PAGE_SIZE, the huge page size will increase by a
> factor of 4. So with 16KB base pages the huge page size would be 32MB, since
> there are many fewer opportunities to transparently use a 32MB page, I would
> expect any errors related to huge pages to be correspondingly less frequent.
>
> With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that that
> could never be used by normal userspace programs.
I checked the R10K/R12K manual, and the PageMask register there has bits 24:13
open for setting a mask value. It looks like these CPUs only support a page
size from 4KB to 16MB (so a 2MB page size should work w/ transparent
hugepages). I assume that the R14K on the Octane might be the same (but I
don't have a manual specific to the R14k, so I don't know). All of the
remaining bits in that register read 0 and must have 0's written back.
I guess I could find a way to have the kernel trigger a non-fatal oops/dump the
registers on a bus error and get a look at the cause register to see if that
sheds any light on things. Doesn't a SIGBUS on MIPS typically mean that an
address wasn't aligned on a 32-bit boundary? Or could it also mean other things?
I believe that the R10K is largely compatible with the R4K-style TLB setup, but
Ralf or someone else more knowledge in that area will have to verify. Maybe
the R10k-family CPUs need their own TLB routines, or what currently exists
needs modifications? I have not tried to understand the whole TLB thing in
MIPS yet, so that's a bit of voodoo to me.
--J
>> [ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted
>> 3.17.1-mipsgit-20141006 #57
>> [ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti:
>> a8000000fa6f0000
>> [ 1302.260000] $ 0 : 0000000000000000 0000000000000001 0000000000000000
>> a8000000ff5ad800
>> [ 1302.260000] $ 4 : a8000000006d5480 00000000000f9c00 00000001f380173f
>> a800000001000000
>> [ 1302.260000] $ 8 : 00000001f380173f 0000000000100077 a8000000fe77a000
>> 0000000000000000
>> [ 1302.260000] $12 : 0000000000660000 0000000000000000 0000000000000000
>> 776bc40c00000004
>> [ 1302.260000] $16 : 0000000000e00000 0000000000000000 00000000018ee000
>> 6db6db6db6db6db7
>> [ 1302.260000] $20 : 00000000000000ca a8000000006d5480 a8000000ff65fa68
>> 0000000000001000
>> [ 1302.260000] $24 : 0000000000000000 a8000000000469c0
>> [ 1302.260000] $28 : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000
>> a800000000046720
>> [ 1302.260000] Hi : 00000000002ed400
>> [ 1302.260000] Lo : 00000000000f9c00
>> [ 1302.260000] epc : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
>> [ 1302.260000] Not tainted
>> [ 1302.260000] ra : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
>> [ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
>> [ 1302.260000] Cause : 0000c010
>> [ 1302.260000] BadVA : 00000001f380173f
>> [ 1302.260000] PrId : 00000e35 (R12000)
>> [ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000,
>> task=a8000000ffbbf288, tls=00000000778d2490)
>> [ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00
>> a8000000006d5480
>> a8000000ff65fa68 0000000000001000 0000000000e00000 a80000000010cb00
>> a8000000046a2000 a8000000ff65fa68 00000000018ee000 6db6db6db6db6db7
>> a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800 a8000000005fbd90
>> 0000000300000080 a8000000ff668580 a8000000005fbd90 5349474900000080
>> a8000000fa6f3ad8 a8000000005fbd90 0000000600000088 a8000000ff5ad928
>> a8000000005fbd90 46494c4500002bf9 c000000000101000 0000000a00000080
>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> ...
>> [ 1302.260000] Call Trace:
>> [ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
>> [ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
>> [ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
>> [ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
>> [ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
>> [ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
>> [ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
>> [ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
>> [ 1302.260000]
>> [ 1302.260000]
>> Code: 0010327a 30c60ff8 00c8302d <dcc60000> 30c80001 1100003e 00000000
>> bfb40000 df880000
>> [ 1305.340000] ---[ end trace c7649a6433db8d18 ]---
>>
>> Thoughts?
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28
"The past tempts us, the present confuses us, the future frightens us. And our
lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-04 1:08 ` Joshua Kinard
@ 2014-11-04 1:23 ` David Daney
2014-11-04 1:34 ` Joshua Kinard
` (2 more replies)
2014-11-05 13:52 ` Ralf Baechle
1 sibling, 3 replies; 27+ messages in thread
From: David Daney @ 2014-11-04 1:23 UTC (permalink / raw)
To: Joshua Kinard; +Cc: Linux MIPS List
On 11/03/2014 05:08 PM, Joshua Kinard wrote:
> On 11/03/2014 13:52, David Daney wrote:
>> On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>>>
>>> So I have been testing the Onyx2 I have out the last few days with the IOC3
>>> metadriver used on Octane, and I can get it to boot, but if
>>> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>>>
>>> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
>>> Gentoo's 'emerge' command can produce one. Switch to CONFIG_PAGE_SIZE_16KB,
>>> and the bus errors are far less frequent. I suspect CONFIG_PAGE_SIZE_64KB will
>>> be even less.
>>>
>>> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good. It's
>>> been up for almost 8 hours compiling, and not a single bus error yet. It's got
>>> 2x node board with dual R12K/400MHz CPUs per node.
>>>
>>> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's causing
>>> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
>>> CPUs). I tried getting a core dump on one of the bus errors, but that
>>> produces a
>>> truncated or corrupted core file that actually crashed GDB, plus I get a nice
>>> oops message in dmesg:
>>
>> Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE, huge
>> pages will be created and used in the background transparently to the userspace
>> application.
>>
>> With 4KB base page size, the huge pages will be 2MB in size.. I don't know
>> much about the R10K/R12K/R14K CPUs, but it is possible that either their TLBs
>> cannot handle such pages, or that the TLB Exception handlers don't contain
>> proper code for these CPUs.
>>
>> For each doubling of the base PAGE_SIZE, the huge page size will increase by a
>> factor of 4. So with 16KB base pages the huge page size would be 32MB, since
>> there are many fewer opportunities to transparently use a 32MB page, I would
>> expect any errors related to huge pages to be correspondingly less frequent.
>>
>> With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that that
>> could never be used by normal userspace programs.
>
> I checked the R10K/R12K manual, and the PageMask register there has bits 24:13
> open for setting a mask value. It looks like these CPUs only support a page
> size from 4KB to 16MB (so a 2MB page size should work w/ transparent
> hugepages). I assume that the R14K on the Octane might be the same (but I
> don't have a manual specific to the R14k, so I don't know). All of the
> remaining bits in that register read 0 and must have 0's written back.
>
> I guess I could find a way to have the kernel trigger a non-fatal oops/dump the
> registers on a bus error and get a look at the cause register to see if that
> sheds any light on things. Doesn't a SIGBUS on MIPS typically mean that an
> address wasn't aligned on a 32-bit boundary? Or could it also mean other things?
>
> I believe that the R10K is largely compatible with the R4K-style TLB setup, but
> Ralf or someone else more knowledge in that area will have to verify. Maybe
> the R10k-family CPUs need their own TLB routines, or what currently exists
> needs modifications? I have not tried to understand the whole TLB thing in
> MIPS yet, so that's a bit of voodoo to me.
I haven't checked, but there may be workarounds required in the TLB
management code that are not in place for the huge page case. When the
huge TLB code was developed, we didn't do any testing on R10K. Somebody
should dump the exception handlers and carefully look at the rest of the
huge TLB management code, and check to see that any required workarounds
are in place.
David.
>
> --J
>
>
>
>>> [ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted
>>> 3.17.1-mipsgit-20141006 #57
>>> [ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti:
>>> a8000000fa6f0000
>>> [ 1302.260000] $ 0 : 0000000000000000 0000000000000001 0000000000000000
>>> a8000000ff5ad800
>>> [ 1302.260000] $ 4 : a8000000006d5480 00000000000f9c00 00000001f380173f
>>> a800000001000000
>>> [ 1302.260000] $ 8 : 00000001f380173f 0000000000100077 a8000000fe77a000
>>> 0000000000000000
>>> [ 1302.260000] $12 : 0000000000660000 0000000000000000 0000000000000000
>>> 776bc40c00000004
>>> [ 1302.260000] $16 : 0000000000e00000 0000000000000000 00000000018ee000
>>> 6db6db6db6db6db7
>>> [ 1302.260000] $20 : 00000000000000ca a8000000006d5480 a8000000ff65fa68
>>> 0000000000001000
>>> [ 1302.260000] $24 : 0000000000000000 a8000000000469c0
>>> [ 1302.260000] $28 : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000
>>> a800000000046720
>>> [ 1302.260000] Hi : 00000000002ed400
>>> [ 1302.260000] Lo : 00000000000f9c00
>>> [ 1302.260000] epc : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
>>> [ 1302.260000] Not tainted
>>> [ 1302.260000] ra : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
>>> [ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
>>> [ 1302.260000] Cause : 0000c010
>>> [ 1302.260000] BadVA : 00000001f380173f
>>> [ 1302.260000] PrId : 00000e35 (R12000)
>>> [ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000,
>>> task=a8000000ffbbf288, tls=00000000778d2490)
>>> [ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00
>>> a8000000006d5480
>>> a8000000ff65fa68 0000000000001000 0000000000e00000 a80000000010cb00
>>> a8000000046a2000 a8000000ff65fa68 00000000018ee000 6db6db6db6db6db7
>>> a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800 a8000000005fbd90
>>> 0000000300000080 a8000000ff668580 a8000000005fbd90 5349474900000080
>>> a8000000fa6f3ad8 a8000000005fbd90 0000000600000088 a8000000ff5ad928
>>> a8000000005fbd90 46494c4500002bf9 c000000000101000 0000000a00000080
>>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> ...
>>> [ 1302.260000] Call Trace:
>>> [ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
>>> [ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
>>> [ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
>>> [ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
>>> [ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
>>> [ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
>>> [ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
>>> [ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
>>> [ 1302.260000]
>>> [ 1302.260000]
>>> Code: 0010327a 30c60ff8 00c8302d <dcc60000> 30c80001 1100003e 00000000
>>> bfb40000 df880000
>>> [ 1305.340000] ---[ end trace c7649a6433db8d18 ]---
>>>
>>> Thoughts?
>
>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-04 1:23 ` David Daney
@ 2014-11-04 1:34 ` Joshua Kinard
2014-11-04 1:43 ` David Daney
2014-11-05 9:07 ` Joshua Kinard
2014-11-05 16:09 ` Ralf Baechle
2 siblings, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-04 1:34 UTC (permalink / raw)
To: linux-mips
On 11/03/2014 20:23, David Daney wrote:
> On 11/03/2014 05:08 PM, Joshua Kinard wrote:
>> On 11/03/2014 13:52, David Daney wrote:
>>> On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>>>>
>>>> So I have been testing the Onyx2 I have out the last few days with the IOC3
>>>> metadriver used on Octane, and I can get it to boot, but if
>>>> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>>>>
>>>> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
>>>> Gentoo's 'emerge' command can produce one. Switch to CONFIG_PAGE_SIZE_16KB,
>>>> and the bus errors are far less frequent. I suspect CONFIG_PAGE_SIZE_64KB
>>>> will
>>>> be even less.
>>>>
>>>> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good. It's
>>>> been up for almost 8 hours compiling, and not a single bus error yet. It's
>>>> got
>>>> 2x node board with dual R12K/400MHz CPUs per node.
>>>>
>>>> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's
>>>> causing
>>>> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
>>>> CPUs). I tried getting a core dump on one of the bus errors, but that
>>>> produces a
>>>> truncated or corrupted core file that actually crashed GDB, plus I get a nice
>>>> oops message in dmesg:
>>>
>>> Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE, huge
>>> pages will be created and used in the background transparently to the userspace
>>> application.
>>>
>>> With 4KB base page size, the huge pages will be 2MB in size.. I don't know
>>> much about the R10K/R12K/R14K CPUs, but it is possible that either their TLBs
>>> cannot handle such pages, or that the TLB Exception handlers don't contain
>>> proper code for these CPUs.
>>>
>>> For each doubling of the base PAGE_SIZE, the huge page size will increase by a
>>> factor of 4. So with 16KB base pages the huge page size would be 32MB, since
>>> there are many fewer opportunities to transparently use a 32MB page, I would
>>> expect any errors related to huge pages to be correspondingly less frequent.
>>>
>>> With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that that
>>> could never be used by normal userspace programs.
>>
>> I checked the R10K/R12K manual, and the PageMask register there has bits 24:13
>> open for setting a mask value. It looks like these CPUs only support a page
>> size from 4KB to 16MB (so a 2MB page size should work w/ transparent
>> hugepages). I assume that the R14K on the Octane might be the same (but I
>> don't have a manual specific to the R14k, so I don't know). All of the
>> remaining bits in that register read 0 and must have 0's written back.
>>
>> I guess I could find a way to have the kernel trigger a non-fatal oops/dump the
>> registers on a bus error and get a look at the cause register to see if that
>> sheds any light on things. Doesn't a SIGBUS on MIPS typically mean that an
>> address wasn't aligned on a 32-bit boundary? Or could it also mean other
>> things?
>>
>> I believe that the R10K is largely compatible with the R4K-style TLB setup, but
>> Ralf or someone else more knowledge in that area will have to verify. Maybe
>> the R10k-family CPUs need their own TLB routines, or what currently exists
>> needs modifications? I have not tried to understand the whole TLB thing in
>> MIPS yet, so that's a bit of voodoo to me.
>
> I haven't checked, but there may be workarounds required in the TLB management
> code that are not in place for the huge page case. When the huge TLB code was
> developed, we didn't do any testing on R10K. Somebody should dump the
> exception handlers and carefully look at the rest of the huge TLB management
> code, and check to see that any required workarounds are in place.
How does one dump the exception handlers? Is it a debug switch somewhere?
--J
> David.
>
>
>>
>> --J
>>
>>
>>
>>>> [ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted
>>>> 3.17.1-mipsgit-20141006 #57
>>>> [ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti:
>>>> a8000000fa6f0000
>>>> [ 1302.260000] $ 0 : 0000000000000000 0000000000000001 0000000000000000
>>>> a8000000ff5ad800
>>>> [ 1302.260000] $ 4 : a8000000006d5480 00000000000f9c00 00000001f380173f
>>>> a800000001000000
>>>> [ 1302.260000] $ 8 : 00000001f380173f 0000000000100077 a8000000fe77a000
>>>> 0000000000000000
>>>> [ 1302.260000] $12 : 0000000000660000 0000000000000000 0000000000000000
>>>> 776bc40c00000004
>>>> [ 1302.260000] $16 : 0000000000e00000 0000000000000000 00000000018ee000
>>>> 6db6db6db6db6db7
>>>> [ 1302.260000] $20 : 00000000000000ca a8000000006d5480 a8000000ff65fa68
>>>> 0000000000001000
>>>> [ 1302.260000] $24 : 0000000000000000 a8000000000469c0
>>>> [ 1302.260000] $28 : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000
>>>> a800000000046720
>>>> [ 1302.260000] Hi : 00000000002ed400
>>>> [ 1302.260000] Lo : 00000000000f9c00
>>>> [ 1302.260000] epc : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
>>>> [ 1302.260000] Not tainted
>>>> [ 1302.260000] ra : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
>>>> [ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
>>>> [ 1302.260000] Cause : 0000c010
>>>> [ 1302.260000] BadVA : 00000001f380173f
>>>> [ 1302.260000] PrId : 00000e35 (R12000)
>>>> [ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000,
>>>> task=a8000000ffbbf288, tls=00000000778d2490)
>>>> [ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00
>>>> a8000000006d5480
>>>> a8000000ff65fa68 0000000000001000 0000000000e00000
>>>> a80000000010cb00
>>>> a8000000046a2000 a8000000ff65fa68 00000000018ee000
>>>> 6db6db6db6db6db7
>>>> a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800
>>>> a8000000005fbd90
>>>> 0000000300000080 a8000000ff668580 a8000000005fbd90
>>>> 5349474900000080
>>>> a8000000fa6f3ad8 a8000000005fbd90 0000000600000088
>>>> a8000000ff5ad928
>>>> a8000000005fbd90 46494c4500002bf9 c000000000101000
>>>> 0000000a00000080
>>>> 0000000000000000 0000000000000000 0000000000000000
>>>> 0000000000000000
>>>> 0000000000000000 0000000000000000 0000000000000000
>>>> 0000000000000000
>>>> 0000000000000000 0000000000000000 0000000000000000
>>>> 0000000000000000
>>>> ...
>>>> [ 1302.260000] Call Trace:
>>>> [ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
>>>> [ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
>>>> [ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
>>>> [ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
>>>> [ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
>>>> [ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
>>>> [ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
>>>> [ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
>>>> [ 1302.260000]
>>>> [ 1302.260000]
>>>> Code: 0010327a 30c60ff8 00c8302d <dcc60000> 30c80001 1100003e 00000000
>>>> bfb40000 df880000
>>>> [ 1305.340000] ---[ end trace c7649a6433db8d18 ]---
>>>>
>>>> Thoughts?
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28
"The past tempts us, the present confuses us, the future frightens us. And our
lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-04 1:34 ` Joshua Kinard
@ 2014-11-04 1:43 ` David Daney
2014-11-04 5:51 ` Joshua Kinard
0 siblings, 1 reply; 27+ messages in thread
From: David Daney @ 2014-11-04 1:43 UTC (permalink / raw)
To: Joshua Kinard; +Cc: linux-mips
On 11/03/2014 05:34 PM, Joshua Kinard wrote:
> On 11/03/2014 20:23, David Daney wrote:
>> On 11/03/2014 05:08 PM, Joshua Kinard wrote:
>>> On 11/03/2014 13:52, David Daney wrote:
>>>> On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>>>>>
>>>>> So I have been testing the Onyx2 I have out the last few days with the IOC3
>>>>> metadriver used on Octane, and I can get it to boot, but if
>>>>> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>>>>>
>>>>> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
>>>>> Gentoo's 'emerge' command can produce one. Switch to CONFIG_PAGE_SIZE_16KB,
>>>>> and the bus errors are far less frequent. I suspect CONFIG_PAGE_SIZE_64KB
>>>>> will
>>>>> be even less.
>>>>>
>>>>> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good. It's
>>>>> been up for almost 8 hours compiling, and not a single bus error yet. It's
>>>>> got
>>>>> 2x node board with dual R12K/400MHz CPUs per node.
>>>>>
>>>>> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's
>>>>> causing
>>>>> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
>>>>> CPUs). I tried getting a core dump on one of the bus errors, but that
>>>>> produces a
>>>>> truncated or corrupted core file that actually crashed GDB, plus I get a nice
>>>>> oops message in dmesg:
>>>>
>>>> Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE, huge
>>>> pages will be created and used in the background transparently to the userspace
>>>> application.
>>>>
>>>> With 4KB base page size, the huge pages will be 2MB in size.. I don't know
>>>> much about the R10K/R12K/R14K CPUs, but it is possible that either their TLBs
>>>> cannot handle such pages, or that the TLB Exception handlers don't contain
>>>> proper code for these CPUs.
>>>>
>>>> For each doubling of the base PAGE_SIZE, the huge page size will increase by a
>>>> factor of 4. So with 16KB base pages the huge page size would be 32MB, since
>>>> there are many fewer opportunities to transparently use a 32MB page, I would
>>>> expect any errors related to huge pages to be correspondingly less frequent.
>>>>
>>>> With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that that
>>>> could never be used by normal userspace programs.
>>>
>>> I checked the R10K/R12K manual, and the PageMask register there has bits 24:13
>>> open for setting a mask value. It looks like these CPUs only support a page
>>> size from 4KB to 16MB (so a 2MB page size should work w/ transparent
>>> hugepages). I assume that the R14K on the Octane might be the same (but I
>>> don't have a manual specific to the R14k, so I don't know). All of the
>>> remaining bits in that register read 0 and must have 0's written back.
>>>
>>> I guess I could find a way to have the kernel trigger a non-fatal oops/dump the
>>> registers on a bus error and get a look at the cause register to see if that
>>> sheds any light on things. Doesn't a SIGBUS on MIPS typically mean that an
>>> address wasn't aligned on a 32-bit boundary? Or could it also mean other
>>> things?
>>>
>>> I believe that the R10K is largely compatible with the R4K-style TLB setup, but
>>> Ralf or someone else more knowledge in that area will have to verify. Maybe
>>> the R10k-family CPUs need their own TLB routines, or what currently exists
>>> needs modifications? I have not tried to understand the whole TLB thing in
>>> MIPS yet, so that's a bit of voodoo to me.
>>
>> I haven't checked, but there may be workarounds required in the TLB management
>> code that are not in place for the huge page case. When the huge TLB code was
>> developed, we didn't do any testing on R10K. Somebody should dump the
>> exception handlers and carefully look at the rest of the huge TLB management
>> code, and check to see that any required workarounds are in place.
>
> How does one dump the exception handlers? Is it a debug switch somewhere?
>
Add as the very first line of tlbex.c "#define DEBUG 1"
Then rebuild, and pass "debug" on the kernel command line.
The output can be fed though gas, and then disassembled with objdump -d
> --J
>
>
>
>
>> David.
>>
>>
>>>
>>> --J
>>>
>>>
>>>
>>>>> [ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted
>>>>> 3.17.1-mipsgit-20141006 #57
>>>>> [ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti:
>>>>> a8000000fa6f0000
>>>>> [ 1302.260000] $ 0 : 0000000000000000 0000000000000001 0000000000000000
>>>>> a8000000ff5ad800
>>>>> [ 1302.260000] $ 4 : a8000000006d5480 00000000000f9c00 00000001f380173f
>>>>> a800000001000000
>>>>> [ 1302.260000] $ 8 : 00000001f380173f 0000000000100077 a8000000fe77a000
>>>>> 0000000000000000
>>>>> [ 1302.260000] $12 : 0000000000660000 0000000000000000 0000000000000000
>>>>> 776bc40c00000004
>>>>> [ 1302.260000] $16 : 0000000000e00000 0000000000000000 00000000018ee000
>>>>> 6db6db6db6db6db7
>>>>> [ 1302.260000] $20 : 00000000000000ca a8000000006d5480 a8000000ff65fa68
>>>>> 0000000000001000
>>>>> [ 1302.260000] $24 : 0000000000000000 a8000000000469c0
>>>>> [ 1302.260000] $28 : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000
>>>>> a800000000046720
>>>>> [ 1302.260000] Hi : 00000000002ed400
>>>>> [ 1302.260000] Lo : 00000000000f9c00
>>>>> [ 1302.260000] epc : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
>>>>> [ 1302.260000] Not tainted
>>>>> [ 1302.260000] ra : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
>>>>> [ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
>>>>> [ 1302.260000] Cause : 0000c010
>>>>> [ 1302.260000] BadVA : 00000001f380173f
>>>>> [ 1302.260000] PrId : 00000e35 (R12000)
>>>>> [ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000,
>>>>> task=a8000000ffbbf288, tls=00000000778d2490)
>>>>> [ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00
>>>>> a8000000006d5480
>>>>> a8000000ff65fa68 0000000000001000 0000000000e00000
>>>>> a80000000010cb00
>>>>> a8000000046a2000 a8000000ff65fa68 00000000018ee000
>>>>> 6db6db6db6db6db7
>>>>> a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800
>>>>> a8000000005fbd90
>>>>> 0000000300000080 a8000000ff668580 a8000000005fbd90
>>>>> 5349474900000080
>>>>> a8000000fa6f3ad8 a8000000005fbd90 0000000600000088
>>>>> a8000000ff5ad928
>>>>> a8000000005fbd90 46494c4500002bf9 c000000000101000
>>>>> 0000000a00000080
>>>>> 0000000000000000 0000000000000000 0000000000000000
>>>>> 0000000000000000
>>>>> 0000000000000000 0000000000000000 0000000000000000
>>>>> 0000000000000000
>>>>> 0000000000000000 0000000000000000 0000000000000000
>>>>> 0000000000000000
>>>>> ...
>>>>> [ 1302.260000] Call Trace:
>>>>> [ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
>>>>> [ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
>>>>> [ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
>>>>> [ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
>>>>> [ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
>>>>> [ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
>>>>> [ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
>>>>> [ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
>>>>> [ 1302.260000]
>>>>> [ 1302.260000]
>>>>> Code: 0010327a 30c60ff8 00c8302d <dcc60000> 30c80001 1100003e 00000000
>>>>> bfb40000 df880000
>>>>> [ 1305.340000] ---[ end trace c7649a6433db8d18 ]---
>>>>>
>>>>> Thoughts?
>
>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-04 1:43 ` David Daney
@ 2014-11-04 5:51 ` Joshua Kinard
0 siblings, 0 replies; 27+ messages in thread
From: Joshua Kinard @ 2014-11-04 5:51 UTC (permalink / raw)
To: David Daney; +Cc: linux-mips
[-- Attachment #1: Type: text/plain, Size: 8557 bytes --]
On 11/03/2014 20:43, David Daney wrote:
> On 11/03/2014 05:34 PM, Joshua Kinard wrote:
>> On 11/03/2014 20:23, David Daney wrote:
>>> On 11/03/2014 05:08 PM, Joshua Kinard wrote:
>>>> On 11/03/2014 13:52, David Daney wrote:
>>>>> On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>>>>>>
>>>>>> So I have been testing the Onyx2 I have out the last few days with the IOC3
>>>>>> metadriver used on Octane, and I can get it to boot, but if
>>>>>> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>>>>>>
>>>>>> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
>>>>>> Gentoo's 'emerge' command can produce one. Switch to
>>>>>> CONFIG_PAGE_SIZE_16KB,
>>>>>> and the bus errors are far less frequent. I suspect CONFIG_PAGE_SIZE_64KB
>>>>>> will
>>>>>> be even less.
>>>>>>
>>>>>> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good.
>>>>>> It's
>>>>>> been up for almost 8 hours compiling, and not a single bus error yet. It's
>>>>>> got
>>>>>> 2x node board with dual R12K/400MHz CPUs per node.
>>>>>>
>>>>>> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's
>>>>>> causing
>>>>>> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
>>>>>> CPUs). I tried getting a core dump on one of the bus errors, but that
>>>>>> produces a
>>>>>> truncated or corrupted core file that actually crashed GDB, plus I get a
>>>>>> nice
>>>>>> oops message in dmesg:
>>>>>
>>>>> Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE, huge
>>>>> pages will be created and used in the background transparently to the
>>>>> userspace
>>>>> application.
>>>>>
>>>>> With 4KB base page size, the huge pages will be 2MB in size.. I don't know
>>>>> much about the R10K/R12K/R14K CPUs, but it is possible that either their TLBs
>>>>> cannot handle such pages, or that the TLB Exception handlers don't contain
>>>>> proper code for these CPUs.
>>>>>
>>>>> For each doubling of the base PAGE_SIZE, the huge page size will increase
>>>>> by a
>>>>> factor of 4. So with 16KB base pages the huge page size would be 32MB, since
>>>>> there are many fewer opportunities to transparently use a 32MB page, I would
>>>>> expect any errors related to huge pages to be correspondingly less frequent.
>>>>>
>>>>> With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that that
>>>>> could never be used by normal userspace programs.
>>>>
>>>> I checked the R10K/R12K manual, and the PageMask register there has bits 24:13
>>>> open for setting a mask value. It looks like these CPUs only support a page
>>>> size from 4KB to 16MB (so a 2MB page size should work w/ transparent
>>>> hugepages). I assume that the R14K on the Octane might be the same (but I
>>>> don't have a manual specific to the R14k, so I don't know). All of the
>>>> remaining bits in that register read 0 and must have 0's written back.
>>>>
>>>> I guess I could find a way to have the kernel trigger a non-fatal oops/dump
>>>> the
>>>> registers on a bus error and get a look at the cause register to see if that
>>>> sheds any light on things. Doesn't a SIGBUS on MIPS typically mean that an
>>>> address wasn't aligned on a 32-bit boundary? Or could it also mean other
>>>> things?
>>>>
>>>> I believe that the R10K is largely compatible with the R4K-style TLB setup,
>>>> but
>>>> Ralf or someone else more knowledge in that area will have to verify. Maybe
>>>> the R10k-family CPUs need their own TLB routines, or what currently exists
>>>> needs modifications? I have not tried to understand the whole TLB thing in
>>>> MIPS yet, so that's a bit of voodoo to me.
>>>
>>> I haven't checked, but there may be workarounds required in the TLB management
>>> code that are not in place for the huge page case. When the huge TLB code was
>>> developed, we didn't do any testing on R10K. Somebody should dump the
>>> exception handlers and carefully look at the rest of the huge TLB management
>>> code, and check to see that any required workarounds are in place.
>>
>> How does one dump the exception handlers? Is it a debug switch somewhere?
>>
>
> Add as the very first line of tlbex.c "#define DEBUG 1"
>
> Then rebuild, and pass "debug" on the kernel command line.
>
> The output can be fed though gas, and then disassembled with objdump -d
Had to fiddle with gas a little bit, but that was because I was using
cross-compiler. Got it to work, though. tlb1-no-transparent-hugepage.dis is
without CONFIG_TRANSPARENT_HUGEPAGE, while tlb2-transparent-hugepage.dis is
with that option enabled. I don't think any of the other patches/hacks I've
added to my IP27 build have affected the output.
I saw that CPU1 through CPU3 also dumped r4000_tlb_refill only, but that looks
to be the same across all of the CPUs, so I only compiled/disassembled the
output from CPU0.
--J
>>>>>> [ 1302.260000] CPU: 0 PID: 1179 Comm: emerge Not tainted
>>>>>> 3.17.1-mipsgit-20141006 #57
>>>>>> [ 1302.260000] task: a8000000ffbbf288 ti: a8000000fa6f0000 task.ti:
>>>>>> a8000000fa6f0000
>>>>>> [ 1302.260000] $ 0 : 0000000000000000 0000000000000001 0000000000000000
>>>>>> a8000000ff5ad800
>>>>>> [ 1302.260000] $ 4 : a8000000006d5480 00000000000f9c00 00000001f380173f
>>>>>> a800000001000000
>>>>>> [ 1302.260000] $ 8 : 00000001f380173f 0000000000100077 a8000000fe77a000
>>>>>> 0000000000000000
>>>>>> [ 1302.260000] $12 : 0000000000660000 0000000000000000 0000000000000000
>>>>>> 776bc40c00000004
>>>>>> [ 1302.260000] $16 : 0000000000e00000 0000000000000000 00000000018ee000
>>>>>> 6db6db6db6db6db7
>>>>>> [ 1302.260000] $20 : 00000000000000ca a8000000006d5480 a8000000ff65fa68
>>>>>> 0000000000001000
>>>>>> [ 1302.260000] $24 : 0000000000000000 a8000000000469c0
>>>>>> [ 1302.260000] $28 : a8000000fa6f0000 a8000000fa6f3a00 0000000000e00000
>>>>>> a800000000046720
>>>>>> [ 1302.260000] Hi : 00000000002ed400
>>>>>> [ 1302.260000] Lo : 00000000000f9c00
>>>>>> [ 1302.260000] epc : a8000000000467e4 r4k_flush_cache_page+0x104/0x2e0
>>>>>> [ 1302.260000] Not tainted
>>>>>> [ 1302.260000] ra : a800000000046720 r4k_flush_cache_page+0x40/0x2e0
>>>>>> [ 1302.260000] Status: 90001ce3 KX SX UX KERNEL EXL IE
>>>>>> [ 1302.260000] Cause : 0000c010
>>>>>> [ 1302.260000] BadVA : 00000001f380173f
>>>>>> [ 1302.260000] PrId : 00000e35 (R12000)
>>>>>> [ 1302.260000] Process emerge (pid: 1179, threadinfo=a8000000fa6f0000,
>>>>>> task=a8000000ffbbf288, tls=00000000778d2490)
>>>>>> [ 1302.260000] Stack : a8000000ff65fa68 0000000000e00000 00000000000f9c00
>>>>>> a8000000006d5480
>>>>>> a8000000ff65fa68 0000000000001000 0000000000e00000
>>>>>> a80000000010cb00
>>>>>> a8000000046a2000 a8000000ff65fa68 00000000018ee000
>>>>>> 6db6db6db6db6db7
>>>>>> a8000000fe7fdce0 a8000000000375ec a8000000ff4e5800
>>>>>> a8000000005fbd90
>>>>>> 0000000300000080 a8000000ff668580 a8000000005fbd90
>>>>>> 5349474900000080
>>>>>> a8000000fa6f3ad8 a8000000005fbd90 0000000600000088
>>>>>> a8000000ff5ad928
>>>>>> a8000000005fbd90 46494c4500002bf9 c000000000101000
>>>>>> 0000000a00000080
>>>>>> 0000000000000000 0000000000000000 0000000000000000
>>>>>> 0000000000000000
>>>>>> 0000000000000000 0000000000000000 0000000000000000
>>>>>> 0000000000000000
>>>>>> 0000000000000000 0000000000000000 0000000000000000
>>>>>> 0000000000000000
>>>>>> ...
>>>>>> [ 1302.260000] Call Trace:
>>>>>> [ 1302.260000] [<a8000000000467e4>] r4k_flush_cache_page+0x104/0x2e0
>>>>>> [ 1302.260000] [<a80000000010cb00>] get_dump_page+0xc8/0xe8
>>>>>> [ 1302.260000] [<a8000000000375ec>] elf_core_dump+0x1294/0x14d8
>>>>>> [ 1302.260000] [<a8000000001b41e4>] do_coredump+0x5e4/0x1048
>>>>>> [ 1302.260000] [<a80000000005c0b8>] get_signal+0x1b8/0x710
>>>>>> [ 1302.260000] [<a8000000000299c0>] do_signal+0x18/0x240
>>>>>> [ 1302.260000] [<a80000000002a4c8>] do_notify_resume+0x70/0x88
>>>>>> [ 1302.260000] [<a8000000000255ac>] work_notifysig+0x10/0x18
>>>>>> [ 1302.260000]
>>>>>> [ 1302.260000]
>>>>>> Code: 0010327a 30c60ff8 00c8302d <dcc60000> 30c80001 1100003e 00000000
>>>>>> bfb40000 df880000
>>>>>> [ 1305.340000] ---[ end trace c7649a6433db8d18 ]---
>>>>>>
>>>>>> Thoughts?
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28
"The past tempts us, the present confuses us, the future frightens us. And our
lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
[-- Attachment #2: tlb1-no-transparent-hugepage.dis --]
[-- Type: text/plain, Size: 7693 bytes --]
tlb1: file format elf64-tradbigmips
Disassembly of section .text:
0000000000000000 <tlbmiss_handler>:
0: 40252000 dmfc0 a1,$4
4: 00052dfa dsrl a1,a1,0x17
8: 3c06a800 lui a2,0xa800
c: 00063438 dsll a2,a2,0x10
10: 64c60091 daddiu a2,a2,145
14: 00063438 dsll a2,a2,0x10
18: 00c5302d daddu a2,a2,a1
1c: fcc44100 sd a0,16640(a2)
20: 03e00008 jr ra
24: 00000000 nop
...
0000000000000048 <r4000_tlb_load>:
48: 403a2000 dmfc0 k0,$4
4c: 001ad6ba dsrl k0,k0,0x1a
50: 001ad1f8 dsll k0,k0,0x7
54: 3c1ba800 lui k1,0xa800
58: 001bdc38 dsll k1,k1,0x10
5c: 677b0091 daddiu k1,k1,145
60: 001bdc38 dsll k1,k1,0x10
64: 677b5500 daddiu k1,k1,21760
68: 035bd02d daddu k0,k0,k1
6c: ff410000 sd at,0(k0)
70: ff420008 sd v0,8(k0)
74: 403b4000 dmfc0 k1,$8
78: 001b0abe dsrl32 at,k1,0xa
7c: 14200026 bnez at,118 <r4000_tlb_load+0xd0>
80: 40212000 dmfc0 at,$4
84: 00010dfa dsrl at,at,0x17
88: 3c1ba800 lui k1,0xa800
8c: 001bdc38 dsll k1,k1,0x10
90: 677b0091 daddiu k1,k1,145
94: 001bdc38 dsll k1,k1,0x10
98: 003b082d daddu at,at,k1
9c: 403b4000 dmfc0 k1,$8
a0: dc214100 ld at,16640(at)
a4: 001bdeba dsrl k1,k1,0x1a
a8: 337bfff8 andi k1,k1,0xfff8
ac: 003b082d daddu at,at,k1
b0: 403b4000 dmfc0 k1,$8
b4: dc210000 ld at,0(at)
b8: 001bdb7a dsrl k1,k1,0xd
bc: 337bfff8 andi k1,k1,0xfff8
c0: 003b082d daddu at,at,k1
c4: d03b0000 lld k1,0(at)
c8: 42000008 tlbp
cc: 33620003 andi v0,k1,0x3
d0: 38420003 xori v0,v0,0x3
d4: 14400016 bnez v0,130 <r4000_tlb_load+0xe8>
d8: 377b0048 ori k1,k1,0x48
dc: f03b0000 scd k1,0(at)
e0: 5360fff8 beqzl k1,c4 <r4000_tlb_load+0x7c>
e4: 00000000 nop
e8: 34210008 ori at,at,0x8
ec: 38210008 xori at,at,0x8
f0: dc3b0000 ld k1,0(at)
f4: dc210008 ld at,8(at)
f8: 001bd97a dsrl k1,k1,0x5
fc: 40bb1000 dmtc0 k1,$2
100: 0001097a dsrl at,at,0x5
104: 40a11800 dmtc0 at,$3
108: 42000002 tlbwi
10c: df410000 ld at,0(k0)
110: df420008 ld v0,8(k0)
114: 42000018 eret
118: 3c01a800 lui at,0xa800
11c: 00010c38 dsll at,at,0x10
120: 6421008f daddiu at,at,143
124: 00010c38 dsll at,at,0x10
128: 1000ffde b a4 <r4000_tlb_load+0x5c>
12c: 64210000 daddiu at,at,0
130: df410000 ld at,0(k0)
134: df420008 ld v0,8(k0)
138: 08010cc8 j 43320 <r4000_tlb_refill+0x42cd8>
13c: 00000000 nop
...
0000000000000248 <r4000_tlb_store>:
248: 403a2000 dmfc0 k0,$4
24c: 001ad6ba dsrl k0,k0,0x1a
250: 001ad1f8 dsll k0,k0,0x7
254: 3c1ba800 lui k1,0xa800
258: 001bdc38 dsll k1,k1,0x10
25c: 677b0091 daddiu k1,k1,145
260: 001bdc38 dsll k1,k1,0x10
264: 677b5500 daddiu k1,k1,21760
268: 035bd02d daddu k0,k0,k1
26c: ff410000 sd at,0(k0)
270: ff420008 sd v0,8(k0)
274: 403b4000 dmfc0 k1,$8
278: 001b0abe dsrl32 at,k1,0xa
27c: 14200027 bnez at,31c <r4000_tlb_store+0xd4>
280: 40212000 dmfc0 at,$4
284: 00010dfa dsrl at,at,0x17
288: 3c1ba800 lui k1,0xa800
28c: 001bdc38 dsll k1,k1,0x10
290: 677b0091 daddiu k1,k1,145
294: 001bdc38 dsll k1,k1,0x10
298: 003b082d daddu at,at,k1
29c: 403b4000 dmfc0 k1,$8
2a0: dc214100 ld at,16640(at)
2a4: 001bdeba dsrl k1,k1,0x1a
2a8: 337bfff8 andi k1,k1,0xfff8
2ac: 003b082d daddu at,at,k1
2b0: 403b4000 dmfc0 k1,$8
2b4: dc210000 ld at,0(at)
2b8: 001bdb7a dsrl k1,k1,0xd
2bc: 337bfff8 andi k1,k1,0xfff8
2c0: 003b082d daddu at,at,k1
2c4: d03b0000 lld k1,0(at)
2c8: 42000008 tlbp
2cc: 33620005 andi v0,k1,0x5
2d0: 38420005 xori v0,v0,0x5
2d4: 14400017 bnez v0,334 <r4000_tlb_store+0xec>
2d8: 00000000 nop
2dc: 377b00d8 ori k1,k1,0xd8
2e0: f03b0000 scd k1,0(at)
2e4: 5360fff7 beqzl k1,2c4 <r4000_tlb_store+0x7c>
2e8: 00000000 nop
2ec: 34210008 ori at,at,0x8
2f0: 38210008 xori at,at,0x8
2f4: dc3b0000 ld k1,0(at)
2f8: dc210008 ld at,8(at)
2fc: 001bd97a dsrl k1,k1,0x5
300: 40bb1000 dmtc0 k1,$2
304: 0001097a dsrl at,at,0x5
308: 40a11800 dmtc0 at,$3
30c: 42000002 tlbwi
310: df410000 ld at,0(k0)
314: df420008 ld v0,8(k0)
318: 42000018 eret
31c: 3c01a800 lui at,0xa800
320: 00010c38 dsll at,at,0x10
324: 6421008f daddiu at,at,143
328: 00010c38 dsll at,at,0x10
32c: 1000ffdd b 2a4 <r4000_tlb_store+0x5c>
330: 64210000 daddiu at,at,0
334: df410000 ld at,0(k0)
338: df420008 ld v0,8(k0)
33c: 08010d13 j 4344c <r4000_tlb_refill+0x42e04>
340: 00000000 nop
...
0000000000000448 <r4000_tlb_modify>:
448: 403a2000 dmfc0 k0,$4
44c: 001ad6ba dsrl k0,k0,0x1a
450: 001ad1f8 dsll k0,k0,0x7
454: 3c1ba800 lui k1,0xa800
458: 001bdc38 dsll k1,k1,0x10
45c: 677b0091 daddiu k1,k1,145
460: 001bdc38 dsll k1,k1,0x10
464: 677b5500 daddiu k1,k1,21760
468: 035bd02d daddu k0,k0,k1
46c: ff410000 sd at,0(k0)
470: ff420008 sd v0,8(k0)
474: 403b4000 dmfc0 k1,$8
478: 001b0abe dsrl32 at,k1,0xa
47c: 14200025 bnez at,514 <r4000_tlb_modify+0xcc>
480: 40212000 dmfc0 at,$4
484: 00010dfa dsrl at,at,0x17
488: 3c1ba800 lui k1,0xa800
48c: 001bdc38 dsll k1,k1,0x10
490: 677b0091 daddiu k1,k1,145
494: 001bdc38 dsll k1,k1,0x10
498: 003b082d daddu at,at,k1
49c: 403b4000 dmfc0 k1,$8
4a0: dc214100 ld at,16640(at)
4a4: 001bdeba dsrl k1,k1,0x1a
4a8: 337bfff8 andi k1,k1,0xfff8
4ac: 003b082d daddu at,at,k1
4b0: 403b4000 dmfc0 k1,$8
4b4: dc210000 ld at,0(at)
4b8: 001bdb7a dsrl k1,k1,0xd
4bc: 337bfff8 andi k1,k1,0xfff8
4c0: 003b082d daddu at,at,k1
4c4: d03b0000 lld k1,0(at)
4c8: 42000008 tlbp
4cc: 33620004 andi v0,k1,0x4
4d0: 10400016 beqz v0,52c <r4000_tlb_modify+0xe4>
4d4: 377b00d8 ori k1,k1,0xd8
4d8: f03b0000 scd k1,0(at)
4dc: 5360fff9 beqzl k1,4c4 <r4000_tlb_modify+0x7c>
4e0: 00000000 nop
4e4: 34210008 ori at,at,0x8
4e8: 38210008 xori at,at,0x8
4ec: dc3b0000 ld k1,0(at)
4f0: dc210008 ld at,8(at)
4f4: 001bd97a dsrl k1,k1,0x5
4f8: 40bb1000 dmtc0 k1,$2
4fc: 0001097a dsrl at,at,0x5
500: 40a11800 dmtc0 at,$3
504: 42000002 tlbwi
508: df410000 ld at,0(k0)
50c: df420008 ld v0,8(k0)
510: 42000018 eret
514: 3c01a800 lui at,0xa800
518: 00010c38 dsll at,at,0x10
51c: 6421008f daddiu at,at,143
520: 00010c38 dsll at,at,0x10
524: 1000ffdf b 4a4 <r4000_tlb_modify+0x5c>
528: 64210000 daddiu at,at,0
52c: df410000 ld at,0(k0)
530: df420008 ld v0,8(k0)
534: 08010d13 j 4344c <r4000_tlb_refill+0x42e04>
538: 00000000 nop
...
0000000000000648 <r4000_tlb_refill>:
648: 07410006 bgez k0,664 <r4000_tlb_refill+0x1c>
64c: 3c1ba800 lui k1,0xa800
650: 001bdc38 dsll k1,k1,0x10
654: 677b008f daddiu k1,k1,143
658: 001bdc38 dsll k1,k1,0x10
65c: 10000026 b 6f8 <r4000_tlb_refill+0xb0>
660: 677b0000 daddiu k1,k1,0
664: 3c1ba800 lui k1,0xa800
668: 001bdc38 dsll k1,k1,0x10
66c: 677b0004 daddiu k1,k1,4
670: 001bdc38 dsll k1,k1,0x10
674: 677b3320 daddiu k1,k1,13088
678: 03600008 jr k1
67c: 00000000 nop
...
6c8: 403a4000 dmfc0 k0,$8
6cc: 001adabe dsrl32 k1,k0,0xa
6d0: 1760ffdd bnez k1,648 <r4000_tlb_refill>
6d4: 403b2000 dmfc0 k1,$4
6d8: 001bddfa dsrl k1,k1,0x17
6dc: 3c1aa800 lui k0,0xa800
6e0: 001ad438 dsll k0,k0,0x10
6e4: 675a0091 daddiu k0,k0,145
6e8: 001ad438 dsll k0,k0,0x10
6ec: 037ad82d daddu k1,k1,k0
6f0: 403a4000 dmfc0 k0,$8
6f4: df7b4100 ld k1,16640(k1)
6f8: 001ad6ba dsrl k0,k0,0x1a
6fc: 335afff8 andi k0,k0,0xfff8
700: 037ad82d daddu k1,k1,k0
704: 403aa000 dmfc0 k0,$20
708: df7b0000 ld k1,0(k1)
70c: 001ad13a dsrl k0,k0,0x4
710: 335afff0 andi k0,k0,0xfff0
714: 037ad82d daddu k1,k1,k0
718: df7a0000 ld k0,0(k1)
71c: df7b0008 ld k1,8(k1)
720: 001ad17a dsrl k0,k0,0x5
724: 40ba1000 dmtc0 k0,$2
728: 001bd97a dsrl k1,k1,0x5
72c: 40bb1800 dmtc0 k1,$3
730: 42000006 tlbwr
734: 42000018 eret
...
[-- Attachment #3: tlb2-transparent-hugepage.dis --]
[-- Type: text/plain, Size: 10634 bytes --]
tlb2: file format elf64-tradbigmips
Disassembly of section .text:
0000000000000000 <tlbmiss_handler>:
0: 40252000 dmfc0 a1,$4
4: 00052dfa dsrl a1,a1,0x17
8: 3c06a800 lui a2,0xa800
c: 00063438 dsll a2,a2,0x10
10: 64c60092 daddiu a2,a2,146
14: 00063438 dsll a2,a2,0x10
18: 00c5302d daddu a2,a2,a1
1c: fcc44100 sd a0,16640(a2)
20: 03e00008 jr ra
24: 00000000 nop
...
0000000000000048 <r4000_tlb_load>:
48: 403a2000 dmfc0 k0,$4
4c: 001ad6ba dsrl k0,k0,0x1a
50: 001ad1f8 dsll k0,k0,0x7
54: 3c1ba800 lui k1,0xa800
58: 001bdc38 dsll k1,k1,0x10
5c: 677b0092 daddiu k1,k1,146
60: 001bdc38 dsll k1,k1,0x10
64: 677b5500 daddiu k1,k1,21760
68: 035bd02d daddu k0,k0,k1
6c: ff410000 sd at,0(k0)
70: ff420008 sd v0,8(k0)
74: 403b4000 dmfc0 k1,$8
78: 001b0abe dsrl32 at,k1,0xa
7c: 14200029 bnez at,124 <r4000_tlb_load+0xdc>
80: 40212000 dmfc0 at,$4
84: 00010dfa dsrl at,at,0x17
88: 3c1ba800 lui k1,0xa800
8c: 001bdc38 dsll k1,k1,0x10
90: 677b0092 daddiu k1,k1,146
94: 001bdc38 dsll k1,k1,0x10
98: 003b082d daddu at,at,k1
9c: 403b4000 dmfc0 k1,$8
a0: dc214100 ld at,16640(at)
a4: 001bdeba dsrl k1,k1,0x1a
a8: 337bfff8 andi k1,k1,0xfff8
ac: 003b082d daddu at,at,k1
b0: dc3b0000 ld k1,0(at)
b4: 337b0020 andi k1,k1,0x20
b8: 17600020 bnez k1,13c <r4000_tlb_load+0xf4>
bc: 403b4000 dmfc0 k1,$8
c0: dc210000 ld at,0(at)
c4: 001bdb7a dsrl k1,k1,0xd
c8: 337bfff8 andi k1,k1,0xfff8
cc: 003b082d daddu at,at,k1
d0: d03b0000 lld k1,0(at)
d4: 42000008 tlbp
d8: 33620003 andi v0,k1,0x3
dc: 38420003 xori v0,v0,0x3
e0: 1440002c bnez v0,194 <r4000_tlb_load+0x14c>
e4: 377b0108 ori k1,k1,0x108
e8: f03b0000 scd k1,0(at)
ec: 5360fff8 beqzl k1,d0 <r4000_tlb_load+0x88>
f0: 00000000 nop
f4: 34210008 ori at,at,0x8
f8: 38210008 xori at,at,0x8
fc: dc3b0000 ld k1,0(at)
100: dc210008 ld at,8(at)
104: 001bd9fa dsrl k1,k1,0x7
108: 40bb1000 dmtc0 k1,$2
10c: 000109fa dsrl at,at,0x7
110: 40a11800 dmtc0 at,$3
114: 42000002 tlbwi
118: df410000 ld at,0(k0)
11c: df420008 ld v0,8(k0)
120: 42000018 eret
124: 3c01a800 lui at,0xa800
128: 00010c38 dsll at,at,0x10
12c: 64210090 daddiu at,at,144
130: 00010c38 dsll at,at,0x10
134: 1000ffdb b a4 <r4000_tlb_load+0x5c>
138: 64210000 daddiu at,at,0
13c: d03b0000 lld k1,0(at)
140: 33620003 andi v0,k1,0x3
144: 38420003 xori v0,v0,0x3
148: 14400012 bnez v0,194 <r4000_tlb_load+0x14c>
14c: 42000008 tlbp
150: 377b0108 ori k1,k1,0x108
154: f03b0000 scd k1,0(at)
158: 1360fff8 beqz k1,13c <r4000_tlb_load+0xf4>
15c: dc3b0000 ld k1,0(at)
160: 3c010040 lui at,0x40
164: 001bd9fa dsrl k1,k1,0x7
168: 40bb1000 dmtc0 k1,$2
16c: 0361d82d daddu k1,k1,at
170: 40bb1800 dmtc0 k1,$3
174: 3c1b1fff lui k1,0x1fff
178: 377be000 ori k1,k1,0xe000
17c: 409b2800 mtc0 k1,$5
180: 42000002 tlbwi
184: 3c1b0001 lui k1,0x1
188: 377be000 ori k1,k1,0xe000
18c: 1000ffe2 b 118 <r4000_tlb_load+0xd0>
190: 409b2800 mtc0 k1,$5
194: df410000 ld at,0(k0)
198: df420008 ld v0,8(k0)
19c: 08010f14 j 43c50 <r4000_tlb_refill+0x43608>
1a0: 00000000 nop
...
0000000000000248 <r4000_tlb_store>:
248: 403a2000 dmfc0 k0,$4
24c: 001ad6ba dsrl k0,k0,0x1a
250: 001ad1f8 dsll k0,k0,0x7
254: 3c1ba800 lui k1,0xa800
258: 001bdc38 dsll k1,k1,0x10
25c: 677b0092 daddiu k1,k1,146
260: 001bdc38 dsll k1,k1,0x10
264: 677b5500 daddiu k1,k1,21760
268: 035bd02d daddu k0,k0,k1
26c: ff410000 sd at,0(k0)
270: ff420008 sd v0,8(k0)
274: 403b4000 dmfc0 k1,$8
278: 001b0abe dsrl32 at,k1,0xa
27c: 1420002a bnez at,328 <r4000_tlb_store+0xe0>
280: 40212000 dmfc0 at,$4
284: 00010dfa dsrl at,at,0x17
288: 3c1ba800 lui k1,0xa800
28c: 001bdc38 dsll k1,k1,0x10
290: 677b0092 daddiu k1,k1,146
294: 001bdc38 dsll k1,k1,0x10
298: 003b082d daddu at,at,k1
29c: 403b4000 dmfc0 k1,$8
2a0: dc214100 ld at,16640(at)
2a4: 001bdeba dsrl k1,k1,0x1a
2a8: 337bfff8 andi k1,k1,0xfff8
2ac: 003b082d daddu at,at,k1
2b0: dc3b0000 ld k1,0(at)
2b4: 337b0020 andi k1,k1,0x20
2b8: 17600021 bnez k1,340 <r4000_tlb_store+0xf8>
2bc: 403b4000 dmfc0 k1,$8
2c0: dc210000 ld at,0(at)
2c4: 001bdb7a dsrl k1,k1,0xd
2c8: 337bfff8 andi k1,k1,0xfff8
2cc: 003b082d daddu at,at,k1
2d0: d03b0000 lld k1,0(at)
2d4: 42000008 tlbp
2d8: 33620005 andi v0,k1,0x5
2dc: 38420005 xori v0,v0,0x5
2e0: 1440002e bnez v0,39c <r4000_tlb_store+0x154>
2e4: 00000000 nop
2e8: 377b0318 ori k1,k1,0x318
2ec: f03b0000 scd k1,0(at)
2f0: 5360fff7 beqzl k1,2d0 <r4000_tlb_store+0x88>
2f4: 00000000 nop
2f8: 34210008 ori at,at,0x8
2fc: 38210008 xori at,at,0x8
300: dc3b0000 ld k1,0(at)
304: dc210008 ld at,8(at)
308: 001bd9fa dsrl k1,k1,0x7
30c: 40bb1000 dmtc0 k1,$2
310: 000109fa dsrl at,at,0x7
314: 40a11800 dmtc0 at,$3
318: 42000002 tlbwi
31c: df410000 ld at,0(k0)
320: df420008 ld v0,8(k0)
324: 42000018 eret
328: 3c01a800 lui at,0xa800
32c: 00010c38 dsll at,at,0x10
330: 64210090 daddiu at,at,144
334: 00010c38 dsll at,at,0x10
338: 1000ffda b 2a4 <r4000_tlb_store+0x5c>
33c: 64210000 daddiu at,at,0
340: d03b0000 lld k1,0(at)
344: 33620005 andi v0,k1,0x5
348: 38420005 xori v0,v0,0x5
34c: 14400013 bnez v0,39c <r4000_tlb_store+0x154>
350: 00000000 nop
354: 42000008 tlbp
358: 377b0318 ori k1,k1,0x318
35c: f03b0000 scd k1,0(at)
360: 1360fff7 beqz k1,340 <r4000_tlb_store+0xf8>
364: dc3b0000 ld k1,0(at)
368: 3c010040 lui at,0x40
36c: 001bd9fa dsrl k1,k1,0x7
370: 40bb1000 dmtc0 k1,$2
374: 0361d82d daddu k1,k1,at
378: 40bb1800 dmtc0 k1,$3
37c: 3c1b1fff lui k1,0x1fff
380: 377be000 ori k1,k1,0xe000
384: 409b2800 mtc0 k1,$5
388: 42000002 tlbwi
38c: 3c1b0001 lui k1,0x1
390: 377be000 ori k1,k1,0xe000
394: 1000ffe1 b 31c <r4000_tlb_store+0xd4>
398: 409b2800 mtc0 k1,$5
39c: df410000 ld at,0(k0)
3a0: df420008 ld v0,8(k0)
3a4: 08010f5f j 43d7c <r4000_tlb_refill+0x43734>
3a8: 00000000 nop
...
0000000000000448 <r4000_tlb_modify>:
448: 403a2000 dmfc0 k0,$4
44c: 001ad6ba dsrl k0,k0,0x1a
450: 001ad1f8 dsll k0,k0,0x7
454: 3c1ba800 lui k1,0xa800
458: 001bdc38 dsll k1,k1,0x10
45c: 677b0092 daddiu k1,k1,146
460: 001bdc38 dsll k1,k1,0x10
464: 677b5500 daddiu k1,k1,21760
468: 035bd02d daddu k0,k0,k1
46c: ff410000 sd at,0(k0)
470: ff420008 sd v0,8(k0)
474: 403b4000 dmfc0 k1,$8
478: 001b0abe dsrl32 at,k1,0xa
47c: 14200028 bnez at,520 <r4000_tlb_modify+0xd8>
480: 40212000 dmfc0 at,$4
484: 00010dfa dsrl at,at,0x17
488: 3c1ba800 lui k1,0xa800
48c: 001bdc38 dsll k1,k1,0x10
490: 677b0092 daddiu k1,k1,146
494: 001bdc38 dsll k1,k1,0x10
498: 003b082d daddu at,at,k1
49c: 403b4000 dmfc0 k1,$8
4a0: dc214100 ld at,16640(at)
4a4: 001bdeba dsrl k1,k1,0x1a
4a8: 337bfff8 andi k1,k1,0xfff8
4ac: 003b082d daddu at,at,k1
4b0: dc3b0000 ld k1,0(at)
4b4: 337b0020 andi k1,k1,0x20
4b8: 1760001f bnez k1,538 <r4000_tlb_modify+0xf0>
4bc: 403b4000 dmfc0 k1,$8
4c0: dc210000 ld at,0(at)
4c4: 001bdb7a dsrl k1,k1,0xd
4c8: 337bfff8 andi k1,k1,0xfff8
4cc: 003b082d daddu at,at,k1
4d0: d03b0000 lld k1,0(at)
4d4: 42000008 tlbp
4d8: 33620004 andi v0,k1,0x4
4dc: 1040002b beqz v0,58c <r4000_tlb_modify+0x144>
4e0: 377b0318 ori k1,k1,0x318
4e4: f03b0000 scd k1,0(at)
4e8: 5360fff9 beqzl k1,4d0 <r4000_tlb_modify+0x88>
4ec: 00000000 nop
4f0: 34210008 ori at,at,0x8
4f4: 38210008 xori at,at,0x8
4f8: dc3b0000 ld k1,0(at)
4fc: dc210008 ld at,8(at)
500: 001bd9fa dsrl k1,k1,0x7
504: 40bb1000 dmtc0 k1,$2
508: 000109fa dsrl at,at,0x7
50c: 40a11800 dmtc0 at,$3
510: 42000002 tlbwi
514: df410000 ld at,0(k0)
518: df420008 ld v0,8(k0)
51c: 42000018 eret
520: 3c01a800 lui at,0xa800
524: 00010c38 dsll at,at,0x10
528: 64210090 daddiu at,at,144
52c: 00010c38 dsll at,at,0x10
530: 1000ffdc b 4a4 <r4000_tlb_modify+0x5c>
534: 64210000 daddiu at,at,0
538: d03b0000 lld k1,0(at)
53c: 33620004 andi v0,k1,0x4
540: 10400012 beqz v0,58c <r4000_tlb_modify+0x144>
544: 42000008 tlbp
548: 377b0318 ori k1,k1,0x318
54c: f03b0000 scd k1,0(at)
550: 1360fff9 beqz k1,538 <r4000_tlb_modify+0xf0>
554: dc3b0000 ld k1,0(at)
558: 3c010040 lui at,0x40
55c: 001bd9fa dsrl k1,k1,0x7
560: 40bb1000 dmtc0 k1,$2
564: 0361d82d daddu k1,k1,at
568: 40bb1800 dmtc0 k1,$3
56c: 3c1b1fff lui k1,0x1fff
570: 377be000 ori k1,k1,0xe000
574: 409b2800 mtc0 k1,$5
578: 42000002 tlbwi
57c: 3c1b0001 lui k1,0x1
580: 377be000 ori k1,k1,0xe000
584: 1000ffe3 b 514 <r4000_tlb_modify+0xcc>
588: 409b2800 mtc0 k1,$5
58c: df410000 ld at,0(k0)
590: df420008 ld v0,8(k0)
594: 08010f5f j 43d7c <r4000_tlb_refill+0x43734>
598: 00000000 nop
...
0000000000000648 <r4000_tlb_refill>:
648: df7a0000 ld k0,0(k1)
64c: 3c1b0040 lui k1,0x40
650: 001ad1fa dsrl k0,k0,0x7
654: 40ba1000 dmtc0 k0,$2
658: 035bd02d daddu k0,k0,k1
65c: 40ba1800 dmtc0 k0,$3
660: 3c1a1fff lui k0,0x1fff
664: 375ae000 ori k0,k0,0xe000
668: 409a2800 mtc0 k0,$5
66c: 42000006 tlbwr
670: 3c1a0001 lui k0,0x1
674: 375ae000 ori k0,k0,0xe000
678: 10000031 b 740 <r4000_tlb_refill+0xf8>
67c: 409a2800 mtc0 k0,$5
680: 07410006 bgez k0,69c <r4000_tlb_refill+0x54>
684: 3c1ba800 lui k1,0xa800
688: 001bdc38 dsll k1,k1,0x10
68c: 677b0090 daddiu k1,k1,144
690: 001bdc38 dsll k1,k1,0x10
694: 10000018 b 6f8 <r4000_tlb_refill+0xb0>
698: 677b0000 daddiu k1,k1,0
69c: 3c1ba800 lui k1,0xa800
6a0: 001bdc38 dsll k1,k1,0x10
6a4: 677b0004 daddiu k1,k1,4
6a8: 001bdc38 dsll k1,k1,0x10
6ac: 677b3c50 daddiu k1,k1,15440
6b0: 03600008 jr k1
6b4: 00000000 nop
...
6c8: 403a4000 dmfc0 k0,$8
6cc: 001adabe dsrl32 k1,k0,0xa
6d0: 1760ffeb bnez k1,680 <r4000_tlb_refill+0x38>
6d4: 403b2000 dmfc0 k1,$4
6d8: 001bddfa dsrl k1,k1,0x17
6dc: 3c1aa800 lui k0,0xa800
6e0: 001ad438 dsll k0,k0,0x10
6e4: 675a0092 daddiu k0,k0,146
6e8: 001ad438 dsll k0,k0,0x10
6ec: 037ad82d daddu k1,k1,k0
6f0: 403a4000 dmfc0 k0,$8
6f4: df7b4100 ld k1,16640(k1)
6f8: 001ad6ba dsrl k0,k0,0x1a
6fc: 335afff8 andi k0,k0,0xfff8
700: 037ad82d daddu k1,k1,k0
704: df7a0000 ld k0,0(k1)
708: 335a0020 andi k0,k0,0x20
70c: 1740ffce bnez k0,648 <r4000_tlb_refill>
710: 403aa000 dmfc0 k0,$20
714: df7b0000 ld k1,0(k1)
718: 001ad13a dsrl k0,k0,0x4
71c: 335afff0 andi k0,k0,0xfff0
720: 037ad82d daddu k1,k1,k0
724: df7a0000 ld k0,0(k1)
728: df7b0008 ld k1,8(k1)
72c: 001ad1fa dsrl k0,k0,0x7
730: 40ba1000 dmtc0 k0,$2
734: 001bd9fa dsrl k1,k1,0x7
738: 40bb1800 dmtc0 k1,$3
73c: 42000006 tlbwr
740: 42000018 eret
...
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-04 1:23 ` David Daney
2014-11-04 1:34 ` Joshua Kinard
@ 2014-11-05 9:07 ` Joshua Kinard
2014-11-05 10:21 ` Ralf Baechle
2014-11-05 16:09 ` Ralf Baechle
2 siblings, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-05 9:07 UTC (permalink / raw)
To: linux-mips; +Cc: Ralf Baechle
On 11/03/2014 20:23, David Daney wrote:
> On 11/03/2014 05:08 PM, Joshua Kinard wrote:
>> On 11/03/2014 13:52, David Daney wrote:
>>> On 11/02/2014 02:53 AM, Joshua Kinard wrote:
>>>>
>>>> So I have been testing the Onyx2 I have out the last few days with the IOC3
>>>> metadriver used on Octane, and I can get it to boot, but if
>>>> CONFIG_TRANSPARENT_HUGEPAGE is enabled in the kernel, bus errors can happen.
>>>>
>>>> If I use CONFIG_PAGE_SIZE_4KB, I get bus errors rather frequently -- running
>>>> Gentoo's 'emerge' command can produce one. Switch to CONFIG_PAGE_SIZE_16KB,
>>>> and the bus errors are far less frequent. I suspect CONFIG_PAGE_SIZE_64KB
>>>> will
>>>> be even less.
>>>>
>>>> Disable CONFIG_TRANSPARENT_HUGEPAGE, and the machine works pretty good. It's
>>>> been up for almost 8 hours compiling, and not a single bus error yet. It's
>>>> got
>>>> 2x node board with dual R12K/400MHz CPUs per node.
>>>>
>>>> I'm not really sure what CONFIG_TRANSPARENT_HUGEPAGE is enabling that's
>>>> causing
>>>> R12K CPUs on the IP27 such a headache (and on Octane, really screws up R14K
>>>> CPUs). I tried getting a core dump on one of the bus errors, but that
>>>> produces a
>>>> truncated or corrupted core file that actually crashed GDB, plus I get a nice
>>>> oops message in dmesg:
>>>
>>> Well, as its name implies, if you enable CONFIG_TRANSPARENT_HUGEPAGE, huge
>>> pages will be created and used in the background transparently to the userspace
>>> application.
>>>
>>> With 4KB base page size, the huge pages will be 2MB in size.. I don't know
>>> much about the R10K/R12K/R14K CPUs, but it is possible that either their TLBs
>>> cannot handle such pages, or that the TLB Exception handlers don't contain
>>> proper code for these CPUs.
>>>
>>> For each doubling of the base PAGE_SIZE, the huge page size will increase by a
>>> factor of 4. So with 16KB base pages the huge page size would be 32MB, since
>>> there are many fewer opportunities to transparently use a 32MB page, I would
>>> expect any errors related to huge pages to be correspondingly less frequent.
>>>
>>> With 64KB PAGE_SIZE the huge page size is 512MB, and It is likely that that
>>> could never be used by normal userspace programs.
>>
>> I checked the R10K/R12K manual, and the PageMask register there has bits 24:13
>> open for setting a mask value. It looks like these CPUs only support a page
>> size from 4KB to 16MB (so a 2MB page size should work w/ transparent
>> hugepages). I assume that the R14K on the Octane might be the same (but I
>> don't have a manual specific to the R14k, so I don't know). All of the
>> remaining bits in that register read 0 and must have 0's written back.
>>
>> I guess I could find a way to have the kernel trigger a non-fatal oops/dump the
>> registers on a bus error and get a look at the cause register to see if that
>> sheds any light on things. Doesn't a SIGBUS on MIPS typically mean that an
>> address wasn't aligned on a 32-bit boundary? Or could it also mean other
>> things?
>>
>> I believe that the R10K is largely compatible with the R4K-style TLB setup, but
>> Ralf or someone else more knowledge in that area will have to verify. Maybe
>> the R10k-family CPUs need their own TLB routines, or what currently exists
>> needs modifications? I have not tried to understand the whole TLB thing in
>> MIPS yet, so that's a bit of voodoo to me.
>
> I haven't checked, but there may be workarounds required in the TLB management
> code that are not in place for the huge page case. When the huge TLB code was
> developed, we didn't do any testing on R10K. Somebody should dump the
> exception handlers and carefully look at the rest of the huge TLB management
> code, and check to see that any required workarounds are in place.
>
> David.
I did some digging, and it looks like Ralf added CPU_SUPPORTS_HUGEPAGES support
a few years ago to most of the CPUs:
http://marc.info/?l=git-commits-head&m=135552890201646&w=2
It was pointed out to me off list that this statement for the PageMask register
in the R10K manual may explain things:
"""TLB read and write operations use this register as either a source or a
destination; when virtual addresses are presented for translation into physical
address, the corresponding bits in the TLB identify which virtual address bits
among bits 24:13 are used in the comparison. When the Mask field is not one of
the values shown in Table 13-6, the operation of the TLB is undefined. The 0
field is reserved; it must be written as zeroes, and returns zeroes when read."""
2MB page sizes aren't explicitly listed in this table in the manual, so setting
bits 24:13 in PageMask might be leading to this "undefined behavior", which on
R12K might include the random bus errors/segfaults, and R14K triggers an IBE
that needs a cold reboot.
The only other R10K system I have is the IP28, but I haven't gotten that to
boot up in a few years.
Checking the NEC Vr-Series programming manual and the PMC-Sierra RM7000 manual,
at least the R5000 and RM7000 also carry this restriction because they have the
same bits defined in PageMask.
My O2 w/ RM7K is out of commission at the moment, so I can't test for that.
Anyone got an R5K/R5200/RM7K O2/Indy/I2 and can check that CPU?
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28
"The past tempts us, the present confuses us, the future frightens us. And our
lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-05 9:07 ` Joshua Kinard
@ 2014-11-05 10:21 ` Ralf Baechle
0 siblings, 0 replies; 27+ messages in thread
From: Ralf Baechle @ 2014-11-05 10:21 UTC (permalink / raw)
To: Joshua Kinard; +Cc: linux-mips
On Wed, Nov 05, 2014 at 04:07:24AM -0500, Joshua Kinard wrote:
> It was pointed out to me off list that this statement for the PageMask register
> in the R10K manual may explain things:
>
> """TLB read and write operations use this register as either a source or a
> destination; when virtual addresses are presented for translation into physical
> address, the corresponding bits in the TLB identify which virtual address bits
> among bits 24:13 are used in the comparison. When the Mask field is not one of
> the values shown in Table 13-6, the operation of the TLB is undefined. The 0
> field is reserved; it must be written as zeroes, and returns zeroes when read."""
>
> 2MB page sizes aren't explicitly listed in this table in the manual, so setting
> bits 24:13 in PageMask might be leading to this "undefined behavior", which on
> R12K might include the random bus errors/segfaults, and R14K triggers an IBE
> that needs a cold reboot.
All MIPS CPUs with a R4000-style TLB have this restriction. It's just that the
behaviour of such bitmask values being undefined the resulting behviour is likely
to differ between CPU types.
2MB pages will be loaded into the TLB as a pair of adjacent pair of 1MB pages.
> The only other R10K system I have is the IP28, but I haven't gotten that to
> boot up in a few years.
Ralf
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-04 1:08 ` Joshua Kinard
2014-11-04 1:23 ` David Daney
@ 2014-11-05 13:52 ` Ralf Baechle
1 sibling, 0 replies; 27+ messages in thread
From: Ralf Baechle @ 2014-11-05 13:52 UTC (permalink / raw)
To: Joshua Kinard; +Cc: David Daney, Linux MIPS List
On Mon, Nov 03, 2014 at 08:08:58PM -0500, Joshua Kinard wrote:
> I guess I could find a way to have the kernel trigger a non-fatal oops/dump the
> registers on a bus error and get a look at the cause register to see if that
> sheds any light on things. Doesn't a SIGBUS on MIPS typically mean that an
> address wasn't aligned on a 32-bit boundary? Or could it also mean other things?
>
> I believe that the R10K is largely compatible with the R4K-style TLB setup, but
> Ralf or someone else more knowledge in that area will have to verify. Maybe
> the R10k-family CPUs need their own TLB routines, or what currently exists
> needs modifications? I have not tried to understand the whole TLB thing in
> MIPS yet, so that's a bit of voodoo to me.
Voodoo that normally works a lot better than the conventional code it replaced!
The R10000 TLB is basically the all dancing, all singing version of other
MIPS TLBs. Noteworthy differences are that TLB hazards are handled in hardware
and that the R10000 automatically detects multiple matching TLB entries on a
TLB write in which case it will automatically invalidate the old entry before
writing the new entry. It also is the only MIPS CPU to implement a c0_framemask
register but to my understanding of that functionality the only software
handling that register's functionality needs is initialization to zero essentially
disabling it. The R10000 supports a maximum page size of 16M.
Ralf
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-04 1:23 ` David Daney
2014-11-04 1:34 ` Joshua Kinard
2014-11-05 9:07 ` Joshua Kinard
@ 2014-11-05 16:09 ` Ralf Baechle
2014-11-07 10:22 ` Joshua Kinard
2 siblings, 1 reply; 27+ messages in thread
From: Ralf Baechle @ 2014-11-05 16:09 UTC (permalink / raw)
To: David Daney; +Cc: Joshua Kinard, Linux MIPS List
On Mon, Nov 03, 2014 at 05:23:29PM -0800, David Daney wrote:
> I haven't checked, but there may be workarounds required in the TLB
> management code that are not in place for the huge page case. When the huge
> TLB code was developed, we didn't do any testing on R10K. Somebody should
> dump the exception handlers and carefully look at the rest of the huge TLB
> management code, and check to see that any required workarounds are in
> place.
Joshua, if you happen to have R10000 errata sheets around, maybe you could
check if there's anything suspicious? Off the top of my head I don't recall
any R10000 TLB erratas but the R10000 had plenty of erratas due to it's - by
the standards of the time - high complexity.
Ralf
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-05 16:09 ` Ralf Baechle
@ 2014-11-07 10:22 ` Joshua Kinard
2014-11-07 18:30 ` David Daney
0 siblings, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-07 10:22 UTC (permalink / raw)
To: Ralf Baechle, David Daney; +Cc: Linux MIPS List
On 11/05/2014 11:09, Ralf Baechle wrote:
> On Mon, Nov 03, 2014 at 05:23:29PM -0800, David Daney wrote:
>
>> I haven't checked, but there may be workarounds required in the TLB
>> management code that are not in place for the huge page case. When the huge
>> TLB code was developed, we didn't do any testing on R10K. Somebody should
>> dump the exception handlers and carefully look at the rest of the huge TLB
>> management code, and check to see that any required workarounds are in
>> place.
>
> Joshua, if you happen to have R10000 errata sheets around, maybe you could
> check if there's anything suspicious? Off the top of my head I don't recall
> any R10000 TLB erratas but the R10000 had plenty of erratas due to it's - by
> the standards of the time - high complexity.
>
> Ralf
All I have are errata sheets for Rev 2.3, 2.4, and 2.5 of the R10K. Nothing
specific on the R12K, and nil for the R14K/R16K.
That said, poking through other areas of the R10K/R12K User Manual, there are
paragraphs titled "Errata" and regarding the PageMask register or TLB, they
state this:
Page 41
The calculated address is translated from a 44-bit virtual address into a
40-bit physical address using a translation-lookaside buffer. The TLB contains
64 entries, each of which can translate two pages. Each entry can select a page
size ranging from 4 Kbytes to 16 Mbytes, inclusive, in __powers__ of 4, as
shown in Figure 1-6.
Page 316:
Translated virtual addresses retrieve data in blocks, which are called pages.
In the R10000 processor, the size of each page may be selected from a range
that runs from 4 Kbytes to 16 Mbytes inclusive, __in_powers_of_4__ (that is, 4
Kbytes, 16 Kbytes, 64 Kbytes, etc.).
So my guess is unless hugepages can happen in powers of 4, they're not
compatible w/ the R10K-series (and likely not the R5K/RM7K, either, since they
all have the same 24:13 bits in the PageMask register). It seems the logical
choice would be to remove 'select CPU_SUPPORTS_HUGEPAGES' from CPU_R5000,
CPU_NEVADA, CPU_R10000, and CPU_RM7000 in arch/mips/Kconfig.
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28
"The past tempts us, the present confuses us, the future frightens us. And our
lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-07 10:22 ` Joshua Kinard
@ 2014-11-07 18:30 ` David Daney
2014-11-09 0:09 ` Joshua Kinard
0 siblings, 1 reply; 27+ messages in thread
From: David Daney @ 2014-11-07 18:30 UTC (permalink / raw)
To: Joshua Kinard; +Cc: Ralf Baechle, Linux MIPS List
On 11/07/2014 02:22 AM, Joshua Kinard wrote:
[...]
>
> So my guess is unless hugepages can happen in powers of 4,
Huge pages are currently only supported on MIPS64 for this reason.
huge_page_mask_size = (normal_page_size/8 * normal_page_size) / 2;
If you take log2 of everything you get
huge_page_mask_bits = normal_page_bits - 3 + normal_page_bits - 1
= 2 * normal_page_bits - 4 (always even)
So all page sizes result in huge pages that meet the power of 4 criterion.
> they're not
> compatible w/ the R10K-series (and likely not the R5K/RM7K, either, since they
> all have the same 24:13 bits in the PageMask register). It seems the logical
> choice would be to remove 'select CPU_SUPPORTS_HUGEPAGES' from CPU_R5000,
> CPU_NEVADA, CPU_R10000, and CPU_RM7000 in arch/mips/Kconfig.
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-07 18:30 ` David Daney
@ 2014-11-09 0:09 ` Joshua Kinard
2014-11-10 7:04 ` Joshua Kinard
0 siblings, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-09 0:09 UTC (permalink / raw)
To: David Daney; +Cc: Ralf Baechle, Linux MIPS List
On 11/07/2014 13:30, David Daney wrote:
> On 11/07/2014 02:22 AM, Joshua Kinard wrote:
> [...]
>>
>> So my guess is unless hugepages can happen in powers of 4,
>
> Huge pages are currently only supported on MIPS64 for this reason.
>
> huge_page_mask_size = (normal_page_size/8 * normal_page_size) / 2;
>
> If you take log2 of everything you get
>
> huge_page_mask_bits = normal_page_bits - 3 + normal_page_bits - 1
> = 2 * normal_page_bits - 4 (always even)
>
> So all page sizes result in huge pages that meet the power of 4 criterion.
Well, looks like I'll have to bisect to hunt the problem down. Obviously there
is something with transparent hugepages that the R10K-family dislikes. Just a
question of "what?". Seems like I'm the only one left with this kind of
equipment and interest to play with it :)
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28
"The past tempts us, the present confuses us, the future frightens us. And our
lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-09 0:09 ` Joshua Kinard
@ 2014-11-10 7:04 ` Joshua Kinard
2014-11-10 10:51 ` Ralf Baechle
0 siblings, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-10 7:04 UTC (permalink / raw)
To: David Daney; +Cc: Ralf Baechle, Linux MIPS List
On 11/08/2014 19:09, Joshua Kinard wrote:
> On 11/07/2014 13:30, David Daney wrote:
>> On 11/07/2014 02:22 AM, Joshua Kinard wrote:
>> [...]
>>>
>>> So my guess is unless hugepages can happen in powers of 4,
>>
>> Huge pages are currently only supported on MIPS64 for this reason.
>>
>> huge_page_mask_size = (normal_page_size/8 * normal_page_size) / 2;
>>
>> If you take log2 of everything you get
>>
>> huge_page_mask_bits = normal_page_bits - 3 + normal_page_bits - 1
>> = 2 * normal_page_bits - 4 (always even)
>>
>> So all page sizes result in huge pages that meet the power of 4 criterion.
>
> Well, looks like I'll have to bisect to hunt the problem down. Obviously there
> is something with transparent hugepages that the R10K-family dislikes. Just a
> question of "what?". Seems like I'm the only one left with this kind of
> equipment and interest to play with it :)
I gave up on bisecting this. 3.7 and 3.9 kernels are not bootable on my Onyx2
w/o additional patches to fix the PCI probing code to deal with the card cage I
have in my system (basically, it stops probing after it discovers the first PCI
bus). Even with that fixed, normal init refused to load on those kernels, and
dash as init just outright crashed. Must be some other IP27 bug that was fixed
at some point, and I didn't feel like applying multiple patches to every bisect
checkout, which might've altered results and led me to blaming the wrong commit.
It does look like the PageMask register is getting set to the correct values on
PAGE_SIZE_4K and PAGE_SIZE_16K when a hugepage is needed (PM_1M and PM_16M).
The PAGE_SIZE_64K case wouldn't be valid on R10k, as that uses PM_256M for a
hugepage, which is bits 28:13 in PageMask and that would lead to "undefined
behavior". I'm assuming another register is getting set to an incorrect value
in the huge pagecase (EntryLo0 or EntryLo1? EntryHi?), but I don't have the
required knowledge to fiddle w/ the TLB code to figure it out.
So, I sent in the patch that marks CPU_SUPPORTS_HUGEPAGES as BROKEN until
someone feels like tackling it (if ever).
Sidenote: Is it possible to add additional CP0 registers to a register dump on
a panic or oops? I looked around ptrace.c and ptrace.h and see where these
registers are setup and printed out, but I can't find out where the actual
values are fetched from the CPU and put into struct pt_regs. I am assuming
it's a snippet of asm somewhere. Adding R10K's PageMask, Config, ErrorEpc, And
Context/XContext registers seems like useful debugging info.
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28
"The past tempts us, the present confuses us, the future frightens us. And our
lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-10 7:04 ` Joshua Kinard
@ 2014-11-10 10:51 ` Ralf Baechle
2014-11-10 11:20 ` Thomas Bogendoerfer
2014-11-10 11:22 ` Joshua Kinard
0 siblings, 2 replies; 27+ messages in thread
From: Ralf Baechle @ 2014-11-10 10:51 UTC (permalink / raw)
To: Thomas Bogendoerfer, Joshua Kinard; +Cc: David Daney, Linux MIPS List
Thomas,
can you test CONFIG_TRANSPARENT_HUGEPAGE on an IP28?
All in all the R10000's TLB is unproblematic; my gut feeling is that
rather something else specific to IP27 is spoiling the broth.
Ralf
On Mon, Nov 10, 2014 at 02:04:10AM -0500, Joshua Kinard wrote:
> Date: Mon, 10 Nov 2014 02:04:10 -0500
> From: Joshua Kinard <kumba@gentoo.org>
> To: David Daney <ddaney.cavm@gmail.com>
> CC: Ralf Baechle <ralf@linux-mips.org>, Linux MIPS List
> <linux-mips@linux-mips.org>
> Subject: Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
> Content-Type: text/plain; charset=windows-1252
>
> On 11/08/2014 19:09, Joshua Kinard wrote:
> > On 11/07/2014 13:30, David Daney wrote:
> >> On 11/07/2014 02:22 AM, Joshua Kinard wrote:
> >> [...]
> >>>
> >>> So my guess is unless hugepages can happen in powers of 4,
> >>
> >> Huge pages are currently only supported on MIPS64 for this reason.
> >>
> >> huge_page_mask_size = (normal_page_size/8 * normal_page_size) / 2;
> >>
> >> If you take log2 of everything you get
> >>
> >> huge_page_mask_bits = normal_page_bits - 3 + normal_page_bits - 1
> >> = 2 * normal_page_bits - 4 (always even)
> >>
> >> So all page sizes result in huge pages that meet the power of 4 criterion.
> >
> > Well, looks like I'll have to bisect to hunt the problem down. Obviously there
> > is something with transparent hugepages that the R10K-family dislikes. Just a
> > question of "what?". Seems like I'm the only one left with this kind of
> > equipment and interest to play with it :)
>
> I gave up on bisecting this. 3.7 and 3.9 kernels are not bootable on my Onyx2
> w/o additional patches to fix the PCI probing code to deal with the card cage I
> have in my system (basically, it stops probing after it discovers the first PCI
> bus). Even with that fixed, normal init refused to load on those kernels, and
> dash as init just outright crashed. Must be some other IP27 bug that was fixed
> at some point, and I didn't feel like applying multiple patches to every bisect
> checkout, which might've altered results and led me to blaming the wrong commit.
>
> It does look like the PageMask register is getting set to the correct values on
> PAGE_SIZE_4K and PAGE_SIZE_16K when a hugepage is needed (PM_1M and PM_16M).
> The PAGE_SIZE_64K case wouldn't be valid on R10k, as that uses PM_256M for a
> hugepage, which is bits 28:13 in PageMask and that would lead to "undefined
> behavior". I'm assuming another register is getting set to an incorrect value
> in the huge pagecase (EntryLo0 or EntryLo1? EntryHi?), but I don't have the
> required knowledge to fiddle w/ the TLB code to figure it out.
>
> So, I sent in the patch that marks CPU_SUPPORTS_HUGEPAGES as BROKEN until
> someone feels like tackling it (if ever).
>
> Sidenote: Is it possible to add additional CP0 registers to a register dump on
> a panic or oops? I looked around ptrace.c and ptrace.h and see where these
> registers are setup and printed out, but I can't find out where the actual
> values are fetched from the CPU and put into struct pt_regs. I am assuming
> it's a snippet of asm somewhere. Adding R10K's PageMask, Config, ErrorEpc, And
> Context/XContext registers seems like useful debugging info.
>
> --
> Joshua Kinard
> Gentoo/MIPS
> kumba@gentoo.org
> 4096R/D25D95E3 2011-03-28
>
> "The past tempts us, the present confuses us, the future frightens us. And our
> lives slip away, moment by moment, lost in that vast, terrible in-between."
>
> --Emperor Turhan, Centauri Republic
Ralf
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-10 10:51 ` Ralf Baechle
@ 2014-11-10 11:20 ` Thomas Bogendoerfer
2014-11-10 14:22 ` Joshua Kinard
2014-11-10 21:30 ` Thomas Bogendoerfer
2014-11-10 11:22 ` Joshua Kinard
1 sibling, 2 replies; 27+ messages in thread
From: Thomas Bogendoerfer @ 2014-11-10 11:20 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Joshua Kinard, David Daney, Linux MIPS List
On Mon, Nov 10, 2014 at 11:51:06AM +0100, Ralf Baechle wrote:
> Thomas,
>
> can you test CONFIG_TRANSPARENT_HUGEPAGE on an IP28?
>
> All in all the R10000's TLB is unproblematic; my gut feeling is that
> rather something else specific to IP27 is spoiling the broth.
I'll give it a spin later today.
Thomas.
--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-10 10:51 ` Ralf Baechle
2014-11-10 11:20 ` Thomas Bogendoerfer
@ 2014-11-10 11:22 ` Joshua Kinard
1 sibling, 0 replies; 27+ messages in thread
From: Joshua Kinard @ 2014-11-10 11:22 UTC (permalink / raw)
To: Ralf Baechle, Thomas Bogendoerfer; +Cc: David Daney, Linux MIPS List
On 11/10/2014 05:51, Ralf Baechle wrote:
> Thomas,
>
> can you test CONFIG_TRANSPARENT_HUGEPAGE on an IP28?
>
> All in all the R10000's TLB is unproblematic; my gut feeling is that
> rather something else specific to IP27 is spoiling the broth.
>
> Ralf
I don't know if it's specific to IP27. I have problems on the Octane w/ an
R14000 and CONFIG_TRANSPARENT_HUGEPAGE (instruction bus errors, needs cold
reboot to clear). I didn't have the same issues w/ the R12000 CPU module
installed, but I did not test things as thoroughly the last time I installed
it. I'll see about swapping the R12K module back in tonight or tomorrow and
doing the same tests as on the IP27 that can trigger problems.
--J
> On Mon, Nov 10, 2014 at 02:04:10AM -0500, Joshua Kinard wrote:
>> Date: Mon, 10 Nov 2014 02:04:10 -0500
>> From: Joshua Kinard <kumba@gentoo.org>
>> To: David Daney <ddaney.cavm@gmail.com>
>> CC: Ralf Baechle <ralf@linux-mips.org>, Linux MIPS List
>> <linux-mips@linux-mips.org>
>> Subject: Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
>> Content-Type: text/plain; charset=windows-1252
>>
>> On 11/08/2014 19:09, Joshua Kinard wrote:
>>> On 11/07/2014 13:30, David Daney wrote:
>>>> On 11/07/2014 02:22 AM, Joshua Kinard wrote:
>>>> [...]
>>>>>
>>>>> So my guess is unless hugepages can happen in powers of 4,
>>>>
>>>> Huge pages are currently only supported on MIPS64 for this reason.
>>>>
>>>> huge_page_mask_size = (normal_page_size/8 * normal_page_size) / 2;
>>>>
>>>> If you take log2 of everything you get
>>>>
>>>> huge_page_mask_bits = normal_page_bits - 3 + normal_page_bits - 1
>>>> = 2 * normal_page_bits - 4 (always even)
>>>>
>>>> So all page sizes result in huge pages that meet the power of 4 criterion.
>>>
>>> Well, looks like I'll have to bisect to hunt the problem down. Obviously there
>>> is something with transparent hugepages that the R10K-family dislikes. Just a
>>> question of "what?". Seems like I'm the only one left with this kind of
>>> equipment and interest to play with it :)
>>
>> I gave up on bisecting this. 3.7 and 3.9 kernels are not bootable on my Onyx2
>> w/o additional patches to fix the PCI probing code to deal with the card cage I
>> have in my system (basically, it stops probing after it discovers the first PCI
>> bus). Even with that fixed, normal init refused to load on those kernels, and
>> dash as init just outright crashed. Must be some other IP27 bug that was fixed
>> at some point, and I didn't feel like applying multiple patches to every bisect
>> checkout, which might've altered results and led me to blaming the wrong commit.
>>
>> It does look like the PageMask register is getting set to the correct values on
>> PAGE_SIZE_4K and PAGE_SIZE_16K when a hugepage is needed (PM_1M and PM_16M).
>> The PAGE_SIZE_64K case wouldn't be valid on R10k, as that uses PM_256M for a
>> hugepage, which is bits 28:13 in PageMask and that would lead to "undefined
>> behavior". I'm assuming another register is getting set to an incorrect value
>> in the huge pagecase (EntryLo0 or EntryLo1? EntryHi?), but I don't have the
>> required knowledge to fiddle w/ the TLB code to figure it out.
>>
>> So, I sent in the patch that marks CPU_SUPPORTS_HUGEPAGES as BROKEN until
>> someone feels like tackling it (if ever).
>>
>> Sidenote: Is it possible to add additional CP0 registers to a register dump on
>> a panic or oops? I looked around ptrace.c and ptrace.h and see where these
>> registers are setup and printed out, but I can't find out where the actual
>> values are fetched from the CPU and put into struct pt_regs. I am assuming
>> it's a snippet of asm somewhere. Adding R10K's PageMask, Config, ErrorEpc, And
>> Context/XContext registers seems like useful debugging info.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-10 11:20 ` Thomas Bogendoerfer
@ 2014-11-10 14:22 ` Joshua Kinard
2014-11-10 16:55 ` David Daney
2014-11-10 21:30 ` Thomas Bogendoerfer
1 sibling, 1 reply; 27+ messages in thread
From: Joshua Kinard @ 2014-11-10 14:22 UTC (permalink / raw)
To: Thomas Bogendoerfer, Ralf Baechle; +Cc: David Daney, Linux MIPS List
On 11/10/2014 06:20, Thomas Bogendoerfer wrote:
> On Mon, Nov 10, 2014 at 11:51:06AM +0100, Ralf Baechle wrote:
>> Thomas,
>>
>> can you test CONFIG_TRANSPARENT_HUGEPAGE on an IP28?
>>
>> All in all the R10000's TLB is unproblematic; my gut feeling is that
>> rather something else specific to IP27 is spoiling the broth.
>
> I'll give it a spin later today.
>
> Thomas.
Try testing with and without CONFIG_HUGETLBFS in the kernel. File systems ->
Pseudo filesystems -> HugeTLB file system support
So far, it seems adding that option in with CONFIG_TRANSPARENT_HUGEPAGE makes
both IP27 and IP30 behave. Without, I get data bus errors or segfaults on IP27
running Gentoo's "emerge" program on PAGE_SIZE_4K.
IP30 seems to be fine on an R12000 with or without that option, but I only have
a dual R12K module to test against. I've only had the R14K dual module for a
few days, and I could not reproduce the bus errors on that module, either. So
I wonder if there is something funny with the hardware on the single R14K
module, which I did get IBE's on before. And whether that will behave once
CONFIG_HUGETLBFS is in the kernel.
If so, maybe the fix is to make CONFIG_HUGETLBFS automatically selected if
CONFIG_TRANSPARENT_HUGEPAGE?
--J
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-10 14:22 ` Joshua Kinard
@ 2014-11-10 16:55 ` David Daney
2014-11-10 17:03 ` Ralf Baechle
0 siblings, 1 reply; 27+ messages in thread
From: David Daney @ 2014-11-10 16:55 UTC (permalink / raw)
To: Joshua Kinard; +Cc: Thomas Bogendoerfer, Ralf Baechle, Linux MIPS List
On 11/10/2014 06:22 AM, Joshua Kinard wrote:
> On 11/10/2014 06:20, Thomas Bogendoerfer wrote:
>> On Mon, Nov 10, 2014 at 11:51:06AM +0100, Ralf Baechle wrote:
>>> Thomas,
>>>
>>> can you test CONFIG_TRANSPARENT_HUGEPAGE on an IP28?
>>>
>>> All in all the R10000's TLB is unproblematic; my gut feeling is that
>>> rather something else specific to IP27 is spoiling the broth.
>>
>> I'll give it a spin later today.
>>
>> Thomas.
>
> Try testing with and without CONFIG_HUGETLBFS in the kernel. File systems ->
> Pseudo filesystems -> HugeTLB file system support
>
> So far, it seems adding that option in with CONFIG_TRANSPARENT_HUGEPAGE makes
> both IP27 and IP30 behave. Without, I get data bus errors or segfaults on IP27
> running Gentoo's "emerge" program on PAGE_SIZE_4K.
>
> IP30 seems to be fine on an R12000 with or without that option, but I only have
> a dual R12K module to test against. I've only had the R14K dual module for a
> few days, and I could not reproduce the bus errors on that module, either. So
> I wonder if there is something funny with the hardware on the single R14K
> module, which I did get IBE's on before. And whether that will behave once
> CONFIG_HUGETLBFS is in the kernel.
>
> If so, maybe the fix is to make CONFIG_HUGETLBFS automatically selected if
> CONFIG_TRANSPARENT_HUGEPAGE?
>
Yes, you may be on to something here. Certianly basic huge TLB support
must be in place for TRANSPARENT_HUGEPAGE to work.
It could be that the Kconfig symbols for the various portions of huge
page support are missing the required dependencies.
FWIW, I always build with a huge page Kconfig options set.
I have:
$ grep HUGE .config
CONFIG_SYS_SUPPORTS_HUGETLBFS=y
CONFIG_MIPS_HUGE_TLB_SUPPORT=y
CONFIG_CPU_SUPPORTS_HUGEPAGES=y
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
I suspect that you may not need CONFIG_HUGETLBFS, but
CONFIG_HUGETLB_PAGE is probably essential.
David Daney
> --J
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-10 16:55 ` David Daney
@ 2014-11-10 17:03 ` Ralf Baechle
2014-11-10 17:29 ` David Daney
2014-11-11 11:11 ` Joshua Kinard
0 siblings, 2 replies; 27+ messages in thread
From: Ralf Baechle @ 2014-11-10 17:03 UTC (permalink / raw)
To: David Daney; +Cc: Joshua Kinard, Thomas Bogendoerfer, Linux MIPS List
On Mon, Nov 10, 2014 at 08:55:09AM -0800, David Daney wrote:
> Yes, you may be on to something here. Certianly basic huge TLB support must
> be in place for TRANSPARENT_HUGEPAGE to work.
>
> It could be that the Kconfig symbols for the various portions of huge page
> support are missing the required dependencies.
>
> FWIW, I always build with a huge page Kconfig options set.
>
> I have:
> $ grep HUGE .config
> CONFIG_SYS_SUPPORTS_HUGETLBFS=y
> CONFIG_MIPS_HUGE_TLB_SUPPORT=y
> CONFIG_CPU_SUPPORTS_HUGEPAGES=y
> CONFIG_TRANSPARENT_HUGEPAGE=y
> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
> CONFIG_HUGETLBFS=y
> CONFIG_HUGETLB_PAGE=y
>
> I suspect that you may not need CONFIG_HUGETLBFS, but CONFIG_HUGETLB_PAGE is
> probably essential.
IP27 also has NUMA as the only in-tree MIPS system - and it's NUMA support
is not in the best support state to say the least. Just an observation -
at this point in time there is no obvious connection between either
R10000 <-> transparent huge page
or
NUMA <-> transparent huge page
Ralf
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-10 17:03 ` Ralf Baechle
@ 2014-11-10 17:29 ` David Daney
2014-11-11 11:11 ` Joshua Kinard
1 sibling, 0 replies; 27+ messages in thread
From: David Daney @ 2014-11-10 17:29 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Joshua Kinard, Thomas Bogendoerfer, Linux MIPS List
On 11/10/2014 09:03 AM, Ralf Baechle wrote:
> On Mon, Nov 10, 2014 at 08:55:09AM -0800, David Daney wrote:
>
>> Yes, you may be on to something here. Certianly basic huge TLB support must
>> be in place for TRANSPARENT_HUGEPAGE to work.
>>
>> It could be that the Kconfig symbols for the various portions of huge page
>> support are missing the required dependencies.
>>
>> FWIW, I always build with a huge page Kconfig options set.
>>
>> I have:
>> $ grep HUGE .config
>> CONFIG_SYS_SUPPORTS_HUGETLBFS=y
>> CONFIG_MIPS_HUGE_TLB_SUPPORT=y
>> CONFIG_CPU_SUPPORTS_HUGEPAGES=y
>> CONFIG_TRANSPARENT_HUGEPAGE=y
>> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
>> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
>> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
>> CONFIG_HUGETLBFS=y
>> CONFIG_HUGETLB_PAGE=y
>>
>> I suspect that you may not need CONFIG_HUGETLBFS, but CONFIG_HUGETLB_PAGE is
>> probably essential.
>
> IP27 also has NUMA as the only in-tree MIPS system - and it's NUMA support
> is not in the best support state to say the least. Just an observation -
> at this point in time there is no obvious connection between either
>
> R10000 <-> transparent huge page
>
> or
>
> NUMA <-> transparent huge page
>
> Ralf
>
FYI, I am running with CONFIG_TRANSPARENT_HUGEPAGE on a 2-node NUMA
system (48 CPUs per node) OCTEON III, and the huge pages have not been
an issue. So I don't think there are any inherent NUMA issues with
HUGEPAGES.
David Daney
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-10 11:20 ` Thomas Bogendoerfer
2014-11-10 14:22 ` Joshua Kinard
@ 2014-11-10 21:30 ` Thomas Bogendoerfer
2014-11-11 7:47 ` Ralf Baechle
1 sibling, 1 reply; 27+ messages in thread
From: Thomas Bogendoerfer @ 2014-11-10 21:30 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Joshua Kinard, David Daney, Linux MIPS List
On Mon, Nov 10, 2014 at 12:20:39PM +0100, Thomas Bogendoerfer wrote:
> On Mon, Nov 10, 2014 at 11:51:06AM +0100, Ralf Baechle wrote:
> > Thomas,
> >
> > can you test CONFIG_TRANSPARENT_HUGEPAGE on an IP28?
> >
> > All in all the R10000's TLB is unproblematic; my gut feeling is that
> > rather something else specific to IP27 is spoiling the broth.
>
> I'll give it a spin later today.
looks like IP28 has more problems than HUGEPAGES... even without
huge pages enabled it locks up during upgrading debian packages:-(
My gut feeling is that there is another spot hitting the ll/sc errata
stuff for this old R10k CPU.
So no new data out of that.
Thomas.
--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-10 21:30 ` Thomas Bogendoerfer
@ 2014-11-11 7:47 ` Ralf Baechle
2014-11-11 9:24 ` Thomas Bogendoerfer
0 siblings, 1 reply; 27+ messages in thread
From: Ralf Baechle @ 2014-11-11 7:47 UTC (permalink / raw)
To: Thomas Bogendoerfer; +Cc: Joshua Kinard, David Daney, Linux MIPS List
On Mon, Nov 10, 2014 at 10:30:10PM +0100, Thomas Bogendoerfer wrote:
> looks like IP28 has more problems than HUGEPAGES... even without
> huge pages enabled it locks up during upgrading debian packages:-(
> My gut feeling is that there is another spot hitting the ll/sc errata
> stuff for this old R10k CPU.
You have the dreaded v2.6 CPU?
Ralf
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-11 7:47 ` Ralf Baechle
@ 2014-11-11 9:24 ` Thomas Bogendoerfer
2014-11-11 9:38 ` Ralf Baechle
0 siblings, 1 reply; 27+ messages in thread
From: Thomas Bogendoerfer @ 2014-11-11 9:24 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Joshua Kinard, David Daney, Linux MIPS List
On Tue, Nov 11, 2014 at 08:47:58AM +0100, Ralf Baechle wrote:
> On Mon, Nov 10, 2014 at 10:30:10PM +0100, Thomas Bogendoerfer wrote:
>
> > looks like IP28 has more problems than HUGEPAGES... even without
> > huge pages enabled it locks up during upgrading debian packages:-(
> > My gut feeling is that there is another spot hitting the ll/sc errata
> > stuff for this old R10k CPU.
>
> You have the dreaded v2.6 CPU?
V2.5 even.
Thomas.
--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-11 9:24 ` Thomas Bogendoerfer
@ 2014-11-11 9:38 ` Ralf Baechle
0 siblings, 0 replies; 27+ messages in thread
From: Ralf Baechle @ 2014-11-11 9:38 UTC (permalink / raw)
To: Thomas Bogendoerfer; +Cc: Joshua Kinard, David Daney, Linux MIPS List
On Tue, Nov 11, 2014 at 10:24:07AM +0100, Thomas Bogendoerfer wrote:
> On Tue, Nov 11, 2014 at 08:47:58AM +0100, Ralf Baechle wrote:
> > On Mon, Nov 10, 2014 at 10:30:10PM +0100, Thomas Bogendoerfer wrote:
> >
> > > looks like IP28 has more problems than HUGEPAGES... even without
> > > huge pages enabled it locks up during upgrading debian packages:-(
> > > My gut feeling is that there is another spot hitting the ll/sc errata
> > > stuff for this old R10k CPU.
> >
> > You have the dreaded v2.6 CPU?
>
> V2.5 even.
I'm impressed. Not sure if I've ever seen a v2.5 errata sheet. So far
I thought the v2.6 CPUs in my one Origin were the only ones that ever
left the SGI campus.
Ralf
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
2014-11-10 17:03 ` Ralf Baechle
2014-11-10 17:29 ` David Daney
@ 2014-11-11 11:11 ` Joshua Kinard
1 sibling, 0 replies; 27+ messages in thread
From: Joshua Kinard @ 2014-11-11 11:11 UTC (permalink / raw)
To: Ralf Baechle, David Daney; +Cc: Thomas Bogendoerfer, Linux MIPS List
On 11/10/2014 12:03, Ralf Baechle wrote:
> On Mon, Nov 10, 2014 at 08:55:09AM -0800, David Daney wrote:
>
>> Yes, you may be on to something here. Certianly basic huge TLB support must
>> be in place for TRANSPARENT_HUGEPAGE to work.
>>
>> It could be that the Kconfig symbols for the various portions of huge page
>> support are missing the required dependencies.
>>
>> FWIW, I always build with a huge page Kconfig options set.
>>
>> I have:
>> $ grep HUGE .config
>> CONFIG_SYS_SUPPORTS_HUGETLBFS=y
>> CONFIG_MIPS_HUGE_TLB_SUPPORT=y
>> CONFIG_CPU_SUPPORTS_HUGEPAGES=y
>> CONFIG_TRANSPARENT_HUGEPAGE=y
>> CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
>> # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
>> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
>> CONFIG_HUGETLBFS=y
>> CONFIG_HUGETLB_PAGE=y
>>
>> I suspect that you may not need CONFIG_HUGETLBFS, but CONFIG_HUGETLB_PAGE is
>> probably essential.
>
> IP27 also has NUMA as the only in-tree MIPS system - and it's NUMA support
> is not in the best support state to say the least. Just an observation -
> at this point in time there is no obvious connection between either
>
> R10000 <-> transparent huge page
>
> or
>
> NUMA <-> transparent huge page
>
> Ralf
I briefly tried NUMA on the Onyx2, and it failed to load init. init actually
spat out its --help info and quit, which panicked the kernel. So I didn't test
that too much more. I am also booting an 'M' kernel, not an 'N'.
That said, I went back to playing around with the Octane, which also seems to
have issues when CONFIG_TRANSPARENT_HUGEPAGE is present. I now think that it's
not hugepages support at all, but something in the code covered by
CONFIG_MIGRATION.
Booting a 3.17.2 kernel on the Octane with CONFIG_TRANSPARENT_HUGEPAGES but
without CONFIG_HUGETLBFS (and, consequently, without CONFIG_HUGETLB_PAGE),
didn't immediately trigger my instruction bus errors upon loading init, despite
multiple cold reboots. It took several tries before I could get 3.17.2 to
trigger it.
Backtracking to 3.16, I found out that I could trigger the problem virtually
every single cold boot on 3.16.4, but NOT 3.16.5. Going through 3.16.5's
changelog, I tried backing out several commits that dealt with transparent
hugepages, jiffies calculation, and finally hit on this one:
http://git.linux-mips.org/?p=ralf/linux.git;a=commit;h=e9203e7b4019370e6d8f69cbf71c052aad22ced7
"""
commit d3cb8bf6081b8b7a2dabb1264fe968fd870fa595 upstream.
A migration entry is marked as write if pte_write was true at the time the
entry was created. The VMA protections are not double checked when migration
entries are being removed as mprotect marks write-migration-entries as
read. It means that potentially we take a spurious fault to mark PTEs write
again but it's straight-forward. However, there is a race between write
migrations being marked read and migrations finishing. This potentially
allows a PTE to be write that should have been read. Close this race by
double checking the VMA permissions using maybe_mkwrite when migration
completes.
"""
CONFIG_MIGRATION is enabled by default when you select
CONFIG_TRANSPARENT_HUGEPAGE, and when I backed that patch out of 3.16.5, the
frequency of a cold boot resulting in IBE's upon loading init increased -- 6
out of 7 reboots in one test run.
Leaving that patch backed out, I enabled CONFIG_HUGETLBFS and
CONFIG_HUGETLB_PAGE, and so far, out of five cold boots, all boot up fine.
This mirrors the behavior on the IP27 machine where CONFIG_HUGETLBFS seems to
fix problems. I tried backing the migration patch out on the IP27 kernel and
it doesn't seem to have an effect there.
This seems to suggest that CONFIG_MIGRATION plays a part somehow, but only if
CONFIG_HUGETLB_PAGE is left out. Doesn't look like CONFIG_HUGETLBFS matters,
as I haven't mounted that filesystem anywhere.
The symptoms on each systems are different -- I only get IBE's on Octane,
sometimes mixed with DBE's, and usually when init loads. If by luck, init
loads, the IBE's are not likely to happen and the machine seems to run fine. I
also confirmed that the R12K module on Octane suffers the same problem -- seems
to be a bit more resilient, though.
IP27 only ever gets DBE's, and not usually while loading init, but when
executing other userland programs, like Gentoo's emerge (written in Python).
It also looks like turning on CONFIG_HUGETLBFS and CONFIG_HUGETLB_PAGE fixed my
problems on Octane w/ PAGE_SIZE_16K/PAGE_SIZE_64K triggering random
sigbus/sigsegv signals, too (if anyone remembers that mail thread form a few
months ago).
So I'm curious why CONFIG_HUGETLB_PAGE is hidden and selected only with
CONFIG_HUGETLBFS? It does cause arch/mips/mm/hugetlbpage.c to get built, so
maybe that's the critical part? If so, it seems then for MIPS, that should be
in the the 'Kernel type' menu w/ CONFIG_TRANSPARENT_HUGEPAGE, and not invisibly
hidden away deep the 'File systems' submenu.
--J
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2014-11-11 11:11 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-02 10:53 IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors Joshua Kinard
2014-11-03 18:52 ` David Daney
2014-11-04 1:08 ` Joshua Kinard
2014-11-04 1:23 ` David Daney
2014-11-04 1:34 ` Joshua Kinard
2014-11-04 1:43 ` David Daney
2014-11-04 5:51 ` Joshua Kinard
2014-11-05 9:07 ` Joshua Kinard
2014-11-05 10:21 ` Ralf Baechle
2014-11-05 16:09 ` Ralf Baechle
2014-11-07 10:22 ` Joshua Kinard
2014-11-07 18:30 ` David Daney
2014-11-09 0:09 ` Joshua Kinard
2014-11-10 7:04 ` Joshua Kinard
2014-11-10 10:51 ` Ralf Baechle
2014-11-10 11:20 ` Thomas Bogendoerfer
2014-11-10 14:22 ` Joshua Kinard
2014-11-10 16:55 ` David Daney
2014-11-10 17:03 ` Ralf Baechle
2014-11-10 17:29 ` David Daney
2014-11-11 11:11 ` Joshua Kinard
2014-11-10 21:30 ` Thomas Bogendoerfer
2014-11-11 7:47 ` Ralf Baechle
2014-11-11 9:24 ` Thomas Bogendoerfer
2014-11-11 9:38 ` Ralf Baechle
2014-11-10 11:22 ` Joshua Kinard
2014-11-05 13:52 ` Ralf Baechle
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.