All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joshua Kinard <kumba@gentoo.org>
To: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Subject: Re: THP broken on OCTEON?
Date: Mon, 23 May 2016 15:40:36 -0400	[thread overview]
Message-ID: <57435CB4.5080609@gentoo.org> (raw)
In-Reply-To: <20160523192219.GB24125@linux-mips.org>

On 05/23/2016 15:22, Ralf Baechle wrote:
> On Mon, May 23, 2016 at 02:57:30PM -0400, Joshua Kinard wrote:
> 
>> NAK, this issue looks completely different to IP30/IP27.  In this case, it
>> looks like the hardware is detecting the case where multiple TLB entries match
>> and it's killing the machine to avoid hardware damage.  I don't want to know
>> how the SGI systems handle this scenario (does the R10000 do a TLB shutdown??).
> 
> The R10000 detects if duplicate entries when writing to the TLB and
> invalidates the previous entry.  That is, there will never be duplicate
> entries in the TLB and of course no TLB shutdown.
> 
> That's the theory.  I'm wondering how well that is going to work if
> the entries are having a different page size.
> 
> And Aaro doesn't always get machine checks so it's not like always a
> duplicate entry is written.
> 
>> On IP30, using THP usually results in instruction bus errors (IBE), after a set
>> time, depending on the machine's configuration (<2GB RAM, virtually instant on
>> userland init; >2GB RAM, might survive for a few minutes, even getting all the
>> way to runlevel 3 randomly).
>>
>> IP27 was somewhat similar to IP30, in that THP usually results in IBEs after a
>> few seconds of hitting userland bringup (bash is pretty quick at triggering an
>> IBE), but I haven't tried experimenting with varying the amount of RAM in that
>> machine, due to the fragility of pulling the nodeboards out constantly.  I also
>> haven't tried THP since refactoring/rewriting the IP27 code back in Feb to see
>> if I magically fixed it...

For IP30, I created a BUGS file in my local source (also in the IP30 patch I
still maintain) that documented some combinations of settings that affected THP
on the platform.  Most importantly, using a different PAGE_SIZE than 4KB also
required setting MAX_ZONE_ORDER to a decent value, too, else on Octane, it'd
hit IBEs at soon as the kernel executed /sbin/init.  Also depended on the
amount of RAM in that system:

>>2GB RAM:
>  - In order to use more than 2GB RAM in IP30/Octane requires selecting
>    VERY specific values for certain Kconfig options.  Specifically,
>    the following options under the "Kernel type" submenu:
>      - PAGE_SIZE
>      - Maximum Zone Order
>      - Transparent Hugepages (THP)
> 
>    A table of the specific settings is below:
>     PAGE_SIZE | Zone Order | THP
>    -----------|------------|-----
>        4KB    | 11 to 13   |  N
>       16KB    | 12 Only    |  Y
>       64KB*   | 14 Only    |  Y
> 
>    Any other configuration of these three options will likely lead to
>    Instruction Bus Errors (IBEs) when the kernel loads userland up (when it
>    execve()'s /sbin/init).  Even then, however, the machine will still be
>    very unstable (depending on the operations it does).  Heavy disk I/O
>    still seems capable of crashing the machine due to either NULL pointer
>    dereferences, unhandled kernel unaligned accesses, or Instruction Bus
>    Errors.
> 
>    * Impact users cannot currently use an Impact board with 64KB PAGE_SIZE,
>      THP, and >2GB RAM.  This will trigger a NULL pointer deference in
>      impact_resize_kpool() (when called initially from impact_common_probe()
>      to set the initial 64KB kpool on pool '0') due to (possibly) vzalloc()
>      returning a NULL pointer when allocating kpool_virt[pool].
> 
>    * THP still has issues on R1x000 CPUs, so user beware.  YMMV.


Might try some of those combinations and see if things improve on the Octeon?
IP27 was equally affected by this, minus the bits about RAM and Impact Gfx.
turning off THP, IP30 can run 64KB PAGE_SIZE without issue (compiles of
packages is actually sped up quite significantly under >4KB PAGE_SIZE).

IP27 has a bug in it somewhere that causes an immediate Oops on 64KB PAGE_SIZE
that I haven't traced down yet (I have the Oops saved somewhere if needed).  So
I use 16KB on that system.

An O2 w/ an RM7000 has virtually no issues at all with 64KB or 16KB PAGE_SIZE
and THP, though it's been several months since I last booted my O2.

-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
6144R/F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

  reply	other threads:[~2016-05-23 19:40 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-23 15:13 THP broken on OCTEON? Aaro Koskinen
2016-05-23 15:20 ` Ralf Baechle
2016-05-23 16:21   ` David Daney
2016-05-23 18:52     ` Aaro Koskinen
2016-05-23 19:03       ` David Daney
2016-05-23 19:03         ` David Daney
2016-05-23 19:08       ` Joshua Kinard
2016-05-23 20:02         ` Alastair Bridgewater
2016-05-23 18:57   ` Joshua Kinard
2016-05-23 19:22     ` Ralf Baechle
2016-05-23 19:40       ` Joshua Kinard [this message]
2016-05-23 20:01         ` Ralf Baechle
2016-05-24 21:21         ` Aaro Koskinen
2016-05-24 22:39           ` David Daney
2016-05-25 13:41 ` Aaro Koskinen
2016-05-26  9:33   ` Joshua Kinard
2016-05-26 13:36     ` Aaro Koskinen
2016-05-26 17:59   ` David Daney
2016-05-26 19:23     ` Aaro Koskinen
2016-05-26 22:13       ` David Daney
2016-05-27 17:14         ` Aaro Koskinen
2016-05-27 21:03           ` Joshua Kinard
2016-05-27 22:05             ` Aaro Koskinen
2016-05-27 22:22               ` Joshua Kinard
2016-06-22 22:05 ` David Daney
2016-06-23 12:08   ` Aaro Koskinen
2016-06-23 12:08     ` Aaro Koskinen
2016-06-24 11:38     ` Joshua Kinard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57435CB4.5080609@gentoo.org \
    --to=kumba@gentoo.org \
    --cc=linux-mips@linux-mips.org \
    --cc=ralf@linux-mips.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.