From: Arnd Bergmann <arnd@arndb.de>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: "linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Mark Langsdorf <mlangsdo@redhat.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Linux 3.19-rc3
Date: Mon, 12 Jan 2015 14:15:53 +0100 [thread overview]
Message-ID: <3565689.rjhDkKGSkL@wuerfel> (raw)
In-Reply-To: <20150112115342.GA19807@e104818-lin.cambridge.arm.com>
On Monday 12 January 2015 11:53:42 Catalin Marinas wrote:
> On Sat, Jan 10, 2015 at 08:16:02PM +0000, Arnd Bergmann wrote:
> > Regarding ARM64 in particular, I think it would be nice to investigate
> > how to extend the THP code to cover 64KB TLBs when running with the 4KB
> > page size. There is a hint bit in the page table to tell the CPU that
> > a set of 16 aligned pages can share one TLB, and it would be nice to
> > use that bit in Linux, and to make this case more common for anonymous
> > mappings, and possible large file based mappings.
>
> The generic THP code assumes that huge pages are done at the pmd level,
> which means 2MB for arm64 with 4KB page configuration. Hugetlb allows
> larger ptes which may not necessarily be at the pmd level, though we
> haven't implemented this on arm64 and it's not transparent either. As a
> first step it would be nice if at least we unify the APIs between
> hugetlbfs and THP (set_huge_pte_at vs. set_pmd_at).
>
> I think you could do some arch-only tricks by pretending that you have a
> pte with 16 entries only and a dummy pmd (without a corresponding
> hardware page table level) that can host a "huge" page (16 consecutive
> ptes). But we lose the 2MB transparent huge page as I don't see
> mm/huge_memory.c handling huge puds. We also lose the ability of
> building 4 real level page tables since we use the pmd as a dummy one.
Yes, it quickly gets ugly at that point.
> But it would be a nice investigation. Maybe something simpler like
> getting the mm layer to prefer contiguous 64KB ranges and we do the
> detection in the arch set_pte_at().
Doing the detection would be easy enough I guess and immediately
helps with the post-split THP mapping, but I don't think that
by itself would have a noticeable benefit on general workloads.
My first reaction to a change to the mm layer was that it's probably really
hard, but then again if we limit it to anonymous mappings, all we really
need is a modification in do_anonymous_page() to allocate a larger chunk
if possible and install n PTEs at a time or fall back to the current
behavior if anything gets in the way. For completeness, the same thing
could be done in do_wp_page() for the case where an entire block of pages
are either not mapped or point to the zero page. Anything beyond that
probably adds more complexity than it gains.
Do we have someone who code this up and do some benchmarks to find out
the cost in terms of memory consumption and the performance compared to
normal 4k pages and static 64k pages?
Do the Cortex-A53/A57 cores actually implement the necessary hardware
feature?
IIRC some x86 processors are also able to use larger TLBs for contiguous
page table entries even without an architected hint bit, so if one
could show this to perform better on x86, it would be much easier to
merge.
Arnd
next prev parent reply other threads:[~2015-01-12 13:16 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-06 1:46 Linux 3.19-rc3 Linus Torvalds
2015-01-06 2:46 ` Dave Jones
2015-01-06 8:18 ` Takashi Iwai
2015-01-06 9:45 ` Jiri Kosina
2015-01-08 12:51 ` Mark Langsdorf
2015-01-08 13:45 ` Catalin Marinas
2015-01-08 17:29 ` Mark Langsdorf
2015-01-08 17:34 ` Catalin Marinas
2015-01-08 18:48 ` Mark Langsdorf
2015-01-08 19:21 ` Linus Torvalds
2015-01-09 23:27 ` Catalin Marinas
2015-01-10 0:35 ` Kirill A. Shutemov
2015-01-10 2:27 ` Linus Torvalds
2015-01-10 2:51 ` David Lang
2015-01-10 3:06 ` Linus Torvalds
2015-01-10 10:46 ` Andreas Mohr
2015-01-10 19:42 ` Linus Torvalds
2015-01-13 3:33 ` Rik van Riel
2015-01-13 10:28 ` Catalin Marinas
2015-01-10 3:17 ` Tony Luck
2015-01-10 20:16 ` Arnd Bergmann
2015-01-10 21:00 ` Linus Torvalds
2015-01-10 21:36 ` Arnd Bergmann
2015-01-10 21:48 ` Linus Torvalds
2015-01-12 11:37 ` Kirill A. Shutemov
2015-01-12 12:18 ` Catalin Marinas
2015-01-12 13:57 ` Arnd Bergmann
2015-01-12 14:23 ` Catalin Marinas
2015-01-12 15:42 ` Arnd Bergmann
2015-01-12 11:53 ` Catalin Marinas
2015-01-12 13:15 ` Arnd Bergmann [this message]
2015-01-08 15:08 ` Michal Hocko
2015-01-08 16:37 ` Mark Langsdorf
2015-01-09 15:56 ` Michal Hocko
2015-01-09 12:13 ` Mark Rutland
2015-01-09 14:19 ` Steve Capper
2015-01-09 14:27 ` Mark Langsdorf
2015-01-09 17:57 ` Mark Rutland
2015-01-09 18:37 ` Marc Zyngier
2015-01-09 19:43 ` Will Deacon
2015-01-10 3:29 ` Laszlo Ersek
2015-01-10 4:39 ` Linus Torvalds
2015-01-10 13:37 ` Will Deacon
2015-01-10 19:47 ` Laszlo Ersek
2015-01-10 19:56 ` Linus Torvalds
2015-01-10 20:08 ` Laszlo Ersek
2015-01-10 19:51 ` Linus Torvalds
2015-01-12 12:42 ` Will Deacon
2015-01-12 13:22 ` Mark Langsdorf
2015-01-12 19:03 ` Dave Hansen
2015-01-12 19:06 ` Linus Torvalds
2015-01-12 19:07 ` Linus Torvalds
2015-01-12 19:24 ` Will Deacon
2015-01-10 15:22 ` Kyle McMartin
-- strict thread matches above, loose matches on Subject: below --
2015-01-06 4:49 Sedat Dilek
2015-01-06 9:34 ` Sedat Dilek
2015-01-06 9:56 ` Takashi Iwai
2015-01-06 10:06 ` Sedat Dilek
2015-01-06 10:28 ` Takashi Iwai
2015-01-06 10:31 ` Sedat Dilek
2015-01-06 10:37 ` Takashi Iwai
2015-01-06 10:42 ` Sedat Dilek
2015-01-06 9:59 ` Peter Zijlstra
2015-01-06 9:40 ` Peter Zijlstra
2015-01-06 9:42 ` Sedat Dilek
2015-01-06 9:57 ` Sedat Dilek
2015-01-06 10:06 ` Peter Zijlstra
2015-01-06 10:18 ` Sedat Dilek
2015-01-06 11:01 ` Peter Zijlstra
2015-01-06 11:07 ` Kent Overstreet
2015-01-06 11:25 ` Sedat Dilek
2015-01-06 11:40 ` Kent Overstreet
2015-01-06 12:51 ` Sedat Dilek
2015-01-06 11:42 ` Peter Zijlstra
2015-01-06 11:48 ` Peter Zijlstra
2015-01-06 12:01 ` Kent Overstreet
2015-01-06 12:20 ` Peter Zijlstra
2015-01-06 12:45 ` Kent Overstreet
2015-01-06 12:55 ` Peter Hurley
2015-01-06 17:38 ` Paul E. McKenney
2015-01-06 17:58 ` Peter Hurley
2015-01-06 19:25 ` Paul E. McKenney
2015-01-06 19:57 ` Peter Hurley
2015-01-06 20:47 ` Paul E. McKenney
2015-01-20 0:30 ` Paul E. McKenney
2015-01-20 14:03 ` Peter Hurley
2015-02-02 16:11 ` Paul E. McKenney
2015-02-02 19:03 ` Peter Hurley
2015-02-02 19:33 ` Paul E. McKenney
2015-01-06 11:56 ` Kent Overstreet
2015-01-06 12:16 ` Peter Zijlstra
2015-01-06 12:43 ` Kent Overstreet
2015-01-06 13:03 ` Peter Zijlstra
2015-01-06 13:28 ` Kent Overstreet
2015-01-13 15:23 ` Peter Zijlstra
2015-01-06 11:58 ` Peter Zijlstra
2015-01-06 12:18 ` Kent Overstreet
2015-01-16 16:56 ` Peter Hurley
2015-01-16 17:00 ` Chris Mason
2015-01-16 18:58 ` Peter Hurley
2015-01-06 10:29 ` Sedat Dilek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3565689.rjhDkKGSkL@wuerfel \
--to=arnd@arndb.de \
--cc=catalin.marinas@arm.com \
--cc=kirill@shutemov.name \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mlangsdo@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox