linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Scott Wood <scottwood@freescale.com>
Cc: linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 2/7] powerpc/mm: 64-bit 4k: use a PMD-based virtual page table
Date: Tue, 24 May 2011 06:51:01 +1000	[thread overview]
Message-ID: <1306183861.7481.208.camel@pasglop> (raw)
In-Reply-To: <20110523135433.557e2d63@schlenkerla.am.freescale.net>

On Mon, 2011-05-23 at 13:54 -0500, Scott Wood wrote:
> On Sat, 21 May 2011 08:15:36 +1000
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
> > On Fri, 2011-05-20 at 15:57 -0500, Scott Wood wrote:
> > 
> > > I see a 2% cost going from virtual pmd to full 4-level walk in the
> > > benchmark mentioned above (some type of sort), and just under 3% in
> > > page-stride lat_mem_rd from lmbench.
> > > 
> > > OTOH, the virtual pmd approach still leaves the possibility of taking a
> > > bunch of virtual page table misses if non-localized accesses happen over a
> > > very large chunk of address space (tens of GiB), and we'd have one fewer
> > > type of TLB miss to worry about complexity-wise with a straight table walk.
> > > 
> > > Let me know what you'd prefer.
> > 
> > I'm tempted to kill the virtual linear feature alltogether.. it didn't
> > buy us that much. Have you looked if you can snatch back some of those
> > cycles with hand tuning of the level walker ?
> 
> That's after trying a bit of that (pulled the pgd load up before
> normal_tlb_miss, and some other reordering).  Not sure how much more can be
> squeezed out of it with such techniques, at least with e5500.
> 
> Hmm, in the normal miss case we know we're in the first EXTLB level,
> right?  So we could cut out a load/mfspr by subtracting EXTLB from r12
> to get the PACA (that load's latency is pretty well buried, but maybe we
> could replace it with loading pgd, replacing it later if it's a kernel
> region).  Maybe move pgd to the first EXTLB, so it's in the same cache line
> as the state save data. The PACA cacheline containing pgd is probably
> pretty hot in normal kernel code, but not so much in a long stretch of
> userspace plus TLB misses (other than for pgd itself).

Is your linear mapping bolted ? If it is you may be able to cut out most
of the save/restore stuff (SRR0,1, ...) since with a normal walk you
won't take nested misses.
 
> > Would it work/help to have a simple cache of the last pmd & address and
> > compare just that ?
> 
> Maybe.
> 
> It would still slow down the case where you miss that cache -- not by as
> much as a virtual page table miss (and it wouldn't compete for TLB entries
> with actual user pages), but it would happen more often, since you'd only be
> able to cache one pmd.
>
> > Maybe in a SPRG or a known cache hot location like
> > the PACA in a line that we already load anyways ?
> 
> A cache access is faster than a SPRG access on our chips (plus we
> don't have many to spare, especially if we want to avoid swapping SPRG4-7 on
> guest entry/exit in KVM), so I'd favor putting it in the PACA.
> 
> I'll try this stuff out and see what helps.

Cool,

Cheers,
Ben.

  reply	other threads:[~2011-05-23 20:51 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-18 21:04 [PATCH 1/7] powerpc/mm: 64-bit 4k: use page-sized PMDs Scott Wood
2011-05-18 21:05 ` [PATCH 2/7] powerpc/mm: 64-bit 4k: use a PMD-based virtual page table Scott Wood
2011-05-18 21:33   ` Benjamin Herrenschmidt
2011-05-20 20:57     ` Scott Wood
2011-05-20 22:15       ` Benjamin Herrenschmidt
2011-05-23 18:54         ` Scott Wood
2011-05-23 20:51           ` Benjamin Herrenschmidt [this message]
2011-05-23 23:31             ` Scott Wood
2011-05-24  2:52               ` Benjamin Herrenschmidt
2011-05-18 21:05 ` [PATCH 3/7] powerpc/mm: 64-bit tlb miss: get PACA from memory rather than SPR Scott Wood
2011-05-18 21:05 ` [PATCH 4/7] powerpc/mm: 64-bit: Don't load PACA in normal TLB miss exceptions Scott Wood
2011-05-18 21:05 ` [PATCH 5/7] powerpc/mm: 64-bit: don't handle non-standard page sizes Scott Wood
2011-05-18 21:36   ` Benjamin Herrenschmidt
2011-05-18 21:50     ` Scott Wood
2011-05-18 21:54       ` Benjamin Herrenschmidt
2011-05-18 21:05 ` [PATCH 6/7] powerpc/mm: 64-bit: tlb handler micro-optimization Scott Wood
2011-05-18 21:37   ` Benjamin Herrenschmidt
2011-05-18 21:51     ` Scott Wood
2011-05-18 21:54       ` Benjamin Herrenschmidt
2011-05-18 22:27         ` Scott Wood
2011-05-18 21:05 ` [PATCH 7/7] powerpc/e5500: set MMU_FTR_USE_PAIRED_MAS Scott Wood
2011-05-18 21:38   ` Benjamin Herrenschmidt
2011-05-18 21:52     ` Scott Wood
2011-05-18 21:58       ` Benjamin Herrenschmidt
2011-05-18 21:32 ` [PATCH 1/7] powerpc/mm: 64-bit 4k: use page-sized PMDs Benjamin Herrenschmidt
2011-05-18 21:46   ` Scott Wood
2011-05-18 21:52     ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1306183861.7481.208.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=scottwood@freescale.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).