linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Ayman El-Khashab <ayman@elkhashab.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org
Subject: Re: ppc44x - how do i optimize driver for tlb hits
Date: Thu, 23 Sep 2010 17:35:16 -0500	[thread overview]
Message-ID: <20100923223516.GA30033@crust.elkhashab.com> (raw)
In-Reply-To: <1285279264.5158.18.camel@pasglop>

On Fri, Sep 24, 2010 at 08:01:04AM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2010-09-23 at 10:12 -0500, Ayman El-Khashab wrote:
> > I've implemented a working driver on my 460EX.  it allocates a couple
> > of buffers of 4MB each.  I have a custom memcmp algorithm in asm that
> > is extremely fast in user space, but 1/2 as fast when run on these
> > buffers.
> > 
> > my tests are showing that the algorithm seems to be memory bandwidth
> > bound.  my guess is that i am having tlb or cache misses (my algo
> > uses the dbct) that is slowing performance.  curiously when in user
> > space, i can affect the performance by small changes in the size of
> > the buffer, i.e. 4MB + 32B is fast, 4MB + 4K is much worse.
> > 
> > Can i adjust my driver code that is using kmalloc to make sure that
> > the ppc44x has 4MB tlb entries for these and that they stay put?
> 
> Anything you allocate with kmalloc() is going to be mapped by bolted
> 256M TLB entries, so there should be no TLB misses happening in the
> kernel case.
> 

Hi Ben, can you or somebody elaborate?  I saw the pinned tlb in 44x_mmu.c.
Perhaps I don't understand the code fully, but it appears to map 256MB
of "lowmem" into a pinned tlb.  I am not sure what phys address lowmem
means, but I assumed (possibly incorrectly) that it is 0-256MB.  When I
get the physical addresses for my buffers after kmalloc, they all have
addresses that are within my DRAM but start at about the 440MB mark. I
end up passing those phys addresses to my DMA engine.

When my compare runs it takes a huge amount of time in the assembly code
doing memory fetches which makes me think that there are either tons of
cache misses (despite the prefetching) or the entries have been purged
from the TLB and must be obtained again.  As an experiment, I disabled
my cache prefetch code and the algo took forever.  Next I altered the
asm to do the same amount of data but a smaller amount over and over 
so that less if fetched from main memory.  That executed very quickly.
>From that I drew the conclusion that the algorithm is memory bandwidth
limited.

In a standalone configuration (i.e. algorithm just using user memory,
everything else identical), the speedup is 2-3x.  So the limitation 
is not a hardware limit, it must be something that is happening when
I execute the loads.  (it is a compare algorithm, so it only does
loads).

Thanks
Ayman

  reply	other threads:[~2010-09-23 22:35 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-23 15:12 ppc44x - how do i optimize driver for tlb hits Ayman El-Khashab
2010-09-23 22:01 ` Benjamin Herrenschmidt
2010-09-23 22:35   ` Ayman El-Khashab [this message]
2010-09-24  1:07     ` Benjamin Herrenschmidt
2010-09-24  2:58       ` Ayman El-Khashab
2010-09-24  4:43         ` Benjamin Herrenschmidt
2010-09-24 10:30           ` Josh Boyer
2010-09-24 13:08             ` Ayman El-Khashab
2010-09-24 22:11               ` Benjamin Herrenschmidt
2010-10-03 19:13                 ` Ayman El-Khashab
2010-10-03 22:38                   ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100923223516.GA30033@crust.elkhashab.com \
    --to=ayman@elkhashab.com \
    --cc=benh@kernel.crashing.org \
    --cc=linuxppc-dev@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).