linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Ayman El-Khashab <ayman@elkhashab.com>
To: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: linuxppc-dev@ozlabs.org
Subject: Re: ppc44x - how do i optimize driver for tlb hits
Date: Fri, 24 Sep 2010 08:08:51 -0500	[thread overview]
Message-ID: <20100924130851.GA14016@crust.elkhashab.com> (raw)
In-Reply-To: <20100924103034.GA27958@zod.rchland.ibm.com>

On Fri, Sep 24, 2010 at 06:30:34AM -0400, Josh Boyer wrote:
> On Fri, Sep 24, 2010 at 02:43:52PM +1000, Benjamin Herrenschmidt wrote:
> >> The DMA is what I use in the "real world case" to get data into and out 
> >> of these buffers.  However, I can disable the DMA completely and do only
> >> the kmalloc.  In this case I still see the same poor performance.  My
> >> prefetching is part of my algo using the dcbt instructions.  I know the
> >> instructions are effective b/c without them the algo is much less 
> >> performant.  So yes, my prefetches are explicit.
> >
> >Could be some "effect" of the cache structure, L2 cache, cache geometry
> >(number of ways etc...). You might be able to alleviate that by changing
> >the "stride" of your prefetch.

My original theory was that it was having lots of cache misses.  But since
the algorithm works standalone fast and uses large enough buffers (4MB),
much of the cache is flushed and replaced with my data.  The cache is 32K,
8 way, 32b/line.  I've crafted the algorithm to use those parameters.

> >
> >Unfortunately, I'm not familiar enough with the 440 micro architecture
> >and its caches to be able to help you much here.
> 
> Also, doesn't kmalloc have a limit to the size of the request it will
> let you allocate?  I know in the distant past you could allocate 128K
> with kmalloc, and 2M with an explicit call to get_free_pages.  Anything
> larger than that had to use vmalloc.  The limit might indeed be higher
> now, but a 4MB kmalloc buffer sounds very large, given that it would be
> contiguous pages.  Two of them even less so.

I thought so too, but at least in the current implementation we found
empirically that we could kmalloc up to but no more than 4MB.  We have 
also tried an approach in user memory and then using "get_user_pages"
and building a scatter-gather.  We found that the compare code doesn't 
perform any better. 

I suppose another option is to to use the kernel profiling option I 
always see but have never used.  Is that a viable option to figure out
what is happening here?  

ayman

  reply	other threads:[~2010-09-24 13:08 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-23 15:12 ppc44x - how do i optimize driver for tlb hits Ayman El-Khashab
2010-09-23 22:01 ` Benjamin Herrenschmidt
2010-09-23 22:35   ` Ayman El-Khashab
2010-09-24  1:07     ` Benjamin Herrenschmidt
2010-09-24  2:58       ` Ayman El-Khashab
2010-09-24  4:43         ` Benjamin Herrenschmidt
2010-09-24 10:30           ` Josh Boyer
2010-09-24 13:08             ` Ayman El-Khashab [this message]
2010-09-24 22:11               ` Benjamin Herrenschmidt
2010-10-03 19:13                 ` Ayman El-Khashab
2010-10-03 22:38                   ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100924130851.GA14016@crust.elkhashab.com \
    --to=ayman@elkhashab.com \
    --cc=jwboyer@linux.vnet.ibm.com \
    --cc=linuxppc-dev@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).