Re: TTM placement & caching issue/questions

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Thomas Hellstrom <thellstrom@vmware.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org, dri-devel@lists.freedesktop.org,
	Daniel Vetter <daniel@ffwll.ch>
Subject: Re: TTM placement & caching issue/questions
Date: Thu, 4 Sep 2014 12:23:48 +0200	[thread overview]
Message-ID: <54083DB4.1050009@vmware.com> (raw)
In-Reply-To: <1409823823.4246.61.camel@pasglop>

On 09/04/2014 11:43 AM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-09-04 at 11:34 +0200, Daniel Vetter wrote:
>> On Thu, Sep 04, 2014 at 09:44:04AM +0200, Thomas Hellstrom wrote:
>>> Last time I tested, (and it seems like Michel is on the same track),
>>> writing with the CPU to write-combined memory was substantially faster
>>> than writing to cached memory, with the additional side-effect that CPU
>>> caches are left unpolluted.
>>>
>>> Moreover (although only tested on Intel's embedded chipsets), texturing
>>> from cpu-cache-coherent PCI memory was a real GPU performance hog
>>> compared to texturing from non-snooped memory. Hence, whenever a buffer
>>> could be classified as GPU-read-only (or almost at least), it should be
>>> placed in write-combined memory.
>> Just a quick comment since this explicitly referes to intel chips: On
>> desktop/laptop chips with the big shared l3/l4 caches it's the other way
>> round. Cached uploads are substantially faster than wc and not using
>> coherent access is a severe perf hit for texturing. I guess the hw guys
>> worked really hard to hide the snooping costs so that the gpu can benefit
>> from the massive bandwidth these caches can provide.
> This is similar to modern POWER chips as well. We have pretty big L3's
> (though not technically shared they are in a separate quadrant and we
> have a shared L4 in the memory buffer) and our fabric is generally
> optimized for cachable/coherent access performance. In fact, we only
> have so many credits for NC accesses on the bus...
>

Thanks both of you for the update. I haven't dealt with real hardware
for a while..

/Thomas

WARNING: multiple messages have this Message-ID (diff)

From: Thomas Hellstrom <thellstrom@vmware.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org, dri-devel@lists.freedesktop.org,
	Daniel Vetter <daniel@ffwll.ch>
Subject: Re: TTM placement & caching issue/questions
Date: Thu, 4 Sep 2014 12:23:48 +0200	[thread overview]
Message-ID: <54083DB4.1050009@vmware.com> (raw)
In-Reply-To: <1409823823.4246.61.camel@pasglop>

On 09/04/2014 11:43 AM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-09-04 at 11:34 +0200, Daniel Vetter wrote:
>> On Thu, Sep 04, 2014 at 09:44:04AM +0200, Thomas Hellstrom wrote:
>>> Last time I tested, (and it seems like Michel is on the same track),
>>> writing with the CPU to write-combined memory was substantially faster
>>> than writing to cached memory, with the additional side-effect that CPU
>>> caches are left unpolluted.
>>>
>>> Moreover (although only tested on Intel's embedded chipsets), texturing
>>> from cpu-cache-coherent PCI memory was a real GPU performance hog
>>> compared to texturing from non-snooped memory. Hence, whenever a buffer
>>> could be classified as GPU-read-only (or almost at least), it should be
>>> placed in write-combined memory.
>> Just a quick comment since this explicitly referes to intel chips: On
>> desktop/laptop chips with the big shared l3/l4 caches it's the other way
>> round. Cached uploads are substantially faster than wc and not using
>> coherent access is a severe perf hit for texturing. I guess the hw guys
>> worked really hard to hide the snooping costs so that the gpu can benefit
>> from the massive bandwidth these caches can provide.
> This is similar to modern POWER chips as well. We have pretty big L3's
> (though not technically shared they are in a separate quadrant and we
> have a shared L4 in the memory buffer) and our fabric is generally
> optimized for cachable/coherent access performance. In fact, we only
> have so many credits for NC accesses on the bus...
>

Thanks both of you for the update. I haven't dealt with real hardware
for a while..

/Thomas

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

next prev parent reply	other threads:[~2014-09-04 10:24 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-04  0:12 TTM placement & caching issue/questions Benjamin Herrenschmidt
2014-09-04  0:12 ` Benjamin Herrenschmidt
2014-09-04  1:55 ` Jerome Glisse
2014-09-04  2:07   ` Jerome Glisse
2014-09-04  2:07     ` Jerome Glisse
2014-09-04  2:25     ` Benjamin Herrenschmidt
2014-09-04  2:25       ` Benjamin Herrenschmidt
2014-09-04  2:31       ` Jerome Glisse
2014-09-04  2:31         ` Jerome Glisse
2014-09-04  2:32         ` Jerome Glisse
2014-09-04  2:32           ` Jerome Glisse
2014-09-04  2:36         ` Jerome Glisse
2014-09-04  2:36           ` Jerome Glisse
2014-09-04  5:23           ` Benjamin Herrenschmidt
2014-09-04  6:45           ` Gabriel Paubert
2014-09-04  7:19           ` Michel Dänzer
2014-09-04  7:54             ` Benjamin Herrenschmidt
2014-09-04  7:59               ` Michel Dänzer
2014-09-04  7:59                 ` Michel Dänzer
2014-09-04  8:07                 ` Benjamin Herrenschmidt
2014-09-04  2:15   ` Benjamin Herrenschmidt
2014-09-04  7:12   ` Michel Dänzer
2014-09-04  7:44 ` Thomas Hellstrom
2014-09-04  7:44   ` Thomas Hellstrom
2014-09-04  8:06   ` Benjamin Herrenschmidt
2014-09-04  8:46     ` Thomas Hellstrom
2014-09-04  8:46       ` Thomas Hellstrom
2014-09-04  9:34   ` Daniel Vetter
2014-09-04  9:34     ` Daniel Vetter
2014-09-04  9:43     ` Benjamin Herrenschmidt
2014-09-04 10:23       ` Thomas Hellstrom [this message]
2014-09-04 10:23         ` Thomas Hellstrom
     [not found] <ED4D93630842CD4385F644DC5158EE9171B05E72@NTOVMAIL03.ad.otto.de>
2014-09-05  7:40 ` Jochen Rollwagen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54083DB4.1050009@vmware.com \
    --to=thellstrom@vmware.com \
    --cc=benh@kernel.crashing.org \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linuxppc-dev@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.