public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Dave Gordon <david.s.gordon@intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	intel-gfx@lists.freedesktop.org,
	Akash Goel <akash.goel@intel.com>,
	Mika Kuoppala <mika.kuoppala@intel.com>
Subject: Re: [PATCH] drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory
Date: Mon, 18 Jul 2016 12:57:09 +0100	[thread overview]
Message-ID: <578CC415.202@intel.com> (raw)
In-Reply-To: <20160718113501.GH21839@nuc-i3427.alporthouse.com>

On 18/07/16 12:35, Chris Wilson wrote:
> On Mon, Jul 18, 2016 at 12:15:32PM +0100, Tvrtko Ursulin wrote:
>> I am not sure about this, but looking at the raid6 for example, it
>> has a lot more annotations in cases like this.
>>
>> It seems to be telling the compiler which memory ranges does each
>> instruction access, and also uses "asm volatile" - whether or not
>> that is really needed I don't know.
>>
>> For example:
>>                  asm volatile("movdqa %0,%%xmm4" :: "m" (dptr[z0][d]));
>>
>> And:
>>                  asm volatile("movdqa %%xmm4,%0" : "=m" (q[d]));
>>
>> Each one is telling the compiler the instruction is either reading
>> or writing respectively from a certain memory address.
>>
>> You don't have any of that, and don't even specify nothing as an
>> output parameter so I am not sure if your code is safe.
>
> The asm is correct. We do not modify either of the two pointers which we
> pass in via register inputs, but the memory behind them - hence the memory
> clobber.

This is a choice of how much we let the compiler decide about 
addressing, and how much we tell it about what the asm code really does. 
The examples above get the compiler to generate *any* suitable 
addressing mode for each specific location involved in the transfers, so 
the compiler knows a lot about what's happening and can track where each 
datum comes from and goes to.

OTOH Chris' code

+        asm("movntdqa   (%0), %%xmm0\n"
+            "movntdqa 16(%0), %%xmm1\n"
+            "movntdqa 32(%0), %%xmm2\n"
+            "movntdqa 48(%0), %%xmm3\n"
+            "movaps %%xmm0,   (%1)\n"
+            "movaps %%xmm1, 16(%1)\n"
+            "movaps %%xmm2, 32(%1)\n"
+            "movaps %%xmm3, 48(%1)\n"
+            :: "r" (src), "r" (dst) : "memory");

- doesn't need "volatile" because asm statements that have no output 
operands are implicitly volatile.

- makes the compiler give us the source and destination *addresses* in a 
register each; beyond that, it doesn't know what we're doing with them, 
so the third ("clobbers") parameter has to say "memory" i.e. treat *all* 
memory contents as unknown after this.

[[From GCC docs: The "memory" clobber tells the compiler that the 
assembly code performs memory reads or writes to items other than those 
listed in the input and output operands (for example, accessing the 
memory pointed to by one of the input parameters). To ensure memory 
contains correct values, GCC may need to flush specific register values 
to memory before executing the asm. Further, the compiler does not 
assume that any values read from memory before an asm remain unchanged 
after that asm; it reloads them as needed. Using the "memory" clobber 
effectively forms a read/write memory barrier for the compiler.]]

BTW, should we not tell it we've *also* clobbered %xmm[0-3]?

So they're both correct, just taking different approaches. I don't know 
which would give the best performance for this specific case.

.Dave.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-07-18 11:57 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-16 12:07 [PATCH] drm/i915: Use SSE4.1 movntqda to accelerate reads from WC memory Chris Wilson
2016-07-16 12:32 ` ✗ Ro.CI.BAT: failure for " Patchwork
2016-07-16 12:33 ` [PATCH] " Chris Wilson
2016-07-16 12:45 ` [PATCH v2] " Chris Wilson
2016-07-16 12:53   ` [PATCH v3] " Chris Wilson
2016-07-16 15:44     ` [PATCH v5] drm/i915: Use SSE4.1 movntdqa " Chris Wilson
2016-07-17  3:21       ` Matt Turner
2016-07-17  8:12         ` Chris Wilson
2016-07-18  9:31       ` Tvrtko Ursulin
2016-07-18 10:01         ` Chris Wilson
2016-07-18 10:07           ` [PATCH] " Chris Wilson
2016-07-18 10:29             ` Chris Wilson
2016-07-18 11:15             ` Tvrtko Ursulin
2016-07-18 11:35               ` Chris Wilson
2016-07-18 11:57                 ` Dave Gordon [this message]
2016-07-18 12:56                   ` Tvrtko Ursulin
2016-07-18 13:46                     ` Tvrtko Ursulin
2016-07-18 15:06                       ` Tvrtko Ursulin
2016-07-18 16:05                         ` Dave Gordon
2016-07-19 10:26                         ` Tvrtko Ursulin
2016-07-18 13:48                     ` Dave Gordon
2016-07-18 12:07                 ` Tvrtko Ursulin
2016-07-19  6:50             ` Daniel Vetter
2016-07-16 13:12 ` ✗ Ro.CI.BAT: failure for drm/i915: Use SSE4.1 movntqda to accelerate reads from WC memory (rev2) Patchwork
2016-07-16 13:34 ` ✓ Ro.CI.BAT: success for drm/i915: Use SSE4.1 movntqda to accelerate reads from WC memory (rev3) Patchwork
2016-07-16 16:12 ` ✗ Ro.CI.BAT: failure for drm/i915: Use SSE4.1 movntqda to accelerate reads from WC memory (rev4) Patchwork
2016-07-18 10:59 ` ✗ Ro.CI.BAT: failure for drm/i915: Use SSE4.1 movntqda to accelerate reads from WC memory (rev5) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=578CC415.202@intel.com \
    --to=david.s.gordon@intel.com \
    --cc=akash.goel@intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=mika.kuoppala@intel.com \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox