linux-assembly.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 4x4 single-precision matrix product with SSE
@ 2011-03-11 22:49 Nicolas Bock
  2011-03-12  8:32 ` Frederic Marmond
       [not found] ` <AANLkTimCWmanFU19admtg5q18HvCOxrdjm+9XWFT-0Zm@mail.gmail.com>
  0 siblings, 2 replies; 5+ messages in thread
From: Nicolas Bock @ 2011-03-11 22:49 UTC (permalink / raw)
  To: linux-assembly

[-- Attachment #1: Type: text/plain, Size: 469 bytes --]

Hello list,

I am writing an assembly function that multiplies 2 4x4 single precision
matrices. I wrote 2 versions, one using SSE the other using SSE4.1. What
surprised me is that the SSE4.1 version fails to beat the SSE version,
it is in fact slightly slower.

Is this the right place to ask for help? If anyone is interested I can
post some code which would maybe clarify the situation a bit.

If this is not the right place, please ignore me...

nick


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-09-05 19:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-11 22:49 4x4 single-precision matrix product with SSE Nicolas Bock
2011-03-12  8:32 ` Frederic Marmond
     [not found] ` <AANLkTimCWmanFU19admtg5q18HvCOxrdjm+9XWFT-0Zm@mail.gmail.com>
2011-03-13 20:23   ` Nicolas Bock
     [not found]     ` <AANLkTim-ZqzJ+2q+u=7+yRjzTf7FQDcuu-YDN=RV0H6X@mail.gmail.com>
     [not found]       ` <AANLkTimny0PkR0bYBjKgaH4j=_=2aL=rt=YcDjWeQCG6@mail.gmail.com>
2011-03-14 15:43         ` Fwd: " Nicolas Bock
2012-09-05 19:13           ` Nicolas Bock

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).