Re: hint? - Marcin Kościelnicki

linux-assembly.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Marcin Kościelnicki" <markosc@interia.pl>
To: linux-assembly@vger.kernel.org
Subject: Re: hint?
Date: Fri, 15 Sep 2006 00:05:59 +0200	[thread overview]
Message-ID: <200609150005.59939.markosc@interia.pl> (raw)
In-Reply-To: <20060914195852.88791.qmail@web50313.mail.yahoo.com>

> Hi.
> I've read hint in some magazine that if you are not
> using the same registers in following lines new
> procesor ( my is quite old PIII(coppermine)600Mhz )
> can make commands syumyltaniously ie:
>
> mov %eax,%ebx
> mov %ecx,%edx
>
> can be does allmost in the same time. i've check it by
> using extremly simple example:
>
> movl $3,%ebx
> movl $3,%edx
> movl $0b111111111111111111111111111,%ecx
>
> loop:
> sall $2,%eax
> orl  %ebx,%eax
> sall $2,%edi
> orl  %edx,%edi
>    loop loop
>
> it works ( agains theorem from magazine ) faster then
>
> movl $3,%ebx
> movl $3,%edx
> movl $0b111111111111111111111111111,%ecx
>
> loop:
> sall $2,%eax
> sall $2,%edi
> orl  %ebx,%eax
> orl  %edx,%edi
>    loop loop
>
> so what is real answer ?

Doing mutually-independent computations [that is, ones that can be executed 
parallel without changing the outcome] simultanously indeed speeds up the 
computation. HOWEVER, modern x86s look ahead several instructions at a time 
and have the capability to move some instructions out of order a bit if it 
can be proved to not change the result. Since your code snippets are quite 
short, the CPU sees the whole thing at once and rearranges the code to 
parallel version anyway. So basically, both versions should be executed in 
about the same time, with minor differencies due to internal chip details 
varying between different CPU models.

Scheduling operations for mutual independence is important when you're dealing 
with bigger pieces of code. If your function does the same long computation 
on 3 sets of data [and you can manage to fit 3 simultanous computations in 
x86's small register set], it's much faster to interleave these three 
computations than do them one after another.

> Secund i'he allways trying not to use moemory becouse
> its extremly slow .
>
> nut again if i exchange walue between registers with
> using temporary memory:
>
> movl %eax,temp
> movl %ebx,%eax,
> movl temp,%ebx
>
> it works faster than:
>
> movl %eax,%edx
> movl %ebx,%eax,
> movl %edx,%ebx
>
> WHY? maybe linux is doing something in the mean time
> ... ?

Where did you get that result? On my machine, the version using memory is 1.5x 
slower than register-only, as expected.

> amaizing hink is that
>
> xchgl %eax,%ebx
>
> works slowest :) WHY ?

See, x86 is 20 years old. Some things that seemed like a good idea 20 years 
ago proved to be teh suck by now. One major category of sucky things in x86 
are useless instructions. One of them is xchg for register-to-register 
exchange. It is slow, because intel only bothers to speed up instructions 
that people actually use. And noone uses it, since code rarely swaps contents 
of registers around. The variables usually just stay in one register. And 
there's no reason to move them around later -- what's gained by freeing this 
register only to occupy another one?

Note: xchg for memory is entirely different beast. It is way slower that you 
might assume. That's because xchg for memory is actually useful, but only for 
one thing: as an atomic operation for implementing locking mechanisms for 
multi-threaded and/or multi-processor stuff. And locking needs special 
measures to be taken so that this location is always consistent among all CPU 
caches. And that is slow.

> I realy need time in my programs thats why i'm looking
> for any optimalization...
>
> Lukas

Marcin Ko≈õcielnicki

------------------------------------------------------------------------
Szybko i tanio ubezpiecz samochod! 
Kupno polisy zajmie Ci 15 minut! Kontakt przez telefon albo Internet. 
Kliknij i sprawdz: http://link.interia.pl/f19a0

-
To unsubscribe from this list: send the line "unsubscribe linux-assembly" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2006-09-14 22:05 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-14 19:58 hint? Henio Paszczak
2006-09-14 22:05 ` Marcin Kościelnicki [this message]
2006-09-14 23:19 ` hint? Robert Plantz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200609150005.59939.markosc@interia.pl \
    --to=markosc@interia.pl \
    --cc=linux-assembly@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).