From: Ralf Baechle <ralf@linux-mips.org>
To: cee1 <fykcee1@gmail.com>
Cc: linux-mips@linux-mips.org, Chen Jie <chenj@lemote.com>
Subject: Re: [v5] MIPS: lib: csum_partial: more instruction paral
Date: Mon, 30 Mar 2015 22:10:15 +0200 [thread overview]
Message-ID: <20150330201015.GA3757@linux-mips.org> (raw)
In-Reply-To: <1427389644-92793-1-git-send-email-fykcee1@gmail.com>
On Fri, Mar 27, 2015 at 01:07:24AM +0800, cee1 wrote:
> From: Chen Jie <chenj@lemote.com>
>
> Computing sum introduces true data dependency. This patch removes some
> true data depdendencies, hence increases instruction level parallelism.
>
> This patch brings at most 50% csum performance gain on Loongson 3a
> processor in our test.
>
> One example about how this patch works is in CSUM_BIGCHUNK1:
> // ** original ** vs ** patch applied **
> ADDC(sum, t0) ADDC(t0, t1)
> ADDC(sum, t1) ADDC(t2, t3)
> ADDC(sum, t2) ADDC(sum, t0)
> ADDC(sum, t3) ADDC(sum, t2)
>
> In the original implementation, each ADDC(sum, ...) depends on the sum
> value updated by previous ADDC(as source operand).
>
> With this patch applied, the first two ADDC operations are independent,
> hence can be executed simultaneously if possible.
>
> Another example is in the "copy and sum calculating chunk":
> // ** original ** vs ** patch applied **
> STORE(t0, UNIT(0) ... STORE(t0, UNIT(0) ...
> ADDC(sum, t0) ADDC(t0, t1)
> STORE(t1, UNIT(1) ... STORE(t1, UNIT(1) ...
> ADDC(sum, t1) ADDC(sum, t0)
> STORE(t2, UNIT(2) ... STORE(t2, UNIT(2) ...
> ADDC(sum, t2) ADDC(t2, t3)
> STORE(t3, UNIT(3) ... STORE(t3, UNIT(3) ...
> ADDC(sum, t3) ADDC(sum, t2)
>
> With this patch applied, ADDC and the **next next** ADDC are independent.
This is interesting because even CPUs as old as the R2000 have a pipeline
bypass which allows an instruction to use a result written to a register
by an immediately preceeeding instruction.
Can you explain why this patch is so beneficial for Loongson 3A?
Thanks,
Ralf
next prev parent reply other threads:[~2015-03-30 20:10 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-26 17:07 [v5] MIPS: lib: csum_partial: more instruction paral cee1
2015-03-30 20:10 ` Ralf Baechle [this message]
2015-03-31 8:34 ` cee1
2015-04-02 12:59 ` Maciej W. Rozycki
2015-04-06 13:03 ` cee1
2015-04-06 13:52 ` Maciej W. Rozycki
2015-04-06 15:30 ` cee1
2015-03-31 8:34 ` cee1
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150330201015.GA3757@linux-mips.org \
--to=ralf@linux-mips.org \
--cc=chenj@lemote.com \
--cc=fykcee1@gmail.com \
--cc=linux-mips@linux-mips.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.