public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: jszhang@marvell.com (Jisheng Zhang)
To: linux-arm-kernel@lists.infradead.org
Subject: lib/GCD.c regression on arm
Date: Mon, 18 Jul 2016 20:15:49 +0800	[thread overview]
Message-ID: <20160718201549.61f135c8@xhacker> (raw)
In-Reply-To: <20160715135109.GA2657@linux-Precision-WorkStation-T5500>

Dear Cheah,

On Fri, 15 Jul 2016 21:51:10 +0800 Cheah Kok Cheong wrote:

> Commit fff7fb0b2d90 ("lib/GCD.c: use binary GCD algorithm instead of Euclidean")
> replaced the Euclidean algorithm totally with the Binary algorithm.
> Two variants were provided and selected via Kconfig depending on whether
> a fast __ffs (find least significant set bit) instruction is available.
> 
> For arm v5 and above the fast __ffs version is used as evident in
> arch/arm/mm/Kconfig.
> 
> I benchmarked the gcd performance using the code provided in the commit
> with a Cortex-A9 based Mediatek MT6577. Three runs at different settings
> were used.
> 
> The performance with fast __ffs Binary algo is slower than the Euclidean
> algo. Using the non ffs version [even/odd variant] gives a comparable
> performance as the Euclidean algo.

Interesting, using the code in the commit, I get the following result
on A CA53 platform

build with aarch64 toolchain, -O2 -mcpu=cortex-a53

~ # /a53 -r 500000 -n 10
gcd0: elapsed 10170
gcd1: elapsed 11340
gcd2: elapsed 13590
gcd3: elapsed 11700
gcd4: elapsed 14230
PASS

build with armhf toolchain, -O2 -mcpu=cortex-a53

~ # /a53_32 -r 500000 -n 10
gcd0: elapsed 9490
gcd1: elapsed 10220
gcd2: elapsed 10790
gcd3: elapsed 10270
gcd4: elapsed 10850
PASS


> 
> Will be interesting to see whether this is also true for other platforms
> with arm v5 and above? Hopefully others will do some testing.
> If this is the case then we should "select CPU_NO_EFFICIENT_FFS" in our
> Kconfig.
> 
> Thanks.
> Best Regards,
> Cheah
> 
> cross compiled with '-O2'
> 
> Euclidean                 Binary with ffs           Binary no ffs
> 
> 
> gcd -r 50000 -n 10        
> 
> gcd0: elapsed 25766       gcd0: elapsed 25766       gcd0: elapsed 25765
> gcd1: elapsed 19994       gcd1: elapsed 20224       gcd1: elapsed 19843
> gcd2: elapsed 20071       gcd2: elapsed 20533       gcd2: elapsed 20151
> gcd3: elapsed 20070       gcd3: elapsed 20380       gcd3: elapsed 19919
> gcd4: elapsed 20148       gcd4: elapsed 20610       gcd4: elapsed 20151
> PASS                      PASS                      PASS
>            
> gcd0: elapsed 26690       gcd0: elapsed 26612       gcd0: elapsed 24381
> gcd1: elapsed 20224       gcd1: elapsed 20379       gcd1: elapsed 19765
> gcd2: elapsed 20224       gcd2: elapsed 20304       gcd2: elapsed 19842
> gcd3: elapsed 20148       gcd3: elapsed 20302       gcd3: elapsed 19919
> gcd4: elapsed 20301       gcd4: elapsed 20302       gcd4: elapsed 19919
> PASS                      PASS                      PASS
>                                          
> gcd0: elapsed 25842       gcd0: elapsed 26459       gcd0: elapsed 25457
> gcd1: elapsed 20454       gcd1: elapsed 20532       gcd1: elapsed 20225
> gcd2: elapsed 20378       gcd2: elapsed 20762       gcd2: elapsed 20226
> gcd3: elapsed 20378       gcd3: elapsed 20378       gcd3: elapsed 20148
> gcd4: elapsed 20532       gcd4: elapsed 20918       gcd4: elapsed 20301
> PASS                      PASS                      PASS
> 
> 
> gcd -r 1000 -n 100
>                                             
> gcd0: elapsed 245873      gcd0: elapsed 252957      gcd0: elapsed 245571
> gcd1: elapsed 191290      gcd1: elapsed 198345      gcd1: elapsed 192513
> gcd2: elapsed 192672      gcd2: elapsed 199579      gcd2: elapsed 192978
> gcd3: elapsed 191366      gcd3: elapsed 198728      gcd3: elapsed 192283
> gcd4: elapsed 193134      gcd4: elapsed 200884      gcd4: elapsed 193669
> PASS                      PASS                      PASS
> 
> gcd0: elapsed 245180      gcd0: elapsed 251113      gcd0: elapsed 250573
> gcd1: elapsed 191755      gcd1: elapsed 196800      gcd1: elapsed 194729
> gcd2: elapsed 192286      gcd2: elapsed 198654      gcd2: elapsed 195574
> gcd3: elapsed 191601      gcd3: elapsed 197344      gcd3: elapsed 194965
> gcd4: elapsed 193135      gcd4: elapsed 200268      gcd4: elapsed 197037
> PASS                      PASS                      PASS
> 
> gcd0: elapsed 243412      gcd0: elapsed 252189      gcd0: elapsed 247876
> gcd1: elapsed 190447      gcd1: elapsed 197192      gcd1: elapsed 193355
> gcd2: elapsed 192288      gcd2: elapsed 199042      gcd2: elapsed 193437
> gcd3: elapsed 190755      gcd3: elapsed 198957      gcd3: elapsed 193660
> gcd4: elapsed 192672      gcd4: elapsed 200346      gcd4: elapsed 194586
> PASS                      PASS                      PASS
> 
> 
> gcd -n 1000
> 
> gcd0: elapsed 2636655     gcd0: elapsed 2701340     gcd0: elapsed 2622109
> gcd1: elapsed 2055411     gcd1: elapsed 2153446     gcd1: elapsed 2053342
> gcd2: elapsed 2064420     gcd2: elapsed 2162496     gcd2: elapsed 2066503
> gcd3: elapsed 2055151     gcd3: elapsed 2163201     gcd3: elapsed 2055161
> gcd4: elapsed 2071591     gcd4: elapsed 2171636     gcd4: elapsed 2074488
> PASS                      PASS                      PASS
> 
> gcd0: elapsed 2636512     gcd0: elapsed 2719436     gcd0: elapsed 2613575
> gcd1: elapsed 2060157     gcd1: elapsed 2159284     gcd1: elapsed 2046187
> gcd2: elapsed 2069242     gcd2: elapsed 2163944     gcd2: elapsed 2056430
> gcd3: elapsed 2060436     gcd3: elapsed 2166796     gcd3: elapsed 2046933
> gcd4: elapsed 2074188     gcd4: elapsed 2176243     gcd4: elapsed 2065170
> PASS                      PASS                      PASS
> 
> gcd0: elapsed 2614949     gcd0: elapsed 2708342     gcd0: elapsed 2632962
> gcd1: elapsed 2044957     gcd1: elapsed 2157985     gcd1: elapsed 2055475
> gcd2: elapsed 2054496     gcd2: elapsed 2170720     gcd2: elapsed 2068926
> gcd3: elapsed 2044838     gcd3: elapsed 2167954     gcd3: elapsed 2055305
> gcd4: elapsed 2059033     gcd4: elapsed 2176002     gcd4: elapsed 2079856
> PASS                      PASS                      PASS
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2016-07-18 12:15 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-15 13:51 lib/GCD.c regression on arm Cheah Kok Cheong
2016-07-18 12:15 ` Jisheng Zhang [this message]
2016-07-19  6:52   ` Cheah Kok Cheong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160718201549.61f135c8@xhacker \
    --to=jszhang@marvell.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox