Buildroot Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] FFTW optimized for ARM and NEON FPU
@ 2015-02-24  3:57 guillaume william brs
  2015-02-24  9:44 ` Peter Kümmel
  2015-03-15 23:41 ` Yann E. MORIN
  0 siblings, 2 replies; 3+ messages in thread
From: guillaume william brs @ 2015-02-24  3:57 UTC (permalink / raw)
  To: buildroot

Hello,

I am working on a cortex-a9 (zynq SoC, mainly zc706)
I faced fairly slow computations of FFT using this library with the 
default configuration (20MFlops).

I modified package/fftw.mk to optimize the library compilation as long 
as an FPU is available,
NEON in this case: this increased the number of MFlops to 500-600.

-- 
guillaume william bres-saix
software engineer
NIST - Time & Frequency div.
325 Broadway, Boulder, CO 80305.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: fftw.patch
Type: text/x-diff
Size: 665 bytes
Desc: not available
URL: <http://lists.busybox.net/pipermail/buildroot/attachments/20150223/11a9087c/attachment.bin>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Buildroot] FFTW optimized for ARM and NEON FPU
  2015-02-24  3:57 [Buildroot] FFTW optimized for ARM and NEON FPU guillaume william brs
@ 2015-02-24  9:44 ` Peter Kümmel
  2015-03-15 23:41 ` Yann E. MORIN
  1 sibling, 0 replies; 3+ messages in thread
From: Peter Kümmel @ 2015-02-24  9:44 UTC (permalink / raw)
  To: buildroot

+FFTW_CONF_OPTS+= ARM_FLOAT_ABI=softfp

Shouldn't it use hardfp?
https://wiki.debian.org/ArmHardFloatPort/VfpComparison

Am 24.02.2015 um 04:57 schrieb guillaume william brs:
> Hello,
>
> I am working on a cortex-a9 (zynq SoC, mainly zc706)
> I faced fairly slow computations of FFT using this library
> with the default configuration (20MFlops).
>
> I modified package/fftw.mk to optimize the library
> compilation as long as an FPU is available,
> NEON in this case: this increased the number of MFlops to
> 500-600.
>
>
>
> _______________________________________________
> buildroot mailing list
> buildroot at busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Buildroot] FFTW optimized for ARM and NEON FPU
  2015-02-24  3:57 [Buildroot] FFTW optimized for ARM and NEON FPU guillaume william brs
  2015-02-24  9:44 ` Peter Kümmel
@ 2015-03-15 23:41 ` Yann E. MORIN
  1 sibling, 0 replies; 3+ messages in thread
From: Yann E. MORIN @ 2015-03-15 23:41 UTC (permalink / raw)
  To: buildroot

Guillaume, All,

Sorry for the delay in the response...

On 2015-02-23 20:57 -0700, guillaume william brs spake thusly:
> I am working on a cortex-a9 (zynq SoC, mainly zc706)
> I faced fairly slow computations of FFT using this library with the default
> configuration (20MFlops).
> 
> I modified package/fftw.mk to optimize the library compilation as long as an
> FPU is available,
> NEON in this case: this increased the number of MFlops to 500-600.

There's recently been another attempt at making fftw more efficient:
    http://patchwork.ozlabs.org/patch/450317/

So, I've decided to have a look if we had other patches for fftw. And it
turned out there's yours! ;-)

So, I've hada look and have some comments, see below...

> -- 
> guillaume william bres-saix
> software engineer
> NIST - Time & Frequency div.
> 325 Broadway, Boulder, CO 80305.
> 

> --- fftw.mk	2015-02-23 19:20:04.849545269 -0700
> +++ fftw_fpu.mk	2015-02-23 19:26:47.598136607 -0700
> @@ -10,4 +10,17 @@
>  FFTW_LICENSE = GPLv2+
>  FFTW_LICENSE_FILES = COPYING
>  
> +ifeq ($(BR2_ARM_ENABLE_NEON),y)
> +FFTW_CONF_OPTS+= --enable-threads 

This should be conditional on BR2_TOOLCHAIN_HAS_THREADS; it has nothing
to do with NEON.

> +FFTW_CONF_OPTS+= --with-cpu=cortex-a9 
> +FFTW_CONF_OPTS+= --with-mode=arm 
> +FFTW_CONF_OPTS+= --enable-languages=c,c++ 

fftw does not seem to have such options...

> +FFTW_CONF_OPTS+= ARM_CPU_TYPE=cortex-a9 
> +FFTW_CONF_OPTS+= ARM_FLOAT_ABI=softfp

Those two variables are nowhere to be used in fftw either...

> +FFTW_CONF_OPTS+= --disable-fortran 

OK.

> +FFTW_CONF_OPTS+= --enable-single

This one is handled in the patch I pointed to earlier.

> +FFTW_CONF_OPTS+= --enable-neon
> +FFTW_CONF_OPTS+= CFLAGS="$(TARGET_CFLAGS)-Ofast -mfpu=neon -mfloat-abi=softfp"

Hmm... -mfpu and -mfloat-abi should not be needed, since they are
already taken care of:
  - for internal toolchains, they are set as the defaults for gcc;
  - for external toolchains, they are automatically added by our wrapper.

Also, specifying them could be conflicting with the defaults.

I'll see to get this in order and will resubmit...

Regards,
Yann E. MORIN.

> +endif
> +
>  $(eval $(autotools-package))

> _______________________________________________
> buildroot mailing list
> buildroot at busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot


-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-03-15 23:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-24  3:57 [Buildroot] FFTW optimized for ARM and NEON FPU guillaume william brs
2015-02-24  9:44 ` Peter Kümmel
2015-03-15 23:41 ` Yann E. MORIN

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox