All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Viktorin <viktorin@rehivetech.com>
To: thomas.monjalon@6wind.com
Cc: jerin.jacob@caviumnetworks.com, tomaszx.kulasek@intel.com,
	jianbo.liu@linaro.org, dev@dpdk.org
Subject: Re: [PATCH v2] arm: detect NEON by RTE_MACHINE_CPUFLAG_NEON flag only
Date: Sat, 19 Mar 2016 12:05:59 +0100	[thread overview]
Message-ID: <20160319120559.372e9088@jvn> (raw)
In-Reply-To: <1458379590-18618-1-git-send-email-viktorin@rehivetech.com>

On Sat, 19 Mar 2016 10:26:30 +0100
Jan Viktorin <viktorin@rehivetech.com> wrote:

> The RTE_MACHINE_CPUFLAG_NEON was only a result of the gcc testing. However,
> the target CPU may not support NEON or the user can disable to use it (as it
> does not always improve the performance).
> 
> The RTE_MACHINE_CPUFLAG_NEON detection is now based on both, the __ARM_NEON_FP
> feature from gcc and CONFIG_RTE_ARCH_ARM_NEON from the .config. The memcpy
> implemention is driven by RTE_MACHINE_CPUFLAG_NEON, so the reason to disable
> NEON is hidden for the actual code.

Unfortunately, I've overlooked a mistake. I have to remake the patch a
bit, sorry. I am a bit confused about the __ARM_NEON and __ARM_NEON_FP
settings.

The arm_neon.h is available only when the __ARM_NEON is present. But...

$ arm-buildroot-linux-gnueabi-gcc -dM -E - < /dev/null  | grep "_FP\|_NEON"
#define __ARM_FP 12
#define __ARM_NEON_FP 4
#define __VFP_FP__ 1

Without -mfpu=neon we don't have arm_neon.h. I consider this strange as
we are not interested in the FPU features but in the SIMD features...

$ arm-buildroot-linux-gnueabi-gcc -mfpu=neon -dM -E - < /dev/null  | grep "_FP\|_NEON"
#define __ARM_FP 12
#define __ARM_NEON_FP 4
#define __ARM_NEON__ 1
#define __VFP_FP__ 1
#define __ARM_NEON 1

$ arm-buildroot-linux-gnueabi-gcc -mfpu=neon-vfpv4 -dM -E - < /dev/null  | grep "_FP\|_NEON"
#define __ARM_FP 14
#define __ARM_NEON_FP 6
#define __FP_FAST_FMAF 1
#define __FP_FAST_FMAL 1
#define __ARM_NEON__ 1
#define __VFP_FP__ 1
#define __ARM_NEON 1
#define __FP_FAST_FMA 1

ARM64 is OK here...

$ aarch64-buildroot-linux-gnu-gcc -dM -E - < /dev/null | grep "NEON\|FP"
#define __FP_FAST_FMAF 1
#define __ARM_NEON 1
#define __FP_FAST_FMA 1

So...

> 
> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> ---
> v2: fix l3fwm_em.c to refer RTE_MACHINE_CPUFLAG_NEON instead of __ARM_NEON
> ---
>  examples/l3fwd/l3fwd_em.c                              | 2 +-
>  lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h | 4 ++--
>  mk/machine/armv7a/rte.vars.mk                          | 2 +-
>  mk/rte.cpuflags.mk                                     | 2 ++
>  4 files changed, 6 insertions(+), 4 deletions(-)
> 
[...]
>  #ifdef __cplusplus
>  }
> diff --git a/mk/machine/armv7a/rte.vars.mk b/mk/machine/armv7a/rte.vars.mk
> index 48d3979..7a167c1 100644
> --- a/mk/machine/armv7a/rte.vars.mk
> +++ b/mk/machine/armv7a/rte.vars.mk
> @@ -62,6 +62,6 @@ ifdef CONFIG_RTE_ARCH_ARM_TUNE
>  MACHINE_CFLAGS += -mtune=$(CONFIG_RTE_ARCH_ARM_TUNE)
>  endif
>  
> -ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
> +ifdef $(RTE_MACHINE_CPUFLAG_NEON)
>  MACHINE_CFLAGS += -mfpu=neon
>  endif

RTE_MACHINE_CPUFLAG_NEON is not *yet* set here (cpuflags are detected later)...
So the -mfpu=neon is never configured and the build fails. The
MACHINE_CFLAGS should rather depend on the CONFIG_RTE_ARCH_ARM_NEON
telling the build-system "we want NEON".

> diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
> index 19a3e7e..1947511 100644
> --- a/mk/rte.cpuflags.mk
> +++ b/mk/rte.cpuflags.mk
> @@ -111,9 +111,11 @@ CPUFLAGS += VSX
>  endif
>  
>  # ARM flags
> +ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
>  ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON_FP),)

Here, we should check __ARM_NEON (to be ARM32/64 compatible) but we
cannot see __ARM_NEON without the -mfpu=neon flag.

Jerin, does the current DPDK detect NEON feature on ARM64? I'd say, it
cannot.

So, we should probably check both __ARM_NEON and __ARM_NEON_FP here.

Another point, related to the original discussion:

http://dpdk.org/ml/archives/dev/2016-March/thread.html#35972

we should probably have a config option to enable memcpy optimizations
separated from the NEON support. The NEON support can then be detected
only by the __ARM_NEON flag. The ARMv7 would have the -mfpu=neon always
set. If somebody likes to customize this, she would do it by hand. The
result is, we correctly detect NEON during build time from the GCC.

>  CPUFLAGS += NEON
>  endif
> +endif
>  
>  ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_FEATURE_CRC32),)
>  CPUFLAGS += CRC32



-- 
  Jan Viktorin                E-mail: Viktorin@RehiveTech.com
  System Architect            Web:    www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic

  reply	other threads:[~2016-03-19 11:05 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-19  9:26 [PATCH v2] arm: detect NEON by RTE_MACHINE_CPUFLAG_NEON flag only Jan Viktorin
2016-03-19 11:05 ` Jan Viktorin [this message]
2016-03-19 19:58 ` [PATCH v3 0/4] " Jan Viktorin
2016-03-24 16:47   ` Thomas Monjalon
2016-03-19 19:58 ` [PATCH v3 1/4] arm: remove CONFIG_RTE_ARCH_ARM_NEON Jan Viktorin
2016-03-19 19:58 ` [PATCH v3 2/4] arm: detect NEON cpu feature by checking __ARM_NEON Jan Viktorin
2016-03-20 17:27   ` Jerin Jacob
2016-03-19 19:58 ` [PATCH v3 3/4] arm: detect NEON by checking RTE_MACHINE_CPUFLAG_NEON Jan Viktorin
2016-03-19 19:58 ` [PATCH v3 4/4] eal/arm: introduce CONFIG_RTE_ARCH_ARM_NEON_MEMCPY Jan Viktorin
2016-03-19 20:14   ` Thomas Monjalon
2016-03-20  9:41     ` Jan Viktorin
2016-03-20  9:46       ` Jan Viktorin
2016-03-20 10:33         ` Thomas Monjalon
2016-03-20 10:29       ` Thomas Monjalon
2016-03-20 17:38         ` Jerin Jacob
2016-03-21  5:42   ` Jianbo Liu
2016-03-21 12:21     ` Jan Viktorin
2016-03-21 13:24       ` Thomas Monjalon
2016-03-21 14:01         ` Jan Viktorin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160319120559.372e9088@jvn \
    --to=viktorin@rehivetech.com \
    --cc=dev@dpdk.org \
    --cc=jerin.jacob@caviumnetworks.com \
    --cc=jianbo.liu@linaro.org \
    --cc=thomas.monjalon@6wind.com \
    --cc=tomaszx.kulasek@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.