All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Hatle <mark.hatle@windriver.com>
To: Daniel Dragomir <daniel.dragomir@windriver.com>,
	<openembedded-core@lists.openembedded.org>
Cc: Cristian Bercaru <cristian.bercaru@windriver.com>
Subject: Re: [PATCH] add tunings for 32-bit ARMv8-a
Date: Tue, 16 Feb 2016 09:50:11 -0500	[thread overview]
Message-ID: <56C33723.30408@windriver.com> (raw)
In-Reply-To: <1455632918-31605-2-git-send-email-daniel.dragomir@windriver.com>

On 2/16/16 9:28 AM, Daniel Dragomir wrote:
> From: Cristian Bercaru <cristian.bercaru@windriver.com>
> 
> This patch adds tunes for 32-bit armv8 platforms. The user can select
> the vector floating-point instruction set: vfpv3, vfpv4 or fp-armv8
> and the Neon, crc and crypto extensions.
> 
> Signed-off-by: Cristian Bercaru <cristian.bercaru@windriver.com>
> Signed-off-by: Daniel Dragomir <daniel.dragomir@windriver.com>
> ---
>  meta/conf/machine/include/arm/arch-armv7a.inc      |  1 -
>  meta/conf/machine/include/arm/arch-armv8a.inc      | 65 ++++++++++++++++++++++
>  meta/conf/machine/include/arm/feature-arm-neon.inc |  3 -
>  meta/conf/machine/include/arm/feature-arm-vfp.inc  | 59 ++++++++++++++++----
>  4 files changed, 112 insertions(+), 16 deletions(-)
>  create mode 100644 meta/conf/machine/include/arm/arch-armv8a.inc
>  delete mode 100644 meta/conf/machine/include/arm/feature-arm-neon.inc
> 
> diff --git a/meta/conf/machine/include/arm/arch-armv7a.inc b/meta/conf/machine/include/arm/arch-armv7a.inc
> index d3b6f64..ac85fda 100644
> --- a/meta/conf/machine/include/arm/arch-armv7a.inc
> +++ b/meta/conf/machine/include/arm/arch-armv7a.inc
> @@ -6,7 +6,6 @@ TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "armv7a", " -march=armv7-a
>  MACHINEOVERRIDES =. "${@bb.utils.contains("TUNE_FEATURES", "armv7a", "armv7a:", "" ,d)}"
>  
>  require conf/machine/include/arm/arch-armv6.inc
> -require conf/machine/include/arm/feature-arm-neon.inc
>  
>  # Little Endian base configs
>  AVAILTUNES += "armv7a armv7at armv7a-vfpv3d16 armv7at-vfpv3d16 armv7a-vfpv3 armv7at-vfpv3 armv7a-neon armv7at-neon"
> diff --git a/meta/conf/machine/include/arm/arch-armv8a.inc b/meta/conf/machine/include/arm/arch-armv8a.inc
> new file mode 100644
> index 0000000..1b062b8
> --- /dev/null
> +++ b/meta/conf/machine/include/arm/arch-armv8a.inc
> @@ -0,0 +1,65 @@
> +DEFAULTTUNE ?= "armv8a"
> +
> +TUNEVALID[armv8a] = "Enable instructions for ARMv8-a"
> +TUNEVALID[crc] = "Enable CRC instrucitons for ARMv8-a"
> +TUNECONFLICTS[armv8a] = "armv4 armv5 armv6 armv7 armv7a"
> +TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "armv8a", " -march=armv8-a", "", d)}"
> +TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "crc", "+crc", "", d)}"

I assume the idea here is to get -march=armv8-a or -march=armv8-a+crc correct?

My concern here is that there is nothing defined that forces the CRC feature to
also require armv8.  I'd definitely prefer if there was a check that BOTH armv8a
and crc was set before doing the +crc.  (I've been burned by this in the past.)

> +MACHINEOVERRIDES =. "${@bb.utils.contains("TUNE_FEATURES", "armv8a", "armv8a:", "" ,d)}"
> +
> +require conf/machine/include/arm/arch-armv7a.inc
> +
> +# Little Endian base configs
> +AVAILTUNES += "armv8a armv8a-vfpv3 armv8a-neon armv8a-vfpv4 armv8a-neon-vfpv4 armv8a-fp-armv8 armv8a-neon-fp-armv8 armv8a-crypto-neon-fp-armv8"
> +ARMPKGARCH_tune-armv8a ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-vfpv3 ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-neon ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-vfpv4 ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-neon-vfpv4 ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-fp-armv8 ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-neon-fp-armv8 ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-crypto-neon-fp-armv8 ?= "armv8a"
> +TUNE_FEATURES_tune-armv8a ?= "arm armv8a vfp"
> +TUNE_FEATURES_tune-armv8a-vfpv3 ?= "${TUNE_FEATURES_tune-armv8a} vfpv3"
> +TUNE_FEATURES_tune-armv8a-neon ?= "${TUNE_FEATURES_tune-armv8a} neon"
> +TUNE_FEATURES_tune-armv8a-vfpv4 ?= "${TUNE_FEATURES_tune-armv8a} vfpv4"
> +TUNE_FEATURES_tune-armv8a-neon-vfpv4 ?= "${TUNE_FEATURES_tune-armv8a} neon vfpv4"
> +TUNE_FEATURES_tune-armv8a-fp-armv8 ?= "${TUNE_FEATURES_tune-armv8a} fp-armv8"
> +TUNE_FEATURES_tune-armv8a-neon-fp-armv8 ?= "${TUNE_FEATURES_tune-armv8a} neon fp-armv8"
> +TUNE_FEATURES_tune-armv8a-crypto-neon-fp-armv8 ?= "${TUNE_FEATURES_tune-armv8a} crypto neon fp-armv8"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a = "${PACKAGE_EXTRA_ARCHS_tune-armv7a} armv8a armv8a-vfp"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-vfpv3 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a} armv8a-vfp-vfpv3"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-neon = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-vfpv3} armv8a-vfp-neon"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-vfpv4 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-vfpv3} armv8a-vfp-vfpv4"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-neon-vfpv4 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-vfpv4} armv8a-vfp-neon armv8a-vfp-neon-vfpv4"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-fp-armv8 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a} armv8a-vfp-fp-armv8"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-neon-fp-armv8 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-fp-armv8} armv8a-vfp-neon-fp-armv8"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-crypto-neon-fp-armv8 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-neon-fp-armv8} armv8a-vfp-crypto-neon-fp-armv8"
> +
> +# CRC configs
> +AVAILTUNES += "armv8a-crc armv8a-crc-vfpv3 armv8a-crc-neon armv8a-crc-vfpv4 armv8a-crc-neon-vfpv4 armv8a-crc-fp-armv8 armv8a-crc-neon-fp-armv8 armv8a-crc-crypto-neon-fp-armv8"
> +ARMPKGARCH_tune-armv8a-crc ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-crc-vfpv3 ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-crc-neon ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-crc-vfpv4 ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-crc-neon-vfpv4 ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-crc-fp-armv8 ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-crc-neon-fp-armv8 ?= "armv8a"
> +ARMPKGARCH_tune-armv8a-crc-crypto-neon-fp-armv8 ?= "armv8a"
> +TUNE_FEATURES_tune-armv8a-crc ?= "${TUNE_FEATURES_tune-armv8a} crc"
> +TUNE_FEATURES_tune-armv8a-crc-vfpv3 ?= "${TUNE_FEATURES_tune-armv8a-vfpv3} crc"
> +TUNE_FEATURES_tune-armv8a-crc-neon ?= "${TUNE_FEATURES_tune-armv8a-neon} crc"
> +TUNE_FEATURES_tune-armv8a-crc-vfpv4 ?= "${TUNE_FEATURES_tune-armv8a-vfpv4} crc"
> +TUNE_FEATURES_tune-armv8a-crc-neon-vfpv4 ?= "${TUNE_FEATURES_tune-armv8a-neon-vfpv4} crc"
> +TUNE_FEATURES_tune-armv8a-crc-fp-armv8 ?= "${TUNE_FEATURES_tune-armv8a-fp-armv8} crc"
> +TUNE_FEATURES_tune-armv8a-crc-neon-fp-armv8 ?= "${TUNE_FEATURES_tune-armv8a-neon-fp-armv8} crc"
> +TUNE_FEATURES_tune-armv8a-crc-crypto-neon-fp-armv8 ?= "${TUNE_FEATURES_tune-armv8a-crypto-neon-fp-armv8} crc"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-crc = "${PACKAGE_EXTRA_ARCHS_tune-armv8a} armv8a-crc-vfp"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-vfpv3 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} armv8a-vfp-vfpv3 armv8a-crc-vfp-vfpv3"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-neon = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-vfpv3} armv8a-vfp-neon armv8a-crc-vfp-neon"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-vfpv4 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-vfpv3} armv8a-vfp-vfpv4 armv8a-crc-vfp-vfpv4"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-neon-vfpv4 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-vfpv4} armv8a-vfp-neon armv8a-crc-vfp-neon armv8a-vfp-neon-vfpv4 armv8a-crc-vfp-neon-vfpv4"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-fp-armv8 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} armv8a-vfp-fp-armv8 armv8a-vfp-fp-crc-armv8"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-neon-fp-armv8 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-fp-armv8} armv8a-vfp-neon-fp-armv8 armv8a-crc-vfp-neon-fp-armv8"
> +PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto-neon-fp-armv8 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-neon-fp-armv8} armv8a-vfp-crypto-neon-fp-armv8 armv8a-crc-vfp-crypto-neon-fp-armv8"
> +

I'll leave the changes below for others to comment on.  I'm not against them,
but I do expect some others may have concerns.

--Mark

> diff --git a/meta/conf/machine/include/arm/feature-arm-neon.inc b/meta/conf/machine/include/arm/feature-arm-neon.inc
> deleted file mode 100644
> index e8b2b85..0000000
> --- a/meta/conf/machine/include/arm/feature-arm-neon.inc
> +++ /dev/null
> @@ -1,3 +0,0 @@
> -TUNEVALID[neon] = "Enable Neon SIMD accelerator unit."
> -TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "neon", bb.utils.contains("TUNE_FEATURES", "vfpv4", " -mfpu=neon-vfpv4", " -mfpu=neon", d), "" , d)}"
> -ARMPKGSFX_FPU .= "${@bb.utils.contains("TUNE_FEATURES", "neon", "-neon", "", d)}"
> diff --git a/meta/conf/machine/include/arm/feature-arm-vfp.inc b/meta/conf/machine/include/arm/feature-arm-vfp.inc
> index 3dfbeac..12a4e81 100644
> --- a/meta/conf/machine/include/arm/feature-arm-vfp.inc
> +++ b/meta/conf/machine/include/arm/feature-arm-vfp.inc
> @@ -1,17 +1,52 @@
>  TUNEVALID[vfp] = "Enable Vector Floating Point (vfp) unit."
> -ARMPKGSFX_FPU .= "${@bb.utils.contains("TUNE_FEATURES", "vfp", "-vfp", "" ,d)}"
> -
>  TUNEVALID[vfpv3d16] = "Enable Vector Floating Point Version 3 with 16 registers (vfpv3-d16) unit."
> -TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "vfpv3d16", " -mfpu=vfpv3-d16", "", d)}"
> -ARMPKGSFX_FPU .= "${@bb.utils.contains("TUNE_FEATURES", "vfpv3d16", "-vfpv3d16", "" ,d)}"
> -
>  TUNEVALID[vfpv3] = "Enable Vector Floating Point Version 3 with 32 registers (vfpv3) unit."
> -TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "vfpv3", " -mfpu=vfpv3", "", d)}"
> -ARMPKGSFX_FPU .= "${@bb.utils.contains("TUNE_FEATURES", "vfpv3", "-vfpv3", "" ,d)}"
> -
>  TUNEVALID[vfpv4] = "Enable Vector Floating Point Version 4 (vfpv4) unit."
> -ARMPKGSFX_FPU .= "${@bb.utils.contains("TUNE_FEATURES", "vfpv4", "-vfpv4", "" ,d)}"
> -
> +TUNEVALID[fp-armv8] = "Enable ARMv8 Vector Floating Point unit."
> +TUNEVALID[neon] = "Enable vfpv3 and Neon SIMD accelerator unit."
> +TUNEVALID[crypto] = "Enable ARMv8 crypto extension."
>  TUNEVALID[callconvention-hard] = "Enable EABI hard float call convention, requires VFP."
> -TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "vfp", bb.utils.contains("TUNE_FEATURES", "callconvention-hard", " -mfloat-abi=hard", " -mfloat-abi=softfp", d), "" ,d)}"
> -ARMPKGSFX_EABI .= "${@bb.utils.contains("TUNE_FEATURES", [ "callconvention-hard", "vfp" ], "hf", "", d)}"
> +
> +python () {
> +	if bb.utils.contains("TUNE_FEATURES", "crc", True, False, d):
> +		d.appendVar("ARMPKGSFX_FPU", "-crc")
> +
> +	if bb.utils.contains("TUNE_FEATURES", "vfp", True, False, d):
> +		d.appendVar("ARMPKGSFX_FPU", "-vfp")
> +
> +	if bb.utils.contains("TUNE_FEATURES", "vfpv3d16", True, False, d):
> +		d.appendVar("TUNE_CCARGS", " -mfpu=vfpv3-d16")
> +		d.appendVar("ARMPKGSFX_FPU", "-vfpv3d16")
> +
> +	if bb.utils.contains("TUNE_FEATURES", "vfpv3", True, False, d):
> +			d.appendVar("TUNE_CCARGS", " -mfpu=vfpv3")
> +			d.appendVar("ARMPKGSFX_FPU", "-vfpv3")
> +	elif bb.utils.contains("TUNE_FEATURES", "vfpv4", True, False, d):
> +		if bb.utils.contains("TUNE_FEATURES", "neon", True, False, d):
> +			d.appendVar("TUNE_CCARGS", " -mfpu=neon-vfpv4")
> +			d.appendVar("ARMPKGSFX_FPU", "-neon-vfpv4")
> +		else:
> +			d.appendVar("TUNE_CCARGS", " -mfpu=vfpv4")
> +			d.appendVar("ARMPKGSFX_FPU", "-vfpv4")
> +	elif bb.utils.contains("TUNE_FEATURES", "fp-armv8", True, False, d):
> +		if bb.utils.contains("TUNE_FEATURES", "neon", True, False, d):
> +			if bb.utils.contains("TUNE_FEATURES", "crypto", True, False, d):
> +				d.appendVar("TUNE_CCARGS", " -mfpu=crypto-neon-fp-armv8")
> +				d.appendVar("ARMPKGSFX_FPU", "-crypto-neon-fp-armv8")
> +			else:
> +				d.appendVar("TUNE_CCARGS", " -mfpu=neon-fp-armv8")
> +				d.appendVar("ARMPKGSFX_FPU", "-neon-fp-armv8")
> +		else:
> +			d.appendVar("TUNE_CCARGS", " -mfpu=fp-armv8")
> +			d.appendVar("ARMPKGSFX_FPU", "-fp-armv8")
> +	elif bb.utils.contains("TUNE_FEATURES", "neon", True, False, d):
> +		d.appendVar("TUNE_CCARGS", " -mfpu=neon")
> +		d.appendVar("ARMPKGSFX_FPU", "-neon")
> +
> +	if bb.utils.contains("TUNE_FEATURES", "callconvention-hard", True, False, d):
> +		d.appendVar("TUNE_CCARGS", " -mfloat-abi=hard")
> +	else:
> +		d.appendVar("TUNE_CCARGS", " -mfloat-abi=softfp")
> +	if bb.utils.contains("TUNE_FEATURES", [ "vfp", "callconvention-hard" ], True, False, d):
> +		d.appendVar("ARMPKGSFX_EABI", "hf")
> +}
> 



  reply	other threads:[~2016-02-16 14:50 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-16 14:28 [PATCH] ARMv8 32-bit compiler tunings Daniel Dragomir
2016-02-16 14:28 ` [PATCH] add tunings for 32-bit ARMv8-a Daniel Dragomir
2016-02-16 14:50   ` Mark Hatle [this message]
2016-02-16 14:54   ` Martin Jansa
2016-02-16 16:08     ` Dragomir Daniel
2016-02-16 16:20       ` Martin Jansa
2016-02-18 13:28         ` Dragomir Daniel
2016-02-23 15:04         ` Martin Jansa
2016-02-16 20:20   ` Andre McCurdy
  -- strict thread matches above, loose matches on Subject: below --
2015-12-04 17:20 [PATCH] ARMv8 32-bit compiler tunings Cristian Bercaru
2015-12-04 17:20 ` [PATCH] add tunings for 32-bit ARMv8-a Cristian Bercaru
2015-12-09 15:47   ` Martin Jansa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56C33723.30408@windriver.com \
    --to=mark.hatle@windriver.com \
    --cc=cristian.bercaru@windriver.com \
    --cc=daniel.dragomir@windriver.com \
    --cc=openembedded-core@lists.openembedded.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.