* [meta-ti][master/kirkstone][PATCH] conf: machine: k3: Use Cortex-A53/A72 CPU tune
@ 2024-02-15 21:26 Andrew Davis
2024-02-16 7:07 ` [EXTERNAL] " Limaye, Aniket
2024-02-16 20:23 ` Denys Dmytriyenko
0 siblings, 2 replies; 5+ messages in thread
From: Andrew Davis @ 2024-02-15 21:26 UTC (permalink / raw)
To: Denys Dmytriyenko, Ryan Eatmon, meta-ti; +Cc: Andrew Davis
All current K3 devices use either A53 or A72. Use the compile tune
configuration specific for these to allow the compiler to make
better optimizations.
Signed-off-by: Andrew Davis <afd@ti.com>
---
meta-ti-bsp/conf/machine/include/k3.inc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/meta-ti-bsp/conf/machine/include/k3.inc b/meta-ti-bsp/conf/machine/include/k3.inc
index 2415f0ba..7c3579af 100644
--- a/meta-ti-bsp/conf/machine/include/k3.inc
+++ b/meta-ti-bsp/conf/machine/include/k3.inc
@@ -3,7 +3,7 @@
require conf/machine/include/ti-soc.inc
SOC_FAMILY:append = ":k3"
-require conf/machine/include/arm/arch-arm64.inc
+require conf/machine/include/arm/armv8a/tune-cortexa72-cortexa53.inc
BBMULTICONFIG += "k3r5"
--
2.39.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [EXTERNAL] [meta-ti][master/kirkstone][PATCH] conf: machine: k3: Use Cortex-A53/A72 CPU tune
2024-02-15 21:26 [meta-ti][master/kirkstone][PATCH] conf: machine: k3: Use Cortex-A53/A72 CPU tune Andrew Davis
@ 2024-02-16 7:07 ` Limaye, Aniket
2024-02-16 20:23 ` Denys Dmytriyenko
1 sibling, 0 replies; 5+ messages in thread
From: Limaye, Aniket @ 2024-02-16 7:07 UTC (permalink / raw)
To: meta-ti; +Cc: Kumar, Udit
[-- Attachment #1: Type: text/plain, Size: 2824 bytes --]
Hi Andrew,
I was testing this patch locally, and wanted to see if we get some perf
improvements with some benchmarks available on the default image. What
benchmark do you recommend testing this against?
I ran '/runLinpack/' on the tisdk-default-image on j7200 and did not see
any difference in the reported performance with and without this
patch... is this expected?
With the default image built on latest SDK, WITHOUT the patch:
Unrolled Single Precision 1845878 Kflops ; 10 Reps
With default image built WITH the patch:
Unrolled Single Precision 1857362 Kflops ; 10 Reps
Thanks,
Aniket
On 2/16/2024 2:56 AM, Andrew Davis via lists.yoctoproject.org wrote:
> All current K3 devices use either A53 or A72. Use the compile tune
> configuration specific for these to allow the compiler to make better
> optimizations. Signed-off-by: Andrew Davis <afd@ ti. com> ---
> meta-ti-bsp/conf/machine/include/k3. inc
> ZjQcmQRYFpfptBannerStart
> This message was sent from outside of Texas Instruments.
> Do not click links or open attachments unless you recognize the source
> of this email and know the content is safe.
> ZjQcmQRYFpfptBannerEnd
> All current K3 devices use either A53 or A72. Use the compile tune
> configuration specific for these to allow the compiler to make
> better optimizations.
>
> Signed-off-by: Andrew Davis<afd@ti.com>
> ---
> meta-ti-bsp/conf/machine/include/k3.inc | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/meta-ti-bsp/conf/machine/include/k3.inc b/meta-ti-bsp/conf/machine/include/k3.inc
> index 2415f0ba..7c3579af 100644
> --- a/meta-ti-bsp/conf/machine/include/k3.inc
> +++ b/meta-ti-bsp/conf/machine/include/k3.inc
> @@ -3,7 +3,7 @@
> require conf/machine/include/ti-soc.inc
> SOC_FAMILY:append = ":k3"
>
> -require conf/machine/include/arm/arch-arm64.inc
> +require conf/machine/include/arm/armv8a/tune-cortexa72-cortexa53.inc
>
> BBMULTICONFIG += "k3r5"
>
> --
> 2.39.2
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#17482):https://urldefense.com/v3/__https://lists.yoctoproject.org/g/meta-ti/message/17482__;!!G3vK!T21e4UwMAwqmb2WXp3iTH2w5zs9CtoI5wX4pmvJQk3-F9H6FDcANpOSFX7ctu-yvXLL-_io6FQngJ72BANL844OMEXw$
> Mute This Topic:https://urldefense.com/v3/__https://lists.yoctoproject.org/mt/104381861/6607860__;!!G3vK!T21e4UwMAwqmb2WXp3iTH2w5zs9CtoI5wX4pmvJQk3-F9H6FDcANpOSFX7ctu-yvXLL-_io6FQngJ72BANL8UKq3aCI$
> Group Owner:meta-ti+owner@lists.yoctoproject.org
> Unsubscribe:https://urldefense.com/v3/__https://lists.yoctoproject.org/g/meta-ti/unsub__;!!G3vK!T21e4UwMAwqmb2WXp3iTH2w5zs9CtoI5wX4pmvJQk3-F9H6FDcANpOSFX7ctu-yvXLL-_io6FQngJ72BANL8aX0fg_c$ [a-limaye@ti.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
[-- Attachment #2: Type: text/html, Size: 8624 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [meta-ti][master/kirkstone][PATCH] conf: machine: k3: Use Cortex-A53/A72 CPU tune
2024-02-15 21:26 [meta-ti][master/kirkstone][PATCH] conf: machine: k3: Use Cortex-A53/A72 CPU tune Andrew Davis
2024-02-16 7:07 ` [EXTERNAL] " Limaye, Aniket
@ 2024-02-16 20:23 ` Denys Dmytriyenko
2024-02-20 14:31 ` Andrew Davis
1 sibling, 1 reply; 5+ messages in thread
From: Denys Dmytriyenko @ 2024-02-16 20:23 UTC (permalink / raw)
To: afd; +Cc: Denys Dmytriyenko, Ryan Eatmon, meta-ti
Unfortunately, NAK.
This is considered an antisocial behavior for a BSP in the Yocto Project
world. And the performance benefit is questionable with 1%-2%, if at all.
The proper place for any extra optimization tunes is in a distro config. Maybe
even by end customer's final product, not a reference distro.
Consider a distro that supports multiple HW platforms and uses multiple BSPs
besides meta-ti - YoE, AGL, etc. You do want a common denominator tunes in
order to get the most binary re-use across the platforms.
For example, AGL goes to some extreme lengths to override such custom tunes
set by misbehaving BSPs and it's quite ugly.
And moreover, we've gone through this motion in the past many years ago when
we had our ARMv7 platforms set to their corresponding cortex-a8/a9/a15 tunes
by default, but eventually ended up setting a common ARMv7 tune:
DEFAULTTUNE ?= "armv7athf-neon"
So, you should either leave the current arch-arm64.inc inclusion as is, or if
you insist on including tune-cortexa72-cortexa53.inc, set the default tune
back to plain aarch64:
DEFAULTTUNE ?= "aarch64"
On Thu, Feb 15, 2024 at 03:26:13PM -0600, Andrew Davis via lists.yoctoproject.org wrote:
> All current K3 devices use either A53 or A72. Use the compile tune
> configuration specific for these to allow the compiler to make
> better optimizations.
>
> Signed-off-by: Andrew Davis <afd@ti.com>
> ---
> meta-ti-bsp/conf/machine/include/k3.inc | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/meta-ti-bsp/conf/machine/include/k3.inc b/meta-ti-bsp/conf/machine/include/k3.inc
> index 2415f0ba..7c3579af 100644
> --- a/meta-ti-bsp/conf/machine/include/k3.inc
> +++ b/meta-ti-bsp/conf/machine/include/k3.inc
> @@ -3,7 +3,7 @@
> require conf/machine/include/ti-soc.inc
> SOC_FAMILY:append = ":k3"
>
> -require conf/machine/include/arm/arch-arm64.inc
> +require conf/machine/include/arm/armv8a/tune-cortexa72-cortexa53.inc
>
> BBMULTICONFIG += "k3r5"
>
> --
> 2.39.2
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [meta-ti][master/kirkstone][PATCH] conf: machine: k3: Use Cortex-A53/A72 CPU tune
2024-02-16 20:23 ` Denys Dmytriyenko
@ 2024-02-20 14:31 ` Andrew Davis
2024-02-20 15:00 ` Ryan Eatmon
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Davis @ 2024-02-20 14:31 UTC (permalink / raw)
To: Denys Dmytriyenko; +Cc: Denys Dmytriyenko, Ryan Eatmon, meta-ti
On 2/16/24 2:23 PM, Denys Dmytriyenko wrote:
> Unfortunately, NAK.
>
> This is considered an antisocial behavior for a BSP in the Yocto Project
> world. And the performance benefit is questionable with 1%-2%, if at all.
>
This stated when a potential customer noticed building and running some
benchmarks (linpack for instance) on our SDK were being out-performed by
some other vendors. Even though on paper our platforms should have been
the better performing ones.
After investigating it turns out these other vendors have these tune
options in their BSP layers, causing the performance discrepancy.
So the performance here, even of a couple percent, is very important.
> The proper place for any extra optimization tunes is in a distro config. Maybe
> even by end customer's final product, not a reference distro.
>
> Consider a distro that supports multiple HW platforms and uses multiple BSPs
> besides meta-ti - YoE, AGL, etc. You do want a common denominator tunes in
> order to get the most binary re-use across the platforms.
>
If one wants binary re-use they can override the tune. Otherwise maybe
they should be using Debian or some other binary distro. The main selling
point for Yocto IMHO is customizing like this. The best part of rebuilding
everything from scratch every time for every machine is we can have these
machine specific tunings.
> For example, AGL goes to some extreme lengths to override such custom tunes
> set by misbehaving BSPs and it's quite ugly.
>
Then we should work to make it easier to override for those folks, not simply
leave this performance on the table.
> And moreover, we've gone through this motion in the past many years ago when
> we had our ARMv7 platforms set to their corresponding cortex-a8/a9/a15 tunes
> by default, but eventually ended up setting a common ARMv7 tune:
>
> DEFAULTTUNE ?= "armv7athf-neon"
>
> So, you should either leave the current arch-arm64.inc inclusion as is, or if
> you insist on including tune-cortexa72-cortexa53.inc, set the default tune
> back to plain aarch64:
>
> DEFAULTTUNE ?= "aarch64"
>
I see our friends over in meta-xilinx are doing machine specific DEFAULTTUNEs.
I was thinking of matching that to keep our BSP performance competitive. But
as a compromise and to avoid "antisocial behavior" as you say, I think I can
live with DEFAULTTUNE ?= "aarch64".
Will resend with that.
Andrew
>
> On Thu, Feb 15, 2024 at 03:26:13PM -0600, Andrew Davis via lists.yoctoproject.org wrote:
>> All current K3 devices use either A53 or A72. Use the compile tune
>> configuration specific for these to allow the compiler to make
>> better optimizations.
>>
>> Signed-off-by: Andrew Davis <afd@ti.com>
>> ---
>> meta-ti-bsp/conf/machine/include/k3.inc | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/meta-ti-bsp/conf/machine/include/k3.inc b/meta-ti-bsp/conf/machine/include/k3.inc
>> index 2415f0ba..7c3579af 100644
>> --- a/meta-ti-bsp/conf/machine/include/k3.inc
>> +++ b/meta-ti-bsp/conf/machine/include/k3.inc
>> @@ -3,7 +3,7 @@
>> require conf/machine/include/ti-soc.inc
>> SOC_FAMILY:append = ":k3"
>>
>> -require conf/machine/include/arm/arch-arm64.inc
>> +require conf/machine/include/arm/armv8a/tune-cortexa72-cortexa53.inc
>>
>> BBMULTICONFIG += "k3r5"
>>
>> --
>> 2.39.2
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [meta-ti][master/kirkstone][PATCH] conf: machine: k3: Use Cortex-A53/A72 CPU tune
2024-02-20 14:31 ` Andrew Davis
@ 2024-02-20 15:00 ` Ryan Eatmon
0 siblings, 0 replies; 5+ messages in thread
From: Ryan Eatmon @ 2024-02-20 15:00 UTC (permalink / raw)
To: Andrew Davis, Denys Dmytriyenko; +Cc: Denys Dmytriyenko, meta-ti
On 2/20/2024 8:31 AM, Andrew Davis wrote:
> On 2/16/24 2:23 PM, Denys Dmytriyenko wrote:
>> Unfortunately, NAK.
>>
>> This is considered an antisocial behavior for a BSP in the Yocto Project
>> world. And the performance benefit is questionable with 1%-2%, if at all.
>>
>
> This stated when a potential customer noticed building and running some
> benchmarks (linpack for instance) on our SDK were being out-performed by
> some other vendors. Even though on paper our platforms should have been
> the better performing ones.
>
> After investigating it turns out these other vendors have these tune
> options in their BSP layers, causing the performance discrepancy.
>
> So the performance here, even of a couple percent, is very important.
>
>> The proper place for any extra optimization tunes is in a distro
>> config. Maybe
>> even by end customer's final product, not a reference distro.
>>
>> Consider a distro that supports multiple HW platforms and uses
>> multiple BSPs
>> besides meta-ti - YoE, AGL, etc. You do want a common denominator
>> tunes in
>> order to get the most binary re-use across the platforms.
>>
>
> If one wants binary re-use they can override the tune. Otherwise maybe
> they should be using Debian or some other binary distro. The main selling
> point for Yocto IMHO is customizing like this. The best part of rebuilding
> everything from scratch every time for every machine is we can have these
> machine specific tunings.
>
>> For example, AGL goes to some extreme lengths to override such custom
>> tunes
>> set by misbehaving BSPs and it's quite ugly.
>>
>
> Then we should work to make it easier to override for those folks, not
> simply
> leave this performance on the table.
>
>> And moreover, we've gone through this motion in the past many years
>> ago when
>> we had our ARMv7 platforms set to their corresponding cortex-a8/a9/a15
>> tunes
>> by default, but eventually ended up setting a common ARMv7 tune:
>>
>> DEFAULTTUNE ?= "armv7athf-neon"
>>
>> So, you should either leave the current arch-arm64.inc inclusion as
>> is, or if
>> you insist on including tune-cortexa72-cortexa53.inc, set the default
>> tune
>> back to plain aarch64:
>>
>> DEFAULTTUNE ?= "aarch64"
>>
>
> I see our friends over in meta-xilinx are doing machine specific
> DEFAULTTUNEs.
> I was thinking of matching that to keep our BSP performance competitive.
> But
> as a compromise and to avoid "antisocial behavior" as you say, I think I
> can
> live with DEFAULTTUNE ?= "aarch64".
>
> Will resend with that.
So, if we include the more targeted tuning file, but set DEFAULTUNE to
the generic, then how do our builds use the more targeted tuning? Is
that something we have to set in the local.conf as part of our builds?
Or is this some sort of magic that occurs that gets the correct thing?
> Andrew
>
>>
>> On Thu, Feb 15, 2024 at 03:26:13PM -0600, Andrew Davis via
>> lists.yoctoproject.org wrote:
>>> All current K3 devices use either A53 or A72. Use the compile tune
>>> configuration specific for these to allow the compiler to make
>>> better optimizations.
>>>
>>> Signed-off-by: Andrew Davis <afd@ti.com>
>>> ---
>>> meta-ti-bsp/conf/machine/include/k3.inc | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/meta-ti-bsp/conf/machine/include/k3.inc
>>> b/meta-ti-bsp/conf/machine/include/k3.inc
>>> index 2415f0ba..7c3579af 100644
>>> --- a/meta-ti-bsp/conf/machine/include/k3.inc
>>> +++ b/meta-ti-bsp/conf/machine/include/k3.inc
>>> @@ -3,7 +3,7 @@
>>> require conf/machine/include/ti-soc.inc
>>> SOC_FAMILY:append = ":k3"
>>> -require conf/machine/include/arm/arch-arm64.inc
>>> +require conf/machine/include/arm/armv8a/tune-cortexa72-cortexa53.inc
>>> BBMULTICONFIG += "k3r5"
>>> --
>>> 2.39.2
--
Ryan Eatmon reatmon@ti.com
-----------------------------------------
Texas Instruments, Inc. - LCPD - MGTS
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-02-20 15:00 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-15 21:26 [meta-ti][master/kirkstone][PATCH] conf: machine: k3: Use Cortex-A53/A72 CPU tune Andrew Davis
2024-02-16 7:07 ` [EXTERNAL] " Limaye, Aniket
2024-02-16 20:23 ` Denys Dmytriyenko
2024-02-20 14:31 ` Andrew Davis
2024-02-20 15:00 ` Ryan Eatmon
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.