All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
To: "Tummala, Sivaprasad" <Sivaprasad.Tummala@amd.com>,
	Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>,
	Tyler Retzlaff <roretzla@linux.microsoft.com>,
	"Yigit, Ferruh" <Ferruh.Yigit@amd.com>
Cc: Bruce Richardson <bruce.richardson@intel.com>,
	"david.hunt@intel.com" <david.hunt@intel.com>,
	"anatoly.burakov@intel.com" <anatoly.burakov@intel.com>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: RE: [PATCH v5 3/3] power: amd power monitor support
Date: Thu, 28 Sep 2023 10:11:24 +0000	[thread overview]
Message-ID: <f9dc5bd4089f49578b423c453e0b258c@huawei.com> (raw)
In-Reply-To: <BL1PR12MB57774F58837F118786F1524686C2A@BL1PR12MB5777.namprd12.prod.outlook.com>


> > Caution: This message originated from an External Source. Use proper caution
> > when opening attachments, clicking links, or responding.
> >
> >
> > 25/08/2023 17:00, Tyler Retzlaff пишет:
> > > On Thu, Aug 24, 2023 at 10:04:42AM +0100, Ferruh Yigit wrote:
> > >> On 8/23/2023 5:03 PM, Tyler Retzlaff wrote:
> > >>> On Wed, Aug 23, 2023 at 10:19:39AM +0100, Ferruh Yigit wrote:
> > >>>> On 8/22/2023 11:30 PM, Konstantin Ananyev wrote:
> > >>>>> 18/08/2023 14:48, Bruce Richardson пишет:
> > >>>>>> On Fri, Aug 18, 2023 at 02:25:14PM +0100, Ferruh Yigit wrote:
> > >>>>>>> On 8/17/2023 3:18 PM, Konstantin Ananyev wrote:
> > >>>>>>>>
> > >>>>>>>>>> Caution: This message originated from an External Source. Use
> > >>>>>>>>>> proper caution when opening attachments, clicking links, or
> > >>>>>>>>>> responding.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> On Wed, Aug 16, 2023 at 11:59:59AM -0700, Sivaprasad Tummala
> > wrote:
> > >>>>>>>>>>> mwaitx allows EPYC processors to enter a implementation
> > >>>>>>>>>>> dependent power/performance optimized state (C1 state) for a
> > >>>>>>>>>>> specific period or until a store to the monitored address
> > >>>>>>>>>>> range.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Signed-off-by: Sivaprasad Tummala
> > >>>>>>>>>>> <sivaprasad.tummala@amd.com>
> > >>>>>>>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > >>>>>>>>>>> ---
> > >>>>>>>>>>>    lib/eal/x86/rte_power_intrinsics.c | 77
> > >>>>>>>>>>> +++++++++++++++++++++++++-----
> > >>>>>>>>>>>    1 file changed, 66 insertions(+), 11 deletions(-)
> > >>>>>>>>>>>
> > >>>>>>>>>>> diff --git a/lib/eal/x86/rte_power_intrinsics.c
> > >>>>>>>>>>> b/lib/eal/x86/rte_power_intrinsics.c
> > >>>>>>>>>>> index 6eb9e50807..b4754e17da 100644
> > >>>>>>>>>>> --- a/lib/eal/x86/rte_power_intrinsics.c
> > >>>>>>>>>>> +++ b/lib/eal/x86/rte_power_intrinsics.c
> > >>>>>>>>>>> @@ -17,6 +17,60 @@ static struct power_wait_status {
> > >>>>>>>>>>>         volatile void *monitor_addr; /**< NULL if not
> > >>>>>>>>>>> currently sleeping */  } __rte_cache_aligned
> > >>>>>>>>>>> wait_status[RTE_MAX_LCORE];
> > >>>>>>>>>>>
> > >>>>>>>>>>> +/**
> > >>>>>>>>>>> + * These functions uses UMONITOR/UMWAIT instructions and
> > >>>>>>>>>>> +will
> > >>>>>>>>>>> enter C0.2
> > >>>>>>>>>> state.
> > >>>>>>>>>>> + * For more information about usage of these instructions,
> > >>>>>>>>>>> +please refer to
> > >>>>>>>>>>> + * Intel(R) 64 and IA-32 Architectures Software Developer's
> > Manual.
> > >>>>>>>>>>> + */
> > >>>>>>>>>>> +static void intel_umonitor(volatile void *addr) {
> > >>>>>>>>>>> +     /* UMONITOR */
> > >>>>>>>>>>> +     asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> > >>>>>>>>>>> +                     :
> > >>>>>>>>>>> +                     : "D"(addr)); }
> > >>>>>>>>>>> +
> > >>>>>>>>>>> +static void intel_umwait(const uint64_t timeout) {
> > >>>>>>>>>>> +     const uint32_t tsc_l = (uint32_t)timeout;
> > >>>>>>>>>>> +     const uint32_t tsc_h = (uint32_t)(timeout >> 32);
> > >>>>>>>>>>> +     /* UMWAIT */
> > >>>>>>>>>>> +     asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
> > >>>>>>>>>>> +                     : /* ignore rflags */
> > >>>>>>>>>>> +                     : "D"(0), /* enter C0.2 */
> > >>>>>>>>>>> +                     "a"(tsc_l), "d"(tsc_h)); }
> > >>>>>>>>>>
> > >>>>>>>>>> question and perhaps Anatoly Burakov can chime in with expertise.
> > >>>>>>>>>>
> > >>>>>>>>>> gcc/clang have built-in intrinsics for umonitor and umwait i
> > >>>>>>>>>> believe as per our other thread of discussion is there a
> > >>>>>>>>>> benefit to also providing inline assembly over just using the
> > >>>>>>>>>> intrinsics? I understand that the intrinsics may not exist
> > >>>>>>>>>> for the monitorx and mwaitx below so it is probably necessary
> > >>>>>>>>>> for amd.
> > >>>>>>>>>>
> > >>>>>>>>>> so the suggestion here is when they are available just use
> > >>>>>>>>>> the intrinsics.
> > >>>>>>>>>>
> > >>>>>>>>>> thanks
> > >>>>>>>>>>
> > >>>>>>>>> The gcc built-in functions
> > >>>>>>>>> __builtin_ia32_monitorx()/__builtin_ia32_mwaitx are available
> > >>>>>>>>> only when -mmwaitx is used specific for AMD platforms. On
> > >>>>>>>>> generic builds, these built-ins are not available and hence
> > >>>>>>>>> inline assembly is required here.
> > >>>>>>>>
> > >>>>>>>> Ok... but we can probably put them into a separate .c file that
> > >>>>>>>> will be compiled with that specific flag?
> > >>>>>>>> Same thing can be probably done for Intel specific instructions.
> > >>>>>>>> In general, I think it is much more preferable to use built-ins
> > >>>>>>>> vs inline assembly (if possible off-course).
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> We don't compile different set of files for AMD and Intel, but
> > >>>>>>> there are runtime checks, so putting into separate file is not much
> > different.
> > >>>>>
> > >>>>> Well, we probably don't compile .c files for particular vendor,
> > >>>>> but we definitely do compile some .c files for particular ISA extensions.
> > >>>>> Let say there are files in lib/acl that requires various '-mavx512*'
> > >>>>> flags, same for other libs and PMDs.
> > >>>>> So still not clear to me why same approach can't be applied to
> > >>>>> power_instrincts.c?
> > >>>>>
> > >>>>>>>
> > >>>>>>> It may be an option to always enable compiler flag (-mmwaitx), I
> > >>>>>>> think it won't hurt other platforms but I am not sure about
> > >>>>>>> implications of this to other platforms (what was the motivation
> > >>>>>>> for the compiler guys to enable these build-ins with specific flag?).
> > >>>>>>>
> > >>>>>>> Also this requires detecting compiler that supports 'mmwaitx' or
> > >>>>>>> not, etc..
> > >>>>>>>
> > >>>>>> This is the biggest reason why we have in the past added support
> > >>>>>> for these instructions via asm bytes rather than intrinsics. It
> > >>>>>> takes a long time for end-user compilers, especially those in LTS
> > >>>>>> releases, to get the necessary intrinsics.
> > >>>>>
> > >>>>> Yep, understand.
> > >>>>> But why then we can't have both implementations?
> > >>>>> Let say if WAITPKG is defined we can use builtins for
> > >>>>> umonitor/umwait/tpause, otherwise we fallback to inline asm
> > implementation.
> > >>>>> Same story for MWAITX/monitorx.
> > >>>>>
> > >>>>
> > >>>> Yes this can be done,
> > >>>> it can be done either as different .c files per implementation, or
> > >>>> as #ifdef in same file.
> > >>>>
> > >>>> But eventually asm implementation is required, as fallback, and if
> > >>>> we will rely on asm implementation anyway, does it worth to have
> > >>>> the additional checks to be able to use built-in intrinsic?
> > >>>>
> > >>>> Does it helps to comment name of the built-in function to inline
> > >>>> assembly code, to document intention and another possible
> > implementation?
> > >>>
> > >>> the main value of preferring intrinsics is that when they are
> > >>> available they also work with msvc/windows. the msvc toolchain does
> > >>> not support inline asm. so some of the targets have to use
> > >>> intrinsics because that's all there is.
> > >>>
> > >>
> > >> How windows handles current power APIs without inline asm support,
> > >> like rte_power_intrinsics.c one?
> > >
> > > so this is a windows vs toolchain entanglement.
> > >
> > >> Also will using both built-in and inline assembly work for Windows,
> > >> since there may be compiler versions that doesn't support built-in
> > >> functions, they should disable APIs altogether, and this can create a
> > >> scenario that list of exposed APIs changes based on compiler version.
> > >
> > > so I don't intend to disable apis, theres usually a way to make them
> > > work and there should not be any api changes when done correctly.
> > >
> > > windows/clang/mingw
> > >      * inline asm may be used, but for me isn't preferred
> > >
> > > windows/msvc
> > >      * intrinsics (when available)
> > >      * non-inline asm in a .s (when no intrinsics available)
> > >      * keeping in mind that the compiler version isn't tied to windows
> > >        OS release so it's easier for me to document that you need a
> > >        newer compiler arbitrarily. The periods where there are no intrinsics
> > >        end up being short-lived.
> > >
> > > I'm on the hook for windows/msvc any stickyness dealing with it ends
> > > up being my problem.
> >
> > As I can read rte_power_instrintcts.c, for each set of power instructions we have
> > related static variable: wait*_supported.
> > So if we need to support compiler that supports neither new bultins nor inline-asm,
> > then it probably possible to rearrange the code to keep these static vars equal zero
> > for such case.
> >
> Konstantin,
> I preferred to use the inline asm for the new power instructions with this patch due to:
> 1. builtins may not always available based on compiler versions
> 2. additional complexity with versioning checks

I don't think we'll need to mock with version numbers etc.
AFAIK, each such feature as corresponding define, i.e.:
If WAITPKG is defined we can use builtins for umonitor/umwait/tpause,
If MWAITX is defined we can use mwaitx/monitorx, etc.

> 3. no additional performance benefit
> As an improvement, will add builtins support globally in a separate patch later.

Works for me, as long as we wouldn't forget to do it in future. 

 
> >
> > >
> > >>
> > >>>>
> > >>>>>> Consider a user running e.g. RHEL 8, who wants to take advantages
> > >>>>>> of the latest DPDK features; they should not be required to
> > >>>>>> upgrade their compiler - and possibly binutils/assembler - to do so.
> > >>>>>>
> > >>>>>> /Bruce
> > >>>>>


  reply	other threads:[~2023-09-28 10:11 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-13 11:53 [PATCH v2 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
2023-04-13 11:53 ` [PATCH v2 2/3] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
2023-04-17  4:31   ` [PATCH v3 1/4] " Sivaprasad Tummala
2023-04-18  8:14     ` [PATCH v4 0/4] power: monitor support for AMD EPYC processors Sivaprasad Tummala
2023-04-18  8:25     ` Sivaprasad Tummala
2023-04-18  8:25       ` [PATCH v4 1/4] doc: announce new cpu flag added to rte_cpu_flag_t Sivaprasad Tummala
2023-04-18  8:52         ` Ferruh Yigit
2023-04-18  9:22           ` Bruce Richardson
2023-06-01  9:23             ` David Marchand
2023-07-05 11:32         ` Konstantin Ananyev
2023-08-16 18:59         ` [PATCH v5 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
2023-08-16 18:59           ` [PATCH v5 2/3] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
2023-08-16 18:59           ` [PATCH v5 3/3] power: amd power monitor support Sivaprasad Tummala
2023-08-16 19:27             ` Tyler Retzlaff
2023-08-17 11:34               ` Tummala, Sivaprasad
2023-08-17 14:18                 ` Konstantin Ananyev
2023-08-18 13:25                   ` Ferruh Yigit
2023-08-18 13:48                     ` Bruce Richardson
2023-08-21 15:42                       ` Tyler Retzlaff
2023-08-22 22:30                       ` Konstantin Ananyev
2023-08-23  9:19                         ` Ferruh Yigit
2023-08-23 16:03                           ` Tyler Retzlaff
2023-08-24  9:04                             ` Ferruh Yigit
2023-08-25 16:00                               ` Tyler Retzlaff
2023-08-30 22:45                                 ` Konstantin Ananyev
2023-09-27 10:38                                   ` Tummala, Sivaprasad
2023-09-28 10:11                                     ` Konstantin Ananyev [this message]
2023-10-06  8:26             ` David Marchand
2023-10-09  8:02               ` Tummala, Sivaprasad
2023-10-09 14:05             ` [PATCH v6 1/3] eal: add x86 cpuid support for monitorx Sivaprasad Tummala
2023-10-09 14:05               ` [PATCH v6 2/3] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
2023-10-09 14:05               ` [PATCH v6 3/3] power: amd power monitor support Sivaprasad Tummala
2023-10-10  8:59                 ` David Marchand
2023-10-11  9:33                   ` Tummala, Sivaprasad
2023-10-10 10:14                 ` Konstantin Ananyev
2023-10-09 16:23               ` [PATCH v6 1/3] eal: add x86 cpuid support for monitorx Patrick Robb
2023-10-10  8:21                 ` David Marchand
2023-04-18  8:25       ` [PATCH v4 2/4] " Sivaprasad Tummala
2023-06-14 13:15         ` Burakov, Anatoly
2023-04-18  8:25       ` [PATCH v4 3/4] eal: removed unnecessary checks in x86 power monitor APIs Sivaprasad Tummala
2023-06-14 13:14         ` Burakov, Anatoly
2023-04-18  8:25       ` [PATCH v4 4/4] power: amd power monitor support Sivaprasad Tummala
2023-06-14 13:14         ` Burakov, Anatoly
2023-04-13 11:53 ` [PATCH v2 3/3] " Sivaprasad Tummala
2023-04-17  4:34   ` [PATCH v3 4/4] " Sivaprasad Tummala
2023-04-13 11:59 ` [PATCH v2 1/3] eal: add x86 cpuid support for monitorx David Marchand
2023-04-13 17:50   ` Tummala, Sivaprasad
2023-04-14  7:05     ` David Marchand
2023-04-14  8:51       ` Tummala, Sivaprasad
2023-04-14 11:48       ` Ferruh Yigit
2023-04-17  4:32 ` [PATCH v3 2/4] " Sivaprasad Tummala

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f9dc5bd4089f49578b423c453e0b258c@huawei.com \
    --to=konstantin.ananyev@huawei.com \
    --cc=Ferruh.Yigit@amd.com \
    --cc=Sivaprasad.Tummala@amd.com \
    --cc=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.hunt@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.v.ananyev@yandex.ru \
    --cc=roretzla@linux.microsoft.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.