From: Ingo Molnar <mingo@kernel.org>
To: Yi Sun <yi.sun@intel.com>
Cc: dave.hansen@intel.com, linux-kernel@vger.kernel.org,
x86@kernel.org, sohil.mehta@intel.com, ak@linux.intel.com,
ilpo.jarvinen@linux.intel.com, heng.su@intel.com,
tony.luck@intel.com, yi.sun@linux.intel.com, yu.c.chen@intel.com
Subject: Re: [PATCH v7 1/3] x86/fpu: Measure the Latency of XSAVE and XRSTOR
Date: Thu, 21 Sep 2023 09:03:26 +0200 [thread overview]
Message-ID: <ZQvqvpSbyub6gFZX@gmail.com> (raw)
In-Reply-To: <20230921062900.864679-2-yi.sun@intel.com>
* Yi Sun <yi.sun@intel.com> wrote:
> @@ -113,7 +116,7 @@ static inline u64 xfeatures_mask_independent(void)
> * original instruction which gets replaced. We need to use it here as the
> * address of the instruction where we might get an exception at.
> */
> -#define XSTATE_XSAVE(st, lmask, hmask, err) \
> +#define __XSTATE_XSAVE(st, lmask, hmask, err) \
> asm volatile(ALTERNATIVE_3(XSAVE, \
> XSAVEOPT, X86_FEATURE_XSAVEOPT, \
> XSAVEC, X86_FEATURE_XSAVEC, \
> @@ -130,7 +133,7 @@ static inline u64 xfeatures_mask_independent(void)
> * Use XRSTORS to restore context if it is enabled. XRSTORS supports compact
> * XSAVE area format.
> */
> -#define XSTATE_XRESTORE(st, lmask, hmask) \
> +#define __XSTATE_XRESTORE(st, lmask, hmask) \
> asm volatile(ALTERNATIVE(XRSTOR, \
> XRSTORS, X86_FEATURE_XSAVES) \
> "\n" \
> @@ -140,6 +143,35 @@ static inline u64 xfeatures_mask_independent(void)
> : "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \
> : "memory")
>
> +#if defined(CONFIG_X86_DEBUG_FPU)
> +#define XSTATE_XSAVE(fps, lmask, hmask, err) \
> + do { \
> + struct fpstate *f = fps; \
> + u64 tc = -1; \
> + if (tracepoint_enabled(x86_fpu_latency_xsave)) \
> + tc = trace_clock(); \
> + __XSTATE_XSAVE(&f->regs.xsave, lmask, hmask, err); \
> + if (tracepoint_enabled(x86_fpu_latency_xsave)) \
> + trace_x86_fpu_latency_xsave(f, trace_clock() - tc);\
> + } while (0)
> +
> +#define XSTATE_XRESTORE(fps, lmask, hmask) \
> + do { \
> + struct fpstate *f = fps; \
> + u64 tc = -1; \
> + if (tracepoint_enabled(x86_fpu_latency_xrstor)) \
> + tc = trace_clock(); \
> + __XSTATE_XRESTORE(&f->regs.xsave, lmask, hmask); \
> + if (tracepoint_enabled(x86_fpu_latency_xrstor)) \
> + trace_x86_fpu_latency_xrstor(f, trace_clock() - tc);\
This v7 version does not adequately address the review feedback I gave for
v6: it adds tracing overhead to potential hot paths, and putting it behind
CONFIG_X86_DEBUG_FPU is not a solution either: it's default-y, so de-facto
enabled on all major distros...
It seems unnecessarily complex: why does it have to measure latency
directly? Tracepoints *by default* come with event timestamps. A latency
measurement tool should be able to subtract two timestamps to extract the
latency between two tracepoints...
In fact, function tracing is enabled on all major Linux distros:
kepler:~/tip> grep FUNCTION_TRACER /boot/config-6.2.0-33-generic
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_FUNCTION_TRACER=y
Why not just enable function tracing for the affected FPU context switching
functions?
The relevant functions are already standalone in a typical kernel:
xsave: # ffffffff8103cfe0 T save_fpregs_to_fpstate
xrstor: # ffffffff8103d160 T restore_fpregs_from_fpstate
xrstor_supervisor: # ffffffff8103dc50 T fpu__clear_user_states
... and FPU context switching overhead dominates the cost of these
functions.
Thanks,
Ingo
next prev parent reply other threads:[~2023-09-21 18:52 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-21 6:28 [PATCH v7 0/3] x86/fpu Measure the Latency of XSAVES and XRSTORS Yi Sun
2023-09-21 6:28 ` [PATCH v7 1/3] x86/fpu: Measure the Latency of XSAVE and XRSTOR Yi Sun
2023-09-21 7:03 ` Ingo Molnar [this message]
2023-09-21 8:24 ` Yi Sun
2023-09-21 16:52 ` Andi Kleen
2023-09-22 3:44 ` Yi Sun
2023-09-21 6:28 ` [PATCH v7 2/3] tools/testing/fpu: Add script to consume trace log of xsave latency Yi Sun
2023-09-21 6:29 ` [PATCH v7 3/3] tools/testing/fpu: Add a 'count' column Yi Sun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZQvqvpSbyub6gFZX@gmail.com \
--to=mingo@kernel.org \
--cc=ak@linux.intel.com \
--cc=dave.hansen@intel.com \
--cc=heng.su@intel.com \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sohil.mehta@intel.com \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
--cc=yi.sun@intel.com \
--cc=yi.sun@linux.intel.com \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.