From: Andi Kleen <ak@linux.intel.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Yi Sun <yi.sun@intel.com>,
dave.hansen@intel.com, tglx@linutronix.de,
linux-kernel@vger.kernel.org, x86@kernel.org,
sohil.mehta@intel.com, ilpo.jarvinen@linux.intel.com,
heng.su@intel.com, tony.luck@intel.com,
dave.hansen@linux.intel.com, yi.sun@intel.intel.com
Subject: Re: [PATCH v6 1/3] x86/fpu: Measure the Latency of XSAVES and XRSTORS
Date: Sat, 2 Sep 2023 12:09:10 -0700 [thread overview]
Message-ID: <ZPOIVmC6aY9GBtdJ@tassilo> (raw)
In-Reply-To: <ZPMTVNM2oBCdSYjJ@gmail.com>
> Instead of adding overhead to the regular FPU context saving/restoring code
> paths, could you add a helper function that has tracing code included, but
> which isn't otherwise used - and leave the regular code with no tracing
> overhead?
>
> This puts a bit of a long-term maintenance focus on making sure that the
> traced functionality won't bitrot, but I'd say that's preferable to adding
> tracing overhead.
Or just use PT
% sudo perf record --kcore -e intel_pt/cyc=1,cyc_thresh=1/k --filter 'filter save_fpregs_to_fpstate' -a sleep 5
% sudo perf script --insn-trace --xed -F -comm,-tid,-dso,-sym,-symoff,+ipc
[000] 677203.751913565: ffffffffa7046230 nopw %ax, (%rax)
[000] 677203.751913565: ffffffffa7046234 nopl %eax, (%rax,%rax,1)
[000] 677203.751913565: ffffffffa7046239 mov %rdi, %rcx
[000] 677203.751913565: ffffffffa704623c nopl %eax, (%rax,%rax,1)
[000] 677203.751913565: ffffffffa7046241 movq
0x10(%rdi), %rsi
[000] 677203.751913565: ffffffffa7046245 movq 0x8(%rsi), %rax
[000] 677203.751913565: ffffffffa7046249 leaq 0x40(%rsi), %rdi
[000] 677203.751913565: ffffffffa704624d mov %rax, %rdx
[000] 677203.751913565: ffffffffa7046250 shr $0x20, %rdx
[000] 677203.751913565: ffffffffa7046254 xsaves64 (%rdi)
[000] 677203.751913565: ffffffffa7046258 xor %edi, %edi
[000] 677203.751913565: ffffffffa704625a movq 0x10(%rcx), %rax
[000] 677203.751913565: ffffffffa704625e testb $0xc0, 0x240(%rax)
[000] 677203.751913636: ffffffffa7046265 jz 0xffffffffa7046285 IPC: 0.16 (14/85)
...
So it took 85 cycles here.
(it includes a few extra instructions, but I bet they're less than what
ftrace adds. This example is for XSAVE, but can be similarly extended
for XRSTOR)
-Andi
next prev parent reply other threads:[~2023-09-02 19:09 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-01 14:34 [PATCH v6 0/3] x86/fpu Measure the Latency of XSAVES and Yi Sun
2023-09-01 14:34 ` [PATCH v6 1/3] x86/fpu: Measure the Latency of XSAVES and XRSTORS Yi Sun
2023-09-02 10:49 ` Ingo Molnar
2023-09-02 19:09 ` Andi Kleen [this message]
2023-09-06 9:18 ` Yi Sun
2023-09-06 18:49 ` Dave Hansen
2023-09-06 22:02 ` Ingo Molnar
2023-09-08 0:24 ` Yi Sun
2023-09-06 8:47 ` Yi Sun
2023-09-15 9:54 ` Ingo Molnar
2023-09-01 14:34 ` [PATCH v6 2/3] tools/testing/fpu: Add script to consume trace log of xsaves latency Yi Sun
2023-09-01 14:34 ` [PATCH v6 3/3] tools/testing/fpu: Add a 'count' column Yi Sun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZPOIVmC6aY9GBtdJ@tassilo \
--to=ak@linux.intel.com \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=heng.su@intel.com \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=sohil.mehta@intel.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
--cc=yi.sun@intel.com \
--cc=yi.sun@intel.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.