From: Dapeng Mi <dapeng1.mi@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Ian Rogers <irogers@google.com>,
Adrian Hunter <adrian.hunter@intel.com>,
Jiri Olsa <jolsa@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>,
Eranian Stephane <eranian@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
broonie@kernel.org, Ravi Bangoria <ravi.bangoria@amd.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Zide Chen <zide.chen@intel.com>,
Falcon Thomas <thomas.falcon@intel.com>,
Dapeng Mi <dapeng1.mi@intel.com>,
Xudong Hao <xudong.hao@intel.com>,
Dapeng Mi <dapeng1.mi@linux.intel.com>
Subject: [Patch v8 08/23] x86/fpu: Ensure TIF_NEED_FPU_LOAD is set after saving FPU state
Date: Fri, 29 May 2026 15:56:30 +0800 [thread overview]
Message-ID: <20260529075645.580362-9-dapeng1.mi@linux.intel.com> (raw)
In-Reply-To: <20260529075645.580362-1-dapeng1.mi@linux.intel.com>
Following Peter and Dave's suggestion, Ensure that the TIF_NEED_FPU_LOAD
flag is always set after saving the FPU state. This guarantees that the
user space FPU state has been saved whenever the TIF_NEED_FPU_LOAD flag
is set.
A subsequent patch will verify if the user space FPU state can be
retrieved from the saved task FPU state in the NMI context by checking
the TIF_NEED_FPU_LOAD flag.
Please check the below link to get more background about the suggestion.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/all/20251204154721.GB2619703@noisy.programming.kicks-ass.net/
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
arch/x86/include/asm/fpu/sched.h | 5 +++--
arch/x86/kernel/fpu/core.c | 27 ++++++++++++++++++++-------
2 files changed, 23 insertions(+), 9 deletions(-)
diff --git a/arch/x86/include/asm/fpu/sched.h b/arch/x86/include/asm/fpu/sched.h
index 89004f4ca208..dcb2fa5f06d6 100644
--- a/arch/x86/include/asm/fpu/sched.h
+++ b/arch/x86/include/asm/fpu/sched.h
@@ -10,6 +10,8 @@
#include <asm/trace/fpu.h>
extern void save_fpregs_to_fpstate(struct fpu *fpu);
+extern void update_fpu_state_and_flag(struct fpu *fpu,
+ struct task_struct *task);
extern void fpu__drop(struct task_struct *tsk);
extern int fpu_clone(struct task_struct *dst, u64 clone_flags, bool minimal,
unsigned long shstk_addr);
@@ -36,8 +38,7 @@ static inline void switch_fpu(struct task_struct *old, int cpu)
!(old->flags & (PF_KTHREAD | PF_USER_WORKER))) {
struct fpu *old_fpu = x86_task_fpu(old);
- set_tsk_thread_flag(old, TIF_NEED_FPU_LOAD);
- save_fpregs_to_fpstate(old_fpu);
+ update_fpu_state_and_flag(old_fpu, old);
/*
* The save operation preserved register state, so the
* fpu_fpregs_owner_ctx is still @old_fpu. Store the
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 608983806fd7..48d1ab50a961 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -213,6 +213,19 @@ void restore_fpregs_from_fpstate(struct fpstate *fpstate, u64 mask)
}
}
+/*
+ * Save the FPU register state in fpu->fpstate->regs and set
+ * TIF_NEED_FPU_LOAD subsequently.
+ *
+ * Must be called with fpregs_lock() held, ensuring flag
+ * TIF_NEED_FPU_LOAD is set last.
+ */
+void update_fpu_state_and_flag(struct fpu *fpu, struct task_struct *task)
+{
+ save_fpregs_to_fpstate(fpu);
+ set_tsk_thread_flag(task, TIF_NEED_FPU_LOAD);
+}
+
void fpu_reset_from_exception_fixup(void)
{
restore_fpregs_from_fpstate(&init_fpstate, XFEATURE_MASK_FPSTATE);
@@ -379,17 +392,19 @@ int fpu_swap_kvm_fpstate(struct fpu_guest *guest_fpu, bool enter_guest)
fpregs_lock();
if (!cur_fps->is_confidential && !test_thread_flag(TIF_NEED_FPU_LOAD))
- save_fpregs_to_fpstate(fpu);
+ update_fpu_state_and_flag(fpu, current);
/* Swap fpstate */
if (enter_guest) {
- fpu->__task_fpstate = cur_fps;
+ WRITE_ONCE(fpu->__task_fpstate, cur_fps);
+ barrier();
fpu->fpstate = guest_fps;
guest_fps->in_use = true;
} else {
guest_fps->in_use = false;
fpu->fpstate = fpu->__task_fpstate;
- fpu->__task_fpstate = NULL;
+ barrier();
+ WRITE_ONCE(fpu->__task_fpstate, NULL);
}
cur_fps = fpu->fpstate;
@@ -481,10 +496,8 @@ void kernel_fpu_begin_mask(unsigned int kfpu_mask)
this_cpu_write(kernel_fpu_allowed, false);
if (!(current->flags & (PF_KTHREAD | PF_USER_WORKER)) &&
- !test_thread_flag(TIF_NEED_FPU_LOAD)) {
- set_thread_flag(TIF_NEED_FPU_LOAD);
- save_fpregs_to_fpstate(x86_task_fpu(current));
- }
+ !test_thread_flag(TIF_NEED_FPU_LOAD))
+ update_fpu_state_and_flag(x86_task_fpu(current), current);
__cpu_invalidate_fpregs_state();
/* Put sane initial values into the control registers. */
--
2.34.1
next prev parent reply other threads:[~2026-05-29 8:03 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-29 7:56 [Patch v8 00/23] Support SIMD/eGPRs/SSP registers sampling for perf Dapeng Mi
2026-05-29 7:56 ` [Patch v8 01/23] perf/x86/intel: Validate return value of intel_pmu_init_hybrid() Dapeng Mi
2026-05-29 8:53 ` sashiko-bot
2026-05-29 11:11 ` Peter Zijlstra
2026-05-29 7:56 ` [Patch v8 02/23] perf/x86: Move hybrid PMU initialization before x86_pmu_starting_cpu() Dapeng Mi
2026-05-29 8:51 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 03/23] perf/x86/intel: Enable large PEBS sampling for XMMs Dapeng Mi
2026-05-29 7:56 ` [Patch v8 04/23] perf/x86/intel: Convert x86_perf_regs to per-cpu variables Dapeng Mi
2026-05-29 7:56 ` [Patch v8 05/23] perf: Eliminate duplicate arch-specific functions definations Dapeng Mi
2026-05-29 7:56 ` [Patch v8 06/23] perf/x86: Use x86_perf_regs in the x86 nmi handlers Dapeng Mi
2026-05-29 7:56 ` [Patch v8 07/23] x86/fpu/xstate: Add xsaves_nmi() helper Dapeng Mi
2026-05-29 8:56 ` sashiko-bot
2026-05-29 11:32 ` Peter Zijlstra
2026-05-29 7:56 ` Dapeng Mi [this message]
2026-05-29 7:56 ` [Patch v8 09/23] perf: Move and enhance has_extended_regs() for arch-specific use Dapeng Mi
2026-05-29 7:56 ` [Patch v8 10/23] perf/x86: Enable XMM Register Sampling for Non-PEBS Events Dapeng Mi
2026-05-29 9:02 ` sashiko-bot
2026-05-29 11:38 ` Peter Zijlstra
2026-05-29 7:56 ` [Patch v8 11/23] perf/x86: Enable XMM register sampling for REGS_USER case Dapeng Mi
2026-05-29 9:24 ` sashiko-bot
2026-05-29 11:42 ` Peter Zijlstra
2026-05-29 7:56 ` [Patch v8 12/23] perf: Add sampling support for SIMD registers Dapeng Mi
2026-05-29 8:36 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 13/23] perf/x86: Support XMM sampling using sample_simd_vec_reg_* fields Dapeng Mi
2026-05-29 8:49 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 14/23] perf/x86: Support YMM " Dapeng Mi
2026-05-29 8:47 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 15/23] perf/x86: Support ZMM " Dapeng Mi
2026-05-29 7:56 ` [Patch v8 16/23] perf/x86: Support OPMASK sampling using sample_simd_pred_reg_* fields Dapeng Mi
2026-05-29 9:21 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 17/23] perf: Enhance perf_reg_validate() with simd_enabled argument Dapeng Mi
2026-05-29 7:56 ` [Patch v8 18/23] perf/x86: Support eGPRs sampling using sample_regs_* fields Dapeng Mi
2026-05-29 9:31 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 19/23] perf/x86: Support SSP " Dapeng Mi
2026-05-29 10:03 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 20/23] perf/x86/intel: Support arch-PEBS based SIMD/eGPRs/SSP sampling Dapeng Mi
2026-05-29 9:45 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 21/23] perf/x86/intel: Enable PERF_PMU_CAP_SIMD_REGS capability Dapeng Mi
2026-05-29 10:43 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 22/23] perf/x86: Activate back-to-back NMI detection for arch-PEBS induced NMIs Dapeng Mi
2026-05-29 9:34 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 23/23] perf/x86/intel: Add sanity check for PEBS fragment size Dapeng Mi
2026-05-29 9:54 ` sashiko-bot
2026-05-29 8:32 ` [Patch v8 00/23] Support SIMD/eGPRs/SSP registers sampling for perf Mi, Dapeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260529075645.580362-9-dapeng1.mi@linux.intel.com \
--to=dapeng1.mi@linux.intel.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=broonie@kernel.org \
--cc=dapeng1.mi@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=eranian@google.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@amd.com \
--cc=tglx@linutronix.de \
--cc=thomas.falcon@intel.com \
--cc=xudong.hao@intel.com \
--cc=zide.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox