All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH Dovetail 6.16 v3 0/3] Dovetail 6.16: Fix some regressions introduced during last forward port
@ 2025-09-05 14:26 Florian Bezdeka
  2025-09-05 14:26 ` [PATCH Dovetail 6.16 v3 1/3] x86/fpu: dovetail: Fix FPU corruption Florian Bezdeka
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Florian Bezdeka @ 2025-09-05 14:26 UTC (permalink / raw)
  To: xenomai; +Cc: Philippe Gerum, Jan Kiszka, Florian Bezdeka

Hi all,

seems the last forward port to 6.16 introduced some regressions.
Please review carefully, thanks!

With that applied I have two open problems:
  - Running dmesg kills the system, no idea yet
    Requires at least 3 CPUs on my qemu system to reproduce.
    Update: I know how to fix that, but not fully understood yet.
    Update: Fix posted to ML.

  - I'm able to reproduce [1] on 6.16 now
    Update: This one is understood now. Xenomai 3 bug.
    Tests already ongoing.
    Update: Fix posted to ML and merged into Xenomai next.

Cc: Philippe Gerum <rpm@xenomai.org>
Cc: Jan Kiszka <jan.kiszka@siemens.com>

[1] https://lore.kernel.org/xenomai/6fcb3f9830056d99161ea8f50a016b99d98a7c71.camel@siemens.com/T/

Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
---
Changes in v3:
- Rework patch 1: Move preemption marker to thread local flags
- Added patch 2: Align arm64 and x86 fpu preemption code
- Patch 3: was Patch 2 in v2, updated to match latest Linux code
- Link to v2: https://lore.kernel.org/r/20250820-wip-flo-fixes-for-dovetail-6-15-v2-0-535e4fa355f7@siemens.com

Changes in v2:
- Patch 1: Reworked as discussed on the list
           Following the same pattern as for arm64 
- Link to v1: https://lore.kernel.org/r/20250820-wip-flo-fixes-for-dovetail-6-15-v1-0-d44f50cb4b0b@siemens.com

---
Florian Bezdeka (3):
      x86/fpu: dovetail: Fix FPU corruption
      arm64: fpsimd: dovetail: Align with x86 implementation
      irq_pipeline: Fix pipelining code for level triggered IRQs

 arch/arm64/include/asm/thread_info.h | 10 +++++-----
 arch/arm64/kernel/fpsimd.c           |  7 +++++--
 arch/x86/include/asm/fpu/sched.h     | 17 ++++-------------
 arch/x86/include/asm/fpu/types.h     | 12 ------------
 arch/x86/include/asm/thread_info.h   |  9 +++++----
 arch/x86/kernel/fpu/core.c           | 13 +++++--------
 kernel/irq/chip.c                    | 22 ++++++++++++++++++++--
 7 files changed, 44 insertions(+), 46 deletions(-)
---
base-commit: eec111a874f5899e1078ef785ccf29dff986e5a0
change-id: 20250820-wip-flo-fixes-for-dovetail-6-15-1368f7f6e0b1

Best regards,
-- 
Florian Bezdeka <florian.bezdeka@siemens.com>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH Dovetail 6.16 v3 1/3] x86/fpu: dovetail: Fix FPU corruption
  2025-09-05 14:26 [PATCH Dovetail 6.16 v3 0/3] Dovetail 6.16: Fix some regressions introduced during last forward port Florian Bezdeka
@ 2025-09-05 14:26 ` Florian Bezdeka
  2025-09-08 12:37   ` Philippe Gerum
  2025-09-05 14:26 ` [PATCH Dovetail 6.16 v3 2/3] arm64: fpsimd: dovetail: Align with x86 implementation Florian Bezdeka
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Florian Bezdeka @ 2025-09-05 14:26 UTC (permalink / raw)
  To: xenomai; +Cc: Philippe Gerum, Jan Kiszka, Florian Bezdeka

The kernel splat looked as follows and could be reproduced by running
the Xenomai 3 testsuite over a stress-ng produced load. The problem is
specific to Dovetail 6.16.

[19527.937613] ------------[ cut here ]------------
[19527.937614] WARNING: CPU: 0 PID: 14 at arch/x86/kernel/fpu/core.c:62 x86_task_fpu+0x18/0x30
[19527.937619] Modules linked in: intel_rapl_msr intel_rapl_common intel_pmc_core pmt_telemetry pmt_class intel_pmc_ssram_telemetry intel_vsec ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel gf128mul rapl pcspkr button evdev serio_raw joydev sg loop efi_pstore fuse dm_mod drm dax configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom ata_generic sd_mod hid_generic hid_hyperv ata_piix psmouse hid hv_netvsc hv_storvsc libata scsi_transport_fc floppy i2c_piix4 i2c_smbus scsi_mod scsi_common [last unloaded: xeno_can]
[19527.937635] CPU: 0 UID: 0 PID: 14 Comm: ksoftirqd/0 Not tainted 6.16.0-hyprv-xenomai-0+ #196 PREEMPT(voluntary)
[19527.937637] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008  12/07/2018
[19527.937637] IRQ stage: Xenomai
[19527.937638] RIP: 0010:x86_task_fpu+0x18/0x30
[19527.937640] Code: 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8d 87 00 20 00 00 f6 47 3e 20 75 05 c3 cc cc cc cc 90 <0f> 0b 90 31 c0 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f
[19527.937641] RSP: 0018:ffffba19c0083b60 EFLAGS: 00010002
[19527.937641] RAX: ffffa0a90133c3c0 RBX: 0000000000000001 RCX: ffffa0a93fc5a6e8
[19527.937642] RDX: 0000000000000000 RSI: 0000000000000007 RDI: ffffa0a90133a3c0
[19527.937643] RBP: ffffa0a93fc5a058 R08: 01364d9cb9685197 R09: ffffffff83e406f8
[19527.937643] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[19527.937643] R13: ffffa0a90133a3c0 R14: ffffba19c0c62008 R15: 0000000000000001
[19527.937645] FS:  0000000000000000(0000) GS:ffffa0a9ba664000(0000) knlGS:0000000000000000
[19527.937646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[19527.937646] CR2: 00007f5c694363c0 CR3: 000000003e038004 CR4: 00000000003706f0
[19527.937647] Call Trace:
[19527.937648]  <TASK>
[19527.937649]  fpu__suspend_inband+0x3c/0x50
[19527.937651]  dovetail_context_switch+0x130/0x240
[19527.937652]  ? xntimer_start+0x101/0x240
[19527.937656]  ___xnsched_run+0x1f9/0x4d0
[19527.937659]  run_oob_call+0x87/0x140
[19527.937662]  handle_irq_pipelined_finish+0x188/0x1a0
[19527.937664]  arch_pipeline_entry+0x43/0xf0
[19527.937667]  asm_sysvec_hyperv_stimer0+0x16/0x20
[19527.937669] RIP: 0010:kernel_fpu_end+0x5/0x50
[19527.937670] Code: 57 02 e8 9e 12 42 00 31 c0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <65> 0f b6 05 03 40 57 02 84 c0 75 31 65 c6 05 f7 3f 57 02 01 65 f6
[19527.937671] RSP: 0018:ffffba19c0083d28 EFLAGS: 00000246
[19527.937671] RAX: 000000000000000c RBX: 0000000000000001 RCX: ffffffff842f2260
[19527.937672] RDX: 0000000000000000 RSI: ffffffff84c67450 RDI: ffffffff84c673e0
[19527.937672] RBP: 0000000000000001 R08: ffffffff842f2260 R09: 0000000000000000
[19527.937672] R10: ffffffff84a080c0 R11: 0000000000000001 R12: ffffffff84c67410
[19527.937673] R13: ffffffff84c673e0 R14: 0000000000000040 R15: 0000000000000004
[19527.937675]  blake2s_compress+0x52/0xa0
[19527.937679]  blake2s_update+0x75/0x150
[19527.937681]  add_timer_randomness+0xac/0x1a0
[19527.937684]  scsi_end_request+0x297/0x3a0 [scsi_mod]
[19527.937697]  scsi_io_completion+0x55/0x6a0 [scsi_mod]
[19527.937707]  blk_done_softirq+0x46/0x60
[19527.937709]  handle_softirqs+0xd0/0x430
[19527.937711]  ? __pfx_smpboot_thread_fn+0x10/0x10
[19527.937713]  run_ksoftirqd+0x47/0x60
[19527.937714]  smpboot_thread_fn+0xec/0x220
[19527.937716]  kthread+0xf7/0x240
[19527.937718]  ? __pfx_kthread+0x10/0x10
[19527.937720]  ret_from_fork+0x28c/0x310
[19527.937721]  ? __pfx_kthread+0x10/0x10
[19527.937722]  ret_from_fork_asm+0x1a/0x30
[19527.937725]  </TASK>
[19527.937725] irq event stamp: 1270
[19527.937726] hardirqs last  enabled at (1269): [<ffffffff830b2b1c>] run_ksoftirqd+0x4c/0x60
[19527.937727] hardirqs last disabled at (1270): [<ffffffff83d90d53>] __schedule+0x853/0x1080
[19527.937729] softirqs last  enabled at (1268): [<ffffffff830b29d6>] handle_softirqs+0x346/0x430
[19527.937730] softirqs last disabled at (1215): [<ffffffff830b2b17>] run_ksoftirqd+0x47/0x60
[19527.937731] ---[ end trace 0000000000000000 ]---

The main problem was the x86_task_fpu(tsk) helper, which will deliver
NULL in case CONFIG_X86_DEBUG_FPU is enabled and
(tsk->flags & PF_KTHREAD) is true. We ended up in dereferencing a
NULL pointer.

This problem could be fixed by not using the x86_task_fpu().
But:
  - It seems that the location of struct fpu is under development.
  - Moving the preemption marker from struct fpu to
    struct thread_info::flags synchronizes the amd64 and arm64
    implementation.

Second, the early exit in fpu__resume_inband() was wrong, as that
prevented the FPU restore of in-band kthreads.

Link: https://lore.kernel.org/xenomai/80a48c1ddc3721eb8022b77548028aa8e7371928.camel@siemens.com/T/
Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
---
 arch/x86/include/asm/fpu/sched.h   | 17 ++++-------------
 arch/x86/include/asm/fpu/types.h   | 12 ------------
 arch/x86/include/asm/thread_info.h |  9 +++++----
 arch/x86/kernel/fpu/core.c         | 13 +++++--------
 4 files changed, 14 insertions(+), 37 deletions(-)

diff --git a/arch/x86/include/asm/fpu/sched.h b/arch/x86/include/asm/fpu/sched.h
index ba677ffa4fbb..64595a81a40e 100644
--- a/arch/x86/include/asm/fpu/sched.h
+++ b/arch/x86/include/asm/fpu/sched.h
@@ -18,19 +18,9 @@ extern void fpu_flush_thread(void);
 
 #ifdef CONFIG_DOVETAIL
 
-static inline void oob_fpu_set_preempt(struct fpu *fpu)
+static inline bool oob_fpu_preempted(struct thread_info *old_ti)
 {
-	fpu->preempted = true;
-}
-
-static inline void oob_fpu_clear_preempt(struct fpu *fpu)
-{
-	fpu->preempted = false;
-}
-
-static inline bool oob_fpu_preempted(struct fpu *old_fpu)
-{
-	return old_fpu->preempted;
+	return test_thread_local_flags(_TLF_KERNEL_FPU_PREEMPTED);
 }
 
 #else
@@ -60,9 +50,10 @@ static inline void switch_fpu(struct task_struct *old, int cpu)
 {
 	if (!test_tsk_thread_flag(old, TIF_NEED_FPU_LOAD) &&
 		cpu_feature_enabled(X86_FEATURE_FPU)) {
+		struct thread_info *old_ti = task_thread_info(old);
 		struct fpu *old_fpu = x86_task_fpu(old);
 		if (!(old->flags & (PF_KTHREAD | PF_USER_WORKER)) &&
-			!oob_fpu_preempted(old_fpu)) {
+			!oob_fpu_preempted(old_ti)) {
 			set_tsk_thread_flag(old, TIF_NEED_FPU_LOAD);
 			save_fpregs_to_fpstate(old_fpu);
 			/*
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index c329a878a581..1c94121acd3d 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -473,18 +473,6 @@ struct fpu {
 	 */
 	unsigned int			last_cpu;
 
-#ifdef CONFIG_DOVETAIL
-	/*
-	 * @preempted:
-	 *
-	 * When Dovetail is enabled, this flag is set for the inband
-	 * task context saved when entering a kernel_fpu_begin/end()
-	 * section before the latter got preempted by an out-of-band
-	 * task.
-	 */
-	bool				preempted : 1;
-#endif
-
 	/*
 	 * @avx512_timestamp:
 	 *
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 77c4c632cef2..2fc72d622229 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -237,10 +237,11 @@ static inline int arch_within_stack_frames(const void * const stack,
 #define TS_OFFSTAGE		0x0040	/* Thread is in-flight to OOB context */
 #define TS_OOBTRAP		0x0080	/* Handling a trap from OOB context */
 
-#define _TLF_OOB		TS_OOB
-#define _TLF_DOVETAIL		TS_DOVETAIL
-#define _TLF_OFFSTAGE		TS_OFFSTAGE
-#define _TLF_OOBTRAP		TS_OOBTRAP
+#define _TLF_OOB			TS_OOB
+#define _TLF_DOVETAIL			TS_DOVETAIL
+#define _TLF_OFFSTAGE			TS_OFFSTAGE
+#define _TLF_OOBTRAP			TS_OOBTRAP
+#define _TLF_KERNEL_FPU_PREEMPTED	0x0100	/* Dovetail: kernel mode FPU section preempted by OOB */
 
 #ifndef __ASSEMBLER__
 #ifdef CONFIG_COMPAT
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7b87a04c0b6e..8dbebfd7e4b9 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -1044,7 +1044,6 @@ static int fpu__drop_kernel_fpstate(unsigned int cpu)
 void fpu__suspend_inband(void)
 {
 	struct fpu *kfpu = this_cpu_read(in_kernel_fpstate);
-	struct task_struct *tsk = current;
 
 	/*
 	 * If kernel_fpu_allowed is false, we are dealing with the
@@ -1054,23 +1053,21 @@ void fpu__suspend_inband(void)
 	if (!this_cpu_read(kernel_fpu_allowed)) {
 		save_fpregs_to_fpstate(kfpu);
 		__cpu_invalidate_fpregs_state();
-		oob_fpu_set_preempt(x86_task_fpu(tsk));
+		set_thread_local_flags(_TLF_KERNEL_FPU_PREEMPTED);
 	}
 }
 
 void fpu__resume_inband(void)
 {
 	struct fpu *kfpu = this_cpu_read(in_kernel_fpstate);
-	struct task_struct *tsk = current;
 
-	if (tsk->flags & PF_KTHREAD)
-		return;
-
-	if (oob_fpu_preempted(x86_task_fpu(tsk))) {
+	if (test_thread_local_flags(_TLF_KERNEL_FPU_PREEMPTED)) {
 		restore_fpregs_from_fpstate(kfpu->fpstate, XFEATURE_MASK_FPSTATE);
 		__cpu_invalidate_fpregs_state();
-		oob_fpu_clear_preempt(x86_task_fpu(tsk));
+		clear_thread_local_flags(_TLF_KERNEL_FPU_PREEMPTED);
 	} else {
+		if (current->flags & PF_KTHREAD)
+			return;
 		if (test_thread_flag(TIF_NEED_FPU_LOAD))
 			switch_fpu_return();
 	}

-- 
2.39.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH Dovetail 6.16 v3 2/3] arm64: fpsimd: dovetail: Align with x86 implementation
  2025-09-05 14:26 [PATCH Dovetail 6.16 v3 0/3] Dovetail 6.16: Fix some regressions introduced during last forward port Florian Bezdeka
  2025-09-05 14:26 ` [PATCH Dovetail 6.16 v3 1/3] x86/fpu: dovetail: Fix FPU corruption Florian Bezdeka
@ 2025-09-05 14:26 ` Florian Bezdeka
  2025-09-08 12:38   ` Philippe Gerum
  2025-09-05 14:26 ` [PATCH Dovetail 6.16 v3 3/3] irq_pipeline: Fix pipelining code for level triggered IRQs Florian Bezdeka
  2025-09-08  9:20 ` [PATCH Dovetail 6.16 v3 0/3] Dovetail 6.16: Fix some regressions introduced during last forward port Florian Bezdeka
  3 siblings, 1 reply; 9+ messages in thread
From: Florian Bezdeka @ 2025-09-05 14:26 UTC (permalink / raw)
  To: xenomai; +Cc: Philippe Gerum, Jan Kiszka, Florian Bezdeka

The FPU preemption marker is only accessed from the current task, so
we can avoid atomic operations. Moving the marker from the low level
flags to the local (synchronous) flags to make that clear.

Gives us some performance and aligns the arm64 implementation with x86.
---
 arch/arm64/include/asm/thread_info.h | 10 +++++-----
 arch/arm64/kernel/fpsimd.c           |  7 +++++--
 2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 3fb3725fb004..a4405b11189b 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -75,7 +75,6 @@ void arch_setup_new_exec(void);
 #define TIF_SYSCALL_TRACEPOINT	10	/* syscall tracepoint for ftrace */
 #define TIF_SECCOMP		11	/* syscall secure computing */
 #define TIF_SYSCALL_EMU		12	/* syscall emulation active */
-#define TIF_KERNEL_FP_PREEMPTED	15	/* Dovetail: kernel mode FPSIMD section preempted by OOB */
 #define TIF_MAYDAY		16	/* Emergency trap pending */
 #define TIF_RETUSER		17	/* INBAND_TASK_RETUSER is pending */
 #define TIF_MEMDIE		18	/* is terminating due to OOM killer */
@@ -141,9 +140,10 @@ void arch_setup_new_exec(void);
 /*
  * Local (synchronous) thread flags.
  */
-#define _TLF_OOB		0x0001
-#define _TLF_DOVETAIL		0x0002
-#define _TLF_OFFSTAGE		0x0004
-#define _TLF_OOBTRAP		0x0008
+#define _TLF_OOB			0x0001
+#define _TLF_DOVETAIL			0x0002
+#define _TLF_OFFSTAGE			0x0004
+#define _TLF_OOBTRAP			0x0008
+#define _TLF_KERNEL_FPU_PREEMPTED	0x0010	/* Dovetail: kernel mode FPSIMD section preempted by OOB */
 
 #endif /* __ASM_THREAD_INFO_H */
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index aac0dbf6d84e..1f5fcac32b2d 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -2183,7 +2183,7 @@ void fpsimd_suspend_inband(void)
 	 */
 	if (test_thread_flag(TIF_KERNEL_FPSTATE)) {
 		fpsimd_save_state(kfpu);
-		set_thread_flag(TIF_KERNEL_FP_PREEMPTED);
+		set_thread_local_flags(_TLF_KERNEL_FPU_PREEMPTED);
 	}
 }
 
@@ -2191,8 +2191,11 @@ void fpsimd_resume_inband(void)
 {
 	struct user_fpsimd_state *kfpu = this_cpu_ptr(&in_kernel_fpstate);
 
-	if (test_and_clear_thread_flag(TIF_KERNEL_FP_PREEMPTED))
+	if (test_thread_local_flags(_TLF_KERNEL_FPU_PREEMPTED)) {
 		fpsimd_load_state(kfpu);
+		clear_thread_local_flags(_TLF_KERNEL_FPU_PREEMPTED);
+	}
+
 }
 
 #endif

-- 
2.39.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH Dovetail 6.16 v3 3/3] irq_pipeline: Fix pipelining code for level triggered IRQs
  2025-09-05 14:26 [PATCH Dovetail 6.16 v3 0/3] Dovetail 6.16: Fix some regressions introduced during last forward port Florian Bezdeka
  2025-09-05 14:26 ` [PATCH Dovetail 6.16 v3 1/3] x86/fpu: dovetail: Fix FPU corruption Florian Bezdeka
  2025-09-05 14:26 ` [PATCH Dovetail 6.16 v3 2/3] arm64: fpsimd: dovetail: Align with x86 implementation Florian Bezdeka
@ 2025-09-05 14:26 ` Florian Bezdeka
  2025-09-08 12:40   ` Philippe Gerum
  2025-09-08  9:20 ` [PATCH Dovetail 6.16 v3 0/3] Dovetail 6.16: Fix some regressions introduced during last forward port Florian Bezdeka
  3 siblings, 1 reply; 9+ messages in thread
From: Florian Bezdeka @ 2025-09-05 14:26 UTC (permalink / raw)
  To: xenomai; +Cc: Philippe Gerum, Jan Kiszka, Florian Bezdeka

During the last forward port to 6.16 the pipelining code was
accidentally removed from handle_level_irq().

The result was the following warning, observed on qemu/kvm:

[    0.532822] ------------[ cut here ]------------
[    0.532823] WARNING: CPU: 0 PID: 0 at ./include/linux/interrupt.h:670 update_process_times+0xc8/0xf0
[    0.532832] Modules linked in:
[    0.532836] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.16.0-hyprv-xenomai-0+ #212 PREEMPT(voluntary)
[    0.532840] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[    0.532841] IRQ stage: Linux
[    0.532843] RIP: 0010:update_process_times+0xc8/0xf0
[    0.532846] Code: e8 fd f1 01 00 84 c0 74 be 8b 05 17 5d d8 01 85 c0 74 1a 65 8b 05 30 13 6a 02 85 c0 75 0f 65 8b 05 25 10 6a 02 85 c0 75 04 900
[    0.532848] RSP: 0000:ffffffff82e03cb0 EFLAGS: 00010046
[    0.532851] RAX: 0000000000000000 RBX: ffff88803e64a800 RCX: 0000000000000000
[    0.532852] RDX: 00000000fffedb0a RSI: ffff88803e64cd80 RDI: ffff88803e64e040
[    0.532854] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
[    0.532855] R10: 2ecf57e6e4d1a59e R11: 0000000000000185 R12: 0000000000000000
[    0.532856] R13: ffff888004c52400 R14: 0000000000000000 R15: ffffffff82e277c0
[    0.532885] FS:  0000000000000000(0000) GS:ffff8880babc2000(0000) knlGS:0000000000000000
[    0.532887] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.532888] CR2: ffff888004801000 CR3: 0000000002e38000 CR4: 00000000000000b0
[    0.532889] Call Trace:
[    0.532891]  <TASK>
[    0.532895]  tick_periodic+0x29/0xe0
[    0.532909]  tick_handle_periodic+0x20/0x70
[    0.532914]  timer_interrupt+0x24/0x40
[    0.532918]  __handle_irq_event_percpu+0x9d/0x2e0
[    0.532924]  handle_irq_event+0x58/0xe0
[    0.532928]  handle_level_irq+0x166/0x1e0
[    0.532933]  generic_pipeline_irq_desc+0xc2/0x220
[    0.532937]  arch_handle_irq+0x5d/0x180
[    0.532943]  arch_pipeline_entry+0x38/0xf0
[    0.532950]  asm_common_interrupt+0x22/0x40
[    0.532953] RIP: 0010:unmask_8259A+0x68/0x70
[    0.532956] Code: c1 01 e6 21 0f b6 05 28 13 c1 01 e6 a1 48 c7 c7 b8 4f e4 82 48 89 de e8 06 3a 11 00 48 c7 c7 a0 4f e4 82 e8 da 6b 11 00 55 9d0
[    0.532957] RSP: 0000:ffffffff82e03e90 EFLAGS: 00000246
[    0.532959] RAX: 0000000000000001 RBX: ffffffff81233c31 RCX: ffffffff82e03e54
[    0.532960] RDX: 0000000000000000 RSI: ffffffff82e44fb8 RDI: ffffffff82e44fa0
[    0.532961] RBP: 0000000000000246 R08: 0000000000000001 R09: ffffffff82e286b0
[    0.532962] R10: 5878d26fbb66c2c1 R11: 0000000000000000 R12: ffffffff82e277c0
[    0.532963] R13: ffff88803ffd1f40 R14: ffffffffffffffea R15: ffffffff82e26e30
[    0.532966]  ? unmask_8259A+0x11/0x70
[    0.532976]  ? unmask_8259A+0x11/0x70
[    0.532979]  enable_IR_x2apic+0xf0/0x240
[    0.532986]  x86_64_probe_apic+0xb/0x50
[    0.532990]  apic_intr_mode_init+0x3e/0xf0
[    0.532992]  x86_late_time_init+0x20/0x40
[    0.532995]  start_kernel+0x75a/0x8b0
[    0.533002]  x86_64_start_reservations+0x14/0x30
[    0.533005]  x86_64_start_kernel+0xff/0x110
[    0.533009]  common_startup_64+0x13e/0x148
[    0.533022]  </TASK>

Forward porting the (old) code from 6.15 again to fix this problem.

Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
---
 kernel/irq/chip.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 8a0831193d74..f5dfc378ccf2 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -846,15 +846,33 @@ static void cond_unmask_irq(struct irq_desc *desc)
  */
 void handle_level_irq(struct irq_desc *desc)
 {
+	int flow;
+
 	guard(hybrid_spinlock)(&desc->lock);
-	mask_ack_irq(desc);
 
-	if (!irq_can_handle(desc))
+	flow = get_flow_step(desc);
+	if (may_start_flow(flow)) {
+		mask_ack_irq(desc);
+
+		if (!irq_can_handle(desc))
+			return;
+	}
+
+	if (should_feed_pipeline(desc, flow)) {
+		if (handle_oob_irq(desc))
+			goto out_unmask;
+		return;
+	}
+
+	if (flow == IRQ_FLOW_FORWARD) {
+		forward_irq_event(desc);
 		return;
+	}
 
 	kstat_incr_irqs_this_cpu(desc);
 	handle_irq_event(desc);
 
+out_unmask:
 	cond_unmask_irq(desc);
 }
 EXPORT_SYMBOL_GPL(handle_level_irq);

-- 
2.39.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH Dovetail 6.16 v3 0/3] Dovetail 6.16: Fix some regressions introduced during last forward port
  2025-09-05 14:26 [PATCH Dovetail 6.16 v3 0/3] Dovetail 6.16: Fix some regressions introduced during last forward port Florian Bezdeka
                   ` (2 preceding siblings ...)
  2025-09-05 14:26 ` [PATCH Dovetail 6.16 v3 3/3] irq_pipeline: Fix pipelining code for level triggered IRQs Florian Bezdeka
@ 2025-09-08  9:20 ` Florian Bezdeka
  3 siblings, 0 replies; 9+ messages in thread
From: Florian Bezdeka @ 2025-09-08  9:20 UTC (permalink / raw)
  To: xenomai; +Cc: Philippe Gerum, Jan Kiszka

On Fri, 2025-09-05 at 16:26 +0200, Florian Bezdeka wrote:
> 
> Changes in v3:
> - Rework patch 1: Move preemption marker to thread local flags
> - Added patch 2: Align arm64 and x86 fpu preemption code
> - Patch 3: was Patch 2 in v2, updated to match latest Linux code
> - Link to v2: https://lore.kernel.org/r/20250820-wip-flo-fixes-for-dovetail-6-15-v2-0-535e4fa355f7@siemens.com
> 

Any further comments to this series?

Would be nice to get that in as it hinders testing the edge IRQ
handling because of the FPU stall and stalls my hyper-v enablement
branch as well.

Florian

> Changes in v2:
> - Patch 1: Reworked as discussed on the list
>            Following the same pattern as for arm64 
> - Link to v1: https://lore.kernel.org/r/20250820-wip-flo-fixes-for-dovetail-6-15-v1-0-d44f50cb4b0b@siemens.com
> 
> ---
> Florian Bezdeka (3):
>       x86/fpu: dovetail: Fix FPU corruption
>       arm64: fpsimd: dovetail: Align with x86 implementation
>       irq_pipeline: Fix pipelining code for level triggered IRQs
> 
>  arch/arm64/include/asm/thread_info.h | 10 +++++-----
>  arch/arm64/kernel/fpsimd.c           |  7 +++++--
>  arch/x86/include/asm/fpu/sched.h     | 17 ++++-------------
>  arch/x86/include/asm/fpu/types.h     | 12 ------------
>  arch/x86/include/asm/thread_info.h   |  9 +++++----
>  arch/x86/kernel/fpu/core.c           | 13 +++++--------
>  kernel/irq/chip.c                    | 22 ++++++++++++++++++++--
>  7 files changed, 44 insertions(+), 46 deletions(-)
> ---
> base-commit: eec111a874f5899e1078ef785ccf29dff986e5a0
> change-id: 20250820-wip-flo-fixes-for-dovetail-6-15-1368f7f6e0b1
> 
> Best regards,
> -- 
> Florian Bezdeka <florian.bezdeka@siemens.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH Dovetail 6.16 v3 1/3] x86/fpu: dovetail: Fix FPU corruption
  2025-09-05 14:26 ` [PATCH Dovetail 6.16 v3 1/3] x86/fpu: dovetail: Fix FPU corruption Florian Bezdeka
@ 2025-09-08 12:37   ` Philippe Gerum
  2025-09-08 12:53     ` Florian Bezdeka
  0 siblings, 1 reply; 9+ messages in thread
From: Philippe Gerum @ 2025-09-08 12:37 UTC (permalink / raw)
  To: Florian Bezdeka; +Cc: xenomai, Jan Kiszka

Florian Bezdeka <florian.bezdeka@siemens.com> writes:

> diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
> index 77c4c632cef2..2fc72d622229 100644
> --- a/arch/x86/include/asm/thread_info.h
> +++ b/arch/x86/include/asm/thread_info.h
> @@ -237,10 +237,11 @@ static inline int arch_within_stack_frames(const void * const stack,
>  #define TS_OFFSTAGE		0x0040	/* Thread is in-flight to OOB context */
>  #define TS_OOBTRAP		0x0080	/* Handling a trap from OOB context */
>  
> -#define _TLF_OOB		TS_OOB
> -#define _TLF_DOVETAIL		TS_DOVETAIL
> -#define _TLF_OFFSTAGE		TS_OFFSTAGE
> -#define _TLF_OOBTRAP		TS_OOBTRAP
> +#define _TLF_OOB			TS_OOB
> +#define _TLF_DOVETAIL			TS_DOVETAIL
> +#define _TLF_OFFSTAGE			TS_OFFSTAGE
> +#define _TLF_OOBTRAP			TS_OOBTRAP
> +#define _TLF_KERNEL_FPU_PREEMPTED	0x0100	/* Dovetail: kernel mode FPU section preempted by OOB */
>

The TS_* enumeration originates from the upstream implementation, so we
added our thread-synchronous flags there. The _TLF_* aliasing comes from
the Dovetail implementation, in order to get a common naming scheme
because not all archs would define TS_* flags. So we should keep it the
current way which is intended to prevent conflicting upstream addition
to go unnoticed, i.e. define TS_KERNEL_FPU_PREEMPTED then alias it to
_TLF_KERNEL_FPU_PREEMPTED.

The rest looks good to me.


-- 
Philippe.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH Dovetail 6.16 v3 2/3] arm64: fpsimd: dovetail: Align with x86 implementation
  2025-09-05 14:26 ` [PATCH Dovetail 6.16 v3 2/3] arm64: fpsimd: dovetail: Align with x86 implementation Florian Bezdeka
@ 2025-09-08 12:38   ` Philippe Gerum
  0 siblings, 0 replies; 9+ messages in thread
From: Philippe Gerum @ 2025-09-08 12:38 UTC (permalink / raw)
  To: Florian Bezdeka; +Cc: xenomai, Jan Kiszka

Florian Bezdeka <florian.bezdeka@siemens.com> writes:

> The FPU preemption marker is only accessed from the current task, so
> we can avoid atomic operations. Moving the marker from the low level
> flags to the local (synchronous) flags to make that clear.
>
> Gives us some performance and aligns the arm64 implementation with x86.
> ---
>  arch/arm64/include/asm/thread_info.h | 10 +++++-----
>  arch/arm64/kernel/fpsimd.c           |  7 +++++--
>  2 files changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
> index 3fb3725fb004..a4405b11189b 100644
> --- a/arch/arm64/include/asm/thread_info.h
> +++ b/arch/arm64/include/asm/thread_info.h
> @@ -75,7 +75,6 @@ void arch_setup_new_exec(void);
>  #define TIF_SYSCALL_TRACEPOINT	10	/* syscall tracepoint for ftrace */
>  #define TIF_SECCOMP		11	/* syscall secure computing */
>  #define TIF_SYSCALL_EMU		12	/* syscall emulation active */
> -#define TIF_KERNEL_FP_PREEMPTED	15	/* Dovetail: kernel mode FPSIMD section preempted by OOB */
>  #define TIF_MAYDAY		16	/* Emergency trap pending */
>  #define TIF_RETUSER		17	/* INBAND_TASK_RETUSER is pending */
>  #define TIF_MEMDIE		18	/* is terminating due to OOM killer */
> @@ -141,9 +140,10 @@ void arch_setup_new_exec(void);
>  /*
>   * Local (synchronous) thread flags.
>   */
> -#define _TLF_OOB		0x0001
> -#define _TLF_DOVETAIL		0x0002
> -#define _TLF_OFFSTAGE		0x0004
> -#define _TLF_OOBTRAP		0x0008
> +#define _TLF_OOB			0x0001
> +#define _TLF_DOVETAIL			0x0002
> +#define _TLF_OFFSTAGE			0x0004
> +#define _TLF_OOBTRAP			0x0008
> +#define _TLF_KERNEL_FPU_PREEMPTED	0x0010	/* Dovetail: kernel mode FPSIMD section preempted by OOB */
>  
>  #endif /* __ASM_THREAD_INFO_H */
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index aac0dbf6d84e..1f5fcac32b2d 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -2183,7 +2183,7 @@ void fpsimd_suspend_inband(void)
>  	 */
>  	if (test_thread_flag(TIF_KERNEL_FPSTATE)) {
>  		fpsimd_save_state(kfpu);
> -		set_thread_flag(TIF_KERNEL_FP_PREEMPTED);
> +		set_thread_local_flags(_TLF_KERNEL_FPU_PREEMPTED);
>  	}
>  }
>  
> @@ -2191,8 +2191,11 @@ void fpsimd_resume_inband(void)
>  {
>  	struct user_fpsimd_state *kfpu = this_cpu_ptr(&in_kernel_fpstate);
>  
> -	if (test_and_clear_thread_flag(TIF_KERNEL_FP_PREEMPTED))
> +	if (test_thread_local_flags(_TLF_KERNEL_FPU_PREEMPTED)) {
>  		fpsimd_load_state(kfpu);
> +		clear_thread_local_flags(_TLF_KERNEL_FPU_PREEMPTED);
> +	}
> +
>  }
>  
>  #endif

Ack.

-- 
Philippe.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH Dovetail 6.16 v3 3/3] irq_pipeline: Fix pipelining code for level triggered IRQs
  2025-09-05 14:26 ` [PATCH Dovetail 6.16 v3 3/3] irq_pipeline: Fix pipelining code for level triggered IRQs Florian Bezdeka
@ 2025-09-08 12:40   ` Philippe Gerum
  0 siblings, 0 replies; 9+ messages in thread
From: Philippe Gerum @ 2025-09-08 12:40 UTC (permalink / raw)
  To: Florian Bezdeka; +Cc: xenomai, Jan Kiszka

Florian Bezdeka <florian.bezdeka@siemens.com> writes:

> During the last forward port to 6.16 the pipelining code was
> accidentally removed from handle_level_irq().
>
> Forward porting the (old) code from 6.15 again to fix this problem.
>
> Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
> ---
>  kernel/irq/chip.c | 22 ++++++++++++++++++++--
>  1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 8a0831193d74..f5dfc378ccf2 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -846,15 +846,33 @@ static void cond_unmask_irq(struct irq_desc *desc)
>   */
>  void handle_level_irq(struct irq_desc *desc)
>  {
> +	int flow;
> +
>  	guard(hybrid_spinlock)(&desc->lock);
> -	mask_ack_irq(desc);
>  
> -	if (!irq_can_handle(desc))
> +	flow = get_flow_step(desc);
> +	if (may_start_flow(flow)) {
> +		mask_ack_irq(desc);
> +
> +		if (!irq_can_handle(desc))
> +			return;
> +	}
> +
> +	if (should_feed_pipeline(desc, flow)) {
> +		if (handle_oob_irq(desc))
> +			goto out_unmask;
> +		return;
> +	}
> +
> +	if (flow == IRQ_FLOW_FORWARD) {
> +		forward_irq_event(desc);
>  		return;
> +	}
>  
>  	kstat_incr_irqs_this_cpu(desc);
>  	handle_irq_event(desc);
>  
> +out_unmask:
>  	cond_unmask_irq(desc);
>  }
>  EXPORT_SYMBOL_GPL(handle_level_irq);

Ack.

-- 
Philippe.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH Dovetail 6.16 v3 1/3] x86/fpu: dovetail: Fix FPU corruption
  2025-09-08 12:37   ` Philippe Gerum
@ 2025-09-08 12:53     ` Florian Bezdeka
  0 siblings, 0 replies; 9+ messages in thread
From: Florian Bezdeka @ 2025-09-08 12:53 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai, Jan Kiszka

On Mon, 2025-09-08 at 14:37 +0200, Philippe Gerum wrote:
> Florian Bezdeka <florian.bezdeka@siemens.com> writes:
> 
> > diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
> > index 77c4c632cef2..2fc72d622229 100644
> > --- a/arch/x86/include/asm/thread_info.h
> > +++ b/arch/x86/include/asm/thread_info.h
> > @@ -237,10 +237,11 @@ static inline int arch_within_stack_frames(const void * const stack,
> >  #define TS_OFFSTAGE		0x0040	/* Thread is in-flight to OOB context */
> >  #define TS_OOBTRAP		0x0080	/* Handling a trap from OOB context */
> >  
> > -#define _TLF_OOB		TS_OOB
> > -#define _TLF_DOVETAIL		TS_DOVETAIL
> > -#define _TLF_OFFSTAGE		TS_OFFSTAGE
> > -#define _TLF_OOBTRAP		TS_OOBTRAP
> > +#define _TLF_OOB			TS_OOB
> > +#define _TLF_DOVETAIL			TS_DOVETAIL
> > +#define _TLF_OFFSTAGE			TS_OFFSTAGE
> > +#define _TLF_OOBTRAP			TS_OOBTRAP
> > +#define _TLF_KERNEL_FPU_PREEMPTED	0x0100	/* Dovetail: kernel mode FPU section preempted by OOB */
> > 
> 
> The TS_* enumeration originates from the upstream implementation, so we
> added our thread-synchronous flags there. The _TLF_* aliasing comes from
> the Dovetail implementation, in order to get a common naming scheme
> because not all archs would define TS_* flags. So we should keep it the
> current way which is intended to prevent conflicting upstream addition
> to go unnoticed, i.e. define TS_KERNEL_FPU_PREEMPTED then alias it to
> _TLF_KERNEL_FPU_PREEMPTED.

Got it. Will send out v4 soon. Thanks a lot!

> 
> The rest looks good to me.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-09-08 12:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-05 14:26 [PATCH Dovetail 6.16 v3 0/3] Dovetail 6.16: Fix some regressions introduced during last forward port Florian Bezdeka
2025-09-05 14:26 ` [PATCH Dovetail 6.16 v3 1/3] x86/fpu: dovetail: Fix FPU corruption Florian Bezdeka
2025-09-08 12:37   ` Philippe Gerum
2025-09-08 12:53     ` Florian Bezdeka
2025-09-05 14:26 ` [PATCH Dovetail 6.16 v3 2/3] arm64: fpsimd: dovetail: Align with x86 implementation Florian Bezdeka
2025-09-08 12:38   ` Philippe Gerum
2025-09-05 14:26 ` [PATCH Dovetail 6.16 v3 3/3] irq_pipeline: Fix pipelining code for level triggered IRQs Florian Bezdeka
2025-09-08 12:40   ` Philippe Gerum
2025-09-08  9:20 ` [PATCH Dovetail 6.16 v3 0/3] Dovetail 6.16: Fix some regressions introduced during last forward port Florian Bezdeka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.