All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH Dovetail 6.16 0/2] Dovetail 6.16: Fix some regressions introduced during last forward port
@ 2025-08-19 22:53 Florian Bezdeka
  2025-08-19 22:53 ` [PATCH Dovetail 6.16 1/2] x86/fpu: dovetail: Fix FPU corruption Florian Bezdeka
  2025-08-19 22:53 ` [PATCH Dovetail 6.16 2/2] irq_pipeline: Fix pipelining code for level triggered IRQs Florian Bezdeka
  0 siblings, 2 replies; 9+ messages in thread
From: Florian Bezdeka @ 2025-08-19 22:53 UTC (permalink / raw)
  To: xenomai; +Cc: Philippe Gerum, Jan Kiszka, Florian Bezdeka

Hi all,

seems the last forward port to 6.16 introduced some regressions.
Please review carefully, thanks!

With that applied I have two open problems:
  - Running dmesg kills the system, no idea yet
    I tired to trace that but failed to get the traces out of the
    machine.
  - I'm able to reproduce [1] on 6.16 now, no idea yet

CC: Philippe Gerum <rpm@xenomai.org>
CC: Jan Kiszka <jan.kiszka@siemens.com>

[1] https://lore.kernel.org/xenomai/6fcb3f9830056d99161ea8f50a016b99d98a7c71.camel@siemens.com/T/

Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
---
Florian Bezdeka (2):
      x86/fpu: dovetail: Fix FPU corruption
      irq_pipeline: Fix pipelining code for level triggered IRQs

 arch/x86/kernel/fpu/core.c |  3 +++
 kernel/irq/chip.c          | 33 +++++++++++++++++++++++++++++++--
 2 files changed, 34 insertions(+), 2 deletions(-)
---
base-commit: eec111a874f5899e1078ef785ccf29dff986e5a0
change-id: 20250820-wip-flo-fixes-for-dovetail-6-15-1368f7f6e0b1

Best regards,
-- 
Florian Bezdeka <florian.bezdeka@siemens.com>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH Dovetail 6.16 1/2] x86/fpu: dovetail: Fix FPU corruption
  2025-08-19 22:53 [PATCH Dovetail 6.16 0/2] Dovetail 6.16: Fix some regressions introduced during last forward port Florian Bezdeka
@ 2025-08-19 22:53 ` Florian Bezdeka
  2025-08-20 10:57   ` Philippe Gerum
  2025-08-20 14:21   ` Philippe Gerum
  2025-08-19 22:53 ` [PATCH Dovetail 6.16 2/2] irq_pipeline: Fix pipelining code for level triggered IRQs Florian Bezdeka
  1 sibling, 2 replies; 9+ messages in thread
From: Florian Bezdeka @ 2025-08-19 22:53 UTC (permalink / raw)
  To: xenomai; +Cc: Philippe Gerum, Jan Kiszka, Florian Bezdeka

Align fpu__suspend_inband() and fpu__resume_inband() regarding the
PF_KTHREAD handling.

The kernel splat looked as follows and could be reproduced by running
the Xenomai 3 testsuite over a stress-ng produced load.

[19527.937613] ------------[ cut here ]------------
[19527.937614] WARNING: CPU: 0 PID: 14 at arch/x86/kernel/fpu/core.c:62 x86_task_fpu+0x18/0x30
[19527.937619] Modules linked in: intel_rapl_msr intel_rapl_common intel_pmc_core pmt_telemetry pmt_class intel_pmc_ssram_telemetry intel_vsec ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel gf128mul rapl pcspkr button evdev serio_raw joydev sg loop efi_pstore fuse dm_mod drm dax configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom ata_generic sd_mod hid_generic hid_hyperv ata_piix psmouse hid hv_netvsc hv_storvsc libata scsi_transport_fc floppy i2c_piix4 i2c_smbus scsi_mod scsi_common [last unloaded: xeno_can]
[19527.937635] CPU: 0 UID: 0 PID: 14 Comm: ksoftirqd/0 Not tainted 6.16.0-hyprv-xenomai-0+ #196 PREEMPT(voluntary)
[19527.937637] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008  12/07/2018
[19527.937637] IRQ stage: Xenomai
[19527.937638] RIP: 0010:x86_task_fpu+0x18/0x30
[19527.937640] Code: 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8d 87 00 20 00 00 f6 47 3e 20 75 05 c3 cc cc cc cc 90 <0f> 0b 90 31 c0 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f
[19527.937641] RSP: 0018:ffffba19c0083b60 EFLAGS: 00010002
[19527.937641] RAX: ffffa0a90133c3c0 RBX: 0000000000000001 RCX: ffffa0a93fc5a6e8
[19527.937642] RDX: 0000000000000000 RSI: 0000000000000007 RDI: ffffa0a90133a3c0
[19527.937643] RBP: ffffa0a93fc5a058 R08: 01364d9cb9685197 R09: ffffffff83e406f8
[19527.937643] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[19527.937643] R13: ffffa0a90133a3c0 R14: ffffba19c0c62008 R15: 0000000000000001
[19527.937645] FS:  0000000000000000(0000) GS:ffffa0a9ba664000(0000) knlGS:0000000000000000
[19527.937646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[19527.937646] CR2: 00007f5c694363c0 CR3: 000000003e038004 CR4: 00000000003706f0
[19527.937647] Call Trace:
[19527.937648]  <TASK>
[19527.937649]  fpu__suspend_inband+0x3c/0x50
[19527.937651]  dovetail_context_switch+0x130/0x240
[19527.937652]  ? xntimer_start+0x101/0x240
[19527.937656]  ___xnsched_run+0x1f9/0x4d0
[19527.937659]  run_oob_call+0x87/0x140
[19527.937662]  handle_irq_pipelined_finish+0x188/0x1a0
[19527.937664]  arch_pipeline_entry+0x43/0xf0
[19527.937667]  asm_sysvec_hyperv_stimer0+0x16/0x20
[19527.937669] RIP: 0010:kernel_fpu_end+0x5/0x50
[19527.937670] Code: 57 02 e8 9e 12 42 00 31 c0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <65> 0f b6 05 03 40 57 02 84 c0 75 31 65 c6 05 f7 3f 57 02 01 65 f6
[19527.937671] RSP: 0018:ffffba19c0083d28 EFLAGS: 00000246
[19527.937671] RAX: 000000000000000c RBX: 0000000000000001 RCX: ffffffff842f2260
[19527.937672] RDX: 0000000000000000 RSI: ffffffff84c67450 RDI: ffffffff84c673e0
[19527.937672] RBP: 0000000000000001 R08: ffffffff842f2260 R09: 0000000000000000
[19527.937672] R10: ffffffff84a080c0 R11: 0000000000000001 R12: ffffffff84c67410
[19527.937673] R13: ffffffff84c673e0 R14: 0000000000000040 R15: 0000000000000004
[19527.937675]  blake2s_compress+0x52/0xa0
[19527.937679]  blake2s_update+0x75/0x150
[19527.937681]  add_timer_randomness+0xac/0x1a0
[19527.937684]  scsi_end_request+0x297/0x3a0 [scsi_mod]
[19527.937697]  scsi_io_completion+0x55/0x6a0 [scsi_mod]
[19527.937707]  blk_done_softirq+0x46/0x60
[19527.937709]  handle_softirqs+0xd0/0x430
[19527.937711]  ? __pfx_smpboot_thread_fn+0x10/0x10
[19527.937713]  run_ksoftirqd+0x47/0x60
[19527.937714]  smpboot_thread_fn+0xec/0x220
[19527.937716]  kthread+0xf7/0x240
[19527.937718]  ? __pfx_kthread+0x10/0x10
[19527.937720]  ret_from_fork+0x28c/0x310
[19527.937721]  ? __pfx_kthread+0x10/0x10
[19527.937722]  ret_from_fork_asm+0x1a/0x30
[19527.937725]  </TASK>
[19527.937725] irq event stamp: 1270
[19527.937726] hardirqs last  enabled at (1269): [<ffffffff830b2b1c>] run_ksoftirqd+0x4c/0x60
[19527.937727] hardirqs last disabled at (1270): [<ffffffff83d90d53>] __schedule+0x853/0x1080
[19527.937729] softirqs last  enabled at (1268): [<ffffffff830b29d6>] handle_softirqs+0x346/0x430
[19527.937730] softirqs last disabled at (1215): [<ffffffff830b2b17>] run_ksoftirqd+0x47/0x60
[19527.937731] ---[ end trace 0000000000000000 ]---

Link: https://lore.kernel.org/xenomai/80a48c1ddc3721eb8022b77548028aa8e7371928.camel@siemens.com/T/
Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
---
 arch/x86/kernel/fpu/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7b87a04c0b6e..949f89bb5cdc 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -1046,6 +1046,9 @@ void fpu__suspend_inband(void)
 	struct fpu *kfpu = this_cpu_read(in_kernel_fpstate);
 	struct task_struct *tsk = current;
 
+	if (tsk->flags & PF_KTHREAD)
+		return;
+
 	/*
 	 * If kernel_fpu_allowed is false, we are dealing with the
 	 * preemption of an inband kernel context currently using the

-- 
2.39.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH Dovetail 6.16 2/2] irq_pipeline: Fix pipelining code for level triggered IRQs
  2025-08-19 22:53 [PATCH Dovetail 6.16 0/2] Dovetail 6.16: Fix some regressions introduced during last forward port Florian Bezdeka
  2025-08-19 22:53 ` [PATCH Dovetail 6.16 1/2] x86/fpu: dovetail: Fix FPU corruption Florian Bezdeka
@ 2025-08-19 22:53 ` Florian Bezdeka
  1 sibling, 0 replies; 9+ messages in thread
From: Florian Bezdeka @ 2025-08-19 22:53 UTC (permalink / raw)
  To: xenomai; +Cc: Philippe Gerum, Jan Kiszka, Florian Bezdeka

During the last forward port to 6.16 the pipelining code was
accidentally removed from handle_level_irq().

The result was the following warning, observed on qemu/kvm:

[    0.532822] ------------[ cut here ]------------
[    0.532823] WARNING: CPU: 0 PID: 0 at ./include/linux/interrupt.h:670 update_process_times+0xc8/0xf0
[    0.532832] Modules linked in:
[    0.532836] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.16.0-hyprv-xenomai-0+ #212 PREEMPT(voluntary)
[    0.532840] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[    0.532841] IRQ stage: Linux
[    0.532843] RIP: 0010:update_process_times+0xc8/0xf0
[    0.532846] Code: e8 fd f1 01 00 84 c0 74 be 8b 05 17 5d d8 01 85 c0 74 1a 65 8b 05 30 13 6a 02 85 c0 75 0f 65 8b 05 25 10 6a 02 85 c0 75 04 900
[    0.532848] RSP: 0000:ffffffff82e03cb0 EFLAGS: 00010046
[    0.532851] RAX: 0000000000000000 RBX: ffff88803e64a800 RCX: 0000000000000000
[    0.532852] RDX: 00000000fffedb0a RSI: ffff88803e64cd80 RDI: ffff88803e64e040
[    0.532854] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
[    0.532855] R10: 2ecf57e6e4d1a59e R11: 0000000000000185 R12: 0000000000000000
[    0.532856] R13: ffff888004c52400 R14: 0000000000000000 R15: ffffffff82e277c0
[    0.532885] FS:  0000000000000000(0000) GS:ffff8880babc2000(0000) knlGS:0000000000000000
[    0.532887] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.532888] CR2: ffff888004801000 CR3: 0000000002e38000 CR4: 00000000000000b0
[    0.532889] Call Trace:
[    0.532891]  <TASK>
[    0.532895]  tick_periodic+0x29/0xe0
[    0.532909]  tick_handle_periodic+0x20/0x70
[    0.532914]  timer_interrupt+0x24/0x40
[    0.532918]  __handle_irq_event_percpu+0x9d/0x2e0
[    0.532924]  handle_irq_event+0x58/0xe0
[    0.532928]  handle_level_irq+0x166/0x1e0
[    0.532933]  generic_pipeline_irq_desc+0xc2/0x220
[    0.532937]  arch_handle_irq+0x5d/0x180
[    0.532943]  arch_pipeline_entry+0x38/0xf0
[    0.532950]  asm_common_interrupt+0x22/0x40
[    0.532953] RIP: 0010:unmask_8259A+0x68/0x70
[    0.532956] Code: c1 01 e6 21 0f b6 05 28 13 c1 01 e6 a1 48 c7 c7 b8 4f e4 82 48 89 de e8 06 3a 11 00 48 c7 c7 a0 4f e4 82 e8 da 6b 11 00 55 9d0
[    0.532957] RSP: 0000:ffffffff82e03e90 EFLAGS: 00000246
[    0.532959] RAX: 0000000000000001 RBX: ffffffff81233c31 RCX: ffffffff82e03e54
[    0.532960] RDX: 0000000000000000 RSI: ffffffff82e44fb8 RDI: ffffffff82e44fa0
[    0.532961] RBP: 0000000000000246 R08: 0000000000000001 R09: ffffffff82e286b0
[    0.532962] R10: 5878d26fbb66c2c1 R11: 0000000000000000 R12: ffffffff82e277c0
[    0.532963] R13: ffff88803ffd1f40 R14: ffffffffffffffea R15: ffffffff82e26e30
[    0.532966]  ? unmask_8259A+0x11/0x70
[    0.532976]  ? unmask_8259A+0x11/0x70
[    0.532979]  enable_IR_x2apic+0xf0/0x240
[    0.532986]  x86_64_probe_apic+0xb/0x50
[    0.532990]  apic_intr_mode_init+0x3e/0xf0
[    0.532992]  x86_late_time_init+0x20/0x40
[    0.532995]  start_kernel+0x75a/0x8b0
[    0.533002]  x86_64_start_reservations+0x14/0x30
[    0.533005]  x86_64_start_kernel+0xff/0x110
[    0.533009]  common_startup_64+0x13e/0x148
[    0.533022]  </TASK>

Forward porting the (old) code from 6.15 again to fix this problem.

Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
---
 kernel/irq/chip.c | 33 +++++++++++++++++++++++++++++++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 8a0831193d74..aacacb5869cd 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -846,15 +846,44 @@ static void cond_unmask_irq(struct irq_desc *desc)
  */
 void handle_level_irq(struct irq_desc *desc)
 {
+	int flow;
+
 	guard(hybrid_spinlock)(&desc->lock);
-	mask_ack_irq(desc);
 
-	if (!irq_can_handle(desc))
+	flow = get_flow_step(desc);
+	if (may_start_flow(flow)) {
+		mask_ack_irq(desc);
+
+		if (!irq_can_handle(desc))
+			return;
+	}
+
+	if (should_feed_pipeline(desc, flow)) {
+		if (handle_oob_irq(desc))
+			goto out_unmask;
+		return;
+	}
+
+	if (flow == IRQ_FLOW_FORWARD) {
+		forward_irq_event(desc);
 		return;
+	}
+
+	desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING);
+
+	/*
+	 * If its disabled or no action available keep it masked and get out
+	 * of here
+	 */
+	if (unlikely(!desc->action || irqd_irq_disabled(&desc->irq_data))) {
+		desc->istate |= IRQS_PENDING;
+		return;
+	}
 
 	kstat_incr_irqs_this_cpu(desc);
 	handle_irq_event(desc);
 
+out_unmask:
 	cond_unmask_irq(desc);
 }
 EXPORT_SYMBOL_GPL(handle_level_irq);

-- 
2.39.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH Dovetail 6.16 1/2] x86/fpu: dovetail: Fix FPU corruption
  2025-08-19 22:53 ` [PATCH Dovetail 6.16 1/2] x86/fpu: dovetail: Fix FPU corruption Florian Bezdeka
@ 2025-08-20 10:57   ` Philippe Gerum
  2025-08-20 13:00     ` Florian Bezdeka
  2025-08-20 14:21   ` Philippe Gerum
  1 sibling, 1 reply; 9+ messages in thread
From: Philippe Gerum @ 2025-08-20 10:57 UTC (permalink / raw)
  To: Florian Bezdeka; +Cc: xenomai, Jan Kiszka

Florian Bezdeka <florian.bezdeka@siemens.com> writes:

> Align fpu__suspend_inband() and fpu__resume_inband() regarding the
> PF_KTHREAD handling.
>
> The kernel splat looked as follows and could be reproduced by running
> the Xenomai 3 testsuite over a stress-ng produced load.
>
> [19527.937613] ------------[ cut here ]------------
> [19527.937614] WARNING: CPU: 0 PID: 14 at arch/x86/kernel/fpu/core.c:62 x86_task_fpu+0x18/0x30
> [19527.937619] Modules linked in: intel_rapl_msr intel_rapl_common intel_pmc_core pmt_telemetry pmt_class intel_pmc_ssram_telemetry intel_vsec ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel gf128mul rapl pcspkr button evdev serio_raw joydev sg loop efi_pstore fuse dm_mod drm dax configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom ata_generic sd_mod hid_generic hid_hyperv ata_piix psmouse hid hv_netvsc hv_storvsc libata scsi_transport_fc floppy i2c_piix4 i2c_smbus scsi_mod scsi_common [last unloaded: xeno_can]
> [19527.937635] CPU: 0 UID: 0 PID: 14 Comm: ksoftirqd/0 Not tainted 6.16.0-hyprv-xenomai-0+ #196 PREEMPT(voluntary)
> [19527.937637] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008  12/07/2018
> [19527.937637] IRQ stage: Xenomai
> [19527.937638] RIP: 0010:x86_task_fpu+0x18/0x30
> [19527.937640] Code: 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8d 87 00 20 00 00 f6 47 3e 20 75 05 c3 cc cc cc cc 90 <0f> 0b 90 31 c0 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f
> [19527.937641] RSP: 0018:ffffba19c0083b60 EFLAGS: 00010002
> [19527.937641] RAX: ffffa0a90133c3c0 RBX: 0000000000000001 RCX: ffffa0a93fc5a6e8
> [19527.937642] RDX: 0000000000000000 RSI: 0000000000000007 RDI: ffffa0a90133a3c0
> [19527.937643] RBP: ffffa0a93fc5a058 R08: 01364d9cb9685197 R09: ffffffff83e406f8
> [19527.937643] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> [19527.937643] R13: ffffa0a90133a3c0 R14: ffffba19c0c62008 R15: 0000000000000001
> [19527.937645] FS:  0000000000000000(0000) GS:ffffa0a9ba664000(0000) knlGS:0000000000000000
> [19527.937646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [19527.937646] CR2: 00007f5c694363c0 CR3: 000000003e038004 CR4: 00000000003706f0
> [19527.937647] Call Trace:
> [19527.937648]  <TASK>
> [19527.937649]  fpu__suspend_inband+0x3c/0x50
> [19527.937651]  dovetail_context_switch+0x130/0x240
> [19527.937652]  ? xntimer_start+0x101/0x240
> [19527.937656]  ___xnsched_run+0x1f9/0x4d0
> [19527.937659]  run_oob_call+0x87/0x140
> [19527.937662]  handle_irq_pipelined_finish+0x188/0x1a0
> [19527.937664]  arch_pipeline_entry+0x43/0xf0
> [19527.937667]  asm_sysvec_hyperv_stimer0+0x16/0x20
> [19527.937669] RIP: 0010:kernel_fpu_end+0x5/0x50
> [19527.937670] Code: 57 02 e8 9e 12 42 00 31 c0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <65> 0f b6 05 03 40 57 02 84 c0 75 31 65 c6 05 f7 3f 57 02 01 65 f6
> [19527.937671] RSP: 0018:ffffba19c0083d28 EFLAGS: 00000246
> [19527.937671] RAX: 000000000000000c RBX: 0000000000000001 RCX: ffffffff842f2260
> [19527.937672] RDX: 0000000000000000 RSI: ffffffff84c67450 RDI: ffffffff84c673e0
> [19527.937672] RBP: 0000000000000001 R08: ffffffff842f2260 R09: 0000000000000000
> [19527.937672] R10: ffffffff84a080c0 R11: 0000000000000001 R12: ffffffff84c67410
> [19527.937673] R13: ffffffff84c673e0 R14: 0000000000000040 R15: 0000000000000004
> [19527.937675]  blake2s_compress+0x52/0xa0
> [19527.937679]  blake2s_update+0x75/0x150
> [19527.937681]  add_timer_randomness+0xac/0x1a0
> [19527.937684]  scsi_end_request+0x297/0x3a0 [scsi_mod]
> [19527.937697]  scsi_io_completion+0x55/0x6a0 [scsi_mod]
> [19527.937707]  blk_done_softirq+0x46/0x60
> [19527.937709]  handle_softirqs+0xd0/0x430
> [19527.937711]  ? __pfx_smpboot_thread_fn+0x10/0x10
> [19527.937713]  run_ksoftirqd+0x47/0x60
> [19527.937714]  smpboot_thread_fn+0xec/0x220
> [19527.937716]  kthread+0xf7/0x240
> [19527.937718]  ? __pfx_kthread+0x10/0x10
> [19527.937720]  ret_from_fork+0x28c/0x310
> [19527.937721]  ? __pfx_kthread+0x10/0x10
> [19527.937722]  ret_from_fork_asm+0x1a/0x30
> [19527.937725]  </TASK>
> [19527.937725] irq event stamp: 1270
> [19527.937726] hardirqs last  enabled at (1269): [<ffffffff830b2b1c>] run_ksoftirqd+0x4c/0x60
> [19527.937727] hardirqs last disabled at (1270): [<ffffffff83d90d53>] __schedule+0x853/0x1080
> [19527.937729] softirqs last  enabled at (1268): [<ffffffff830b29d6>] handle_softirqs+0x346/0x430
> [19527.937730] softirqs last disabled at (1215): [<ffffffff830b2b17>] run_ksoftirqd+0x47/0x60
> [19527.937731] ---[ end trace 0000000000000000 ]---
>
> Link: https://lore.kernel.org/xenomai/80a48c1ddc3721eb8022b77548028aa8e7371928.camel@siemens.com/T/
> Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
> ---
>  arch/x86/kernel/fpu/core.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index 7b87a04c0b6e..949f89bb5cdc 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -1046,6 +1046,9 @@ void fpu__suspend_inband(void)
>  	struct fpu *kfpu = this_cpu_read(in_kernel_fpstate);
>  	struct task_struct *tsk = current;
>  
> +	if (tsk->flags & PF_KTHREAD)
> +		return;
> +
>  	/*
>  	 * If kernel_fpu_allowed is false, we are dealing with the
>  	 * preemption of an inband kernel context currently using the

If an in-band kernel thread is preempted by an oob task while running
inside a kernel_fpu_begin_mask()<->kernel_fpu_end() section, the change
above would prevent the fpu hw state from being saved
appropriately. fpu__suspend_inband() and fpu__resume_inband() should not
be aligned in this respect, the log is misleading for that reason.

I suspect this change papers over the root issue, which is yet to be
discovered.

-- 
Philippe.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH Dovetail 6.16 1/2] x86/fpu: dovetail: Fix FPU corruption
  2025-08-20 10:57   ` Philippe Gerum
@ 2025-08-20 13:00     ` Florian Bezdeka
  2025-08-20 13:55       ` Philippe Gerum
  0 siblings, 1 reply; 9+ messages in thread
From: Florian Bezdeka @ 2025-08-20 13:00 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai, Jan Kiszka

On Wed, 2025-08-20 at 12:57 +0200, Philippe Gerum wrote:
> Florian Bezdeka <florian.bezdeka@siemens.com> writes:
> 
> > Align fpu__suspend_inband() and fpu__resume_inband() regarding the
> > PF_KTHREAD handling.
> > 
> > The kernel splat looked as follows and could be reproduced by running
> > the Xenomai 3 testsuite over a stress-ng produced load.
> > 
> > [19527.937613] ------------[ cut here ]------------
> > [19527.937614] WARNING: CPU: 0 PID: 14 at arch/x86/kernel/fpu/core.c:62 x86_task_fpu+0x18/0x30
> > [19527.937619] Modules linked in: intel_rapl_msr intel_rapl_common intel_pmc_core pmt_telemetry pmt_class intel_pmc_ssram_telemetry intel_vsec ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel gf128mul rapl pcspkr button evdev serio_raw joydev sg loop efi_pstore fuse dm_mod drm dax configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom ata_generic sd_mod hid_generic hid_hyperv ata_piix psmouse hid hv_netvsc hv_storvsc libata scsi_transport_fc floppy i2c_piix4 i2c_smbus scsi_mod scsi_common [last unloaded: xeno_can]
> > [19527.937635] CPU: 0 UID: 0 PID: 14 Comm: ksoftirqd/0 Not tainted 6.16.0-hyprv-xenomai-0+ #196 PREEMPT(voluntary)
> > [19527.937637] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008  12/07/2018
> > [19527.937637] IRQ stage: Xenomai
> > [19527.937638] RIP: 0010:x86_task_fpu+0x18/0x30
> > [19527.937640] Code: 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8d 87 00 20 00 00 f6 47 3e 20 75 05 c3 cc cc cc cc 90 <0f> 0b 90 31 c0 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f
> > [19527.937641] RSP: 0018:ffffba19c0083b60 EFLAGS: 00010002
> > [19527.937641] RAX: ffffa0a90133c3c0 RBX: 0000000000000001 RCX: ffffa0a93fc5a6e8
> > [19527.937642] RDX: 0000000000000000 RSI: 0000000000000007 RDI: ffffa0a90133a3c0
> > [19527.937643] RBP: ffffa0a93fc5a058 R08: 01364d9cb9685197 R09: ffffffff83e406f8
> > [19527.937643] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> > [19527.937643] R13: ffffa0a90133a3c0 R14: ffffba19c0c62008 R15: 0000000000000001
> > [19527.937645] FS:  0000000000000000(0000) GS:ffffa0a9ba664000(0000) knlGS:0000000000000000
> > [19527.937646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [19527.937646] CR2: 00007f5c694363c0 CR3: 000000003e038004 CR4: 00000000003706f0
> > [19527.937647] Call Trace:
> > [19527.937648]  <TASK>
> > [19527.937649]  fpu__suspend_inband+0x3c/0x50
> > [19527.937651]  dovetail_context_switch+0x130/0x240
> > [19527.937652]  ? xntimer_start+0x101/0x240
> > [19527.937656]  ___xnsched_run+0x1f9/0x4d0
> > [19527.937659]  run_oob_call+0x87/0x140
> > [19527.937662]  handle_irq_pipelined_finish+0x188/0x1a0
> > [19527.937664]  arch_pipeline_entry+0x43/0xf0
> > [19527.937667]  asm_sysvec_hyperv_stimer0+0x16/0x20
> > [19527.937669] RIP: 0010:kernel_fpu_end+0x5/0x50
> > [19527.937670] Code: 57 02 e8 9e 12 42 00 31 c0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <65> 0f b6 05 03 40 57 02 84 c0 75 31 65 c6 05 f7 3f 57 02 01 65 f6
> > [19527.937671] RSP: 0018:ffffba19c0083d28 EFLAGS: 00000246
> > [19527.937671] RAX: 000000000000000c RBX: 0000000000000001 RCX: ffffffff842f2260
> > [19527.937672] RDX: 0000000000000000 RSI: ffffffff84c67450 RDI: ffffffff84c673e0
> > [19527.937672] RBP: 0000000000000001 R08: ffffffff842f2260 R09: 0000000000000000
> > [19527.937672] R10: ffffffff84a080c0 R11: 0000000000000001 R12: ffffffff84c67410
> > [19527.937673] R13: ffffffff84c673e0 R14: 0000000000000040 R15: 0000000000000004
> > [19527.937675]  blake2s_compress+0x52/0xa0
> > [19527.937679]  blake2s_update+0x75/0x150
> > [19527.937681]  add_timer_randomness+0xac/0x1a0
> > [19527.937684]  scsi_end_request+0x297/0x3a0 [scsi_mod]
> > [19527.937697]  scsi_io_completion+0x55/0x6a0 [scsi_mod]
> > [19527.937707]  blk_done_softirq+0x46/0x60
> > [19527.937709]  handle_softirqs+0xd0/0x430
> > [19527.937711]  ? __pfx_smpboot_thread_fn+0x10/0x10
> > [19527.937713]  run_ksoftirqd+0x47/0x60
> > [19527.937714]  smpboot_thread_fn+0xec/0x220
> > [19527.937716]  kthread+0xf7/0x240
> > [19527.937718]  ? __pfx_kthread+0x10/0x10
> > [19527.937720]  ret_from_fork+0x28c/0x310
> > [19527.937721]  ? __pfx_kthread+0x10/0x10
> > [19527.937722]  ret_from_fork_asm+0x1a/0x30
> > [19527.937725]  </TASK>
> > [19527.937725] irq event stamp: 1270
> > [19527.937726] hardirqs last  enabled at (1269): [<ffffffff830b2b1c>] run_ksoftirqd+0x4c/0x60
> > [19527.937727] hardirqs last disabled at (1270): [<ffffffff83d90d53>] __schedule+0x853/0x1080
> > [19527.937729] softirqs last  enabled at (1268): [<ffffffff830b29d6>] handle_softirqs+0x346/0x430
> > [19527.937730] softirqs last disabled at (1215): [<ffffffff830b2b17>] run_ksoftirqd+0x47/0x60
> > [19527.937731] ---[ end trace 0000000000000000 ]---
> > 
> > Link: https://lore.kernel.org/xenomai/80a48c1ddc3721eb8022b77548028aa8e7371928.camel@siemens.com/T/
> > Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
> > ---
> >  arch/x86/kernel/fpu/core.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> > index 7b87a04c0b6e..949f89bb5cdc 100644
> > --- a/arch/x86/kernel/fpu/core.c
> > +++ b/arch/x86/kernel/fpu/core.c
> > @@ -1046,6 +1046,9 @@ void fpu__suspend_inband(void)
> >  	struct fpu *kfpu = this_cpu_read(in_kernel_fpstate);
> >  	struct task_struct *tsk = current;
> >  
> > +	if (tsk->flags & PF_KTHREAD)
> > +		return;
> > +
> >  	/*
> >  	 * If kernel_fpu_allowed is false, we are dealing with the
> >  	 * preemption of an inband kernel context currently using the
> 
> If an in-band kernel thread is preempted by an oob task while running
> inside a kernel_fpu_begin_mask()<->kernel_fpu_end() section, the change
> above would prevent the fpu hw state from being saved
> appropriately. fpu__suspend_inband() and fpu__resume_inband() should not
> be aligned in this respect, the log is misleading for that reason.

Let me add some more lines below:

	if (!this_cpu_read(kernel_fpu_allowed)) {
		save_fpregs_to_fpstate(kfpu);
		__cpu_invalidate_fpregs_state();
		oob_fpu_set_preempt(x86_task_fpu(tsk));
	}

x86_task_fpu(tsk) will deliver NULL in case CONFIG_X86_DEBUG_FPU is
enabled and (tsk->flags & PF_KTHREAD) is true, so we end up in
dereferencing a NULL pointer.

I think you took care of that during the forward port to 6.16 in
fpu__resume_inband() the same way. There is simply no struct fpu
reference where we could set the preempted flag.

Are kthreads really allowed to use the FPU? Could not find any
limitation so far, so we may have to modify x86_task_fpu(tsk) instead?

> 
> I suspect this change papers over the root issue, which is yet to be
> discovered.
> 
> -- 
> Philippe.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH Dovetail 6.16 1/2] x86/fpu: dovetail: Fix FPU corruption
  2025-08-20 13:00     ` Florian Bezdeka
@ 2025-08-20 13:55       ` Philippe Gerum
  2025-08-20 16:45         ` Florian Bezdeka
  0 siblings, 1 reply; 9+ messages in thread
From: Philippe Gerum @ 2025-08-20 13:55 UTC (permalink / raw)
  To: Florian Bezdeka; +Cc: xenomai, Jan Kiszka

Florian Bezdeka <florian.bezdeka@siemens.com> writes:

> On Wed, 2025-08-20 at 12:57 +0200, Philippe Gerum wrote:
>> Florian Bezdeka <florian.bezdeka@siemens.com> writes:
>> 
>> > Align fpu__suspend_inband() and fpu__resume_inband() regarding the
>> > PF_KTHREAD handling.
>> > 
>> > The kernel splat looked as follows and could be reproduced by running
>> > the Xenomai 3 testsuite over a stress-ng produced load.
>> > 
>> > [19527.937613] ------------[ cut here ]------------
>> > [19527.937614] WARNING: CPU: 0 PID: 14 at arch/x86/kernel/fpu/core.c:62 x86_task_fpu+0x18/0x30
>> > [19527.937619] Modules linked in: intel_rapl_msr intel_rapl_common intel_pmc_core pmt_telemetry pmt_class intel_pmc_ssram_telemetry intel_vsec ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel gf128mul rapl pcspkr button evdev serio_raw joydev sg loop efi_pstore fuse dm_mod drm dax configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom ata_generic sd_mod hid_generic hid_hyperv ata_piix psmouse hid hv_netvsc hv_storvsc libata scsi_transport_fc floppy i2c_piix4 i2c_smbus scsi_mod scsi_common [last unloaded: xeno_can]
>> > [19527.937635] CPU: 0 UID: 0 PID: 14 Comm: ksoftirqd/0 Not tainted 6.16.0-hyprv-xenomai-0+ #196 PREEMPT(voluntary)
>> > [19527.937637] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008  12/07/2018
>> > [19527.937637] IRQ stage: Xenomai
>> > [19527.937638] RIP: 0010:x86_task_fpu+0x18/0x30
>> > [19527.937640] Code: 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8d 87 00 20 00 00 f6 47 3e 20 75 05 c3 cc cc cc cc 90 <0f> 0b 90 31 c0 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f
>> > [19527.937641] RSP: 0018:ffffba19c0083b60 EFLAGS: 00010002
>> > [19527.937641] RAX: ffffa0a90133c3c0 RBX: 0000000000000001 RCX: ffffa0a93fc5a6e8
>> > [19527.937642] RDX: 0000000000000000 RSI: 0000000000000007 RDI: ffffa0a90133a3c0
>> > [19527.937643] RBP: ffffa0a93fc5a058 R08: 01364d9cb9685197 R09: ffffffff83e406f8
>> > [19527.937643] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
>> > [19527.937643] R13: ffffa0a90133a3c0 R14: ffffba19c0c62008 R15: 0000000000000001
>> > [19527.937645] FS:  0000000000000000(0000) GS:ffffa0a9ba664000(0000) knlGS:0000000000000000
>> > [19527.937646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > [19527.937646] CR2: 00007f5c694363c0 CR3: 000000003e038004 CR4: 00000000003706f0
>> > [19527.937647] Call Trace:
>> > [19527.937648]  <TASK>
>> > [19527.937649]  fpu__suspend_inband+0x3c/0x50
>> > [19527.937651]  dovetail_context_switch+0x130/0x240
>> > [19527.937652]  ? xntimer_start+0x101/0x240
>> > [19527.937656]  ___xnsched_run+0x1f9/0x4d0
>> > [19527.937659]  run_oob_call+0x87/0x140
>> > [19527.937662]  handle_irq_pipelined_finish+0x188/0x1a0
>> > [19527.937664]  arch_pipeline_entry+0x43/0xf0
>> > [19527.937667]  asm_sysvec_hyperv_stimer0+0x16/0x20
>> > [19527.937669] RIP: 0010:kernel_fpu_end+0x5/0x50
>> > [19527.937670] Code: 57 02 e8 9e 12 42 00 31 c0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <65> 0f b6 05 03 40 57 02 84 c0 75 31 65 c6 05 f7 3f 57 02 01 65 f6
>> > [19527.937671] RSP: 0018:ffffba19c0083d28 EFLAGS: 00000246
>> > [19527.937671] RAX: 000000000000000c RBX: 0000000000000001 RCX: ffffffff842f2260
>> > [19527.937672] RDX: 0000000000000000 RSI: ffffffff84c67450 RDI: ffffffff84c673e0
>> > [19527.937672] RBP: 0000000000000001 R08: ffffffff842f2260 R09: 0000000000000000
>> > [19527.937672] R10: ffffffff84a080c0 R11: 0000000000000001 R12: ffffffff84c67410
>> > [19527.937673] R13: ffffffff84c673e0 R14: 0000000000000040 R15: 0000000000000004
>> > [19527.937675]  blake2s_compress+0x52/0xa0
>> > [19527.937679]  blake2s_update+0x75/0x150
>> > [19527.937681]  add_timer_randomness+0xac/0x1a0
>> > [19527.937684]  scsi_end_request+0x297/0x3a0 [scsi_mod]
>> > [19527.937697]  scsi_io_completion+0x55/0x6a0 [scsi_mod]
>> > [19527.937707]  blk_done_softirq+0x46/0x60
>> > [19527.937709]  handle_softirqs+0xd0/0x430
>> > [19527.937711]  ? __pfx_smpboot_thread_fn+0x10/0x10
>> > [19527.937713]  run_ksoftirqd+0x47/0x60
>> > [19527.937714]  smpboot_thread_fn+0xec/0x220
>> > [19527.937716]  kthread+0xf7/0x240
>> > [19527.937718]  ? __pfx_kthread+0x10/0x10
>> > [19527.937720]  ret_from_fork+0x28c/0x310
>> > [19527.937721]  ? __pfx_kthread+0x10/0x10
>> > [19527.937722]  ret_from_fork_asm+0x1a/0x30
>> > [19527.937725]  </TASK>
>> > [19527.937725] irq event stamp: 1270
>> > [19527.937726] hardirqs last  enabled at (1269): [<ffffffff830b2b1c>] run_ksoftirqd+0x4c/0x60
>> > [19527.937727] hardirqs last disabled at (1270): [<ffffffff83d90d53>] __schedule+0x853/0x1080
>> > [19527.937729] softirqs last  enabled at (1268): [<ffffffff830b29d6>] handle_softirqs+0x346/0x430
>> > [19527.937730] softirqs last disabled at (1215): [<ffffffff830b2b17>] run_ksoftirqd+0x47/0x60
>> > [19527.937731] ---[ end trace 0000000000000000 ]---
>> > 
>> > Link: https://lore.kernel.org/xenomai/80a48c1ddc3721eb8022b77548028aa8e7371928.camel@siemens.com/T/
>> > Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
>> > ---
>> >  arch/x86/kernel/fpu/core.c | 3 +++
>> >  1 file changed, 3 insertions(+)
>> > 
>> > diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
>> > index 7b87a04c0b6e..949f89bb5cdc 100644
>> > --- a/arch/x86/kernel/fpu/core.c
>> > +++ b/arch/x86/kernel/fpu/core.c
>> > @@ -1046,6 +1046,9 @@ void fpu__suspend_inband(void)
>> >  	struct fpu *kfpu = this_cpu_read(in_kernel_fpstate);
>> >  	struct task_struct *tsk = current;
>> >  
>> > +	if (tsk->flags & PF_KTHREAD)
>> > +		return;
>> > +
>> >  	/*
>> >  	 * If kernel_fpu_allowed is false, we are dealing with the
>> >  	 * preemption of an inband kernel context currently using the
>> 
>> If an in-band kernel thread is preempted by an oob task while running
>> inside a kernel_fpu_begin_mask()<->kernel_fpu_end() section, the change
>> above would prevent the fpu hw state from being saved
>> appropriately. fpu__suspend_inband() and fpu__resume_inband() should not
>> be aligned in this respect, the log is misleading for that reason.
>
> Let me add some more lines below:
>
> 	if (!this_cpu_read(kernel_fpu_allowed)) {
> 		save_fpregs_to_fpstate(kfpu);
> 		__cpu_invalidate_fpregs_state();
> 		oob_fpu_set_preempt(x86_task_fpu(tsk));
> 	}
>
> x86_task_fpu(tsk) will deliver NULL in case CONFIG_X86_DEBUG_FPU is
> enabled and (tsk->flags & PF_KTHREAD) is true, so we end up in
> dereferencing a NULL pointer.
>
> I think you took care of that during the forward port to 6.16 in
> fpu__resume_inband() the same way. There is simply no struct fpu
> reference where we could set the preempted flag.

Yeah, but that early exit in fpu__resume_inband() looks wrong now. This
does silence the warning, but also prevents from reloading the fpu hw
state from the per-cpu in_kernel_fpstate buffer which is designed to
hold the fpu context in the preemption case mentioned above.

>
> Are kthreads really allowed to use the FPU? Could not find any
> limitation so far, so we may have to modify x86_task_fpu(tsk) instead?

in-band kthreads may enter a fpu-enabled section temporarily by calling
kernel_fpu_begin_mask() (RAID support used to do so at some point for
SSE2 IIRC). The explicit check there not to save the hw state when a kthread
is running confirms this. As a result, an oob task may preempt such
context.

I believe that x86_task_fpu() should not be called in either routines in
the first place. The preemption marker should not live in struct fpu.

-- 
Philippe.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH Dovetail 6.16 1/2] x86/fpu: dovetail: Fix FPU corruption
  2025-08-19 22:53 ` [PATCH Dovetail 6.16 1/2] x86/fpu: dovetail: Fix FPU corruption Florian Bezdeka
  2025-08-20 10:57   ` Philippe Gerum
@ 2025-08-20 14:21   ` Philippe Gerum
  1 sibling, 0 replies; 9+ messages in thread
From: Philippe Gerum @ 2025-08-20 14:21 UTC (permalink / raw)
  To: Florian Bezdeka; +Cc: xenomai, Jan Kiszka

Florian Bezdeka <florian.bezdeka@siemens.com> writes:

> Align fpu__suspend_inband() and fpu__resume_inband() regarding the
> PF_KTHREAD handling.
>
> The kernel splat looked as follows and could be reproduced by running
> the Xenomai 3 testsuite over a stress-ng produced load.
>
> [19527.937613] ------------[ cut here ]------------
> [19527.937614] WARNING: CPU: 0 PID: 14 at arch/x86/kernel/fpu/core.c:62 x86_task_fpu+0x18/0x30
> [19527.937619] Modules linked in: intel_rapl_msr intel_rapl_common intel_pmc_core pmt_telemetry pmt_class intel_pmc_ssram_telemetry intel_vsec ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel gf128mul rapl pcspkr button evdev serio_raw joydev sg loop efi_pstore fuse dm_mod drm dax configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom ata_generic sd_mod hid_generic hid_hyperv ata_piix psmouse hid hv_netvsc hv_storvsc libata scsi_transport_fc floppy i2c_piix4 i2c_smbus scsi_mod scsi_common [last unloaded: xeno_can]
> [19527.937635] CPU: 0 UID: 0 PID: 14 Comm: ksoftirqd/0 Not tainted 6.16.0-hyprv-xenomai-0+ #196 PREEMPT(voluntary)
> [19527.937637] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008  12/07/2018
> [19527.937637] IRQ stage: Xenomai
> [19527.937638] RIP: 0010:x86_task_fpu+0x18/0x30
> [19527.937640] Code: 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8d 87 00 20 00 00 f6 47 3e 20 75 05 c3 cc cc cc cc 90 <0f> 0b 90 31 c0 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f
> [19527.937641] RSP: 0018:ffffba19c0083b60 EFLAGS: 00010002
> [19527.937641] RAX: ffffa0a90133c3c0 RBX: 0000000000000001 RCX: ffffa0a93fc5a6e8
> [19527.937642] RDX: 0000000000000000 RSI: 0000000000000007 RDI: ffffa0a90133a3c0
> [19527.937643] RBP: ffffa0a93fc5a058 R08: 01364d9cb9685197 R09: ffffffff83e406f8
> [19527.937643] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> [19527.937643] R13: ffffa0a90133a3c0 R14: ffffba19c0c62008 R15: 0000000000000001
> [19527.937645] FS:  0000000000000000(0000) GS:ffffa0a9ba664000(0000) knlGS:0000000000000000
> [19527.937646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [19527.937646] CR2: 00007f5c694363c0 CR3: 000000003e038004 CR4: 00000000003706f0
> [19527.937647] Call Trace:
> [19527.937648]  <TASK>
> [19527.937649]  fpu__suspend_inband+0x3c/0x50
> [19527.937651]  dovetail_context_switch+0x130/0x240
> [19527.937652]  ? xntimer_start+0x101/0x240
> [19527.937656]  ___xnsched_run+0x1f9/0x4d0
> [19527.937659]  run_oob_call+0x87/0x140
> [19527.937662]  handle_irq_pipelined_finish+0x188/0x1a0
> [19527.937664]  arch_pipeline_entry+0x43/0xf0
> [19527.937667]  asm_sysvec_hyperv_stimer0+0x16/0x20
> [19527.937669] RIP: 0010:kernel_fpu_end+0x5/0x50
> [19527.937670] Code: 57 02 e8 9e 12 42 00 31 c0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <65> 0f b6 05 03 40 57 02 84 c0 75 31 65 c6 05 f7 3f 57 02 01 65 f6
> [19527.937671] RSP: 0018:ffffba19c0083d28 EFLAGS: 00000246
> [19527.937671] RAX: 000000000000000c RBX: 0000000000000001 RCX: ffffffff842f2260
> [19527.937672] RDX: 0000000000000000 RSI: ffffffff84c67450 RDI: ffffffff84c673e0
> [19527.937672] RBP: 0000000000000001 R08: ffffffff842f2260 R09: 0000000000000000
> [19527.937672] R10: ffffffff84a080c0 R11: 0000000000000001 R12: ffffffff84c67410
> [19527.937673] R13: ffffffff84c673e0 R14: 0000000000000040 R15: 0000000000000004
> [19527.937675]  blake2s_compress+0x52/0xa0
> [19527.937679]  blake2s_update+0x75/0x150
> [19527.937681]  add_timer_randomness+0xac/0x1a0
> [19527.937684]  scsi_end_request+0x297/0x3a0 [scsi_mod]
> [19527.937697]  scsi_io_completion+0x55/0x6a0 [scsi_mod]
> [19527.937707]  blk_done_softirq+0x46/0x60
> [19527.937709]  handle_softirqs+0xd0/0x430
> [19527.937711]  ? __pfx_smpboot_thread_fn+0x10/0x10
> [19527.937713]  run_ksoftirqd+0x47/0x60
> [19527.937714]  smpboot_thread_fn+0xec/0x220
> [19527.937716]  kthread+0xf7/0x240
> [19527.937718]  ? __pfx_kthread+0x10/0x10
> [19527.937720]  ret_from_fork+0x28c/0x310
> [19527.937721]  ? __pfx_kthread+0x10/0x10
> [19527.937722]  ret_from_fork_asm+0x1a/0x30
> [19527.937725]  </TASK>
> [19527.937725] irq event stamp: 1270
> [19527.937726] hardirqs last  enabled at (1269): [<ffffffff830b2b1c>] run_ksoftirqd+0x4c/0x60
> [19527.937727] hardirqs last disabled at (1270): [<ffffffff83d90d53>] __schedule+0x853/0x1080
> [19527.937729] softirqs last  enabled at (1268): [<ffffffff830b29d6>] handle_softirqs+0x346/0x430
> [19527.937730] softirqs last disabled at (1215): [<ffffffff830b2b17>] run_ksoftirqd+0x47/0x60
> [19527.937731] ---[ end trace 0000000000000000 ]---
>
> Link: https://lore.kernel.org/xenomai/80a48c1ddc3721eb8022b77548028aa8e7371928.camel@siemens.com/T/
> Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
> ---
>  arch/x86/kernel/fpu/core.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index 7b87a04c0b6e..949f89bb5cdc 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -1046,6 +1046,9 @@ void fpu__suspend_inband(void)
>  	struct fpu *kfpu = this_cpu_read(in_kernel_fpstate);
>  	struct task_struct *tsk = current;
>  
> +	if (tsk->flags & PF_KTHREAD)
> +		return;
> +
>  	/*
>  	 * If kernel_fpu_allowed is false, we are dealing with the
>  	 * preemption of an inband kernel context currently using the

IIRC, in x3, switchtest -s should exercise the code path involving
preemption of a kernel_fpu section by an oob task. This code is guarded
by fp_linux_begin() in the switchtest driver, which enables such test
for x86, disables it for other archs.

-- 
Philippe.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH Dovetail 6.16 1/2] x86/fpu: dovetail: Fix FPU corruption
  2025-08-20 13:55       ` Philippe Gerum
@ 2025-08-20 16:45         ` Florian Bezdeka
  2025-08-20 16:56           ` Philippe Gerum
  0 siblings, 1 reply; 9+ messages in thread
From: Florian Bezdeka @ 2025-08-20 16:45 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai, Jan Kiszka

On Wed, 2025-08-20 at 15:55 +0200, Philippe Gerum wrote:
> Florian Bezdeka <florian.bezdeka@siemens.com> writes:
> 
> > On Wed, 2025-08-20 at 12:57 +0200, Philippe Gerum wrote:
> > > Florian Bezdeka <florian.bezdeka@siemens.com> writes:
> > > 
> > > > Align fpu__suspend_inband() and fpu__resume_inband() regarding the
> > > > PF_KTHREAD handling.
> > > > 
> > > > The kernel splat looked as follows and could be reproduced by running
> > > > the Xenomai 3 testsuite over a stress-ng produced load.
> > > > 
> > > > [19527.937613] ------------[ cut here ]------------
> > > > [19527.937614] WARNING: CPU: 0 PID: 14 at arch/x86/kernel/fpu/core.c:62 x86_task_fpu+0x18/0x30
> > > > [19527.937619] Modules linked in: intel_rapl_msr intel_rapl_common intel_pmc_core pmt_telemetry pmt_class intel_pmc_ssram_telemetry intel_vsec ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel gf128mul rapl pcspkr button evdev serio_raw joydev sg loop efi_pstore fuse dm_mod drm dax configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom ata_generic sd_mod hid_generic hid_hyperv ata_piix psmouse hid hv_netvsc hv_storvsc libata scsi_transport_fc floppy i2c_piix4 i2c_smbus scsi_mod scsi_common [last unloaded: xeno_can]
> > > > [19527.937635] CPU: 0 UID: 0 PID: 14 Comm: ksoftirqd/0 Not tainted 6.16.0-hyprv-xenomai-0+ #196 PREEMPT(voluntary)
> > > > [19527.937637] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008  12/07/2018
> > > > [19527.937637] IRQ stage: Xenomai
> > > > [19527.937638] RIP: 0010:x86_task_fpu+0x18/0x30
> > > > [19527.937640] Code: 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8d 87 00 20 00 00 f6 47 3e 20 75 05 c3 cc cc cc cc 90 <0f> 0b 90 31 c0 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f
> > > > [19527.937641] RSP: 0018:ffffba19c0083b60 EFLAGS: 00010002
> > > > [19527.937641] RAX: ffffa0a90133c3c0 RBX: 0000000000000001 RCX: ffffa0a93fc5a6e8
> > > > [19527.937642] RDX: 0000000000000000 RSI: 0000000000000007 RDI: ffffa0a90133a3c0
> > > > [19527.937643] RBP: ffffa0a93fc5a058 R08: 01364d9cb9685197 R09: ffffffff83e406f8
> > > > [19527.937643] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> > > > [19527.937643] R13: ffffa0a90133a3c0 R14: ffffba19c0c62008 R15: 0000000000000001
> > > > [19527.937645] FS:  0000000000000000(0000) GS:ffffa0a9ba664000(0000) knlGS:0000000000000000
> > > > [19527.937646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [19527.937646] CR2: 00007f5c694363c0 CR3: 000000003e038004 CR4: 00000000003706f0
> > > > [19527.937647] Call Trace:
> > > > [19527.937648]  <TASK>
> > > > [19527.937649]  fpu__suspend_inband+0x3c/0x50
> > > > [19527.937651]  dovetail_context_switch+0x130/0x240
> > > > [19527.937652]  ? xntimer_start+0x101/0x240
> > > > [19527.937656]  ___xnsched_run+0x1f9/0x4d0
> > > > [19527.937659]  run_oob_call+0x87/0x140
> > > > [19527.937662]  handle_irq_pipelined_finish+0x188/0x1a0
> > > > [19527.937664]  arch_pipeline_entry+0x43/0xf0
> > > > [19527.937667]  asm_sysvec_hyperv_stimer0+0x16/0x20
> > > > [19527.937669] RIP: 0010:kernel_fpu_end+0x5/0x50
> > > > [19527.937670] Code: 57 02 e8 9e 12 42 00 31 c0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <65> 0f b6 05 03 40 57 02 84 c0 75 31 65 c6 05 f7 3f 57 02 01 65 f6
> > > > [19527.937671] RSP: 0018:ffffba19c0083d28 EFLAGS: 00000246
> > > > [19527.937671] RAX: 000000000000000c RBX: 0000000000000001 RCX: ffffffff842f2260
> > > > [19527.937672] RDX: 0000000000000000 RSI: ffffffff84c67450 RDI: ffffffff84c673e0
> > > > [19527.937672] RBP: 0000000000000001 R08: ffffffff842f2260 R09: 0000000000000000
> > > > [19527.937672] R10: ffffffff84a080c0 R11: 0000000000000001 R12: ffffffff84c67410
> > > > [19527.937673] R13: ffffffff84c673e0 R14: 0000000000000040 R15: 0000000000000004
> > > > [19527.937675]  blake2s_compress+0x52/0xa0
> > > > [19527.937679]  blake2s_update+0x75/0x150
> > > > [19527.937681]  add_timer_randomness+0xac/0x1a0
> > > > [19527.937684]  scsi_end_request+0x297/0x3a0 [scsi_mod]
> > > > [19527.937697]  scsi_io_completion+0x55/0x6a0 [scsi_mod]
> > > > [19527.937707]  blk_done_softirq+0x46/0x60
> > > > [19527.937709]  handle_softirqs+0xd0/0x430
> > > > [19527.937711]  ? __pfx_smpboot_thread_fn+0x10/0x10
> > > > [19527.937713]  run_ksoftirqd+0x47/0x60
> > > > [19527.937714]  smpboot_thread_fn+0xec/0x220
> > > > [19527.937716]  kthread+0xf7/0x240
> > > > [19527.937718]  ? __pfx_kthread+0x10/0x10
> > > > [19527.937720]  ret_from_fork+0x28c/0x310
> > > > [19527.937721]  ? __pfx_kthread+0x10/0x10
> > > > [19527.937722]  ret_from_fork_asm+0x1a/0x30
> > > > [19527.937725]  </TASK>
> > > > [19527.937725] irq event stamp: 1270
> > > > [19527.937726] hardirqs last  enabled at (1269): [<ffffffff830b2b1c>] run_ksoftirqd+0x4c/0x60
> > > > [19527.937727] hardirqs last disabled at (1270): [<ffffffff83d90d53>] __schedule+0x853/0x1080
> > > > [19527.937729] softirqs last  enabled at (1268): [<ffffffff830b29d6>] handle_softirqs+0x346/0x430
> > > > [19527.937730] softirqs last disabled at (1215): [<ffffffff830b2b17>] run_ksoftirqd+0x47/0x60
> > > > [19527.937731] ---[ end trace 0000000000000000 ]---
> > > > 
> > > > Link: https://lore.kernel.org/xenomai/80a48c1ddc3721eb8022b77548028aa8e7371928.camel@siemens.com/T/
> > > > Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
> > > > ---
> > > >  arch/x86/kernel/fpu/core.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > > 
> > > > diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> > > > index 7b87a04c0b6e..949f89bb5cdc 100644
> > > > --- a/arch/x86/kernel/fpu/core.c
> > > > +++ b/arch/x86/kernel/fpu/core.c
> > > > @@ -1046,6 +1046,9 @@ void fpu__suspend_inband(void)
> > > >  	struct fpu *kfpu = this_cpu_read(in_kernel_fpstate);
> > > >  	struct task_struct *tsk = current;
> > > >  
> > > > +	if (tsk->flags & PF_KTHREAD)
> > > > +		return;
> > > > +
> > > >  	/*
> > > >  	 * If kernel_fpu_allowed is false, we are dealing with the
> > > >  	 * preemption of an inband kernel context currently using the
> > > 
> > > If an in-band kernel thread is preempted by an oob task while running
> > > inside a kernel_fpu_begin_mask()<->kernel_fpu_end() section, the change
> > > above would prevent the fpu hw state from being saved
> > > appropriately. fpu__suspend_inband() and fpu__resume_inband() should not
> > > be aligned in this respect, the log is misleading for that reason.
> > 
> > Let me add some more lines below:
> > 
> > 	if (!this_cpu_read(kernel_fpu_allowed)) {
> > 		save_fpregs_to_fpstate(kfpu);
> > 		__cpu_invalidate_fpregs_state();
> > 		oob_fpu_set_preempt(x86_task_fpu(tsk));
> > 	}
> > 
> > x86_task_fpu(tsk) will deliver NULL in case CONFIG_X86_DEBUG_FPU is
> > enabled and (tsk->flags & PF_KTHREAD) is true, so we end up in
> > dereferencing a NULL pointer.
> > 
> > I think you took care of that during the forward port to 6.16 in
> > fpu__resume_inband() the same way. There is simply no struct fpu
> > reference where we could set the preempted flag.
> 
> Yeah, but that early exit in fpu__resume_inband() looks wrong now. This
> does silence the warning, but also prevents from reloading the fpu hw
> state from the per-cpu in_kernel_fpstate buffer which is designed to
> hold the fpu context in the preemption case mentioned above.
> 
> > 
> > Are kthreads really allowed to use the FPU? Could not find any
> > limitation so far, so we may have to modify x86_task_fpu(tsk) instead?
> 
> in-band kthreads may enter a fpu-enabled section temporarily by calling
> kernel_fpu_begin_mask() (RAID support used to do so at some point for
> SSE2 IIRC). The explicit check there not to save the hw state when a kthread
> is running confirms this. As a result, an oob task may preempt such
> context.
> 
> I believe that x86_task_fpu() should not be called in either routines in
> the first place. The preemption marker should not live in struct fpu.

Where is a better location for it? struct task_struct? I think we have
it in struct fpu for arm64 as well, so in case we decide to move it we
should move it for both archs.

> 
> -- 
> Philippe.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH Dovetail 6.16 1/2] x86/fpu: dovetail: Fix FPU corruption
  2025-08-20 16:45         ` Florian Bezdeka
@ 2025-08-20 16:56           ` Philippe Gerum
  0 siblings, 0 replies; 9+ messages in thread
From: Philippe Gerum @ 2025-08-20 16:56 UTC (permalink / raw)
  To: Florian Bezdeka; +Cc: xenomai, Jan Kiszka

Florian Bezdeka <florian.bezdeka@siemens.com> writes:

> On Wed, 2025-08-20 at 15:55 +0200, Philippe Gerum wrote:
>> Florian Bezdeka <florian.bezdeka@siemens.com> writes:
>> 
>> > On Wed, 2025-08-20 at 12:57 +0200, Philippe Gerum wrote:
>> > > Florian Bezdeka <florian.bezdeka@siemens.com> writes:
>> > > 
>> > > > Align fpu__suspend_inband() and fpu__resume_inband() regarding the
>> > > > PF_KTHREAD handling.
>> > > > 
>> > > > The kernel splat looked as follows and could be reproduced by running
>> > > > the Xenomai 3 testsuite over a stress-ng produced load.
>> > > > 
>> > > > [19527.937613] ------------[ cut here ]------------
>> > > > [19527.937614] WARNING: CPU: 0 PID: 14 at arch/x86/kernel/fpu/core.c:62 x86_task_fpu+0x18/0x30
>> > > > [19527.937619] Modules linked in: intel_rapl_msr intel_rapl_common intel_pmc_core pmt_telemetry pmt_class intel_pmc_ssram_telemetry intel_vsec ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel gf128mul rapl pcspkr button evdev serio_raw joydev sg loop efi_pstore fuse dm_mod drm dax configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom ata_generic sd_mod hid_generic hid_hyperv ata_piix psmouse hid hv_netvsc hv_storvsc libata scsi_transport_fc floppy i2c_piix4 i2c_smbus scsi_mod scsi_common [last unloaded: xeno_can]
>> > > > [19527.937635] CPU: 0 UID: 0 PID: 14 Comm: ksoftirqd/0 Not tainted 6.16.0-hyprv-xenomai-0+ #196 PREEMPT(voluntary)
>> > > > [19527.937637] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008  12/07/2018
>> > > > [19527.937637] IRQ stage: Xenomai
>> > > > [19527.937638] RIP: 0010:x86_task_fpu+0x18/0x30
>> > > > [19527.937640] Code: 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8d 87 00 20 00 00 f6 47 3e 20 75 05 c3 cc cc cc cc 90 <0f> 0b 90 31 c0 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f
>> > > > [19527.937641] RSP: 0018:ffffba19c0083b60 EFLAGS: 00010002
>> > > > [19527.937641] RAX: ffffa0a90133c3c0 RBX: 0000000000000001 RCX: ffffa0a93fc5a6e8
>> > > > [19527.937642] RDX: 0000000000000000 RSI: 0000000000000007 RDI: ffffa0a90133a3c0
>> > > > [19527.937643] RBP: ffffa0a93fc5a058 R08: 01364d9cb9685197 R09: ffffffff83e406f8
>> > > > [19527.937643] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
>> > > > [19527.937643] R13: ffffa0a90133a3c0 R14: ffffba19c0c62008 R15: 0000000000000001
>> > > > [19527.937645] FS:  0000000000000000(0000) GS:ffffa0a9ba664000(0000) knlGS:0000000000000000
>> > > > [19527.937646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > > [19527.937646] CR2: 00007f5c694363c0 CR3: 000000003e038004 CR4: 00000000003706f0
>> > > > [19527.937647] Call Trace:
>> > > > [19527.937648]  <TASK>
>> > > > [19527.937649]  fpu__suspend_inband+0x3c/0x50
>> > > > [19527.937651]  dovetail_context_switch+0x130/0x240
>> > > > [19527.937652]  ? xntimer_start+0x101/0x240
>> > > > [19527.937656]  ___xnsched_run+0x1f9/0x4d0
>> > > > [19527.937659]  run_oob_call+0x87/0x140
>> > > > [19527.937662]  handle_irq_pipelined_finish+0x188/0x1a0
>> > > > [19527.937664]  arch_pipeline_entry+0x43/0xf0
>> > > > [19527.937667]  asm_sysvec_hyperv_stimer0+0x16/0x20
>> > > > [19527.937669] RIP: 0010:kernel_fpu_end+0x5/0x50
>> > > > [19527.937670] Code: 57 02 e8 9e 12 42 00 31 c0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <65> 0f b6 05 03 40 57 02 84 c0 75 31 65 c6 05 f7 3f 57 02 01 65 f6
>> > > > [19527.937671] RSP: 0018:ffffba19c0083d28 EFLAGS: 00000246
>> > > > [19527.937671] RAX: 000000000000000c RBX: 0000000000000001 RCX: ffffffff842f2260
>> > > > [19527.937672] RDX: 0000000000000000 RSI: ffffffff84c67450 RDI: ffffffff84c673e0
>> > > > [19527.937672] RBP: 0000000000000001 R08: ffffffff842f2260 R09: 0000000000000000
>> > > > [19527.937672] R10: ffffffff84a080c0 R11: 0000000000000001 R12: ffffffff84c67410
>> > > > [19527.937673] R13: ffffffff84c673e0 R14: 0000000000000040 R15: 0000000000000004
>> > > > [19527.937675]  blake2s_compress+0x52/0xa0
>> > > > [19527.937679]  blake2s_update+0x75/0x150
>> > > > [19527.937681]  add_timer_randomness+0xac/0x1a0
>> > > > [19527.937684]  scsi_end_request+0x297/0x3a0 [scsi_mod]
>> > > > [19527.937697]  scsi_io_completion+0x55/0x6a0 [scsi_mod]
>> > > > [19527.937707]  blk_done_softirq+0x46/0x60
>> > > > [19527.937709]  handle_softirqs+0xd0/0x430
>> > > > [19527.937711]  ? __pfx_smpboot_thread_fn+0x10/0x10
>> > > > [19527.937713]  run_ksoftirqd+0x47/0x60
>> > > > [19527.937714]  smpboot_thread_fn+0xec/0x220
>> > > > [19527.937716]  kthread+0xf7/0x240
>> > > > [19527.937718]  ? __pfx_kthread+0x10/0x10
>> > > > [19527.937720]  ret_from_fork+0x28c/0x310
>> > > > [19527.937721]  ? __pfx_kthread+0x10/0x10
>> > > > [19527.937722]  ret_from_fork_asm+0x1a/0x30
>> > > > [19527.937725]  </TASK>
>> > > > [19527.937725] irq event stamp: 1270
>> > > > [19527.937726] hardirqs last  enabled at (1269): [<ffffffff830b2b1c>] run_ksoftirqd+0x4c/0x60
>> > > > [19527.937727] hardirqs last disabled at (1270): [<ffffffff83d90d53>] __schedule+0x853/0x1080
>> > > > [19527.937729] softirqs last  enabled at (1268): [<ffffffff830b29d6>] handle_softirqs+0x346/0x430
>> > > > [19527.937730] softirqs last disabled at (1215): [<ffffffff830b2b17>] run_ksoftirqd+0x47/0x60
>> > > > [19527.937731] ---[ end trace 0000000000000000 ]---
>> > > > 
>> > > > Link: https://lore.kernel.org/xenomai/80a48c1ddc3721eb8022b77548028aa8e7371928.camel@siemens.com/T/
>> > > > Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
>> > > > ---
>> > > >  arch/x86/kernel/fpu/core.c | 3 +++
>> > > >  1 file changed, 3 insertions(+)
>> > > > 
>> > > > diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
>> > > > index 7b87a04c0b6e..949f89bb5cdc 100644
>> > > > --- a/arch/x86/kernel/fpu/core.c
>> > > > +++ b/arch/x86/kernel/fpu/core.c
>> > > > @@ -1046,6 +1046,9 @@ void fpu__suspend_inband(void)
>> > > >  	struct fpu *kfpu = this_cpu_read(in_kernel_fpstate);
>> > > >  	struct task_struct *tsk = current;
>> > > >  
>> > > > +	if (tsk->flags & PF_KTHREAD)
>> > > > +		return;
>> > > > +
>> > > >  	/*
>> > > >  	 * If kernel_fpu_allowed is false, we are dealing with the
>> > > >  	 * preemption of an inband kernel context currently using the
>> > > 
>> > > If an in-band kernel thread is preempted by an oob task while running
>> > > inside a kernel_fpu_begin_mask()<->kernel_fpu_end() section, the change
>> > > above would prevent the fpu hw state from being saved
>> > > appropriately. fpu__suspend_inband() and fpu__resume_inband() should not
>> > > be aligned in this respect, the log is misleading for that reason.
>> > 
>> > Let me add some more lines below:
>> > 
>> > 	if (!this_cpu_read(kernel_fpu_allowed)) {
>> > 		save_fpregs_to_fpstate(kfpu);
>> > 		__cpu_invalidate_fpregs_state();
>> > 		oob_fpu_set_preempt(x86_task_fpu(tsk));
>> > 	}
>> > 
>> > x86_task_fpu(tsk) will deliver NULL in case CONFIG_X86_DEBUG_FPU is
>> > enabled and (tsk->flags & PF_KTHREAD) is true, so we end up in
>> > dereferencing a NULL pointer.
>> > 
>> > I think you took care of that during the forward port to 6.16 in
>> > fpu__resume_inband() the same way. There is simply no struct fpu
>> > reference where we could set the preempted flag.
>> 
>> Yeah, but that early exit in fpu__resume_inband() looks wrong now. This
>> does silence the warning, but also prevents from reloading the fpu hw
>> state from the per-cpu in_kernel_fpstate buffer which is designed to
>> hold the fpu context in the preemption case mentioned above.
>> 
>> > 
>> > Are kthreads really allowed to use the FPU? Could not find any
>> > limitation so far, so we may have to modify x86_task_fpu(tsk) instead?
>> 
>> in-band kthreads may enter a fpu-enabled section temporarily by calling
>> kernel_fpu_begin_mask() (RAID support used to do so at some point for
>> SSE2 IIRC). The explicit check there not to save the hw state when a kthread
>> is running confirms this. As a result, an oob task may preempt such
>> context.
>> 
>> I believe that x86_task_fpu() should not be called in either routines in
>> the first place. The preemption marker should not live in struct fpu.
>
> Where is a better location for it? struct task_struct? I think we have
> it in struct fpu for arm64 as well, so in case we decide to move it we
> should move it for both archs.
>

thread_info would make sense, we have flags there already. Yes, we need
this for all supported archs, since all of them properly handles
preemption of a kernel-fpu section by an oob task, since you fixed that
for arm64 and ARM some time ago.

-- 
Philippe.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-08-20 16:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-19 22:53 [PATCH Dovetail 6.16 0/2] Dovetail 6.16: Fix some regressions introduced during last forward port Florian Bezdeka
2025-08-19 22:53 ` [PATCH Dovetail 6.16 1/2] x86/fpu: dovetail: Fix FPU corruption Florian Bezdeka
2025-08-20 10:57   ` Philippe Gerum
2025-08-20 13:00     ` Florian Bezdeka
2025-08-20 13:55       ` Philippe Gerum
2025-08-20 16:45         ` Florian Bezdeka
2025-08-20 16:56           ` Philippe Gerum
2025-08-20 14:21   ` Philippe Gerum
2025-08-19 22:53 ` [PATCH Dovetail 6.16 2/2] irq_pipeline: Fix pipelining code for level triggered IRQs Florian Bezdeka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.