* x86/amd late microcode thread loading slows down boot
@ 2024-11-06 17:20 Thomas De Schampheleire
2024-11-06 18:50 ` Andrew Cooper
` (3 more replies)
0 siblings, 4 replies; 21+ messages in thread
From: Thomas De Schampheleire @ 2024-11-06 17:20 UTC (permalink / raw)
To: bp; +Cc: linux-kernel, x86
Hi Borislav, all,
I am encountering varying delays in the boot process, bisected to commit:
commit a32b0f0db3f396f1c9be2fe621e77c09ec3d8e7d
Author: Borislav Petkov (AMD) <bp@alien8.de>
Date: 2023-05-02 19:53:50 +0200
x86/microcode/AMD: Load late on both threads too
Do the same as early loading - load on both threads.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20230605141332.25948-1-bp@alien8.de
The problem is observed by the initramfs taking unexpectedly long to unpack, as
witnessed by the timestamp on the 'Freeing initrd memory' trace (and confirmed
by additional traces).
Normally, that trace would be at about 12 seconds with only 1-2 seconds
variation across boots. But when applying the mentioned patch, the variation
increases. Most boots see no impact, on some boots the time is increased by a
few to tens of seconds, and in extreme cases even by several minutes (!).
In such cases, the hung task daemon will report a kworker to be hung and panic:
[ 246.812329] INFO: task kworker/u34:0:195 blocked for more than 122 seconds.
[ 246.820106] Not tainted 6.6.52 #1
[ 246.824391] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.833134] task:kworker/u34:0 state:D stack:0 pid:195 ppid:113 flags:0x00004000
[ 246.842462] Call Trace:
[ 246.845194] <TASK>
[ 246.847539] __schedule+0x262/0x770
[ 246.851437] ? prepare_kernel_cred+0x28/0x1c0
[ 246.856305] schedule+0x61/0xe0
[ 246.859814] async_synchronize_cookie_domain+0xe8/0x130
[ 246.865649] ? __pfx_autoremove_wake_function+0x10/0x10
[ 246.871485] call_usermodehelper_exec_async+0xc0/0x190
[ 246.877225] ? __pfx_call_usermodehelper_exec_async+0x10/0x10
[ 246.883641] ret_from_fork+0x34/0x50
[ 246.887634] ? __pfx_call_usermodehelper_exec_async+0x10/0x10
[ 246.894051] ret_from_fork_asm+0x1b/0x30
[ 246.898435] </TASK>
[ 246.900876] Kernel panic - not syncing: hung_task: blocked tasks
[ 246.907582] CPU: 8 PID: 131 Comm: khungtaskd Not tainted 6.6.52 #1
I observe this problem on more than 10 systems, with following parameters:
vendor_id : AuthenticAMD
cpu family : 23
model : 1
model name : AMD EPYC 3251 8-Core Processor
stepping : 2
or
vendor_id : AuthenticAMD
cpu family : 23
model : 1
model name : AMD EPYC 3255 8-Core Processor Industrial Temp
stepping : 2
The original microcode on these boards is 0x800126c .
The microcode I'm updating to is 0x800126e. I also tried 0x800126f but it didn't
make a difference.
During early boot, the microcode update has been done successfully. But, due to
the changes in commit a32b0f0db3f396f1c9be2fe621e77c09ec3d8e7d, an additional
'late' microcode update will be done, even though the reported version already
matches the expected version, to cover the other CPU thread. This reasoning is
described in the commit message of e7ad18d1169c62e6c78c01ff693fd362d9d65278 and
is also discussed in mail thread [1] related to the proposed 'chicken bit'. In
that mail thread, it is claimed that the added late loading does no harm on any
CPU newer than Bulldozer.
Yet, based on my observations, I think this statement may be incorrect.
Filtered dmesg output for an extreme case is at the end of this email (the hung
task daemon was disabled to avoid a panic).
Could you please check this?
Let me know if you need additional information or have changes you would like me
to test, preferably based on 6.6.x .
Thanks,
Thomas
[1] https://lore.kernel.org/all/20230605141332.25948-2-bp@alien8.de/
[ 0.000000] Linux version 6.4.0-rc1 (oe-user@oe-host) (x86_64-vendor-linux-gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.42.0.20240716) #1 SMP PREEMPT_DYNAMIC Tue Nov 5 09:56:34 UTC 2024
[ 0.000000] Command line: console=ttyS0,115200n8 quiet root=/dev/ram0 crashkernel=80M loglevel=8 tsc=reliable
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
[ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'compacted' format.
[ 0.000000] signal: max sigframe size: 1776
[ 0.000000] BIOS-provided physical RAM map:
[...]
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] efi: EFI v2.7 by Phoenix Technologies Ltd.
[ 0.000000] efi: ACPI=0x7e9fd000 ACPI 2.0=0x7e9fd014 TPMFinalLog=0x7e831000 SMBIOS=0x79d42000 SMBIOS 3.0=0x79d35000 MEMATTR=0x74572018 ESRT=0x741af000
[ 0.000000] efi: Not removing mem43: MMIO range=[0xfed80000-0xfed80fff] (4KB) from e820 map
[ 0.000000] SMBIOS 3.1.1 present.
[ 0.000000] tsc: Fast TSC calibration using PIT
[ 0.000000] tsc: Detected 2499.992 MHz processor
[ 0.000015] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[ 0.000018] e820: remove [mem 0x000a0000-0x000fffff] usable
[ 0.000026] last_pfn = 0x47f380 max_arch_pfn = 0x400000000
[ 0.000033] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
[ 0.000178] last_pfn = 0x7f800 max_arch_pfn = 0x400000000
[ 0.004544] esrt: Reserving ESRT space from 0x00000000741af000 to 0x00000000741af038.
[ 0.004549] e820: update [mem 0x741af000-0x741affff] usable ==> reserved
[ 0.004566] Using GB pages for direct mapping
[ 0.004932] Secure boot disabled
[ 0.004933] RAMDISK: [mem 0x2c173000-0x320b0fff]
[... ACPI ...]
[ 0.005129] No NUMA configuration found
[ 0.005130] Faking a node at [mem 0x0000000000000000-0x000000047f37ffff]
[ 0.005134] NODE_DATA(0) allocated [mem 0x47f37b000-0x47f37ffff]
[ 0.005142] Reserving 80MB of memory at 1696MB for crashkernel (System RAM: 12139MB)
[ 0.005163] Zone ranges:
[ 0.005164] DMA [mem 0x0000000000001000-0x0000000000ffffff]
[ 0.005166] DMA32 [mem 0x0000000001000000-0x00000000ffffffff]
[ 0.005167] Normal [mem 0x0000000100000000-0x000000047f37ffff]
[ 0.005169] Device empty
[ 0.005170] Movable zone start for each node
[ 0.005171] Early memory node ranges
[ 0.005171] node 0: [mem 0x0000000000001000-0x000000000009efff]
[ 0.005173] node 0: [mem 0x0000000000100000-0x0000000001d9ffff]
[ 0.005174] node 0: [mem 0x0000000001f21000-0x0000000076c33fff]
[ 0.005175] node 0: [mem 0x000000007e9fe000-0x000000007f7fffff]
[ 0.005176] node 0: [mem 0x0000000200000000-0x000000047f37ffff]
[ 0.005178] Initmem setup node 0 [mem 0x0000000000001000-0x000000047f37ffff]
[ 0.005183] On node 0, zone DMA: 1 pages in unavailable ranges
[ 0.005206] On node 0, zone DMA: 97 pages in unavailable ranges
[ 0.008926] On node 0, zone DMA32: 385 pages in unavailable ranges
[ 0.009265] On node 0, zone DMA32: 32202 pages in unavailable ranges
[ 0.009585] On node 0, zone Normal: 2048 pages in unavailable ranges
[ 0.009617] On node 0, zone Normal: 3200 pages in unavailable ranges
[ 0.010557] ACPI: PM-Timer IO Port: 0x408
[ 0.010567] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
[ 0.010569] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
[ 0.010570] ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
[ 0.010571] ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
[ 0.010572] ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1])
[ 0.010573] ACPI: LAPIC_NMI (acpi_id[0x05] high edge lint[0x1])
[ 0.010574] ACPI: LAPIC_NMI (acpi_id[0x06] high edge lint[0x1])
[ 0.010574] ACPI: LAPIC_NMI (acpi_id[0x07] high edge lint[0x1])
[ 0.010575] ACPI: LAPIC_NMI (acpi_id[0x08] high edge lint[0x1])
[ 0.010576] ACPI: LAPIC_NMI (acpi_id[0x09] high edge lint[0x1])
[ 0.010577] ACPI: LAPIC_NMI (acpi_id[0x0a] high edge lint[0x1])
[ 0.010578] ACPI: LAPIC_NMI (acpi_id[0x0b] high edge lint[0x1])
[ 0.010578] ACPI: LAPIC_NMI (acpi_id[0x0c] high edge lint[0x1])
[ 0.010579] ACPI: LAPIC_NMI (acpi_id[0x0d] high edge lint[0x1])
[ 0.010580] ACPI: LAPIC_NMI (acpi_id[0x0e] high edge lint[0x1])
[ 0.010581] ACPI: LAPIC_NMI (acpi_id[0x0f] high edge lint[0x1])
[ 0.010609] IOAPIC[0]: apic_id 32, version 33, address 0xfec00000, GSI 0-23
[ 0.010614] IOAPIC[1]: apic_id 33, version 33, address 0xfec01000, GSI 24-55
[ 0.010616] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.010618] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
[ 0.010622] ACPI: Using ACPI (MADT) for SMP configuration information
[ 0.010623] ACPI: HPET id: 0x43538210 base: 0xfed00000
[ 0.010628] smpboot: Allowing 16 CPUs, 0 hotplug CPUs
[ 0.010645] [mem 0x80000000-0xf7ffffff] available for PCI devices
[ 0.010649] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns
[ 0.015416] setup_percpu: NR_CPUS:64 nr_cpumask_bits:16 nr_cpu_ids:16 nr_node_ids:1
[ 0.016208] percpu: Embedded 59 pages/cpu s204800 r8192 d28672 u262144
[ 0.016214] pcpu-alloc: s204800 r8192 d28672 u262144 alloc=1*2097152
[ 0.016217] pcpu-alloc: [0] 00 01 02 03 04 05 06 07 [0] 08 09 10 11 12 13 14 15
[ 0.016238] Kernel command line: console=ttyS0,115200n8 quiet root=/dev/ram0 crashkernel=80M loglevel=8 tsc=reliable
[ 0.016392] random: crng init done
[ 0.018490] Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes, linear)
[ 0.019603] Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes, linear)
[ 0.019710] Fallback order for Node 0: 0
[ 0.019718] Built 1 zonelists, mobility grouping on. Total pages: 3059076
[ 0.019719] Policy zone: Normal
[ 0.019721] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.019792] software IO TLB: area num 16.
[ 0.047969] Memory: 1823812K/12431180K available (14336K kernel code, 5006K rwdata, 5164K rodata, 2692K init, 4516K bss, 528072K reserved, 0K cma-reserved)
[ 0.048108] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, Nodes=1
[ 0.048135] ftrace: allocating 38800 entries in 152 pages
[ 0.054657] ftrace: allocated 152 pages with 3 groups
[ 0.054720] Dynamic Preempt: full
[ 0.054768] rcu: Preemptible hierarchical RCU implementation.
[ 0.054768] rcu: RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=16.
[ 0.054769] Trampoline variant of Tasks RCU enabled.
[ 0.054770] Rude variant of Tasks RCU enabled.
[ 0.054770] Tracing variant of Tasks RCU enabled.
[ 0.054771] rcu: RCU calculated value of scheduler-enlistment delay is 100 jiffies.
[ 0.054772] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=16
[ 0.057563] NR_IRQS: 4352, nr_irqs: 1096, preallocated irqs: 16
[ 0.057743] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[ 0.057852] Console: colour dummy device 80x25
[ 0.057875] printk: console [ttyS0] enabled
[ 1.360910] ACPI: Core revision 20230331
[ 1.365444] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484873504 ns
[ 1.375671] APIC: Switch to symmetric I/O mode setup
[ 1.382289] AMD-Vi: ivrs, add hid:PNPD0040, uid:, rdevid:152
[ 1.388621] AMD-Vi: Using global IVHD EFR:0xf77ef22294ada, EFR2:0x0
[ 1.396087] Switched APIC routing to physical flat.
[ 1.402932] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 1.413672] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x24093255d7c, max_idle_ns: 440795319144 ns
[ 1.425426] Calibrating delay loop (skipped), value calculated using timer frequency.. 4999.98 BogoMIPS (lpj=2499992)
[ 1.426424] pid_max: default: 32768 minimum: 301
[ 1.429452] LSM: initializing lsm=capability,yama,integrity
[ 1.430424] Yama: becoming mindful.
[ 1.431470] Mount-cache hash table entries: 32768 (order: 6, 262144 bytes, linear)
[ 1.432450] Mountpoint-cache hash table entries: 32768 (order: 6, 262144 bytes, linear)
[ 1.433674] LVT offset 2 assigned for vector 0xf4
[ 1.434438] process: using mwait in idle threads
[ 1.435425] Last level iTLB entries: 4KB 1024, 2MB 1024, 4MB 512
[ 1.436423] Last level dTLB entries: 4KB 1536, 2MB 1536, 4MB 768, 1GB 0
[...]
[ 1.451511] Freeing SMP alternatives memory: 32K
[ 1.555512] smpboot: CPU0: AMD EPYC 3255 8-Core Processor Industrial Temp (family: 0x17, model: 0x1, stepping: 0x2)
[ 1.556576] cblist_init_generic: Setting adjustable number of callback queues.
[ 1.557423] cblist_init_generic: Setting shift to 4 and lim to 1.
[ 1.558441] cblist_init_generic: Setting shift to 4 and lim to 1.
[ 1.559442] cblist_init_generic: Setting shift to 4 and lim to 1.
[ 1.560439] Performance Events: Fam17h+ core perfctr, AMD PMU driver.
[ 1.561424] ... version: 0
[ 1.562423] ... bit width: 48
[ 1.563423] ... generic registers: 6
[ 1.564425] ... value mask: 0000ffffffffffff
[ 1.565423] ... max period: 00007fffffffffff
[ 1.566423] ... fixed-purpose events: 0
[ 1.567424] ... event mask: 000000000000003f
[ 1.568519] rcu: Hierarchical SRCU implementation.
[ 1.569423] rcu: Max phase no-delay instances is 400.
[ 1.570681] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
[ 1.571565] smp: Bringing up secondary CPUs ...
[ 1.572530] x86: Booting SMP configuration:
[ 1.573428] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15
[ 1.581533] smp: Brought up 1 node, 16 CPUs
[ 1.583424] smpboot: Max logical packages: 1
[ 1.584423] smpboot: Total of 16 processors activated (79999.74 BogoMIPS)
[ 1.610760] node 0 deferred pages initialised in 23ms
[ 1.616606] devtmpfs: initialized
[ 1.617480] x86/mm: Memory block size: 128MB
[ 1.622733] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[ 1.633428] futex hash table entries: 4096 (order: 6, 262144 bytes, linear)
[ 1.641527] pinctrl core: initialized pinctrl subsystem
[ 1.647980] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[ 1.654543] DMA: preallocated 2048 KiB GFP_KERNEL pool for atomic allocations
[ 1.662428] DMA: preallocated 2048 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations
[ 1.671427] DMA: preallocated 2048 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
[ 1.680477] thermal_sys: Registered thermal governor 'fair_share'
[ 1.680477] thermal_sys: Registered thermal governor 'step_wise'
[ 1.687424] thermal_sys: Registered thermal governor 'user_space'
[ 1.694451] cpuidle: using governor menu
[ 1.705504] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[ 1.712494] PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem 0xf8000000-0xfbffffff] (base 0xf8000000)
[ 1.723426] PCI: MMCONFIG at [mem 0xf8000000-0xfbffffff] reserved as E820 entry
[ 1.731437] PCI: Using configuration type 1 for base access
[ 1.738464] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
[ 1.747451] HugeTLB: registered 1.00 GiB page size, pre-allocated 0 pages
[ 1.755424] HugeTLB: 16380 KiB vmemmap can be freed for a 1.00 GiB page
[ 1.762424] HugeTLB: registered 2.00 MiB page size, pre-allocated 0 pages
[ 1.770424] HugeTLB: 28 KiB vmemmap can be freed for a 2.00 MiB page
[ 1.777469] cryptd: max_cpu_qlen set to 1000
[...]
[ 4.453544] Trying to unpack rootfs image as initramfs...
[...]
[ 5.597938] microcode: microcode updated early to new patch_level=0x0800126e
[ 5.605837] microcode: CPU0: patch_level=0x0800126e
[ 5.605837] microcode: CPU1: patch_level=0x0800126e
[ 5.605839] microcode: CPU2: patch_level=0x0800126e
[ 5.605839] microcode: CPU3: patch_level=0x0800126e
[ 5.605840] microcode: CPU5: patch_level=0x0800126e
[ 5.605840] microcode: CPU4: patch_level=0x0800126e
[ 5.605843] microcode: CPU7: patch_level=0x0800126e
[ 5.605843] microcode: CPU6: patch_level=0x0800126e
[ 5.605846] microcode: CPU8: patch_level=0x0800126e
[ 5.605846] microcode: CPU9: patch_level=0x0800126e
[ 5.605848] microcode: CPU10: patch_level=0x0800126e
[ 5.605848] microcode: CPU12: patch_level=0x0800126e
[ 5.605848] microcode: CPU11: patch_level=0x0800126e
[ 5.605849] microcode: CPU13: patch_level=0x0800126e
[ 5.605851] microcode: CPU14: patch_level=0x0800126e
[ 5.605855] microcode: CPU15: patch_level=0x0800126e
[ 5.606797] microcode: CPU14: new patch_level=0x0800126e
[ 5.606802] microcode: CPU2: new patch_level=0x0800126e
[ 5.606801] microcode: CPU0: new patch_level=0x0800126e
[ 5.606839] microcode: CPU11: new patch_level=0x0800126e
[ 5.606966] microcode: CPU5: new patch_level=0x0800126e
[ 5.606968] microcode: CPU4: new patch_level=0x0800126e
[ 5.606748] microcode: CPU7: new patch_level=0x0800126e
[ 5.606983] microcode: CPU6: new patch_level=0x0800126e
[ 5.607039] microcode: CPU12: new patch_level=0x0800126e
[ 5.607041] microcode: CPU13: new patch_level=0x0800126e
[ 5.607041] microcode: CPU15: new patch_level=0x0800126e
[ 5.606834] microcode: CPU9: new patch_level=0x0800126e
[ 5.607086] microcode: CPU8: new patch_level=0x0800126e
[ 5.607090] microcode: CPU10: new patch_level=0x0800126e
[ 5.617884] microcode: CPU3: new patch_level=0x0800126e
[ 5.716623] usb 1-4: new high-speed USB device number 2 using xhci_hcd
[ 5.717926] microcode: CPU1: new patch_level=0x0800126e
[ 5.795625] tsc: Refined TSC clocksource calibration: 2499.995 MHz
[ 5.804743] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x240934df233, max_idle_ns: 440795202126 ns
[ 5.812239] microcode: Microcode Update Driver: v2.2.
[ 6.113951] clocksource: Switched to clocksource tsc
[ 6.114024] IPI shorthand broadcast: enabled
[ 6.129904] AVX2 version of gcm_enc/dec engaged.
[ 6.132935] usb 1-4: New USB device found, idVendor=0424, idProduct=2240, bcdDevice= 1.98
[ 6.144197] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 6.152168] usb 1-4: Product: Ultra Fast Media
[ 6.157227] usb 1-4: Manufacturer: Generic
[ 6.161803] usb 1-4: SerialNumber: 000000225001
[ 6.166884] AES CTR mode by8 optimization enabled
[ 6.167367] usb-storage 1-4:1.0: USB Mass Storage device detected
[ 6.179082] scsi host0: usb-storage 1-4:1.0
[ 6.210818] sched_clock: Marking stable (4857023098, 1353615596)->(6452406360, -241767666)
[ 6.220186] registered taskstats version 1
[ 7.203607] scsi 0:0:0:0: Direct-Access Generic Ultra HS-COMBO 1.98 PQ: 0 ANSI: 0
[ 7.213623] sd 0:0:0:0: [sda] 15542272 512-byte logical blocks: (7.96 GB/7.41 GiB)
[ 7.222724] sd 0:0:0:0: [sda] Write Protect is off
[ 7.228077] sd 0:0:0:0: [sda] Mode Sense: 23 00 00 00
[ 7.234350] sd 0:0:0:0: [sda] No Caching mode page found
[ 7.240283] sd 0:0:0:0: [sda] Assuming drive cache: write through
[ 7.254292] sda: sda1 sda2 sda3
[ 7.258027] sd 0:0:0:0: [sda] Attached SCSI removable disk
[ 233.542760] Freeing initrd memory: 97528K
[ 233.549574] Key type trusted registered
[ 233.556012] Key type encrypted registered
[ 233.560499] ima: Allocated hash algorithm: sha1
[ 233.580213] ima: No architecture policies found
[ 233.585287] evm: Initialising EVM extended attributes:
[ 233.591025] evm: security.selinux (disabled)
[ 233.595793] evm: security.SMACK64 (disabled)
[ 233.600561] evm: security.SMACK64EXEC (disabled)
[ 233.605716] evm: security.SMACK64TRANSMUTE (disabled)
[ 233.611357] evm: security.SMACK64MMAP (disabled)
[ 233.616512] evm: security.apparmor (disabled)
[ 233.621377] evm: security.ima
[ 233.624691] evm: security.capability
[ 233.628683] evm: HMAC attrs: 0x1
[ 233.717708] RAS: Correctable Errors collector initialized.
[ 233.723891] clk: Disabling unused clocks
[ 233.729671] Freeing unused decrypted memory: 2044K
[ 233.735506] Freeing unused kernel image (initmem) memory: 2692K
[ 233.743617] Write protecting the kernel read-only data: 20480k
[ 233.750415] Freeing unused kernel image (rodata/data gap) memory: 980K
[ 233.757709] Run /init as init process
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: x86/amd late microcode thread loading slows down boot
2024-11-06 17:20 x86/amd late microcode thread loading slows down boot Thomas De Schampheleire
@ 2024-11-06 18:50 ` Andrew Cooper
2024-11-07 15:30 ` Borislav Petkov
2024-11-19 11:21 ` [PATCH 1/2] x86/mm: Carve out INVLPG inline asm for use by others Borislav Petkov
` (2 subsequent siblings)
3 siblings, 1 reply; 21+ messages in thread
From: Andrew Cooper @ 2024-11-06 18:50 UTC (permalink / raw)
To: thomas.de_schampheleire; +Cc: bp, linux-kernel, x86
> Hi Borislav, all,
>
> I am encountering varying delays in the boot process
I recognise those symptoms.
https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=f19a199281a23725beb73bef61eb8964d8e225ce
We found it in Xen in 2019, and were instructed by AMD to insert a TLB
flush immediately after patchloading.
Looking at Linux, there's no such workaround.
It turns out that a side effect of patchloading on some CPUs leaves the
mapping of the blob in the TLB (at whatever granularity) as fully UC.
When this happens to be a 2M/1G superpage, it causes whole lot of perf
problems in unrelated areas.
I suggest Linux take the same approach.
~Andrew
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: x86/amd late microcode thread loading slows down boot
2024-11-06 18:50 ` Andrew Cooper
@ 2024-11-07 15:30 ` Borislav Petkov
2024-11-07 20:58 ` Thomas De Schampheleire
0 siblings, 1 reply; 21+ messages in thread
From: Borislav Petkov @ 2024-11-07 15:30 UTC (permalink / raw)
To: Andrew Cooper; +Cc: thomas.de_schampheleire, linux-kernel, x86
On Wed, Nov 06, 2024 at 06:50:12PM +0000, Andrew Cooper wrote:
> I recognise those symptoms.
>
> https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=f19a199281a23725beb73bef61eb8964d8e225ce
>
> We found it in Xen in 2019, and were instructed by AMD to insert a TLB
> flush immediately after patchloading.
Thanks for saving me some wild goose chasing...
Thomas, do you want a diff to try?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: x86/amd late microcode thread loading slows down boot
2024-11-07 15:30 ` Borislav Petkov
@ 2024-11-07 20:58 ` Thomas De Schampheleire
2024-11-14 9:56 ` Borislav Petkov
0 siblings, 1 reply; 21+ messages in thread
From: Thomas De Schampheleire @ 2024-11-07 20:58 UTC (permalink / raw)
To: Borislav Petkov; +Cc: Andrew Cooper, linux-kernel, x86
On Thu, Nov 07, 2024 at 04:30:36PM +0100, Borislav Petkov wrote:
>
> Thanks for saving me some wild goose chasing...
>
> Thomas, do you want a diff to try?
>
Thanks Andrew for the suggestion.
I tested with a call to 'flush_tlb_all()' right after native_rdmsr(), inside
__apply_microcode_amd(). This effectively fixes the problem.
Boris, perhaps you can propose a more fine-tuned flushing? I'd be happy to try
that.
Thanks,
Thomas
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: x86/amd late microcode thread loading slows down boot
2024-11-07 20:58 ` Thomas De Schampheleire
@ 2024-11-14 9:56 ` Borislav Petkov
2024-11-14 12:03 ` Andrew Cooper
2024-11-14 20:01 ` Thomas De Schampheleire
0 siblings, 2 replies; 21+ messages in thread
From: Borislav Petkov @ 2024-11-14 9:56 UTC (permalink / raw)
To: Thomas De Schampheleire; +Cc: Andrew Cooper, linux-kernel, x86
On Thu, Nov 07, 2024 at 09:58:12PM +0100, Thomas De Schampheleire wrote:
> Boris, perhaps you can propose a more fine-tuned flushing? I'd be happy to try
> that.
Let's see if that does the deal too.
Thx.
---
diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
index 580636cdc257..4d3c9d00d6b6 100644
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -34,4 +34,8 @@ static inline void __tlb_remove_table(void *table)
free_page_and_swap_cache(table);
}
+static inline void invlpg(unsigned long addr)
+{
+ asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+}
#endif /* _ASM_X86_TLB_H */
diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 31a73715d755..6a73f775ce4c 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -34,6 +34,7 @@
#include <asm/setup.h>
#include <asm/cpu.h>
#include <asm/msr.h>
+#include <asm/tlb.h>
#include "internal.h"
@@ -489,6 +490,9 @@ static int __apply_microcode_amd(struct microcode_amd *mc)
native_wrmsrl(MSR_AMD64_PATCH_LOADER, (u64)(long)&mc->hdr.data_code);
+ if (x86_family(bsp_cpuid_1_eax) == 0x17)
+ invlpg((u64)(long)&mc->hdr.data_code);
+
/* verify patch application was successful */
native_rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 86593d1b787d..b0678d59ebdb 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -20,6 +20,7 @@
#include <asm/cacheflush.h>
#include <asm/apic.h>
#include <asm/perf_event.h>
+#include <asm/tlb.h>
#include "mm_internal.h"
@@ -1140,7 +1141,7 @@ STATIC_NOPV void native_flush_tlb_one_user(unsigned long addr)
bool cpu_pcide;
/* Flush 'addr' from the kernel PCID: */
- asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+ invlpg(addr);
/* If PTI is off there is no user PCID and nothing to flush. */
if (!static_cpu_has(X86_FEATURE_PTI))
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: x86/amd late microcode thread loading slows down boot
2024-11-14 9:56 ` Borislav Petkov
@ 2024-11-14 12:03 ` Andrew Cooper
2024-11-15 20:51 ` Borislav Petkov
2024-11-18 19:16 ` Borislav Petkov
2024-11-14 20:01 ` Thomas De Schampheleire
1 sibling, 2 replies; 21+ messages in thread
From: Andrew Cooper @ 2024-11-14 12:03 UTC (permalink / raw)
To: Borislav Petkov, Thomas De Schampheleire; +Cc: linux-kernel, x86
On 14/11/2024 9:56 am, Borislav Petkov wrote:
> On Thu, Nov 07, 2024 at 09:58:12PM +0100, Thomas De Schampheleire wrote:
>> Boris, perhaps you can propose a more fine-tuned flushing? I'd be happy to try
>> that.
> Let's see if that does the deal too.
>
> Thx.
>
> ---
> diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
> index 580636cdc257..4d3c9d00d6b6 100644
> --- a/arch/x86/include/asm/tlb.h
> +++ b/arch/x86/include/asm/tlb.h
> @@ -34,4 +34,8 @@ static inline void __tlb_remove_table(void *table)
> free_page_and_swap_cache(table);
> }
>
> +static inline void invlpg(unsigned long addr)
> +{
> + asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
"invlpg %0" :: "m" (*(char *)addr) : "memory"
The compiler can usually do a better job than forcing it into a plain
register.
> +}
> #endif /* _ASM_X86_TLB_H */
> diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
> index 31a73715d755..6a73f775ce4c 100644
> --- a/arch/x86/kernel/cpu/microcode/amd.c
> +++ b/arch/x86/kernel/cpu/microcode/amd.c
> @@ -34,6 +34,7 @@
> #include <asm/setup.h>
> #include <asm/cpu.h>
> #include <asm/msr.h>
> +#include <asm/tlb.h>
>
> #include "internal.h"
>
> @@ -489,6 +490,9 @@ static int __apply_microcode_amd(struct microcode_amd *mc)
>
> native_wrmsrl(MSR_AMD64_PATCH_LOADER, (u64)(long)&mc->hdr.data_code);
>
> + if (x86_family(bsp_cpuid_1_eax) == 0x17)
> + invlpg((u64)(long)&mc->hdr.data_code);
Ok, so it's Fam17h specific. That's good to know. Any formal statement
on the matter from AMD ?
However, these blobs are 3200 bytes long and come with a good chance of
crossing a page boundary. If you're invlpg'ing, you need to issue a
second one for the final byte of the image too.
~Andrew
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: x86/amd late microcode thread loading slows down boot
2024-11-14 12:03 ` Andrew Cooper
@ 2024-11-15 20:51 ` Borislav Petkov
2024-11-16 19:32 ` Andrew Cooper
` (2 more replies)
2024-11-18 19:16 ` Borislav Petkov
1 sibling, 3 replies; 21+ messages in thread
From: Borislav Petkov @ 2024-11-15 20:51 UTC (permalink / raw)
To: Andrew Cooper, Thomas De Schampheleire; +Cc: linux-kernel, x86
On Thu, Nov 14, 2024 at 12:03:41PM +0000, Andrew Cooper wrote:
> "invlpg %0" :: "m" (*(char *)addr) : "memory"
>
> The compiler can usually do a better job than forcing it into a plain
> register.
I guess. I'll do that in the final version as the invlpg carve out will be
a separate patch.
> Ok, so it's Fam17h specific. That's good to know. Any formal statement
> on the matter from AMD ?
You can use my commit message for now... I'm working on something more
formal although I have no idea yet what format that should have ...
> However, these blobs are 3200 bytes long and come with a good chance of
> crossing a page boundary. If you're invlpg'ing, you need to issue a
> second one for the final byte of the image too.
Right, see below. It works here, Thomas you could give it a try too.
---
diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
index 580636cdc257..4d3c9d00d6b6 100644
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -34,4 +34,8 @@ static inline void __tlb_remove_table(void *table)
free_page_and_swap_cache(table);
}
+static inline void invlpg(unsigned long addr)
+{
+ asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+}
#endif /* _ASM_X86_TLB_H */
diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index c4991226c86b..fdd4f8ef3696 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -34,6 +34,7 @@
#include <asm/setup.h>
#include <asm/cpu.h>
#include <asm/msr.h>
+#include <asm/tlb.h>
#include "internal.h"
@@ -483,11 +484,23 @@ static void scan_containers(u8 *ucode, size_t size, struct cont_desc *desc)
}
}
-static int __apply_microcode_amd(struct microcode_amd *mc)
+static int __apply_microcode_amd(struct microcode_amd *mc, unsigned int psize)
{
+ unsigned long p_addr = (unsigned long)&mc->hdr.data_code;
u32 rev, dummy;
- native_wrmsrl(MSR_AMD64_PATCH_LOADER, (u64)(long)&mc->hdr.data_code);
+ native_wrmsrl(MSR_AMD64_PATCH_LOADER, p_addr);
+
+ if (x86_family(bsp_cpuid_1_eax) == 0x17) {
+ invlpg(p_addr);
+
+ /*
+ * Flush next page too if patch image is crossing a page
+ * boundary.
+ */
+ if (p_addr >> PAGE_SHIFT != (p_addr + psize) >> PAGE_SHIFT)
+ invlpg(p_addr + psize);
+ }
/* verify patch application was successful */
native_rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy);
@@ -529,7 +542,7 @@ static bool early_apply_microcode(u32 old_rev, void *ucode, size_t size)
if (old_rev > mc->hdr.patch_id)
return ret;
- return !__apply_microcode_amd(mc);
+ return !__apply_microcode_amd(mc, desc.psize);
}
static bool get_builtin_microcode(struct cpio_data *cp)
@@ -748,7 +761,7 @@ void reload_ucode_amd(unsigned int cpu)
rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy);
if (rev < mc->hdr.patch_id) {
- if (!__apply_microcode_amd(mc))
+ if (!__apply_microcode_amd(mc, p->size))
pr_info_once("reload revision: 0x%08x\n", mc->hdr.patch_id);
}
}
@@ -801,7 +814,7 @@ static enum ucode_state apply_microcode_amd(int cpu)
goto out;
}
- if (__apply_microcode_amd(mc_amd)) {
+ if (__apply_microcode_amd(mc_amd, p->size)) {
pr_err("CPU%d: update failed for patch_level=0x%08x\n",
cpu, mc_amd->hdr.patch_id);
return UCODE_ERROR;
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 86593d1b787d..b0678d59ebdb 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -20,6 +20,7 @@
#include <asm/cacheflush.h>
#include <asm/apic.h>
#include <asm/perf_event.h>
+#include <asm/tlb.h>
#include "mm_internal.h"
@@ -1140,7 +1141,7 @@ STATIC_NOPV void native_flush_tlb_one_user(unsigned long addr)
bool cpu_pcide;
/* Flush 'addr' from the kernel PCID: */
- asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+ invlpg(addr);
/* If PTI is off there is no user PCID and nothing to flush. */
if (!static_cpu_has(X86_FEATURE_PTI))
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: x86/amd late microcode thread loading slows down boot
2024-11-15 20:51 ` Borislav Petkov
@ 2024-11-16 19:32 ` Andrew Cooper
2024-11-17 18:23 ` David Laight
2024-11-18 15:13 ` Thomas De Schampheleire
2 siblings, 0 replies; 21+ messages in thread
From: Andrew Cooper @ 2024-11-16 19:32 UTC (permalink / raw)
To: Borislav Petkov, Thomas De Schampheleire; +Cc: linux-kernel, x86
On 15/11/2024 8:51 pm, Borislav Petkov wrote:
> On Thu, Nov 14, 2024 at 12:03:41PM +0000, Andrew Cooper wrote:
>> Ok, so it's Fam17h specific. That's good to know. Any formal statement
>> on the matter from AMD ?
> You can use my commit message for now... I'm working on something more
> formal although I have no idea yet what format that should have ...
Best would be a new erratum in the revision guides.
Thanks,
~Andrew
^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: x86/amd late microcode thread loading slows down boot
2024-11-15 20:51 ` Borislav Petkov
2024-11-16 19:32 ` Andrew Cooper
@ 2024-11-17 18:23 ` David Laight
2024-11-17 19:52 ` Borislav Petkov
2024-11-18 15:13 ` Thomas De Schampheleire
2 siblings, 1 reply; 21+ messages in thread
From: David Laight @ 2024-11-17 18:23 UTC (permalink / raw)
To: 'Borislav Petkov', Andrew Cooper, Thomas De Schampheleire
Cc: linux-kernel@vger.kernel.org, x86@kernel.org
...
> + /*
> + * Flush next page too if patch image is crossing a page
> + * boundary.
> + */
> + if (p_addr >> PAGE_SHIFT != (p_addr + psize) >> PAGE_SHIFT)
> + invlpg(p_addr + psize);
> + }
Shouldn't that be 'psize - 1' ?
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: x86/amd late microcode thread loading slows down boot
2024-11-15 20:51 ` Borislav Petkov
2024-11-16 19:32 ` Andrew Cooper
2024-11-17 18:23 ` David Laight
@ 2024-11-18 15:13 ` Thomas De Schampheleire
2024-11-18 15:28 ` Borislav Petkov
2 siblings, 1 reply; 21+ messages in thread
From: Thomas De Schampheleire @ 2024-11-18 15:13 UTC (permalink / raw)
To: Borislav Petkov; +Cc: Andrew Cooper, linux-kernel, x86
On Fri, Nov 15, 2024 at 09:51:14PM +0100, Borislav Petkov wrote:
>
> On Thu, Nov 14, 2024 at 12:03:41PM +0000, Andrew Cooper wrote:
[...]
> > However, these blobs are 3200 bytes long and come with a good chance of
> > crossing a page boundary. If you're invlpg'ing, you need to issue a
> > second one for the final byte of the image too.
>
> Right, see below. It works here, Thomas you could give it a try too.
Thanks, I tested this version successfully.
I hadn't included the 'size - 1' fix yet but I don't think this could influence
my test negatively.
Please go ahead with the final patch.
Will it be backported to the 6.6.x branch?
Thanks,
Thomas
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: x86/amd late microcode thread loading slows down boot
2024-11-18 15:13 ` Thomas De Schampheleire
@ 2024-11-18 15:28 ` Borislav Petkov
2024-11-18 15:58 ` Thomas De Schampheleire
2024-11-19 10:46 ` Thomas De Schampheleire
0 siblings, 2 replies; 21+ messages in thread
From: Borislav Petkov @ 2024-11-18 15:28 UTC (permalink / raw)
To: Thomas De Schampheleire; +Cc: Andrew Cooper, linux-kernel, x86
On Mon, Nov 18, 2024 at 04:13:52PM +0100, Thomas De Schampheleire wrote:
> Thanks, I tested this version successfully.
> I hadn't included the 'size - 1' fix yet but I don't think this could influence
> my test negatively.
Thanks for testing, want me to add your Reported-by and Tested-by tag?
> Please go ahead with the final patch.
> Will it be backported to the 6.6.x branch?
Sure, I'll Cc stable.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: x86/amd late microcode thread loading slows down boot
2024-11-18 15:28 ` Borislav Petkov
@ 2024-11-18 15:58 ` Thomas De Schampheleire
2024-11-19 10:46 ` Thomas De Schampheleire
1 sibling, 0 replies; 21+ messages in thread
From: Thomas De Schampheleire @ 2024-11-18 15:58 UTC (permalink / raw)
To: Borislav Petkov; +Cc: Andrew Cooper, linux-kernel, x86
On Mon, Nov 18, 2024 at 04:28:59PM +0100, Borislav Petkov wrote:
>
> Thanks for testing, want me to add your Reported-by and Tested-by tag?
Yes please:
Reported-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
Tested-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: x86/amd late microcode thread loading slows down boot
2024-11-18 15:28 ` Borislav Petkov
2024-11-18 15:58 ` Thomas De Schampheleire
@ 2024-11-19 10:46 ` Thomas De Schampheleire
2024-11-19 11:17 ` Borislav Petkov
1 sibling, 1 reply; 21+ messages in thread
From: Thomas De Schampheleire @ 2024-11-19 10:46 UTC (permalink / raw)
To: Borislav Petkov; +Cc: Andrew Cooper, linux-kernel, x86
On Mon, Nov 18, 2024 at 04:28:59PM +0100, Borislav Petkov wrote:
> On Mon, Nov 18, 2024 at 04:13:52PM +0100, Thomas De Schampheleire wrote:
> > Please go ahead with the final patch.
> > Will it be backported to the 6.6.x branch?
>
> Sure, I'll Cc stable.
Note that neither 6.11.x nor 6.6.x already have the global bsp_cpuid_1_eax which
your patch currently relies on.
~Thomas
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: x86/amd late microcode thread loading slows down boot
2024-11-14 12:03 ` Andrew Cooper
2024-11-15 20:51 ` Borislav Petkov
@ 2024-11-18 19:16 ` Borislav Petkov
1 sibling, 0 replies; 21+ messages in thread
From: Borislav Petkov @ 2024-11-18 19:16 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Thomas De Schampheleire, linux-kernel, x86
On Thu, Nov 14, 2024 at 12:03:41PM +0000, Andrew Cooper wrote:
> > +static inline void invlpg(unsigned long addr)
> > +{
> > + asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
>
> "invlpg %0" :: "m" (*(char *)addr) : "memory"
>
> The compiler can usually do a better job than forcing it into a plain
> register.
I think it is pretty smart and DTRT regardless.
The diff is only comments - insns are the same.
--- /tmp/before 2024-11-18 20:11:08.942464511 +0100
+++ /tmp/after 2024-11-18 20:10:37.722620293 +0100
@@ -3,27 +3,27 @@
movl %ebp, %esi # psize, psize
# arch/x86/kernel/cpu/microcode/amd.c:495: unsigned long p_addr_end = p_addr + psize - 1;
leaq -1(%rbx,%rsi), %rax #, p_addr_end
-# ./arch/x86/include/asm/tlb.h:39: asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+# ./arch/x86/include/asm/tlb.h:39: asm volatile("invlpg %0" ::"m" (*(char *)addr) : "memory");
#APP
# 39 "./arch/x86/include/asm/tlb.h" 1
- invlpg (%rbx) # mc
+ invlpg (%rbx) # MEM[(char *)_1]
# 0 "" 2
# arch/x86/kernel/cpu/microcode/amd.c:503: if (p_addr >> PAGE_SHIFT != p_addr_end >> PAGE_SHIFT)
#NO_APP
- movq %rbx, %rcx # mc, tmp110
+ movq %rbx, %rcx # mc, tmp111
# arch/x86/kernel/cpu/microcode/amd.c:503: if (p_addr >> PAGE_SHIFT != p_addr_end >> PAGE_SHIFT)
- movq %rax, %rdx # p_addr_end, tmp111
+ movq %rax, %rdx # p_addr_end, tmp112
# arch/x86/kernel/cpu/microcode/amd.c:503: if (p_addr >> PAGE_SHIFT != p_addr_end >> PAGE_SHIFT)
- shrq $12, %rcx #, tmp110
+ shrq $12, %rcx #, tmp111
# arch/x86/kernel/cpu/microcode/amd.c:503: if (p_addr >> PAGE_SHIFT != p_addr_end >> PAGE_SHIFT)
- shrq $12, %rdx #, tmp111
+ shrq $12, %rdx #, tmp112
# arch/x86/kernel/cpu/microcode/amd.c:503: if (p_addr >> PAGE_SHIFT != p_addr_end >> PAGE_SHIFT)
- cmpq %rdx, %rcx # tmp111, tmp110
+ cmpq %rdx, %rcx # tmp112, tmp111
je .L5 #,
-# ./arch/x86/include/asm/tlb.h:39: asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+# ./arch/x86/include/asm/tlb.h:39: asm volatile("invlpg %0" ::"m" (*(char *)addr) : "memory");
#APP
# 39 "./arch/x86/include/asm/tlb.h" 1
- invlpg (%rax) # p_addr_end
+ invlpg (%rax) # *addr.16_25
# 0 "" 2
# ./arch/x86/include/asm/tlb.h:40: }
#NO_APP
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: x86/amd late microcode thread loading slows down boot
2024-11-14 9:56 ` Borislav Petkov
2024-11-14 12:03 ` Andrew Cooper
@ 2024-11-14 20:01 ` Thomas De Schampheleire
1 sibling, 0 replies; 21+ messages in thread
From: Thomas De Schampheleire @ 2024-11-14 20:01 UTC (permalink / raw)
To: Borislav Petkov; +Cc: Andrew Cooper, linux-kernel, x86
On Thu, Nov 14, 2024 at 10:56:39AM +0100, Borislav Petkov wrote:
>
> On Thu, Nov 07, 2024 at 09:58:12PM +0100, Thomas De Schampheleire wrote:
> > Boris, perhaps you can propose a more fine-tuned flushing? I'd be happy to try
> > that.
>
> Let's see if that does the deal too.
Thanks, I tested your patch (ported to 6.6.x) and confirm it also fixes the
issue.
This test of course does not invalidate any of the comments from Andrew Cooper.
Best regards,
Thomas
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 1/2] x86/mm: Carve out INVLPG inline asm for use by others
2024-11-06 17:20 x86/amd late microcode thread loading slows down boot Thomas De Schampheleire
2024-11-06 18:50 ` Andrew Cooper
@ 2024-11-19 11:21 ` Borislav Petkov
2024-11-19 11:21 ` [PATCH 2/2] x86/microcode/AMD: Flush patch buffer mapping after application Borislav Petkov
2024-11-25 11:04 ` [tip: x86/urgent] " tip-bot2 for Borislav Petkov (AMD)
2024-11-25 11:04 ` [tip: x86/urgent] x86/mm: Carve out INVLPG inline asm for use by others tip-bot2 for Borislav Petkov (AMD)
3 siblings, 1 reply; 21+ messages in thread
From: Borislav Petkov @ 2024-11-19 11:21 UTC (permalink / raw)
To: X86 ML; +Cc: LKML, Thomas De Schampheleire, Andrew Cooper,
Borislav Petkov (AMD)
From: "Borislav Petkov (AMD)" <bp@alien8.de>
No functional changes.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
---
arch/x86/include/asm/tlb.h | 4 ++++
arch/x86/mm/tlb.c | 3 ++-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
index 580636cdc257..4d3c9d00d6b6 100644
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -34,4 +34,8 @@ static inline void __tlb_remove_table(void *table)
free_page_and_swap_cache(table);
}
+static inline void invlpg(unsigned long addr)
+{
+ asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+}
#endif /* _ASM_X86_TLB_H */
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 86593d1b787d..b0678d59ebdb 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -20,6 +20,7 @@
#include <asm/cacheflush.h>
#include <asm/apic.h>
#include <asm/perf_event.h>
+#include <asm/tlb.h>
#include "mm_internal.h"
@@ -1140,7 +1141,7 @@ STATIC_NOPV void native_flush_tlb_one_user(unsigned long addr)
bool cpu_pcide;
/* Flush 'addr' from the kernel PCID: */
- asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+ invlpg(addr);
/* If PTI is off there is no user PCID and nothing to flush. */
if (!static_cpu_has(X86_FEATURE_PTI))
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH 2/2] x86/microcode/AMD: Flush patch buffer mapping after application
2024-11-19 11:21 ` [PATCH 1/2] x86/mm: Carve out INVLPG inline asm for use by others Borislav Petkov
@ 2024-11-19 11:21 ` Borislav Petkov
0 siblings, 0 replies; 21+ messages in thread
From: Borislav Petkov @ 2024-11-19 11:21 UTC (permalink / raw)
To: X86 ML
Cc: LKML, Thomas De Schampheleire, Andrew Cooper,
Borislav Petkov (AMD), stable
From: "Borislav Petkov (AMD)" <bp@alien8.de>
Due to specific requirements while applying microcode patches on Zen1
and 2, the patch buffer mapping needs to be flushed from the TLB after
application. Do so.
If not, unncesessary and unnatural delays happen in the boot process.
Reported-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/ZyulbYuvrkshfsd2@antipodes
---
arch/x86/kernel/cpu/microcode/amd.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 31a73715d755..fb5d0c67fbab 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -34,6 +34,7 @@
#include <asm/setup.h>
#include <asm/cpu.h>
#include <asm/msr.h>
+#include <asm/tlb.h>
#include "internal.h"
@@ -483,11 +484,25 @@ static void scan_containers(u8 *ucode, size_t size, struct cont_desc *desc)
}
}
-static int __apply_microcode_amd(struct microcode_amd *mc)
+static int __apply_microcode_amd(struct microcode_amd *mc, unsigned int psize)
{
+ unsigned long p_addr = (unsigned long)&mc->hdr.data_code;
u32 rev, dummy;
- native_wrmsrl(MSR_AMD64_PATCH_LOADER, (u64)(long)&mc->hdr.data_code);
+ native_wrmsrl(MSR_AMD64_PATCH_LOADER, p_addr);
+
+ if (x86_family(bsp_cpuid_1_eax) == 0x17) {
+ unsigned long p_addr_end = p_addr + psize - 1;
+
+ invlpg(p_addr);
+
+ /*
+ * Flush next page too if patch image is crossing a page
+ * boundary.
+ */
+ if (p_addr >> PAGE_SHIFT != p_addr_end >> PAGE_SHIFT)
+ invlpg(p_addr_end);
+ }
/* verify patch application was successful */
native_rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy);
@@ -529,7 +544,7 @@ static bool early_apply_microcode(u32 old_rev, void *ucode, size_t size)
if (old_rev > mc->hdr.patch_id)
return ret;
- return !__apply_microcode_amd(mc);
+ return !__apply_microcode_amd(mc, desc.psize);
}
static bool get_builtin_microcode(struct cpio_data *cp)
@@ -745,7 +760,7 @@ void reload_ucode_amd(unsigned int cpu)
rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy);
if (rev < mc->hdr.patch_id) {
- if (!__apply_microcode_amd(mc))
+ if (!__apply_microcode_amd(mc, p->size))
pr_info_once("reload revision: 0x%08x\n", mc->hdr.patch_id);
}
}
@@ -798,7 +813,7 @@ static enum ucode_state apply_microcode_amd(int cpu)
goto out;
}
- if (__apply_microcode_amd(mc_amd)) {
+ if (__apply_microcode_amd(mc_amd, p->size)) {
pr_err("CPU%d: update failed for patch_level=0x%08x\n",
cpu, mc_amd->hdr.patch_id);
return UCODE_ERROR;
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [tip: x86/urgent] x86/microcode/AMD: Flush patch buffer mapping after application
2024-11-06 17:20 x86/amd late microcode thread loading slows down boot Thomas De Schampheleire
2024-11-06 18:50 ` Andrew Cooper
2024-11-19 11:21 ` [PATCH 1/2] x86/mm: Carve out INVLPG inline asm for use by others Borislav Petkov
@ 2024-11-25 11:04 ` tip-bot2 for Borislav Petkov (AMD)
2024-11-25 11:04 ` [tip: x86/urgent] x86/mm: Carve out INVLPG inline asm for use by others tip-bot2 for Borislav Petkov (AMD)
3 siblings, 0 replies; 21+ messages in thread
From: tip-bot2 for Borislav Petkov (AMD) @ 2024-11-25 11:04 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas De Schampheleire, Borislav Petkov (AMD), stable, x86,
linux-kernel
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: c809b0d0e52d01c30066367b2952c4c4186b1047
Gitweb: https://git.kernel.org/tip/c809b0d0e52d01c30066367b2952c4c4186b1047
Author: Borislav Petkov (AMD) <bp@alien8.de>
AuthorDate: Tue, 19 Nov 2024 12:21:33 +01:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Mon, 25 Nov 2024 11:43:21 +01:00
x86/microcode/AMD: Flush patch buffer mapping after application
Due to specific requirements while applying microcode patches on Zen1
and 2, the patch buffer mapping needs to be flushed from the TLB after
application. Do so.
If not, unnecessary and unnatural delays happen in the boot process.
Reported-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
Cc: <stable@kernel.org> # f1d84b59cbb9 ("x86/mm: Carve out INVLPG inline asm for use by others")
Link: https://lore.kernel.org/r/ZyulbYuvrkshfsd2@antipodes
---
arch/x86/kernel/cpu/microcode/amd.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 31a7371..fb5d0c6 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -34,6 +34,7 @@
#include <asm/setup.h>
#include <asm/cpu.h>
#include <asm/msr.h>
+#include <asm/tlb.h>
#include "internal.h"
@@ -483,11 +484,25 @@ static void scan_containers(u8 *ucode, size_t size, struct cont_desc *desc)
}
}
-static int __apply_microcode_amd(struct microcode_amd *mc)
+static int __apply_microcode_amd(struct microcode_amd *mc, unsigned int psize)
{
+ unsigned long p_addr = (unsigned long)&mc->hdr.data_code;
u32 rev, dummy;
- native_wrmsrl(MSR_AMD64_PATCH_LOADER, (u64)(long)&mc->hdr.data_code);
+ native_wrmsrl(MSR_AMD64_PATCH_LOADER, p_addr);
+
+ if (x86_family(bsp_cpuid_1_eax) == 0x17) {
+ unsigned long p_addr_end = p_addr + psize - 1;
+
+ invlpg(p_addr);
+
+ /*
+ * Flush next page too if patch image is crossing a page
+ * boundary.
+ */
+ if (p_addr >> PAGE_SHIFT != p_addr_end >> PAGE_SHIFT)
+ invlpg(p_addr_end);
+ }
/* verify patch application was successful */
native_rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy);
@@ -529,7 +544,7 @@ static bool early_apply_microcode(u32 old_rev, void *ucode, size_t size)
if (old_rev > mc->hdr.patch_id)
return ret;
- return !__apply_microcode_amd(mc);
+ return !__apply_microcode_amd(mc, desc.psize);
}
static bool get_builtin_microcode(struct cpio_data *cp)
@@ -745,7 +760,7 @@ void reload_ucode_amd(unsigned int cpu)
rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy);
if (rev < mc->hdr.patch_id) {
- if (!__apply_microcode_amd(mc))
+ if (!__apply_microcode_amd(mc, p->size))
pr_info_once("reload revision: 0x%08x\n", mc->hdr.patch_id);
}
}
@@ -798,7 +813,7 @@ static enum ucode_state apply_microcode_amd(int cpu)
goto out;
}
- if (__apply_microcode_amd(mc_amd)) {
+ if (__apply_microcode_amd(mc_amd, p->size)) {
pr_err("CPU%d: update failed for patch_level=0x%08x\n",
cpu, mc_amd->hdr.patch_id);
return UCODE_ERROR;
^ permalink raw reply related [flat|nested] 21+ messages in thread* [tip: x86/urgent] x86/mm: Carve out INVLPG inline asm for use by others
2024-11-06 17:20 x86/amd late microcode thread loading slows down boot Thomas De Schampheleire
` (2 preceding siblings ...)
2024-11-25 11:04 ` [tip: x86/urgent] " tip-bot2 for Borislav Petkov (AMD)
@ 2024-11-25 11:04 ` tip-bot2 for Borislav Petkov (AMD)
3 siblings, 0 replies; 21+ messages in thread
From: tip-bot2 for Borislav Petkov (AMD) @ 2024-11-25 11:04 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov (AMD), x86, linux-kernel
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: f1d84b59cbb9547c243d93991acf187fdbe9fbe9
Gitweb: https://git.kernel.org/tip/f1d84b59cbb9547c243d93991acf187fdbe9fbe9
Author: Borislav Petkov (AMD) <bp@alien8.de>
AuthorDate: Tue, 19 Nov 2024 12:21:32 +01:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Mon, 25 Nov 2024 11:28:02 +01:00
x86/mm: Carve out INVLPG inline asm for use by others
No functional changes.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/ZyulbYuvrkshfsd2@antipodes
---
arch/x86/include/asm/tlb.h | 4 ++++
arch/x86/mm/tlb.c | 3 ++-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
index 580636c..4d3c9d0 100644
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -34,4 +34,8 @@ static inline void __tlb_remove_table(void *table)
free_page_and_swap_cache(table);
}
+static inline void invlpg(unsigned long addr)
+{
+ asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+}
#endif /* _ASM_X86_TLB_H */
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index b0d5a64..a2becb8 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -20,6 +20,7 @@
#include <asm/cacheflush.h>
#include <asm/apic.h>
#include <asm/perf_event.h>
+#include <asm/tlb.h>
#include "mm_internal.h"
@@ -1140,7 +1141,7 @@ STATIC_NOPV void native_flush_tlb_one_user(unsigned long addr)
bool cpu_pcide;
/* Flush 'addr' from the kernel PCID: */
- asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+ invlpg(addr);
/* If PTI is off there is no user PCID and nothing to flush. */
if (!static_cpu_has(X86_FEATURE_PTI))
^ permalink raw reply related [flat|nested] 21+ messages in thread
end of thread, other threads:[~2024-11-25 11:04 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-06 17:20 x86/amd late microcode thread loading slows down boot Thomas De Schampheleire
2024-11-06 18:50 ` Andrew Cooper
2024-11-07 15:30 ` Borislav Petkov
2024-11-07 20:58 ` Thomas De Schampheleire
2024-11-14 9:56 ` Borislav Petkov
2024-11-14 12:03 ` Andrew Cooper
2024-11-15 20:51 ` Borislav Petkov
2024-11-16 19:32 ` Andrew Cooper
2024-11-17 18:23 ` David Laight
2024-11-17 19:52 ` Borislav Petkov
2024-11-18 15:13 ` Thomas De Schampheleire
2024-11-18 15:28 ` Borislav Petkov
2024-11-18 15:58 ` Thomas De Schampheleire
2024-11-19 10:46 ` Thomas De Schampheleire
2024-11-19 11:17 ` Borislav Petkov
2024-11-18 19:16 ` Borislav Petkov
2024-11-14 20:01 ` Thomas De Schampheleire
2024-11-19 11:21 ` [PATCH 1/2] x86/mm: Carve out INVLPG inline asm for use by others Borislav Petkov
2024-11-19 11:21 ` [PATCH 2/2] x86/microcode/AMD: Flush patch buffer mapping after application Borislav Petkov
2024-11-25 11:04 ` [tip: x86/urgent] " tip-bot2 for Borislav Petkov (AMD)
2024-11-25 11:04 ` [tip: x86/urgent] x86/mm: Carve out INVLPG inline asm for use by others tip-bot2 for Borislav Petkov (AMD)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox