* Linux 6.11-rc1
@ 2024-07-28 21:40 Linus Torvalds
2024-07-29 9:28 ` Build regressions/improvements in v6.11-rc1 Geert Uytterhoeven
2024-07-29 15:29 ` Linux 6.11-rc1 Guenter Roeck
0 siblings, 2 replies; 59+ messages in thread
From: Linus Torvalds @ 2024-07-28 21:40 UTC (permalink / raw)
To: Linux Kernel Mailing List
The merge window felt pretty normal, and the stats all look pretty
normal too. I was expecting things to be quieter because of summer
vacations, but that (still) doesn't actually seem to have been the
case.
There's 12k+ regular commits (and another 850 merge commits), so as
always the summary of this all is just my merge log. The diffstats are
also (once again) dominated by some big hardware descriptions (another
AMD GPU register dump accounts for ~45% of the lines in the diff, and
some more perf event JSON descriptor files account for another 5%).
But if you ignore those HW dumps, the diff too looks perfectly
regular: drivers account for a bit over half (even when not counting
the AMD register description noise). The rest is roughly one third
architecture updates (lots of it is dts files, so I guess I could have
lumped that in with "more hw descriptor tables"), one third tooling
and documentation, and one third "core kernel" (filesystems,
networking, VM and kernel). Very roughly.
If you want more details, you should get the git tree, and then narrow
things down based on interests.
Linus
---
Al Viro (1):
struct file leak fixes
Alex Williamson (1):
VFIO updates
Alexandre Belloni (2):
RTC updates
i3c updates
Andreas Gruenbacher (1):
gfs2 updates
Andreas Larsson (1):
sparc updates
Andrew Morton (3):
MM updates
non-MM updates
misc hotfixes
Anna Schumaker (1):
NFS client updates
Ard Biesheuvel (2):
EFI updates
EFI fixes
Arnd Bergmann (5):
SoC driver updates
SoC dt updates
SoC defconfig updates
arm SoC platform updates
asm-generic updates
Bartosz Golaszewski (4):
power sequencing updates
gpio updates
power sequencing fixes
gpio fix
Benjamin Tissoires (1):
HID updates
Bjorn Andersson (3):
hwspinlock updates
remoteproc updates
rpmsg updates
Bjorn Helgaas (1):
pci updates
Borislav Petkov (14):
EDAC updates
RAS updates
x86 alternatives updates
x86 boot updates
x86 cleanups
x86 confidential computing updates
x86 uaccess update
x86 build update
misc x86 updates
x86 vmware updates
x86 cpu mitigation updates
x86 cpu model updates
x86 resource control updates
x86 SEV updates
Casey Schaufler (1):
smack updates
Catalin Marinas (1):
arm64 updates
Chandan Babu (1):
xfs updates
Christian Brauner (13):
misc vfs updates
PG_error removal updates
vfs module description updates
vfs casefolding updates
vfs mount API updates
vfs inode / dentry updates
vfs mount query updates
namespace-fs updates
pidfs updates
iomap updates
vfs fixes x 3
Christoph Hellwig (2):
dma-mapping updates
dma-mapping fix
Chuck Lever (1):
nfsd updates
Corey Minyard (1):
IPMI updates
Damien Le Moal (1):
zonefs update
Daniel Thompson (1):
kgdb updates
Dave Airlie (3):
drm fixes
drm updates
drm fixes
Dave Jiang (1):
CXL updates
David Kleikamp (1):
jfs updates
David Sterba (3):
affs updates
btrfs updates
btrfs fix
David Teigland (1):
dlm updates
Dipen Patel (1):
hardware timestamp update
Dmitry Torokhov (1):
input updates
Dominik Brodowski (1):
PCMCIA updates
Gabriel Krisman Bertazi (1):
unicode update
Gao Xiang (2):
erofs updates
more erofs updates
Geert Uytterhoeven (2):
m68k updates
auxdisplay updates
Greg KH (5):
tty / serial updates
USB / Thunderbolt updates
staging driver updates
char / misc and other driver updates
driver core updates
Guenter Roeck (1):
hwmon updates
Helge Deller (2):
fbdev updates
parisc updates
Herbert Xu (1):
crypto update
Huacai Chen (1):
LoongArch updates
Ilpo Järvinen (1):
x86 platform driver updates
Ilya Dryomov (1):
ceph updates
Ingo Molnar (5):
locking updates
objtool updates
scheduler updates
performance events updates
x86 percpu updates
Ira Weiny (1):
libnvdimm updates
Jaegeuk Kim (1):
f2fs updates
Jakub Kicinski (2):
networking updates
networking fixes
James Bottomley (1):
SCSI updates
Jan Kara (2):
fsnotify fix
udf, ext2, isofs fixes and cleanups
Jarkko Sakkinen (3):
tpm updates
keys updates
tpm fix
Jason Donenfeld (1):
random number generator updates
Jason Gunthorpe (2):
iommufd updates
rdma updates
Jassi Brar (1):
mailbox updates
Jens Axboe (7):
io_uring updates
block updates
block integrity mapping updates
more block updates
io_uring fixes
io_uring fixes
block fixes
Joel Granados (2):
sysctl updates
sysctl constification
John Johansen (1):
apparmor updates
John Paul Adrian Glaubitz (1):
sh updates
Jonathan Corbet (1):
documentation updates
Juergen Gross (2):
xen updates
xen fixes
Kees Cook (5):
execve updates
seccomp updates
pstore updates
hardening updates
execve fix
Kent Overstreet (2):
bcachefs updates
bcachefs fixes
Konstantin Komarov (1):
ntfs3 updates
Lee Jones (3):
MFD updates
backlight updates
LED updates
Len Brown (1):
turbostat updates
Linus Walleij (1):
pin control updates
Luis Chamberlain (1):
module update
Mark Brown (6):
regmap updates
regulator updates
spi updates
regmap fix
regulator fixes
spi fixes
Masahiro Yamada (2):
Kbuild updates
Kbuild fixes
Masami Hiramatsu (3):
probes updates
bootconfig update
uprobe fix
Mauro Carvalho Chehab (1):
media updates
Michael Ellerman (1):
powerpc updates
Michael Tsirkin (1):
virtio updates
Mickaël Salaün (2):
landlock updates
landlock fix
Miguel Ojeda (1):
Rust updates
Mike Rapoport (1):
memblock updates
Mikulas Patocka (1):
device mapper updates
Miquel Raynal (1):
MTD updates
Namhyung Kim (2):
perf tools updates
perf tools fixes
Namjae Jeon (1):
exfat updates
Niklas Cassel (1):
ata updates
Palmer Dabbelt (2):
RISC-V updates
more RISC-V updates
Paolo Abeni (1):
networking fixes
Paolo Bonzini (1):
kvm updates
Paul McKenney (6):
arm byte cmpxchg
memory model updates
RCU updates
torture-test updates
KCSAN updates
nolibc updates
Paul Moore (2):
selinux update
lsm updates
Petr Mladek (2):
livepatching update
printk updates
Rafael Wysocki (5):
thermal control updates
power management updates
ACPI updates
thermal control fix
thermal control fix
Richard Weinberger (2):
UML updates
UBI and UBIFS updates
Rob Herring (2):
devicetree updates
more devicetree updates
Sebastian Reichel (2):
HSI update
power supply and reset updates
Shuah Khan (2):
KUnit updates
kselftest updates
Stephen Boyd (2):
clk updates
clk fixes
Steve French (3):
smb client fixes
smb server fixes
more smb client updates
Steven Rostedt (4):
tracing updates
ftrace updates
tracing tools updates
tracing CREDITS file update
Takashi Iwai (2):
sound updates
sound fixes
Takashi Sakamoto (2):
firewire updates
firewire fixes
Ted Ts'o (1):
ext4 updates
Tejun Heo (3):
cgroup updates
workqueue updates
workqueue fix
Thomas Bogendoerfer (2):
MIPS updates
MIPS updates
Thomas Gleixner (6):
debugobjects update
CPU hotplug updates
timer updates
interrupt subsystem updates
MSI interrupt updates
timer migration updates
Tzung-Bi Shih (2):
chrome platform updates
chrome platform firmware update
Ulf Hansson (2):
pmdomain updates
MMC updates
Uwe Kleine-König (1):
pwm updates
Vasily Gorbik (2):
s390 updates
more s390 updates
Vinod Koul (3):
dmaengine updates
soundwire updates
phy updates
Vlastimil Babka (1):
slab updates
Will Deacon (3):
iommu updates
arm64 fixes
iommu fixes
Wim Van Sebroeck (1):
watchdog updates
Wolfram Sang (2):
i2c fixes
more i2c updates
Yury Norov (1):
bitmap updates
^ permalink raw reply [flat|nested] 59+ messages in thread
* Build regressions/improvements in v6.11-rc1
2024-07-28 21:40 Linux 6.11-rc1 Linus Torvalds
@ 2024-07-29 9:28 ` Geert Uytterhoeven
2024-07-29 9:35 ` Geert Uytterhoeven
2024-07-29 15:29 ` Linux 6.11-rc1 Guenter Roeck
1 sibling, 1 reply; 59+ messages in thread
From: Geert Uytterhoeven @ 2024-07-29 9:28 UTC (permalink / raw)
To: linux-kernel
Below is the list of build error/warning regressions/improvements in
v6.11-rc1[1] compared to v6.10[2].
Summarized:
- build errors: +7/-22
- build warnings: +4/-19
Happy fixing! ;-)
Thanks to the linux-next team for providing the build service.
[1] http://kisskb.ellerman.id.au/kisskb/branch/linus/head/8400291e289ee6b2bf9779ff1c83a291501f017b/ (all 132 configs)
[2] http://kisskb.ellerman.id.au/kisskb/branch/linus/head/0c3836482481200ead7b416ca80c68a29cfdaabd/ (all 132 configs)
*** ERRORS ***
7 error regressions:
+ /kisskb/src/arch/mips/sgi-ip22/ip22-gio.c: error: initialization of 'int (*)(struct device *, const struct device_driver *)' from incompatible pointer type 'int (*)(struct device *, struct device_driver *)' [-Werror=incompatible-pointer-types]: => 384:14
+ /kisskb/src/drivers/md/dm-integrity.c: error: logical not is only applied to the left hand side of comparison [-Werror=logical-not-parentheses]: => 4718:45
+ /kisskb/src/fs/btrfs/inode.c: error: 'location.objectid' may be used uninitialized in this function [-Werror=maybe-uninitialized]: => 5603:9
+ /kisskb/src/fs/btrfs/inode.c: error: 'location.type' may be used uninitialized in this function [-Werror=maybe-uninitialized]: => 5674:5
+ /kisskb/src/include/linux/compiler_types.h: error: call to '__compiletime_assert_933' declared with attribute error: FIELD_GET: mask is not constant: => 510:38
+ /kisskb/src/include/linux/compiler_types.h: error: call to '__compiletime_assert_934' declared with attribute error: FIELD_GET: mask is not constant: => 510:38
+ /kisskb/src/kernel/fork.c: error: #warning clone3() entry point is missing, please fix [-Werror=cpp]: => 3072:2
22 error improvements:
- /kisskb/src/arch/sparc/include/asm/floppy_64.h: error: no previous prototype for 'sparc_floppy_irq' [-Werror=missing-prototypes]: 200:13 =>
- /kisskb/src/arch/sparc/include/asm/floppy_64.h: error: no previous prototype for 'sun_pci_fd_dma_callback' [-Werror=missing-prototypes]: 437:6 =>
- /kisskb/src/arch/sparc/power/hibernate.c: error: no previous prototype for 'pfn_is_nosave' [-Werror=missing-prototypes]: 22:5 =>
- /kisskb/src/arch/sparc/power/hibernate.c: error: no previous prototype for 'restore_processor_state' [-Werror=missing-prototypes]: 35:6 =>
- /kisskb/src/arch/sparc/power/hibernate.c: error: no previous prototype for 'save_processor_state' [-Werror=missing-prototypes]: 30:6 =>
- /kisskb/src/arch/sparc/prom/misc_64.c: error: no previous prototype for 'prom_get_mmu_ihandle' [-Werror=missing-prototypes]: 165:5 =>
- /kisskb/src/arch/sparc/prom/p1275.c: error: no previous prototype for 'prom_cif_init' [-Werror=missing-prototypes]: 52:6 =>
- /kisskb/src/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c: error: unknown option after '#pragma GCC diagnostic' kind [-Werror=pragmas]: 16:9 =>
- /kisskb/src/drivers/gpu/drm/msm/adreno/adreno_gen7_0_0_snapshot.h: error: 'gen7_0_0_external_core_regs' defined but not used [-Werror=unused-variable]: 924:19 =>
- /kisskb/src/drivers/gpu/drm/msm/adreno/adreno_gen7_2_0_snapshot.h: error: 'gen7_2_0_external_core_regs' defined but not used [-Werror=unused-variable]: 748:19 =>
- /kisskb/src/drivers/gpu/drm/msm/adreno/adreno_gen7_9_0_snapshot.h: error: 'gen7_9_0_external_core_regs' defined but not used [-Werror=unused-variable]: 1438:19 =>
- /kisskb/src/drivers/gpu/drm/msm/adreno/adreno_gen7_9_0_snapshot.h: error: 'gen7_9_0_sptp_clusters' defined but not used [-Werror=unused-variable]: 1188:43 =>
- /kisskb/src/fs/bcachefs/data_update.c: error: the frame size of 1028 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]: 338:1 =>
- error: arch/sparc/kernel/process_32.o: relocation truncated to fit: R_SPARC_WDISP22 against `.text': (.fixup+0xc), (.fixup+0x4) =>
- error: arch/sparc/kernel/signal_32.o: relocation truncated to fit: R_SPARC_WDISP22 against `.text': (.fixup+0x8), (.fixup+0x10), (.fixup+0x0), (.fixup+0x20), (.fixup+0x18) =>
- error: relocation truncated to fit: R_SPARC_WDISP22 against `.init.text': (.head.text+0x5040), (.head.text+0x5100) =>
- error: relocation truncated to fit: R_SPARC_WDISP22 against symbol `leon_smp_cpu_startup' defined in .text section in arch/sparc/kernel/trampoline_32.o: (.init.text+0xa4) =>
- {standard input}: Error: displacement to undefined symbol .L137 overflows 8-bit field : 1105, 1031 =>
- {standard input}: Error: displacement to undefined symbol .L158 overflows 8-bit field : 1110 =>
- {standard input}: Error: pcrel too far: 1096, 1126, 1254, 1022, 1074, 1095, 1255, 1020, 1021 => 1397
- {standard input}: Error: unknown pseudo-op: `.al': 1270 =>
- {standard input}: Error: unknown pseudo-op: `.siz': 1273 =>
*** WARNINGS ***
4 warning regressions:
+ /kisskb/src/fs/btrfs/fiemap.c: warning: 'last_extent_end' may be used uninitialized in this function [-Wmaybe-uninitialized]: => 822:19
+ /kisskb/src/fs/btrfs/inode.c: warning: 'location.objectid' may be used uninitialized in this function [-Wmaybe-uninitialized]: => 5603:9
+ /kisskb/src/fs/btrfs/inode.c: warning: 'location.type' may be used uninitialized in this function [-Wmaybe-uninitialized]: => 5674:5
+ /kisskb/src/kernel/fork.c: warning: #warning clone3() entry point is missing, please fix [-Wcpp]: => 3072:2
19 warning improvements:
- ./.config.32r1_defconfig: warning: override: CPU_BIG_ENDIAN changes choice state: 93 =>
- ./.config.32r2_defconfig: warning: override: CPU_BIG_ENDIAN changes choice state: 93 =>
- ./.config.32r6_defconfig: warning: override: CPU_BIG_ENDIAN changes choice state: 95 =>
- ./.config.64r1_defconfig: warning: override: CPU_BIG_ENDIAN changes choice state: 96 =>
- ./.config.64r2_defconfig: warning: override: CPU_BIG_ENDIAN changes choice state: 96 =>
- ./.config.64r6_defconfig: warning: override: CPU_BIG_ENDIAN changes choice state: 98 =>
- ./.config.micro32r2_defconfig: warning: override: CPU_BIG_ENDIAN changes choice state: 94 =>
- .config: warning: override: ARCH_RV32I changes choice state: 6414 =>
- .config: warning: override: CPU_BIG_ENDIAN changes choice state: 92, 95, 97, 93, 94 =>
- /kisskb/src/arch/mips/sgi-ip22/ip22-berr.c: warning: no previous prototype for 'ip22_be_init' [-Wmissing-prototypes]: 113:13 =>
- /kisskb/src/arch/mips/sgi-ip22/ip22-berr.c: warning: no previous prototype for 'ip22_be_interrupt' [-Wmissing-prototypes]: 89:6 =>
- /kisskb/src/arch/mips/sgi-ip22/ip22-gio.c: warning: no previous prototype for 'ip22_gio_init' [-Wmissing-prototypes]: 398:12 =>
- /kisskb/src/arch/mips/sgi-ip22/ip22-gio.c: warning: no previous prototype for 'ip22_gio_set_64bit' [-Wmissing-prototypes]: 249:6 =>
- /kisskb/src/arch/mips/sgi-ip22/ip22-time.c: warning: no previous prototype for 'indy_8254timer_irq' [-Wmissing-prototypes]: 119:18 =>
- /kisskb/src/arch/sparc/prom/misc_64.c: warning: no previous prototype for 'prom_get_mmu_ihandle' [-Wmissing-prototypes]: 165:5 =>
- /kisskb/src/arch/sparc/prom/p1275.c: warning: no previous prototype for 'prom_cif_init' [-Wmissing-prototypes]: 52:6 =>
- /kisskb/src/drivers/base/regmap/regcache-maple.c: warning: 'lower_index' is used uninitialized [-Wuninitialized]: 113:23 =>
- /kisskb/src/drivers/base/regmap/regcache-maple.c: warning: 'lower_last' is used uninitialized [-Wuninitialized]: 113:36 =>
- /kisskb/src/fs/btrfs/extent_io.c: warning: 'last_extent_end' may be used uninitialized in this function [-Wmaybe-uninitialized]: 3285:19 =>
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Build regressions/improvements in v6.11-rc1
2024-07-29 9:28 ` Build regressions/improvements in v6.11-rc1 Geert Uytterhoeven
@ 2024-07-29 9:35 ` Geert Uytterhoeven
2024-07-29 9:54 ` Arnd Bergmann
0 siblings, 1 reply; 59+ messages in thread
From: Geert Uytterhoeven @ 2024-07-29 9:35 UTC (permalink / raw)
To: linux-kernel
Cc: Greg Kroah-Hartman, linux-mips, dm-devel, linuxppc-dev,
linux-btrfs, intel-xe, Arnd Bergmann, linux-sh, sparclinux
On Mon, 29 Jul 2024, Geert Uytterhoeven wrote:
> Below is the list of build error/warning regressions/improvements in
> v6.11-rc1[1] compared to v6.10[2].
>
> Summarized:
> - build errors: +7/-22
> [1] http://kisskb.ellerman.id.au/kisskb/branch/linus/head/8400291e289ee6b2bf9779ff1c83a291501f017b/ (all 132 configs)
> 7 error regressions:
> + /kisskb/src/arch/mips/sgi-ip22/ip22-gio.c: error: initialization of 'int (*)(struct device *, const struct device_driver *)' from incompatible pointer type 'int (*)(struct device *, struct device_driver *)' [-Werror=incompatible-pointer-types]: => 384:14
mips-gcc8/ip22_defconfig
> + /kisskb/src/drivers/md/dm-integrity.c: error: logical not is only applied to the left hand side of comparison [-Werror=logical-not-parentheses]: => 4718:45
powerpc-gcc5/powerpc-all{mod,yes}config
powerpc-gcc5/ppc64le_allmodconfig
> + /kisskb/src/fs/btrfs/inode.c: error: 'location.objectid' may be used uninitialized in this function [-Werror=maybe-uninitialized]: => 5603:9
> + /kisskb/src/fs/btrfs/inode.c: error: 'location.type' may be used uninitialized in this function [-Werror=maybe-uninitialized]: => 5674:5
m68k-gcc8/m68k-allmodconfig
mips-gcc8/mips-allmodconfig
powerpc-gcc5/powerpc-all{mod,yes}config
powerpc-gcc5/ppc64_defconfig
> + /kisskb/src/include/linux/compiler_types.h: error: call to '__compiletime_assert_933' declared with attribute error: FIELD_GET: mask is not constant: => 510:38
> + /kisskb/src/include/linux/compiler_types.h: error: call to '__compiletime_assert_934' declared with attribute error: FIELD_GET: mask is not constant: => 510:38
inlined from 'xe_oa_set_prop_oa_format' at /kisskb/src/drivers/gpu/drm/xe/xe_oa.c:1664:6:
powerpc-gcc5/powerpc-all{yes,mod}config
powerpc-gcc5/powerpc-allmodconfig
powerpc-gcc5/ppc64le_allmodconfig
(fix sent)
> + /kisskb/src/kernel/fork.c: error: #warning clone3() entry point is missing, please fix [-Werror=cpp]: => 3072:2
sh4-gcc13/se{7619,7750}_defconfig
sh4-gcc13/sh-all{mod,no,yes}config
sh4-gcc13/sh-defconfig
sparc64-gcc5/sparc-allnoconfig
sparc64-gcc{5,13}/sparc32_defconfig
sparc64-gcc{5,13}/sparc64-{allno,def}config
sparc64-gcc13/sparc-all{mod,no}config
sparc64-gcc13/sparc64-allmodconfig
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Build regressions/improvements in v6.11-rc1
2024-07-29 9:35 ` Geert Uytterhoeven
@ 2024-07-29 9:54 ` Arnd Bergmann
2024-07-29 10:07 ` Geert Uytterhoeven
0 siblings, 1 reply; 59+ messages in thread
From: Arnd Bergmann @ 2024-07-29 9:54 UTC (permalink / raw)
To: Geert Uytterhoeven, linux-kernel
Cc: Greg Kroah-Hartman, linux-mips, dm-devel, linuxppc-dev,
linux-btrfs, intel-xe, linux-sh, sparclinux, linux-hexagon,
linux-sh
On Mon, Jul 29, 2024, at 11:35, Geert Uytterhoeven wrote:
>
>> + /kisskb/src/kernel/fork.c: error: #warning clone3() entry point is missing, please fix [-Werror=cpp]: => 3072:2
>
> sh4-gcc13/se{7619,7750}_defconfig
> sh4-gcc13/sh-all{mod,no,yes}config
> sh4-gcc13/sh-defconfig
> sparc64-gcc5/sparc-allnoconfig
> sparc64-gcc{5,13}/sparc32_defconfig
> sparc64-gcc{5,13}/sparc64-{allno,def}config
> sparc64-gcc13/sparc-all{mod,no}config
> sparc64-gcc13/sparc64-allmodconfig
Hexagon and NIOS2 as well, but this is expected. I really just
moved the warning into the actual implementation, the warning
is the same as before. hexagon and sh look like they should be
trivial, it's just that nobody seems to care. I'm sure the
patches were posted before and never applied.
sparc and nios2 do need some real work to write and test
the wrappers.
It does look like CONFIG_WERROR did not fail the build before
505d66d1abfb ("clone3: drop __ARCH_WANT_SYS_CLONE3 macro")
as it probably was intended.
Arnd
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Build regressions/improvements in v6.11-rc1
2024-07-29 9:54 ` Arnd Bergmann
@ 2024-07-29 10:07 ` Geert Uytterhoeven
0 siblings, 0 replies; 59+ messages in thread
From: Geert Uytterhoeven @ 2024-07-29 10:07 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linux-kernel, Greg Kroah-Hartman, linux-mips, dm-devel,
linuxppc-dev, linux-btrfs, intel-xe, linux-sh, sparclinux,
linux-hexagon
Hi Arnd,
On Mon, Jul 29, 2024 at 11:55 AM Arnd Bergmann <arnd@arndb.de> wrote:
> On Mon, Jul 29, 2024, at 11:35, Geert Uytterhoeven wrote:
> >> + /kisskb/src/kernel/fork.c: error: #warning clone3() entry point is missing, please fix [-Werror=cpp]: => 3072:2
> >
> > sh4-gcc13/se{7619,7750}_defconfig
> > sh4-gcc13/sh-all{mod,no,yes}config
> > sh4-gcc13/sh-defconfig
> > sparc64-gcc5/sparc-allnoconfig
> > sparc64-gcc{5,13}/sparc32_defconfig
> > sparc64-gcc{5,13}/sparc64-{allno,def}config
> > sparc64-gcc13/sparc-all{mod,no}config
> > sparc64-gcc13/sparc64-allmodconfig
>
> Hexagon and NIOS2 as well, but this is expected. I really just
> moved the warning into the actual implementation, the warning
> is the same as before. hexagon and sh look like they should be
> trivial, it's just that nobody seems to care. I'm sure the
> patches were posted before and never applied.
>
> sparc and nios2 do need some real work to write and test
> the wrappers.
>
> It does look like CONFIG_WERROR did not fail the build before
> 505d66d1abfb ("clone3: drop __ARCH_WANT_SYS_CLONE3 macro")
> as it probably was intended.
Indeed. The actual regression is that this turned into a fatal error
with -Werror.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-28 21:40 Linux 6.11-rc1 Linus Torvalds
2024-07-29 9:28 ` Build regressions/improvements in v6.11-rc1 Geert Uytterhoeven
@ 2024-07-29 15:29 ` Guenter Roeck
2024-07-29 19:23 ` Linus Torvalds
` (2 more replies)
1 sibling, 3 replies; 59+ messages in thread
From: Guenter Roeck @ 2024-07-29 15:29 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Linux Kernel Mailing List
On Sun, Jul 28, 2024 at 02:40:01PM -0700, Linus Torvalds wrote:
> The merge window felt pretty normal, and the stats all look pretty
> normal too. I was expecting things to be quieter because of summer
> vacations, but that (still) doesn't actually seem to have been the
> case.
>
> There's 12k+ regular commits (and another 850 merge commits), so as
> always the summary of this all is just my merge log. The diffstats are
> also (once again) dominated by some big hardware descriptions (another
> AMD GPU register dump accounts for ~45% of the lines in the diff, and
> some more perf event JSON descriptor files account for another 5%).
>
> But if you ignore those HW dumps, the diff too looks perfectly
> regular: drivers account for a bit over half (even when not counting
> the AMD register description noise). The rest is roughly one third
> architecture updates (lots of it is dts files, so I guess I could have
> lumped that in with "more hw descriptor tables"), one third tooling
> and documentation, and one third "core kernel" (filesystems,
> networking, VM and kernel). Very roughly.
>
> If you want more details, you should get the git tree, and then narrow
> things down based on interests.
>
Build results:
total: 158 pass: 139 fail: 19
Failed builds:
alpha:allmodconfig
alpha:tinyconfig
arcv2:tinyconfig
arm:tinyconfig
csky:tinyconfig
hexagon:tinyconfig
loongarch:tinyconfig
m68k:tinyconfig
microblaze:tinyconfig
mips:tinyconfig
nios2:tinyconfig
openrisc:tinyconfig
parisc:tinyconfig
powerpc:tinyconfig
riscv32:tinyconfig
riscv64:tinyconfig
sparc32:tinyconfig
sparc64:tinyconfig
xtensa:tinyconfig
Qemu test results:
total: 533 pass: 493 fail: 40
Failed tests:
arm:versatilepb:versatile_defconfig:aeabi:pci:scsi:mem128:net=default:versatile-pb:ext2
arm:versatilepb:versatile_defconfig:aeabi:pci:flash64:mem128:net=default:versatile-pb:ext2
arm:versatilepb:versatile_defconfig:aeabi:pci:mem128:net=default:versatile-pb:initrd
arm:versatileab:versatile_defconfig:mem128:net=default:versatile-ab:initrd
microblaze:petalogix-s3adsp1800:initrd
microblaze:petalogix-s3adsp1800:rootfs
microblaze:petalogix-ml605:initrd
microblaze:petalogix-ml605:rootfs
microblazeel:petalogix-s3adsp1800:initrd
microblazeel:petalogix-s3adsp1800:rootfs
microblazeel:petalogix-ml605:initrd
microblazeel:petalogix-ml605:rootfs
ppc:mpc8544ds:mpc85xx_defconfig:net=e1000:initrd
ppc:mpc8544ds:mpc85xx_defconfig:scsi[53C895A]:net=ne2k_pci:btrfs
ppc:mpc8544ds:mpc85xx_defconfig:sata-sii3112:net=rtl8139:ext2
ppc:mpc8544ds:mpc85xx_defconfig:sdhci-mmc:net=usb-ohci:ext2
ppc:mpc8544ds:mpc85xx_smp_defconfig:net=e1000:initrd
ppc:mpc8544ds:mpc85xx_smp_defconfig:scsi[DC395]:net=i82550:ext2
ppc:mpc8544ds:mpc85xx_smp_defconfig:scsi[53C895A]:net=usb-ohci:btrfs
ppc:mpc8544ds:mpc85xx_smp_defconfig:sata-sii3112:net=ne2k_pci:ext2
ppc:ppce500:corenet32_smp_defconfig:e500:net=rtl8139:initrd
ppc:ppce500:corenet32_smp_defconfig:e500:net=virtio-net:nvme:btrfs
ppc:ppce500:corenet32_smp_defconfig:e500:net=eTSEC:sdhci-mmc:ext2
ppc:ppce500:corenet32_smp_defconfig:e500:net=e1000:mmc:cramfs
ppc:ppce500:corenet32_smp_defconfig:e500:net=tulip:scsi[53C895A]:ext2
ppc:ppce500:corenet32_smp_defconfig:e500:net=i82562:sata-sii3112:ext2
riscv32:virt:rv32_defconfig:nofs:noscsi:net=e1000:initrd
riscv32:virt:rv32,zbb=no:rv32_defconfig:nofs:noscsi:net=e1000e:virtio-blk:ext2
riscv32:virt:rv32_defconfig:nofs:noscsi:net=i82801:virtio:ext2
riscv32:virt:rv32,zbb=no:rv32_defconfig:nofs:noscsi:net=i82550:virtio-pci:ext2
riscv32:virt:rv32_defconfig:nofs:noscsi:tpm-tis-device:net=e1000-82544gc:sdhci-mmc:ext2
riscv32:virt:rv32,zbb=no:rv32_defconfig:nofs:noscsi:net=usb-ohci:nvme:ext2
riscv32:virt:rv32_defconfig:nofs:noscsi:net=virtio-net-device:usb-ohci:ext2
riscv32:virt:rv32,zbb=no:rv32_defconfig:nofs:noscsi:net=pcnet:usb-ehci:ext2
riscv32:virt:rv32_defconfig:nofs:noscsi:net=virtio-net-pci:usb-xhci:ext2
riscv32:virt:rv32,zbb=no:rv32_defconfig:nofs:noscsi:net=i82557a:usb-uas-ehci:ext2
riscv32:virt:rv32_defconfig:nofs:noscsi:net=i82558a:usb-uas-xhci:ext2
riscv32:virt:rv32_defconfig:nofs:noscsi:net=i82557b:scsi[virtio]:ext2
riscv32:virt:rv32,zbb=no:rv32_defconfig:nofs:noscsi:net=i82557c:scsi[virtio-pci]:ext2
i386:q35:pentium3:defconfig:pae:nosmp:net=ne2k_pci:initrd
Unit test results:
pass: 316946 fail: 0
In summary, quite impressive in a negative sense. At least some of the
problems (such as the tinyconfig build failures, and some of the test
failures) have already been reported. I simply don't have the time for a
detailed analysis. Logs are available at https://kerneltests.org/builders,
in the "master" column, for those with time to track things down.
Guenter
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-29 15:29 ` Linux 6.11-rc1 Guenter Roeck
@ 2024-07-29 19:23 ` Linus Torvalds
2024-07-29 19:50 ` Linus Torvalds
` (2 more replies)
2024-07-30 17:04 ` Guenter Roeck
2024-08-02 17:35 ` Linus Walleij
2 siblings, 3 replies; 59+ messages in thread
From: Linus Torvalds @ 2024-07-29 19:23 UTC (permalink / raw)
To: Guenter Roeck, Peter Zijlstra, Sebastian Andrzej Siewior,
Ingo Molnar
Cc: Linux Kernel Mailing List
On Mon, 29 Jul 2024 at 08:29, Guenter Roeck <linux@roeck-us.net> wrote:
>
> In summary, quite impressive in a negative sense.
Grr. I think a lot of the build failures end up being due to commit
466e4d801cd4 ("task_work: Add TWA_NMI_CURRENT as an additional notify
mode") depending on IRQ_WORK, and that not existing everywhere.
I pushed out a tentative fix as commit cec6937dd1aa ("task_work: make
TWA_NMI_CURRENT handling conditional on IRQ_WORK"). I haven't set up a
build environment for those tiny targets, but it looked fairly
straightforward.
I think that explains at least most of the 'tinyconfig' build failures.
Not super-happy about how people apparently were discussing the build
failures for a long time, and didn't even bother mentioning them in
the pull requests. That broken commit came in through the perf-core
pull from Ingo.
And that fix (if it fixes it - I think it will) still leaves the alpha
allmodconfig build and all the failed tests.
I'll take a look.
Linus
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-29 19:23 ` Linus Torvalds
@ 2024-07-29 19:50 ` Linus Torvalds
2024-07-29 21:34 ` Arnd Bergmann
2024-07-30 7:54 ` Peter Zijlstra
2024-07-31 15:45 ` Guenter Roeck
2 siblings, 1 reply; 59+ messages in thread
From: Linus Torvalds @ 2024-07-29 19:50 UTC (permalink / raw)
To: Guenter Roeck, Peter Zijlstra, Sebastian Andrzej Siewior,
Ingo Molnar, Arnd Bergmann
Cc: Linux Kernel Mailing List
On Mon, 29 Jul 2024 at 12:23, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> And that fix (if it fixes it - I think it will) still leaves the alpha
> allmodconfig build and all the failed tests.
>
> I'll take a look.
Well, the alpha allmodconfig case is apparently
ERROR: modpost: "iowrite64be" [drivers/crypto/caam/caam_jr.ko] undefined!
which I suspect it just a result of commit beba3771d9e0 ("crypto:
caam: Make CRYPTO_DEV_FSL_CAAM dependent of COMPILE_TEST").
IOW, that is almost certainly simply due to better build test
coverage, not a new bug.
But I didn't look into *why* it would fail. We have a comment about
iowrite64be saying
* These get provided from <asm-generic/iomap.h> since alpha does not
* select GENERIC_IOMAP.
and I'm not sure why that isn't correct.
I get a feeling that lib/iomap.c is missing a couple of functions, but
didn't look into it a lot.
I suspect Arnd may be the right person to ask. Arnd?
Linus
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-29 19:50 ` Linus Torvalds
@ 2024-07-29 21:34 ` Arnd Bergmann
2024-07-29 23:47 ` Linus Torvalds
0 siblings, 1 reply; 59+ messages in thread
From: Arnd Bergmann @ 2024-07-29 21:34 UTC (permalink / raw)
To: Linus Torvalds, Guenter Roeck, Peter Zijlstra,
Sebastian Andrzej Siewior, Ingo Molnar, Johannes Berg
Cc: Linux Kernel Mailing List
On Mon, Jul 29, 2024, at 21:50, Linus Torvalds wrote:
> On Mon, 29 Jul 2024 at 12:23, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> And that fix (if it fixes it - I think it will) still leaves the alpha
>> allmodconfig build and all the failed tests.
>>
>> I'll take a look.
>
> Well, the alpha allmodconfig case is apparently
>
> ERROR: modpost: "iowrite64be" [drivers/crypto/caam/caam_jr.ko] undefined!
>
> which I suspect it just a result of commit beba3771d9e0 ("crypto:
> caam: Make CRYPTO_DEV_FSL_CAAM dependent of COMPILE_TEST").
>
> IOW, that is almost certainly simply due to better build test
> coverage, not a new bug.
>
> But I didn't look into *why* it would fail. We have a comment about
> iowrite64be saying
>
> * These get provided from <asm-generic/iomap.h> since alpha does not
> * select GENERIC_IOMAP.
>
> and I'm not sure why that isn't correct.
>
> I get a feeling that lib/iomap.c is missing a couple of functions, but
> didn't look into it a lot.
>
> I suspect Arnd may be the right person to ask. Arnd?
Yes, I've noticed this problem a few weeks ago with another
driver as we tried to fix the usage of iowrite64() on 32-bit
architectures. We actually have two old bugs here and still
need to make a decision about how to fix that properly:
- ioread64()/iowrite64() and their variants are defined
differently on architectures depending on whether they use
CONFIG_GENERIC_IOMAP (x86, um, and a few rare configs
elsewhere) or not. On GENERIC_IOMAP architectures, there
is no 64-bit PIO, so lib/iomap.c only provides the
iowrite64_hi_lo()/iowrite64_lo_hi() etc wrappers that do
a pair of 32-bit accessors for PIO but native 64-bit
MMIO. On other 64-bit architectures, iowrite64() is the
same as writeq() and it can operate on PCI I/O space as
well. Drivers with big-endian registers tend to use
iowriteXXbe() in order to the correct byteswap in the
absence of writeX_be().
- Alpha (and I think parisc) uses the asm-generic/iomap.h
header that is meant for GENERIC_IOMAP but then provides
its own functions. It never had iowrite64be() and we
didn't notice this in the absence of users. The caam driver
includes include/linux/io-64-nonatomic-lo-hi.h, which
then redirects iowrite64be() to iowrite64be_lo_hi()
on x86 (since it does not define iowrite64be()) and
on 32-bit architectures, but uses iowrite64be() from
include/asm-generic/io.h on most other 64-bit
architectures. On alpha it uses the incorrect
prototype.
I suspect we can fix the alpha issue with the trivial
change below (haven't tested yet), but the way we are
inconsistent about these will likely keep biting us
unless we come up with a better way to handle them
across architectures.
Arnd
diff --git a/arch/alpha/include/asm/io.h b/arch/alpha/include/asm/io.h
index 2bb8cbeedf91..52212e47e917 100644
--- a/arch/alpha/include/asm/io.h
+++ b/arch/alpha/include/asm/io.h
@@ -534,8 +534,10 @@ extern inline void writeq(u64 b, volatile void __iomem *addr)
#define ioread16be(p) swab16(ioread16(p))
#define ioread32be(p) swab32(ioread32(p))
+#define ioread64be(p) swab64(ioread64(p))
#define iowrite16be(v,p) iowrite16(swab16(v), (p))
#define iowrite32be(v,p) iowrite32(swab32(v), (p))
+#define iowrite64be(v,p) iowrite64(swab64(v), (p))
#define inb_p inb
#define inw_p inw
@@ -634,8 +637,6 @@ extern void outsl (unsigned long port, const void *src, unsigned long count);
*/
#define ioread64 ioread64
#define iowrite64 iowrite64
-#define ioread64be ioread64be
-#define iowrite64be iowrite64be
#define ioread8_rep ioread8_rep
#define ioread16_rep ioread16_rep
#define ioread32_rep ioread32_rep
^ permalink raw reply related [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-29 21:34 ` Arnd Bergmann
@ 2024-07-29 23:47 ` Linus Torvalds
2024-07-30 15:47 ` Arnd Bergmann
0 siblings, 1 reply; 59+ messages in thread
From: Linus Torvalds @ 2024-07-29 23:47 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Guenter Roeck, Peter Zijlstra, Sebastian Andrzej Siewior,
Ingo Molnar, Johannes Berg, Linux Kernel Mailing List
On Mon, 29 Jul 2024 at 14:35, Arnd Bergmann <arnd@arndb.de> wrote:
>
> I suspect we can fix the alpha issue with the trivial
> change below (haven't tested yet), but the way we are
> inconsistent about these will likely keep biting us
> unless we come up with a better way to handle them
> across architectures.
Well, looking around, the other functions (ie things like
iowrite64be_lo_hi() etc) do end up being handled by lib/iomap.c, and
parisc does seem to implement its own versions.
So this may in fact be the only such case.
Knock wood.
Your suggested patch looks ObviouslyCorrect(tm) to me. I assume I'll
get it through the normal channels after testing?
Linus
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-29 19:23 ` Linus Torvalds
2024-07-29 19:50 ` Linus Torvalds
@ 2024-07-30 7:54 ` Peter Zijlstra
2024-07-31 15:45 ` Guenter Roeck
2 siblings, 0 replies; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-30 7:54 UTC (permalink / raw)
To: Linus Torvalds
Cc: Guenter Roeck, Sebastian Andrzej Siewior, Ingo Molnar,
Linux Kernel Mailing List
On Mon, Jul 29, 2024 at 12:23:01PM -0700, Linus Torvalds wrote:
> Not super-happy about how people apparently were discussing the build
> failures for a long time, and didn't even bother mentioning them in
> the pull requests. That broken commit came in through the perf-core
> pull from Ingo.
My bad, sorry. That issue seems to have completely slipped my mind :-(
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-29 23:47 ` Linus Torvalds
@ 2024-07-30 15:47 ` Arnd Bergmann
0 siblings, 0 replies; 59+ messages in thread
From: Arnd Bergmann @ 2024-07-30 15:47 UTC (permalink / raw)
To: Linus Torvalds
Cc: Guenter Roeck, Peter Zijlstra, Sebastian Andrzej Siewior,
Ingo Molnar, Johannes Berg, Linux Kernel Mailing List
On Tue, Jul 30, 2024, at 01:47, Linus Torvalds wrote:
> On Mon, 29 Jul 2024 at 14:35, Arnd Bergmann <arnd@arndb.de> wrote:
>>
>> I suspect we can fix the alpha issue with the trivial
>> change below (haven't tested yet), but the way we are
>> inconsistent about these will likely keep biting us
>> unless we come up with a better way to handle them
>> across architectures.
>
> Well, looking around, the other functions (ie things like
> iowrite64be_lo_hi() etc) do end up being handled by lib/iomap.c, and
> parisc does seem to implement its own versions.
>
> So this may in fact be the only such case.
>
> Knock wood.
>
> Your suggested patch looks ObviouslyCorrect(tm) to me. I assume I'll
> get it through the normal channels after testing?
Yes, I've sent it with a proper description to the alpha
maintainers for feedback now and queued it up in the
asm-generic tree:
https://lore.kernel.org/lkml/20240730152744.2813600-1-arnd@kernel.org/T/#u
I also sent a fix for the uretprobe syscall number mess, will
send both once we have agreed on how to do that.
Arnd
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-29 15:29 ` Linux 6.11-rc1 Guenter Roeck
2024-07-29 19:23 ` Linus Torvalds
@ 2024-07-30 17:04 ` Guenter Roeck
2024-07-30 17:20 ` Jens Axboe
2024-07-30 18:53 ` Linus Torvalds
2024-08-02 17:35 ` Linus Walleij
2 siblings, 2 replies; 59+ messages in thread
From: Guenter Roeck @ 2024-07-30 17:04 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Jens Axboe
On Mon, Jul 29, 2024 at 08:29:20AM -0700, Guenter Roeck wrote:
> On Sun, Jul 28, 2024 at 02:40:01PM -0700, Linus Torvalds wrote:
> > The merge window felt pretty normal, and the stats all look pretty
> > normal too. I was expecting things to be quieter because of summer
> > vacations, but that (still) doesn't actually seem to have been the
> > case.
> >
> > There's 12k+ regular commits (and another 850 merge commits), so as
> > always the summary of this all is just my merge log. The diffstats are
> > also (once again) dominated by some big hardware descriptions (another
> > AMD GPU register dump accounts for ~45% of the lines in the diff, and
> > some more perf event JSON descriptor files account for another 5%).
> >
> > But if you ignore those HW dumps, the diff too looks perfectly
> > regular: drivers account for a bit over half (even when not counting
> > the AMD register description noise). The rest is roughly one third
> > architecture updates (lots of it is dts files, so I guess I could have
> > lumped that in with "more hw descriptor tables"), one third tooling
> > and documentation, and one third "core kernel" (filesystems,
> > networking, VM and kernel). Very roughly.
> >
> > If you want more details, you should get the git tree, and then narrow
> > things down based on interests.
> >
>
> Build results:
> total: 158 pass: 139 fail: 19
> Failed builds:
...
> i386:q35:pentium3:defconfig:pae:nosmp:net=ne2k_pci:initrd
This failure bisects to commit 0256994887d7 ("Merge tag
'for-6.11/block-post-20240722' of git://git.kernel.dk/linux"). I have no
idea why that would be the case, but it is easy to reproduce. Maybe it is
coincidental. Either case, copying Jens in case he has an idea.
From the crash log:
[ 3.605247] sr 2:0:0:0: Attached scsi generic sg0 type 5
[ 3.764508] sched_clock: Marking stable (3740032902, 23766486)->(3766853760, -3054372)
[ 3.768164] registered taskstats version 1
[ 3.768271] Loading compiled-in X.509 certificates
[ 3.990683] Btrfs loaded, zoned=no, fsverity=no
[ 4.005012] cryptomgr_test (68) used greatest stack depth: 6136 bytes left
[ 4.029889] traps: PANIC: double fault, error_code: 0x0
[ 4.030257] Oops: double fault: 0000 [#1] PREEMPT PTI
[ 4.030456] CPU: 0 UID: 0 PID: 70 Comm: modprobe Not tainted 6.11.0-rc1-00043-g94ede2a3e913 #1
[ 4.030523] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 4.030613] EIP: asm_exc_page_fault+0x0/0x10
[ 4.030886] Code: bf 3e c8 e9 23 06 00 00 66 90 8d 76 00 fc 6a 00 68 f0 bd 3e c8 e9 11 06 00 00 8d 76 00 fc 6a 00 68 54 c5 3e c8 e9 01 06 00 00 <8d> 76 00 fc 68 b0 e9 3e c8 e9 f3 05 00 00 66 90 8d 76 00 fc 6a 00
[ 4.030949] EAX: 028af000 EBX: ffa03fbc ECX: 00000000 EDX: 00000000
[ 4.030963] ESI: c2b51ff8 EDI: ffa04000 EBP: 42b51fb4 ESP: ffa0300c
[ 4.030980] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00000006
[ 4.031007] CR0: 80050033 CR2: ffa02ffc CR3: 08dc6000 CR4: 000006f0
[ 4.031064] Call Trace:
[ 4.031187] <#DF>
[ 4.031249] ? show_regs+0x50/0x58
[ 4.031296] ? die+0x2f/0x90
[ 4.031302] ? vprintk+0x25/0x38
[ 4.031315] ? exc_double_fault+0x6d/0x7c
[ 4.031327] ? doublefault_shim+0x10a/0x118
[ 4.031342] ? asm_exc_int3+0x10/0x10
[ 4.031353] ? asm_exc_double_fault+0xa/0x10
[ 4.031370] </#DF>
[ 4.031389] <ENTRY_TRAMPOLINE>
[ 4.031392] ? asm_exc_int3+0x10/0x10
...
[ 4.033360] ? asm_exc_int3+0x10/0x10
[ 4.033368] ? restore_all_switch_stack+0x65/0xe6
[ 4.033386] </ENTRY_TRAMPOLINE>
[ 4.033415] Modules linked in:
[ 4.033685] ---[ end trace 0000000000000000 ]---
[ 4.033741] EIP: asm_exc_page_fault+0x0/0x10
[ 4.033750] Code: bf 3e c8 e9 23 06 00 00 66 90 8d 76 00 fc 6a 00 68 f0 bd 3e c8 e9 11 06 00 00 8d 76 00 fc 6a 00 68 54 c5 3e c8 e9 01 06 00 00 <8d> 76 00 fc 68 b0 e9 3e c8 e9 f3 05 00 00 66 90 8d 76 00 fc 6a 00
[ 4.033757] EAX: 028af000 EBX: ffa03fbc ECX: 00000000 EDX: 00000000
[ 4.033762] ESI: c2b51ff8 EDI: ffa04000 EBP: 42b51fb4 ESP: ffa0300c
[ 4.033767] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00000006
[ 4.033772] CR0: 80050033 CR2: ffa02ffc CR3: 08dc6000 CR4: 000006f0
[ 4.033838] Kernel panic - not syncing: Fatal exception in interrupt
[ 4.033980] Kernel Offset: disabled
Guenter
---
Bisect log:
# bad: [8400291e289ee6b2bf9779ff1c83a291501f017b] Linux 6.11-rc1
# good: [2c9b3512402ed192d1f43f4531fb5da947e72bd0] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
git bisect start 'v6.11-rc1' '2c9b3512402e'
# bad: [6dc2e98d5f1de162d1777aee97e59d75d70d07c5] s390: Remove protvirt and kvm config guards for uv code
git bisect bad 6dc2e98d5f1de162d1777aee97e59d75d70d07c5
# good: [30d77b7eef019fa4422980806e8b7cdc8674493e] mm/mglru: fix ineffective protection calculation
git bisect good 30d77b7eef019fa4422980806e8b7cdc8674493e
# good: [527eff227d4321c6ea453db1083bc4fdd4d3a3e8] Merge tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
git bisect good 527eff227d4321c6ea453db1083bc4fdd4d3a3e8
# bad: [a362ade892e3e4de69296cddb1a23a1efe701428] Merge tag 'loongarch-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
git bisect bad a362ade892e3e4de69296cddb1a23a1efe701428
# good: [dd018c238b8489b6dd8c06f6b962ea75d79115ff] Merge tag 'bcachefs-2024-07-22' of https://evilpiepirate.org/git/bcachefs
git bisect good dd018c238b8489b6dd8c06f6b962ea75d79115ff
# good: [89ed6c9ac69ec398ccb648f5f675b43e8ca679ca] blk-cgroup: move congestion_count to struct blkcg
git bisect good 89ed6c9ac69ec398ccb648f5f675b43e8ca679ca
# good: [3892b11eac5aaaeefbf717f1953288b77759d9e2] LoongArch: Check TIF_LOAD_WATCH to enable user space watchpoint
git bisect good 3892b11eac5aaaeefbf717f1953288b77759d9e2
# bad: [0256994887d7c89c2a41d872aac67605bda8f115] Merge tag 'for-6.11/block-post-20240722' of git://git.kernel.dk/linux
git bisect bad 0256994887d7c89c2a41d872aac67605bda8f115
# good: [bf4c89fc8797f5c0964a0c3d561fbe7e8483b62f] block: don't call bio_uninit from bio_endio
git bisect good bf4c89fc8797f5c0964a0c3d561fbe7e8483b62f
# good: [85253bac4d02b1f95d6109c221aeccd7a262ec4d] block: don't free submitter owned integrity payload on I/O completion
git bisect good 85253bac4d02b1f95d6109c221aeccd7a262ec4d
# good: [74cc150282e41c6c0704cd305c9a4392dc64ef4d] block: don't free the integrity payload in bio_integrity_unmap_free_user
git bisect good 74cc150282e41c6c0704cd305c9a4392dc64ef4d
# first bad commit: [0256994887d7c89c2a41d872aac67605bda8f115] Merge tag 'for-6.11/block-post-20240722' of git://git.kernel.dk/linux
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 17:04 ` Guenter Roeck
@ 2024-07-30 17:20 ` Jens Axboe
2024-07-30 18:22 ` Guenter Roeck
2024-07-30 18:53 ` Linus Torvalds
1 sibling, 1 reply; 59+ messages in thread
From: Jens Axboe @ 2024-07-30 17:20 UTC (permalink / raw)
To: Guenter Roeck, Linus Torvalds; +Cc: Linux Kernel Mailing List
On 7/30/24 11:04 AM, Guenter Roeck wrote:
> On Mon, Jul 29, 2024 at 08:29:20AM -0700, Guenter Roeck wrote:
>> On Sun, Jul 28, 2024 at 02:40:01PM -0700, Linus Torvalds wrote:
>>> The merge window felt pretty normal, and the stats all look pretty
>>> normal too. I was expecting things to be quieter because of summer
>>> vacations, but that (still) doesn't actually seem to have been the
>>> case.
>>>
>>> There's 12k+ regular commits (and another 850 merge commits), so as
>>> always the summary of this all is just my merge log. The diffstats are
>>> also (once again) dominated by some big hardware descriptions (another
>>> AMD GPU register dump accounts for ~45% of the lines in the diff, and
>>> some more perf event JSON descriptor files account for another 5%).
>>>
>>> But if you ignore those HW dumps, the diff too looks perfectly
>>> regular: drivers account for a bit over half (even when not counting
>>> the AMD register description noise). The rest is roughly one third
>>> architecture updates (lots of it is dts files, so I guess I could have
>>> lumped that in with "more hw descriptor tables"), one third tooling
>>> and documentation, and one third "core kernel" (filesystems,
>>> networking, VM and kernel). Very roughly.
>>>
>>> If you want more details, you should get the git tree, and then narrow
>>> things down based on interests.
>>>
>>
>> Build results:
>> total: 158 pass: 139 fail: 19
>> Failed builds:
> ...
>> i386:q35:pentium3:defconfig:pae:nosmp:net=ne2k_pci:initrd
>
> This failure bisects to commit 0256994887d7 ("Merge tag
> 'for-6.11/block-post-20240722' of git://git.kernel.dk/linux"). I have no
> idea why that would be the case, but it is easy to reproduce. Maybe it is
> coincidental. Either case, copying Jens in case he has an idea.
I can take a look, but please post some details on what is actually
being run here so I can attempt to reproduce it. I looked at your
initial email too, and there's a link in there to:
https://kerneltests.org/builders
but I'm still not sure what's being run.
--
Jens Axboe
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 17:20 ` Jens Axboe
@ 2024-07-30 18:22 ` Guenter Roeck
2024-07-30 18:35 ` Jens Axboe
0 siblings, 1 reply; 59+ messages in thread
From: Guenter Roeck @ 2024-07-30 18:22 UTC (permalink / raw)
To: Jens Axboe, Linus Torvalds; +Cc: Linux Kernel Mailing List
On 7/30/24 10:20, Jens Axboe wrote:
> On 7/30/24 11:04 AM, Guenter Roeck wrote:
>> On Mon, Jul 29, 2024 at 08:29:20AM -0700, Guenter Roeck wrote:
>>> On Sun, Jul 28, 2024 at 02:40:01PM -0700, Linus Torvalds wrote:
>>>> The merge window felt pretty normal, and the stats all look pretty
>>>> normal too. I was expecting things to be quieter because of summer
>>>> vacations, but that (still) doesn't actually seem to have been the
>>>> case.
>>>>
>>>> There's 12k+ regular commits (and another 850 merge commits), so as
>>>> always the summary of this all is just my merge log. The diffstats are
>>>> also (once again) dominated by some big hardware descriptions (another
>>>> AMD GPU register dump accounts for ~45% of the lines in the diff, and
>>>> some more perf event JSON descriptor files account for another 5%).
>>>>
>>>> But if you ignore those HW dumps, the diff too looks perfectly
>>>> regular: drivers account for a bit over half (even when not counting
>>>> the AMD register description noise). The rest is roughly one third
>>>> architecture updates (lots of it is dts files, so I guess I could have
>>>> lumped that in with "more hw descriptor tables"), one third tooling
>>>> and documentation, and one third "core kernel" (filesystems,
>>>> networking, VM and kernel). Very roughly.
>>>>
>>>> If you want more details, you should get the git tree, and then narrow
>>>> things down based on interests.
>>>>
>>>
>>> Build results:
>>> total: 158 pass: 139 fail: 19
>>> Failed builds:
>> ...
>>> i386:q35:pentium3:defconfig:pae:nosmp:net=ne2k_pci:initrd
>>
>> This failure bisects to commit 0256994887d7 ("Merge tag
>> 'for-6.11/block-post-20240722' of git://git.kernel.dk/linux"). I have no
>> idea why that would be the case, but it is easy to reproduce. Maybe it is
>> coincidental. Either case, copying Jens in case he has an idea.
>
> I can take a look, but please post some details on what is actually
> being run here so I can attempt to reproduce it. I looked at your
> initial email too, and there's a link in there to:
>
> https://kerneltests.org/builders
>
> but I'm still not sure what's being run.
>
Please see http://server.roeck-us.net/qemu/x86-nosmp/
Thanks,
Guenter
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 18:22 ` Guenter Roeck
@ 2024-07-30 18:35 ` Jens Axboe
2024-07-30 18:54 ` Jens Axboe
0 siblings, 1 reply; 59+ messages in thread
From: Jens Axboe @ 2024-07-30 18:35 UTC (permalink / raw)
To: Guenter Roeck, Linus Torvalds; +Cc: Linux Kernel Mailing List
On 7/30/24 12:22 PM, Guenter Roeck wrote:
> On 7/30/24 10:20, Jens Axboe wrote:
>> On 7/30/24 11:04 AM, Guenter Roeck wrote:
>>> On Mon, Jul 29, 2024 at 08:29:20AM -0700, Guenter Roeck wrote:
>>>> On Sun, Jul 28, 2024 at 02:40:01PM -0700, Linus Torvalds wrote:
>>>>> The merge window felt pretty normal, and the stats all look pretty
>>>>> normal too. I was expecting things to be quieter because of summer
>>>>> vacations, but that (still) doesn't actually seem to have been the
>>>>> case.
>>>>>
>>>>> There's 12k+ regular commits (and another 850 merge commits), so as
>>>>> always the summary of this all is just my merge log. The diffstats are
>>>>> also (once again) dominated by some big hardware descriptions (another
>>>>> AMD GPU register dump accounts for ~45% of the lines in the diff, and
>>>>> some more perf event JSON descriptor files account for another 5%).
>>>>>
>>>>> But if you ignore those HW dumps, the diff too looks perfectly
>>>>> regular: drivers account for a bit over half (even when not counting
>>>>> the AMD register description noise). The rest is roughly one third
>>>>> architecture updates (lots of it is dts files, so I guess I could have
>>>>> lumped that in with "more hw descriptor tables"), one third tooling
>>>>> and documentation, and one third "core kernel" (filesystems,
>>>>> networking, VM and kernel). Very roughly.
>>>>>
>>>>> If you want more details, you should get the git tree, and then narrow
>>>>> things down based on interests.
>>>>>
>>>>
>>>> Build results:
>>>> total: 158 pass: 139 fail: 19
>>>> Failed builds:
>>> ...
>>>> i386:q35:pentium3:defconfig:pae:nosmp:net=ne2k_pci:initrd
>>>
>>> This failure bisects to commit 0256994887d7 ("Merge tag
>>> 'for-6.11/block-post-20240722' of git://git.kernel.dk/linux"). I have no
>>> idea why that would be the case, but it is easy to reproduce. Maybe it is
>>> coincidental. Either case, copying Jens in case he has an idea.
>>
>> I can take a look, but please post some details on what is actually
>> being run here so I can attempt to reproduce it. I looked at your
>> initial email too, and there's a link in there to:
>>
>> https://kerneltests.org/builders
>>
>> but I'm still not sure what's being run.
>>
>
> Please see http://server.roeck-us.net/qemu/x86-nosmp/
Works fine for me on current master, boots and run self tests and
then shuts down. Tried it 5 times now.
axboe@r7625 ~/g/linux-vm (master)> qemu-system-i386 --version
QEMU emulator version 8.2.4 (Debian 1:8.2.4+ds-1)
Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers
Then tried 6.11-rc1 10 times in a loop, and also didn't see any failures.
I then switched to using gcc-11 as that seems to be what you are using,
and them it does indeed bomb during boot. Funky. I'll check the post
branch and see if it's anything from there.
--
Jens Axboe
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 17:04 ` Guenter Roeck
2024-07-30 17:20 ` Jens Axboe
@ 2024-07-30 18:53 ` Linus Torvalds
2024-07-30 19:22 ` Peter Zijlstra
2024-07-31 10:33 ` Peter Zijlstra
1 sibling, 2 replies; 59+ messages in thread
From: Linus Torvalds @ 2024-07-30 18:53 UTC (permalink / raw)
To: Guenter Roeck, Andy Lutomirski, Ingo Molnar, Peter Anvin
Cc: Linux Kernel Mailing List, Jens Axboe, the arch/x86 maintainers
[ Adding x86-32 entry code people, more context at the thread in:
https://lore.kernel.org/all/3f65bfad-bd04-4651-bbe3-e2b1925f1a13@kernel.dk/
for people who were dragged in late ]
On Tue, 30 Jul 2024 at 10:04, Guenter Roeck <linux@roeck-us.net> wrote:
>
> From the crash log:
The full log is more informative, at
http://server.roeck-us.net/qemu/x86-nosmp/
which has that config too.
> [ 3.605247] sr 2:0:0:0: Attached scsi generic sg0 type 5
> [ 3.764508] sched_clock: Marking stable (3740032902, 23766486)->(3766853760, -3054372)
> [ 3.768164] registered taskstats version 1
> [ 3.768271] Loading compiled-in X.509 certificates
> [ 3.990683] Btrfs loaded, zoned=no, fsverity=no
> [ 4.005012] cryptomgr_test (68) used greatest stack depth: 6136 bytes left
> [ 4.029889] traps: PANIC: double fault, error_code: 0x0
Double faults are bad bad juju. Nasty to debug, because it means
something went wrong at a horribly bad time.
> [ 4.030613] EIP: asm_exc_page_fault+0x0/0x10
Sadly, this mainly says that taking a page fault was part of the
horribly bad time.
> [ 4.031389] <ENTRY_TRAMPOLINE>
> [ 4.031392] ? asm_exc_int3+0x10/0x10
> ...
> [ 4.033360] ? asm_exc_int3+0x10/0x10
> [ 4.033368] ? restore_all_switch_stack+0x65/0xe6
> [ 4.033386] </ENTRY_TRAMPOLINE>
Yeah "restore_all_switch_stack" is also part of "horribly bad time".
And from the full log, I see that the "..." is a *lot* of asm_exc_int3+0x10.
Which makes me think it's asm_exc_int3 just recursively failing.
Which will cause a stack overflow, and then - after a time - a double fault.
[ Time passes, I build the i386 kernel image with your config just to
get an image that looks like yours ]
Hmm. I think the stack dump output confused me. Because
"asm_exc_int3+0x10/0x10" doesn't end up making much sense, but it
turns out that "asm_exc_int3+0x10" is actually the same as
'asm_exc_page_fault'.
So it smells like we're taking a page fault, but somehow the page
fault text address has been unmapped, so taking a page fault causes a
page fault and then we end up finally in that same "no more stack,
double fault" situation.
Either page table corruption, or some issue with the page table mitigation.
The fact that it started happening with the block merge may be because
the block code causes some major corruption, or may just be random bad
luck and it just changed some alignment somewhere, and exposed a
hidden but pre-existing issue.
Jens separately said that he can see it with gcc-11, but not his
regular compiler, so regardless it seems to be compiler-dependent.
Let's see it x86 people have some idea, but that
restore_all_switch_stack+0x65/0xe6
and doing an objdump to see the code generation, it is literally here:
0f 20 d8 mov %cr3,%eax
0d 00 10 00 00 or $0x1000,%eax
0f 22 d8 mov %eax,%cr3
eb 16 jmp <restore_all_switch_stack+0x7d>
with that "jmp" instruction being the restore_all_switch_stack+0x65 address.
So the infinite page faults seem to literally happen right after the
"mov %eax,%cr3".
Definitely something wrong with the page tables. But where that
wrongness comes from, I have no idea.
Linus
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 18:35 ` Jens Axboe
@ 2024-07-30 18:54 ` Jens Axboe
0 siblings, 0 replies; 59+ messages in thread
From: Jens Axboe @ 2024-07-30 18:54 UTC (permalink / raw)
To: Guenter Roeck, Linus Torvalds; +Cc: Linux Kernel Mailing List
On 7/30/24 12:35 PM, Jens Axboe wrote:
> On 7/30/24 12:22 PM, Guenter Roeck wrote:
>> On 7/30/24 10:20, Jens Axboe wrote:
>>> On 7/30/24 11:04 AM, Guenter Roeck wrote:
>>>> On Mon, Jul 29, 2024 at 08:29:20AM -0700, Guenter Roeck wrote:
>>>>> On Sun, Jul 28, 2024 at 02:40:01PM -0700, Linus Torvalds wrote:
>>>>>> The merge window felt pretty normal, and the stats all look pretty
>>>>>> normal too. I was expecting things to be quieter because of summer
>>>>>> vacations, but that (still) doesn't actually seem to have been the
>>>>>> case.
>>>>>>
>>>>>> There's 12k+ regular commits (and another 850 merge commits), so as
>>>>>> always the summary of this all is just my merge log. The diffstats are
>>>>>> also (once again) dominated by some big hardware descriptions (another
>>>>>> AMD GPU register dump accounts for ~45% of the lines in the diff, and
>>>>>> some more perf event JSON descriptor files account for another 5%).
>>>>>>
>>>>>> But if you ignore those HW dumps, the diff too looks perfectly
>>>>>> regular: drivers account for a bit over half (even when not counting
>>>>>> the AMD register description noise). The rest is roughly one third
>>>>>> architecture updates (lots of it is dts files, so I guess I could have
>>>>>> lumped that in with "more hw descriptor tables"), one third tooling
>>>>>> and documentation, and one third "core kernel" (filesystems,
>>>>>> networking, VM and kernel). Very roughly.
>>>>>>
>>>>>> If you want more details, you should get the git tree, and then narrow
>>>>>> things down based on interests.
>>>>>>
>>>>>
>>>>> Build results:
>>>>> total: 158 pass: 139 fail: 19
>>>>> Failed builds:
>>>> ...
>>>>> i386:q35:pentium3:defconfig:pae:nosmp:net=ne2k_pci:initrd
>>>>
>>>> This failure bisects to commit 0256994887d7 ("Merge tag
>>>> 'for-6.11/block-post-20240722' of git://git.kernel.dk/linux"). I have no
>>>> idea why that would be the case, but it is easy to reproduce. Maybe it is
>>>> coincidental. Either case, copying Jens in case he has an idea.
>>>
>>> I can take a look, but please post some details on what is actually
>>> being run here so I can attempt to reproduce it. I looked at your
>>> initial email too, and there's a link in there to:
>>>
>>> https://kerneltests.org/builders
>>>
>>> but I'm still not sure what's being run.
>>>
>>
>> Please see http://server.roeck-us.net/qemu/x86-nosmp/
>
> Works fine for me on current master, boots and run self tests and
> then shuts down. Tried it 5 times now.
>
> axboe@r7625 ~/g/linux-vm (master)> qemu-system-i386 --version
> QEMU emulator version 8.2.4 (Debian 1:8.2.4+ds-1)
> Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers
>
> Then tried 6.11-rc1 10 times in a loop, and also didn't see any failures.
>
> I then switched to using gcc-11 as that seems to be what you are using,
> and them it does indeed bomb during boot. Funky. I'll check the post
> branch and see if it's anything from there.
I can fully revert that for-6.11/block-post merge and it still crashes
in the same way for me. So don't believe that's the culprit. It
consistently crashes with a double fault when starting cryptomgr, so
that may be a clue.
FWIW, if I disable KFENCE, then it boots just fine with gcc-11. Or if I
use gcc 13 or 14 it works just fine regardless of whether KFENCE is set
or not.
--
Jens Axboe
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 18:53 ` Linus Torvalds
@ 2024-07-30 19:22 ` Peter Zijlstra
2024-07-30 19:31 ` Jens Axboe
2024-07-31 10:33 ` Peter Zijlstra
1 sibling, 1 reply; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-30 19:22 UTC (permalink / raw)
To: Linus Torvalds
Cc: Guenter Roeck, Andy Lutomirski, Ingo Molnar, Peter Anvin,
Linux Kernel Mailing List, Jens Axboe, the arch/x86 maintainers
On Tue, Jul 30, 2024 at 11:53:31AM -0700, Linus Torvalds wrote:
> Which makes me think it's asm_exc_int3 just recursively failing.
Sounds like text_poke() going sideways, there's a jump_label fail out
there:
https://lkml.kernel.org/r/20240730132626.GV26599@noisy.programming.kicks-ass.net
> Let's see it x86 people have some idea, but that
>
> restore_all_switch_stack+0x65/0xe6
>
> and doing an objdump to see the code generation, it is literally here:
>
> 0f 20 d8 mov %cr3,%eax
> 0d 00 10 00 00 or $0x1000,%eax
> 0f 22 d8 mov %eax,%cr3
That looks like SWITCH_TO_USER_CR3
> eb 16 jmp <restore_all_switch_stack+0x7d>
>
> with that "jmp" instruction being the restore_all_switch_stack+0x65 address.
Thish would make this BUG_IF_WRONG_CR3, which starts with an ALTERNATIVE
jmp. I think we landed a pile of ALTERNATIVE patches this merge window.
That said, Boris did spend an awful lot of time testing them... but this
is 32bit so who knows how much time that got.
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 19:22 ` Peter Zijlstra
@ 2024-07-30 19:31 ` Jens Axboe
2024-07-30 19:34 ` Jens Axboe
2024-07-30 19:38 ` Peter Zijlstra
0 siblings, 2 replies; 59+ messages in thread
From: Jens Axboe @ 2024-07-30 19:31 UTC (permalink / raw)
To: Peter Zijlstra, Linus Torvalds
Cc: Guenter Roeck, Andy Lutomirski, Ingo Molnar, Peter Anvin,
Linux Kernel Mailing List, the arch/x86 maintainers
On 7/30/24 1:22 PM, Peter Zijlstra wrote:
> On Tue, Jul 30, 2024 at 11:53:31AM -0700, Linus Torvalds wrote:
>
>> Which makes me think it's asm_exc_int3 just recursively failing.
>
> Sounds like text_poke() going sideways, there's a jump_label fail out
> there:
>
> https://lkml.kernel.org/r/20240730132626.GV26599@noisy.programming.kicks-ass.net
No change with this applied...
Also not sure if you read my link, but a few things to note:
- It only happens with gcc-11 here. I tried 12/13/14 and those
are fine, don't have anything older
- It only happens with KFENCE enabled.
>> Let's see it x86 people have some idea, but that
>>
>> restore_all_switch_stack+0x65/0xe6
>>
>> and doing an objdump to see the code generation, it is literally here:
>>
>> 0f 20 d8 mov %cr3,%eax
>> 0d 00 10 00 00 or $0x1000,%eax
>> 0f 22 d8 mov %eax,%cr3
>
> That looks like SWITCH_TO_USER_CR3
>
>> eb 16 jmp <restore_all_switch_stack+0x7d>
>>
>> with that "jmp" instruction being the restore_all_switch_stack+0x65 address.
>
> Thish would make this BUG_IF_WRONG_CR3, which starts with an ALTERNATIVE
> jmp. I think we landed a pile of ALTERNATIVE patches this merge window.
>
> That said, Boris did spend an awful lot of time testing them... but this
> is 32bit so who knows how much time that got.
Since I got this setup with Guenter's setup, it literally takes me seconds
to compile and test anything. So feel free to toss anything at it and we'll
see what sticks.
--
Jens Axboe
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 19:31 ` Jens Axboe
@ 2024-07-30 19:34 ` Jens Axboe
2024-07-30 19:38 ` Peter Zijlstra
1 sibling, 0 replies; 59+ messages in thread
From: Jens Axboe @ 2024-07-30 19:34 UTC (permalink / raw)
To: Peter Zijlstra, Linus Torvalds
Cc: Guenter Roeck, Andy Lutomirski, Ingo Molnar, Peter Anvin,
Linux Kernel Mailing List, the arch/x86 maintainers
On 7/30/24 1:31 PM, Jens Axboe wrote:
>> Thish would make this BUG_IF_WRONG_CR3, which starts with an ALTERNATIVE
>> jmp. I think we landed a pile of ALTERNATIVE patches this merge window.
>>
>> That said, Boris did spend an awful lot of time testing them... but this
>> is 32bit so who knows how much time that got.
>
> Since I got this setup with Guenter's setup, it literally takes me seconds
> to compile and test anything. So feel free to toss anything at it and we'll
> see what sticks.
I reverted all the alternative changes, still crashes in the same way.
This is range 1467b49869df..208c6772d38392 fwiw.
--
Jens Axboe
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 19:31 ` Jens Axboe
2024-07-30 19:34 ` Jens Axboe
@ 2024-07-30 19:38 ` Peter Zijlstra
2024-07-30 19:41 ` Linus Torvalds
` (2 more replies)
1 sibling, 3 replies; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-30 19:38 UTC (permalink / raw)
To: Jens Axboe
Cc: Linus Torvalds, Guenter Roeck, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On Tue, Jul 30, 2024 at 01:31:18PM -0600, Jens Axboe wrote:
> On 7/30/24 1:22 PM, Peter Zijlstra wrote:
> > On Tue, Jul 30, 2024 at 11:53:31AM -0700, Linus Torvalds wrote:
> >
> >> Which makes me think it's asm_exc_int3 just recursively failing.
> >
> > Sounds like text_poke() going sideways, there's a jump_label fail out
> > there:
> >
> > https://lkml.kernel.org/r/20240730132626.GV26599@noisy.programming.kicks-ass.net
>
> No change with this applied...
>
> Also not sure if you read my link, but a few things to note:
>
> - It only happens with gcc-11 here. I tried 12/13/14 and those
> are fine, don't have anything older
One of my test boxes has 4.4 4.6 4.8 4.9 5 6 8 9 10 11 12 13
(now I gotta go figure out wth 7 went :-) And yeah, we don't support
most of those version anymore (phew).
So if its easy to setup, I could try older GCCs.
> - It only happens with KFENCE enabled.
I missed the KFENCE bit. Happen to have the .config handy, I couldn't
make much sense of Gunther's website in a hurry.
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 19:38 ` Peter Zijlstra
@ 2024-07-30 19:41 ` Linus Torvalds
2024-07-30 20:04 ` Guenter Roeck
2024-07-30 20:24 ` Guenter Roeck
2 siblings, 0 replies; 59+ messages in thread
From: Linus Torvalds @ 2024-07-30 19:41 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Jens Axboe, Guenter Roeck, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On Tue, 30 Jul 2024 at 12:38, Peter Zijlstra <peterz@infradead.org> wrote:
>
>
> I missed the KFENCE bit. Happen to have the .config handy, I couldn't
> make much sense of Gunther's website in a hurry.
This is what you want to use:
http://server.roeck-us.net/qemu/x86-nosmp/
It has that kernel config in there, along with the oops etc.
Linus
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 19:38 ` Peter Zijlstra
2024-07-30 19:41 ` Linus Torvalds
@ 2024-07-30 20:04 ` Guenter Roeck
2024-07-30 20:09 ` Peter Zijlstra
2024-07-30 20:13 ` Linus Torvalds
2024-07-30 20:24 ` Guenter Roeck
2 siblings, 2 replies; 59+ messages in thread
From: Guenter Roeck @ 2024-07-30 20:04 UTC (permalink / raw)
To: Peter Zijlstra, Jens Axboe
Cc: Linus Torvalds, Andy Lutomirski, Ingo Molnar, Peter Anvin,
Linux Kernel Mailing List, the arch/x86 maintainers
On 7/30/24 12:38, Peter Zijlstra wrote:
> On Tue, Jul 30, 2024 at 01:31:18PM -0600, Jens Axboe wrote:
>> On 7/30/24 1:22 PM, Peter Zijlstra wrote:
>>> On Tue, Jul 30, 2024 at 11:53:31AM -0700, Linus Torvalds wrote:
>>>
>>>> Which makes me think it's asm_exc_int3 just recursively failing.
>>>
>>> Sounds like text_poke() going sideways, there's a jump_label fail out
>>> there:
>>>
>>> https://lkml.kernel.org/r/20240730132626.GV26599@noisy.programming.kicks-ass.net
>>
>> No change with this applied...
>>
>> Also not sure if you read my link, but a few things to note:
>>
>> - It only happens with gcc-11 here. I tried 12/13/14 and those
>> are fine, don't have anything older
>
> One of my test boxes has 4.4 4.6 4.8 4.9 5 6 8 9 10 11 12 13
>
> (now I gotta go figure out wth 7 went :-) And yeah, we don't support
> most of those version anymore (phew).
>
> So if its easy to setup, I could try older GCCs.
>
WFM with gcc 9.4, 10.3, 12.4, and 13.3. gcc 11.4 and 11.5 both fail.
Maybe I should just switch to a more recent version of gcc and call it a day,
in the hope that it is a compiler (or qemu) problem and doesn't just hide
the problem.
Thoughts ?
Guenter
>> - It only happens with KFENCE enabled.
>
> I missed the KFENCE bit. Happen to have the .config handy, I couldn't
> make much sense of Gunther's website in a hurry.
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 20:04 ` Guenter Roeck
@ 2024-07-30 20:09 ` Peter Zijlstra
2024-07-30 21:12 ` Peter Zijlstra
2024-07-30 23:29 ` Guenter Roeck
2024-07-30 20:13 ` Linus Torvalds
1 sibling, 2 replies; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-30 20:09 UTC (permalink / raw)
To: Guenter Roeck
Cc: Jens Axboe, Linus Torvalds, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On Tue, Jul 30, 2024 at 01:04:49PM -0700, Guenter Roeck wrote:
> On 7/30/24 12:38, Peter Zijlstra wrote:
> > On Tue, Jul 30, 2024 at 01:31:18PM -0600, Jens Axboe wrote:
> > > On 7/30/24 1:22 PM, Peter Zijlstra wrote:
> > > > On Tue, Jul 30, 2024 at 11:53:31AM -0700, Linus Torvalds wrote:
> > > >
> > > > > Which makes me think it's asm_exc_int3 just recursively failing.
> > > >
> > > > Sounds like text_poke() going sideways, there's a jump_label fail out
> > > > there:
> > > >
> > > > https://lkml.kernel.org/r/20240730132626.GV26599@noisy.programming.kicks-ass.net
> > >
> > > No change with this applied...
> > >
> > > Also not sure if you read my link, but a few things to note:
> > >
> > > - It only happens with gcc-11 here. I tried 12/13/14 and those
> > > are fine, don't have anything older
> >
> > One of my test boxes has 4.4 4.6 4.8 4.9 5 6 8 9 10 11 12 13
> >
> > (now I gotta go figure out wth 7 went :-) And yeah, we don't support
> > most of those version anymore (phew).
> >
> > So if its easy to setup, I could try older GCCs.
> >
>
> WFM with gcc 9.4, 10.3, 12.4, and 13.3. gcc 11.4 and 11.5 both fail.
10.5 and 13.2 worked for me, and I can confirm 11.4 makes it go boom.
> Maybe I should just switch to a more recent version of gcc and call it a day,
> in the hope that it is a compiler (or qemu) problem and doesn't just hide
> the problem.
>
> Thoughts ?
Tempting, but I think it would be good to figure out what in GCC-11
makes it sad, gcc-11 is still well within the supported range of GCCs
afaik.
Lets see if its something that wants to be bisected.
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 20:04 ` Guenter Roeck
2024-07-30 20:09 ` Peter Zijlstra
@ 2024-07-30 20:13 ` Linus Torvalds
1 sibling, 0 replies; 59+ messages in thread
From: Linus Torvalds @ 2024-07-30 20:13 UTC (permalink / raw)
To: Guenter Roeck
Cc: Peter Zijlstra, Jens Axboe, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On Tue, 30 Jul 2024 at 13:04, Guenter Roeck <linux@roeck-us.net> wrote:
>
> Maybe I should just switch to a more recent version of gcc and call it a day,
> in the hope that it is a compiler (or qemu) problem and doesn't just hide
> the problem.
Well, if it's a gcc-11 problem, I think we still really want to know
what is going on. We are *not* all that close to dropping support for
gcc-11 yet.
And honestly, while it's often very convenient to blame the compiler,
compiler bugs are still very rare.
It's *much* more common that bad code just happens to work with a good
compiler than that good code happens to break with a bad compiler.
Yes, we obviously do hit real compiler bugs, but still ... We'd need
to actually see what goes wrong in the code generation before blaming
a compiler bug.
Linus
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 19:38 ` Peter Zijlstra
2024-07-30 19:41 ` Linus Torvalds
2024-07-30 20:04 ` Guenter Roeck
@ 2024-07-30 20:24 ` Guenter Roeck
2024-07-31 12:20 ` Peter Zijlstra
2 siblings, 1 reply; 59+ messages in thread
From: Guenter Roeck @ 2024-07-30 20:24 UTC (permalink / raw)
To: Peter Zijlstra, Jens Axboe
Cc: Linus Torvalds, Andy Lutomirski, Ingo Molnar, Peter Anvin,
Linux Kernel Mailing List, the arch/x86 maintainers
On 7/30/24 12:38, Peter Zijlstra wrote:
> On Tue, Jul 30, 2024 at 01:31:18PM -0600, Jens Axboe wrote:
>> On 7/30/24 1:22 PM, Peter Zijlstra wrote:
>>> On Tue, Jul 30, 2024 at 11:53:31AM -0700, Linus Torvalds wrote:
>>>
>>>> Which makes me think it's asm_exc_int3 just recursively failing.
>>>
>>> Sounds like text_poke() going sideways, there's a jump_label fail out
>>> there:
>>>
>>> https://lkml.kernel.org/r/20240730132626.GV26599@noisy.programming.kicks-ass.net
>>
>> No change with this applied...
>>
>> Also not sure if you read my link, but a few things to note:
>>
>> - It only happens with gcc-11 here. I tried 12/13/14 and those
>> are fine, don't have anything older
>
> One of my test boxes has 4.4 4.6 4.8 4.9 5 6 8 9 10 11 12 13
>
> (now I gotta go figure out wth 7 went :-) And yeah, we don't support
> most of those version anymore (phew).
>
> So if its easy to setup, I could try older GCCs.
>
>> - It only happens with KFENCE enabled.
>
> I missed the KFENCE bit. Happen to have the .config handy, I couldn't
> make much sense of Gunther's website in a hurry.
>
An interesting bit of information: The problem is seen with many,
but not all CPUs. For example, I don't see it with athlon, n270, Dhyana,
or EPYC. qemu32 is affected, but qemu64 is fine. But on the other side
both kvm32 and kvm64 are affected.
Guenter
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 20:09 ` Peter Zijlstra
@ 2024-07-30 21:12 ` Peter Zijlstra
2024-07-30 23:29 ` Guenter Roeck
1 sibling, 0 replies; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-30 21:12 UTC (permalink / raw)
To: Guenter Roeck
Cc: Jens Axboe, Linus Torvalds, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On Tue, Jul 30, 2024 at 10:09:47PM +0200, Peter Zijlstra wrote:
> Lets see if its something that wants to be bisected.
Complete failure.. something along the way must've changed a critical
CONFIG symbol. The .config I ended up with at v6.11-rc1 did no longer
reproduce.
I'll try again tomorrow if nobody beats me to it.
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 20:09 ` Peter Zijlstra
2024-07-30 21:12 ` Peter Zijlstra
@ 2024-07-30 23:29 ` Guenter Roeck
2024-07-30 23:54 ` Linus Torvalds
1 sibling, 1 reply; 59+ messages in thread
From: Guenter Roeck @ 2024-07-30 23:29 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Jens Axboe, Linus Torvalds, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On 7/30/24 13:09, Peter Zijlstra wrote:
> On Tue, Jul 30, 2024 at 01:04:49PM -0700, Guenter Roeck wrote:
>> On 7/30/24 12:38, Peter Zijlstra wrote:
>>> On Tue, Jul 30, 2024 at 01:31:18PM -0600, Jens Axboe wrote:
>>>> On 7/30/24 1:22 PM, Peter Zijlstra wrote:
>>>>> On Tue, Jul 30, 2024 at 11:53:31AM -0700, Linus Torvalds wrote:
>>>>>
>>>>>> Which makes me think it's asm_exc_int3 just recursively failing.
>>>>>
>>>>> Sounds like text_poke() going sideways, there's a jump_label fail out
>>>>> there:
>>>>>
>>>>> https://lkml.kernel.org/r/20240730132626.GV26599@noisy.programming.kicks-ass.net
>>>>
>>>> No change with this applied...
>>>>
>>>> Also not sure if you read my link, but a few things to note:
>>>>
>>>> - It only happens with gcc-11 here. I tried 12/13/14 and those
>>>> are fine, don't have anything older
>>>
>>> One of my test boxes has 4.4 4.6 4.8 4.9 5 6 8 9 10 11 12 13
>>>
>>> (now I gotta go figure out wth 7 went :-) And yeah, we don't support
>>> most of those version anymore (phew).
>>>
>>> So if its easy to setup, I could try older GCCs.
>>>
>>
>> WFM with gcc 9.4, 10.3, 12.4, and 13.3. gcc 11.4 and 11.5 both fail.
>
> 10.5 and 13.2 worked for me, and I can confirm 11.4 makes it go boom.
>
>> Maybe I should just switch to a more recent version of gcc and call it a day,
>> in the hope that it is a compiler (or qemu) problem and doesn't just hide
>> the problem.
>>
>> Thoughts ?
>
> Tempting, but I think it would be good to figure out what in GCC-11
> makes it sad, gcc-11 is still well within the supported range of GCCs
> afaik.
>
> Lets see if its something that wants to be bisected.
I tried bisecting several ways, but it always ends up at commit 0256994887d7
("Merge tag 'for-6.11/block-post-20240722' of git://git.kernel.dk/linux").
Manual build confirmed that 0256994887d7 fails but 0256994887d7~1,
which is commit dd018c238b84 ("Merge tag 'bcachefs-2024-07-22' of
https://evilpiepirate.org/git/bcachefs") is fine, at least for me.
I then rebased 'for-6.11/block-post-20240722' on top of
dd018c238b84 and tried again. Result is below.
However, reverting this patch as well as the subsequent patches does not
fix the problem, and reverting the entire merge from the mainline kernel
doesn't fix it either.
The next step was to bisect starting from 0256994887d7, reverting the block merges
at each step. That points to the io_uring merge (second set of bisect results).
Hoever, reverting that merge doesn't help, and neither does reverting both
the block and the io_uring merges.
On the other side, reverting nothing but enabling CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
makes the problem disappear. But that doesn't really help, especially since reverting
the patches touching CONFIG_CRYPTO_MANAGER_DISABLE_TESTS does _not_ help.
Baffled. Is it possible that the crashing code catches some page boundary ?
Guenter
---
# bad: [a9dd34ab77277f0fb7fa41a3edb8f0a71f7d791f] block: don't free the integrity payload in bio_integrity_unmap_free_user
# good: [dd018c238b8489b6dd8c06f6b962ea75d79115ff] Merge tag 'bcachefs-2024-07-22' of https://evilpiepirate.org/git/bcachefs
git bisect start 'HEAD' 'dd018c238b84'
# bad: [113799f9042573ba197de7a78a1e450cb40573ac] block: don't call bio_uninit from bio_endio
git bisect bad 113799f9042573ba197de7a78a1e450cb40573ac
# good: [473252aab8bf1a86e4266cb65f7baac1c10a70d9] block: also return bio_integrity_payload * from stubs
git bisect good 473252aab8bf1a86e4266cb65f7baac1c10a70d9
# first bad commit: [113799f9042573ba197de7a78a1e450cb40573ac] block: don't call bio_uninit from bio_endio
---
# bad: [8400291e289ee6b2bf9779ff1c83a291501f017b] Linux 6.11-rc1
# good: [0256994887d7c89c2a41d872aac67605bda8f115] Merge tag 'for-6.11/block-post-20240722' of git://git.kernel.dk/linux
git bisect start 'v6.11-rc1' '0256994887d7'
# good: [b2eed73360dffea91ea64e8f19330c950dd42ebb] Merge tag 'linux-watchdog-6.11-rc1' of git://www.linux-watchdog.org/g
git bisect good b2eed73360dffea91ea64e8f19330c950dd42ebb
# good: [0ba9b1551185a8b42003b708b6a9c25a9808701e] Merge tag 'drm-next-2024-07-26' of https://gitlab.freedesktop.org/drl
git bisect good 0ba9b1551185a8b42003b708b6a9c25a9808701e
# good: [8e333791d4605dbce611c22f71a86721c9afc336] Merge tag 'gpio-fixes-for-v6.11-rc1' of git://git.kernel.org/pub/scmx
git bisect good 8e333791d4605dbce611c22f71a86721c9afc336
# bad: [5437f30d3458ad36e83ab96088d490ebfee844d8] Merge tag '6.11-rc-smb-client-fixes-part2' of git://git.samba.org/sfr6
git bisect bad 5437f30d3458ad36e83ab96088d490ebfee844d8
# good: [910bfc26d16d07df5a2bfcbc63f0aa9d1397e2ef] Merge tag 'rust-6.11' of https://github.com/Rust-for-Linux/linux
git bisect good 910bfc26d16d07df5a2bfcbc63f0aa9d1397e2ef
# bad: [8c9307474333d8d100870b45af00bfeb1872c836] Merge tag 'io_uring-6.11-20240726' of git://git.kernel.dk/linux
git bisect bad 8c9307474333d8d100870b45af00bfeb1872c836
# good: [29d63b94036e561a016ec8878b44aad6650d23e2] io_uring: align iowq and task request error handling
git bisect good 29d63b94036e561a016ec8878b44aad6650d23e2
# good: [358169617602f6f71b31e5c9532a09b95a34b043] io_uring/napi: pass ktime to io_napi_adjust_timeout
git bisect good 358169617602f6f71b31e5c9532a09b95a34b043
# good: [ef9ca17ca458ac7253ae71b552e601e49311fc48] hostfs: fix the host directory parse when mounting.
git bisect good ef9ca17ca458ac7253ae71b552e601e49311fc48
# good: [bc4eee85ca6ce5335efe314215841712b5531449] Merge tag 'vfs-6.11-rc1.fixes.3' of git://git.kernel.org/pub/scm/lins
git bisect good bc4eee85ca6ce5335efe314215841712b5531449
# first bad commit: [8c9307474333d8d100870b45af00bfeb1872c836] Merge tag 'io_uring-6.11-20240726' of git://git.kernel.dx
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 23:29 ` Guenter Roeck
@ 2024-07-30 23:54 ` Linus Torvalds
2024-07-31 8:21 ` Borislav Petkov
2024-07-31 13:24 ` Jens Axboe
0 siblings, 2 replies; 59+ messages in thread
From: Linus Torvalds @ 2024-07-30 23:54 UTC (permalink / raw)
To: Guenter Roeck
Cc: Peter Zijlstra, Jens Axboe, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On Tue, 30 Jul 2024 at 16:29, Guenter Roeck <linux@roeck-us.net> wrote:
>
> Baffled. Is it possible that the crashing code catches some page boundary ?
We've definitely seen things like that before. Some alignment change
makes something cross a cacheline or page boundary, and it magically
causes a huge regression.
Usually it's about performance, though, not this kind of thing.
But I could imagine that some odd instruction rewriting thing goes
wrong only when the instruction crosses a page boundary, and that
we've never happened to hit that case, and then some kernel config
just moves the affected code around just enough.
That would then indirectly also explain why only some compiler
versions hit it - because it all depends on hitting that exact page
crosser.
You also seemed to say that it only happened with some CPU selections.
Maybe there's something wrong with the ALTERNATIVE() cleanups - I'm
looking at that new "nested alternatives macros" thing, and the odd
games we play with the origin and replacement lengths etc.
That all looks entirely crazy. That file was hard to read before, now
it's just incomprehensible to me.
Linus
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 23:54 ` Linus Torvalds
@ 2024-07-31 8:21 ` Borislav Petkov
2024-07-31 9:11 ` Peter Zijlstra
2024-07-31 14:37 ` Guenter Roeck
2024-07-31 13:24 ` Jens Axboe
1 sibling, 2 replies; 59+ messages in thread
From: Borislav Petkov @ 2024-07-31 8:21 UTC (permalink / raw)
To: Linus Torvalds
Cc: Guenter Roeck, Peter Zijlstra, Jens Axboe, Andy Lutomirski,
Ingo Molnar, Peter Anvin, Linux Kernel Mailing List,
the arch/x86 maintainers
On Tue, Jul 30, 2024 at 04:54:43PM -0700, Linus Torvalds wrote:
> You also seemed to say that it only happened with some CPU selections.
> Maybe there's something wrong with the ALTERNATIVE() cleanups - I'm
> looking at that new "nested alternatives macros" thing, and the odd
> games we play with the origin and replacement lengths etc.
>
> That all looks entirely crazy. That file was hard to read before, now
> it's just incomprehensible to me.
I'm sorry to hear that. The reason we did it is because it was starting to
become really unwieldy to add a yet another alternative choice N in an
ALTERNATIVE_N call...
Anyway, I'll try to reproduce here. In the meantime, can anyone who can
reproduce - Guenter, Jens - boot that failing kernel with
debug-alternative=-1
and copy dmesg and vmlinux somewhere for me?
It is a lot of output so make sure to catch it all.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 8:21 ` Borislav Petkov
@ 2024-07-31 9:11 ` Peter Zijlstra
2024-07-31 10:02 ` Borislav Petkov
2024-07-31 14:37 ` Guenter Roeck
1 sibling, 1 reply; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-31 9:11 UTC (permalink / raw)
To: Borislav Petkov
Cc: Linus Torvalds, Guenter Roeck, Jens Axboe, Andy Lutomirski,
Ingo Molnar, Peter Anvin, Linux Kernel Mailing List,
the arch/x86 maintainers
On Wed, Jul 31, 2024 at 10:21:11AM +0200, Borislav Petkov wrote:
> On Tue, Jul 30, 2024 at 04:54:43PM -0700, Linus Torvalds wrote:
> > You also seemed to say that it only happened with some CPU selections.
> > Maybe there's something wrong with the ALTERNATIVE() cleanups - I'm
> > looking at that new "nested alternatives macros" thing, and the odd
> > games we play with the origin and replacement lengths etc.
> >
> > That all looks entirely crazy. That file was hard to read before, now
> > it's just incomprehensible to me.
>
> I'm sorry to hear that. The reason we did it is because it was starting to
> become really unwieldy to add a yet another alternative choice N in an
> ALTERNATIVE_N call...
>
> Anyway, I'll try to reproduce here. In the meantime, can anyone who can
> reproduce - Guenter, Jens - boot that failing kernel with
>
> debug-alternative=-1
>
> and copy dmesg and vmlinux somewhere for me?
>
> It is a lot of output so make sure to catch it all.
So what I done instead is add: nokaslr to CMDLINE and -S -s to qemu and
am staring at the failing kernel in gdb.
So far all the alternatives in the affected paths look just fine.
Not that any of it is making sense, notably:
Code: bf 1e c2 e9 23 06 00 00 66 90 8d 76 00 fc 6a 00 68 f0 bd 1e c2 e9 11 06 00 00 8d 76 00 fc 6a 00 68 54 c5 1e c2 e9 01 06 00 00 <8d> 76 00 fc 68 b0 e9 1e c2 e9 f3 05 00 00 66 90 8d 76 00 fc 6a 00
decodes to:
0: bf 1e c2 e9 23 mov $0x23e9c21e,%edi
5: 06 (bad)
6: 00 00 add %al,(%rax)
8: 66 90 xchg %ax,%ax
asm_exc_invalid_op:
a: 8d 76 00 lea 0x0(%rsi),%esi
d: fc cld
e: 6a 00 push $0x0
10: 68 f0 bd 1e c2 push $0xffffffffc21ebdf0
15: e9 11 06 00 00 jmp 0x62b
asm_exc_int3:
1a: 8d 76 00 lea 0x0(%rsi),%esi
1d: fc cld
1e: 6a 00 push $0x0
20: 68 54 c5 1e c2 push $0xffffffffc21ec554
25: e9 01 06 00 00 jmp 0x62b
asm_exc_page_fault:
2a:* 8d 76 00 lea 0x0(%rsi),%esi <-- trapping instruction
2d: fc cld
2e: 68 b0 e9 1e c2 push $0xffffffffc21ee9b0
33: e9 f3 05 00 00 jmp 0x62b
38: 66 90 xchg %ax,%ax
asm_exc_machine_check:
3a: 8d 76 00 lea 0x0(%rsi),%esi
3d: fc cld
3e: 6a 00 push $0x0
And that trapping instruction is the CLAC nop (still a nop in the
faulting kernel image):
(gdb) disassemble asm_exc_page_fault
Dump of assembler code for function asm_exc_page_fault:
0xc2200350 <+0>: lea 0x0(%esi),%esi
0xc2200353 <+3>: cld
0xc2200354 <+4>: push $0xc21ee9b0
0xc2200359 <+9>: jmp 0xc2200951 <handle_exception>
End of assembler dump.
And then we have the endless stream of:
asm_exc_int3+0x10/0x10
which really is: asm_exc_page_fault+0x0/0x10, but that cannot be,
because then we'd have #DF much sooner.
The restore_all_switch_stack+0x65/0xe6 thing looks like so in the live
kernel image:
(gdb) disassemble restore_all_switch_stack
Dump of assembler code for function entry_INT80_32:
...
0xc22008c5 <+353>: mov %cr3,%eax
0xc22008c8 <+356>: or $0x1000,%eax
0xc22008cd <+361>: mov %eax,%cr3
0xc22008d0 <+364>: mov %esi,%esi <--- here
0xc22008d2 <+366>: testl $0x2,0x34(%esp)
0xc22008da <+374>: je 0xc22008e8 <entry_INT80_32+388>
0xc22008dc <+376>: mov %cr3,%eax
0xc22008df <+379>: test $0x1000,%eax
0xc22008e4 <+384>: jne 0xc22008e8 <entry_INT80_32+388>
0xc22008e6 <+386>: ud2
0xc22008e8 <+388>: pop %ebx
...
So that is indeed BUG_IF_WRONG_CR3 and the JMP got patched to a NOP2.
Nothing strange there.
So yeah, no clue still.
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 9:11 ` Peter Zijlstra
@ 2024-07-31 10:02 ` Borislav Petkov
0 siblings, 0 replies; 59+ messages in thread
From: Borislav Petkov @ 2024-07-31 10:02 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Linus Torvalds, Guenter Roeck, Jens Axboe, Andy Lutomirski,
Ingo Molnar, Peter Anvin, Linux Kernel Mailing List,
the arch/x86 maintainers
Just a data point:
gcc-11 (Debian 11.2.0-19) 11.2.0 - does NOT repro.
Upgrading to
gcc-11 (Debian 11.5.0-1) 11.5.0
*does* repro.
Fun.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 18:53 ` Linus Torvalds
2024-07-30 19:22 ` Peter Zijlstra
@ 2024-07-31 10:33 ` Peter Zijlstra
2024-07-31 14:15 ` Peter Zijlstra
1 sibling, 1 reply; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-31 10:33 UTC (permalink / raw)
To: Linus Torvalds
Cc: Guenter Roeck, Andy Lutomirski, Ingo Molnar, Peter Anvin,
Linux Kernel Mailing List, Jens Axboe, the arch/x86 maintainers
On Tue, Jul 30, 2024 at 11:53:31AM -0700, Linus Torvalds wrote:
> Definitely something wrong with the page tables. But where that
> wrongness comes from, I have no idea.
[ 10.231081] CR0: 80050033 CR2: ffa02ffc CR3: 02bc6000 CR4: 000006f0
See CR3 being a user address.... but yeah, million dollar question is
how the fuck did that happen?
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 20:24 ` Guenter Roeck
@ 2024-07-31 12:20 ` Peter Zijlstra
2024-07-31 13:03 ` Thomas Gleixner
0 siblings, 1 reply; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-31 12:20 UTC (permalink / raw)
To: Guenter Roeck
Cc: Jens Axboe, Linus Torvalds, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On Tue, Jul 30, 2024 at 01:24:34PM -0700, Guenter Roeck wrote:
> An interesting bit of information: The problem is seen with many,
> but not all CPUs. For example, I don't see it with athlon, n270, Dhyana,
> or EPYC. qemu32 is affected, but qemu64 is fine. But on the other side
> both kvm32 and kvm64 are affected.
pti=off makes it go away, could be those CPU models don't have meltdown
and as such don't enable PTI.
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 12:20 ` Peter Zijlstra
@ 2024-07-31 13:03 ` Thomas Gleixner
2024-07-31 15:55 ` Peter Zijlstra
0 siblings, 1 reply; 59+ messages in thread
From: Thomas Gleixner @ 2024-07-31 13:03 UTC (permalink / raw)
To: Peter Zijlstra, Guenter Roeck
Cc: Jens Axboe, Linus Torvalds, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On Wed, Jul 31 2024 at 14:20, Peter Zijlstra wrote:
> On Tue, Jul 30, 2024 at 01:24:34PM -0700, Guenter Roeck wrote:
>> An interesting bit of information: The problem is seen with many,
>> but not all CPUs. For example, I don't see it with athlon, n270, Dhyana,
>> or EPYC. qemu32 is affected, but qemu64 is fine. But on the other side
>> both kvm32 and kvm64 are affected.
>
> pti=off makes it go away, could be those CPU models don't have meltdown
> and as such don't enable PTI.
The AMD ones don't have meltdown and neither does n270 which is an
in-order atom.
Thanks,
tglx
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-30 23:54 ` Linus Torvalds
2024-07-31 8:21 ` Borislav Petkov
@ 2024-07-31 13:24 ` Jens Axboe
1 sibling, 0 replies; 59+ messages in thread
From: Jens Axboe @ 2024-07-31 13:24 UTC (permalink / raw)
To: Linus Torvalds, Guenter Roeck
Cc: Peter Zijlstra, Andy Lutomirski, Ingo Molnar, Peter Anvin,
Linux Kernel Mailing List, the arch/x86 maintainers
On 7/30/24 5:54 PM, Linus Torvalds wrote:
> You also seemed to say that it only happened with some CPU selections.
> Maybe there's something wrong with the ALTERNATIVE() cleanups - I'm
> looking at that new "nested alternatives macros" thing, and the odd
> games we play with the origin and replacement lengths etc.
>
> That all looks entirely crazy. That file was hard to read before, now
> it's just incomprehensible to me.
As I reported earlier, I already tried with the alternative cleanups
reverted, and it made zero difference - it still goes boom in very much
the same way.
--
Jens Axboe
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 10:33 ` Peter Zijlstra
@ 2024-07-31 14:15 ` Peter Zijlstra
0 siblings, 0 replies; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-31 14:15 UTC (permalink / raw)
To: Linus Torvalds
Cc: Guenter Roeck, Andy Lutomirski, Ingo Molnar, Peter Anvin,
Linux Kernel Mailing List, Jens Axboe, the arch/x86 maintainers
On Wed, Jul 31, 2024 at 12:33:32PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 30, 2024 at 11:53:31AM -0700, Linus Torvalds wrote:
>
> > Definitely something wrong with the page tables. But where that
> > wrongness comes from, I have no idea.
>
> [ 10.231081] CR0: 80050033 CR2: ffa02ffc CR3: 02bc6000 CR4: 000006f0
>
> See CR3 being a user address.... but yeah, million dollar question is
> how the fuck did that happen?
Thomas just reminded me that CR3 is physical... duh.
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 8:21 ` Borislav Petkov
2024-07-31 9:11 ` Peter Zijlstra
@ 2024-07-31 14:37 ` Guenter Roeck
1 sibling, 0 replies; 59+ messages in thread
From: Guenter Roeck @ 2024-07-31 14:37 UTC (permalink / raw)
To: Borislav Petkov, Linus Torvalds
Cc: Peter Zijlstra, Jens Axboe, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On 7/31/24 01:21, Borislav Petkov wrote:
> On Tue, Jul 30, 2024 at 04:54:43PM -0700, Linus Torvalds wrote:
>> You also seemed to say that it only happened with some CPU selections.
>> Maybe there's something wrong with the ALTERNATIVE() cleanups - I'm
>> looking at that new "nested alternatives macros" thing, and the odd
>> games we play with the origin and replacement lengths etc.
>>
>> That all looks entirely crazy. That file was hard to read before, now
>> it's just incomprehensible to me.
>
> I'm sorry to hear that. The reason we did it is because it was starting to
> become really unwieldy to add a yet another alternative choice N in an
> ALTERNATIVE_N call...
>
> Anyway, I'll try to reproduce here. In the meantime, can anyone who can
> reproduce - Guenter, Jens - boot that failing kernel with
>
> debug-alternative=-1
>
> and copy dmesg and vmlinux somewhere for me?
>
> It is a lot of output so make sure to catch it all.
>
> Thx.
>
See http://server.roeck-us.net/qemu/x86-nosmp/ for images; I copied
vmlinux there as well. Various logs are in
http://server.roeck-us.net/qemu/x86-nosmp/logs/; relevant
log-n270-good boots
log-pentium2-bad crashes
cpu-list List of tested CPUs, with results
Note that Opteron_G4 and Opteron_G5 are
broken in upstream qemu since qemu v6.1.
Hope this helps,
Guenter
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-29 19:23 ` Linus Torvalds
2024-07-29 19:50 ` Linus Torvalds
2024-07-30 7:54 ` Peter Zijlstra
@ 2024-07-31 15:45 ` Guenter Roeck
2 siblings, 0 replies; 59+ messages in thread
From: Guenter Roeck @ 2024-07-31 15:45 UTC (permalink / raw)
To: Linus Torvalds, Peter Zijlstra, Sebastian Andrzej Siewior,
Ingo Molnar
Cc: Linux Kernel Mailing List
On 7/29/24 12:23, Linus Torvalds wrote:
> On Mon, 29 Jul 2024 at 08:29, Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> In summary, quite impressive in a negative sense.
>
> Grr. I think a lot of the build failures end up being due to commit
> 466e4d801cd4 ("task_work: Add TWA_NMI_CURRENT as an additional notify
> mode") depending on IRQ_WORK, and that not existing everywhere.
>
> I pushed out a tentative fix as commit cec6937dd1aa ("task_work: make
> TWA_NMI_CURRENT handling conditional on IRQ_WORK"). I haven't set up a
> build environment for those tiny targets, but it looked fairly
> straightforward.
>
> I think that explains at least most of the 'tinyconfig' build failures.
>
All "tinyconfig" build tests pass with v6.11-rc1-43-g94ede2a3e913,
so that problem has been fixed.
Thanks,
Guenter
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 13:03 ` Thomas Gleixner
@ 2024-07-31 15:55 ` Peter Zijlstra
2024-07-31 16:17 ` Linus Torvalds
0 siblings, 1 reply; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-31 15:55 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Guenter Roeck, Jens Axboe, Linus Torvalds, Andy Lutomirski,
Ingo Molnar, Peter Anvin, Linux Kernel Mailing List,
the arch/x86 maintainers
On Wed, Jul 31, 2024 at 03:03:33PM +0200, Thomas Gleixner wrote:
> On Wed, Jul 31 2024 at 14:20, Peter Zijlstra wrote:
> > On Tue, Jul 30, 2024 at 01:24:34PM -0700, Guenter Roeck wrote:
> >> An interesting bit of information: The problem is seen with many,
> >> but not all CPUs. For example, I don't see it with athlon, n270, Dhyana,
> >> or EPYC. qemu32 is affected, but qemu64 is fine. But on the other side
> >> both kvm32 and kvm64 are affected.
> >
> > pti=off makes it go away, could be those CPU models don't have meltdown
> > and as such don't enable PTI.
>
> The AMD ones don't have meltdown and neither does n270 which is an
> in-order atom.
Right, so Thomas found that i386-pti fails to map the entire entry text.
Specifically pti_clone_pgtable() hard relies -- and does not verify --
that the start address is aligned to the given granularity.
Now, i386 does not align __entry_text_start, and so the termination
condition goes sideways and pte_clone_entry() does not always work right
and it becomes a games of code layout roulette.
Still trying to figure out what the right fix is. I've tried page
aligning the section and using PTE cloning, and that works -- mostly. If
you hit a source PMD the clone logic still does a PMD level clone and
that might not be what we want, see the alignment thing again.
Also, should we just kill PTI on 32bit perhaps?
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 15:55 ` Peter Zijlstra
@ 2024-07-31 16:17 ` Linus Torvalds
2024-07-31 16:31 ` Peter Zijlstra
2024-07-31 16:49 ` Linux 6.11-rc1 Guenter Roeck
0 siblings, 2 replies; 59+ messages in thread
From: Linus Torvalds @ 2024-07-31 16:17 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Thomas Gleixner, Guenter Roeck, Jens Axboe, Andy Lutomirski,
Ingo Molnar, Peter Anvin, Linux Kernel Mailing List,
the arch/x86 maintainers
On Wed, 31 Jul 2024 at 08:55, Peter Zijlstra <peterz@infradead.org> wrote:
>
> Right, so Thomas found that i386-pti fails to map the entire entry text.
> Specifically pti_clone_pgtable() hard relies -- and does not verify --
> that the start address is aligned to the given granularity.
>
> Now, i386 does not align __entry_text_start, and so the termination
> condition goes sideways and pte_clone_entry() does not always work right
> and it becomes a games of code layout roulette.
Lovely.
> Also, should we just kill PTI on 32bit perhaps?
I don't think there's much technical reason to keep it - I can't
imagine any security-conscious people actually use 32-bit x86 any more
- but apart from fixing this bug I wonder how much of a maintenance
burden it is? I think most of the code is shared with 64-bit, isn't
it? The 32-bit case in many ways is simpler, even if it happened to
hit this odd alignment issue because it's obviously also a lot less
tested.
I'd rather kill highmem and X86_PAE, but I also suspect that horror
has a much larger chance of still being used.
The day we finally get rid of HIGHMEM I will dance on its grave. I
have hated that thing for a long long time.
Linus
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 16:17 ` Linus Torvalds
@ 2024-07-31 16:31 ` Peter Zijlstra
2024-07-31 16:50 ` Guenter Roeck
` (3 more replies)
2024-07-31 16:49 ` Linux 6.11-rc1 Guenter Roeck
1 sibling, 4 replies; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-31 16:31 UTC (permalink / raw)
To: Linus Torvalds
Cc: Thomas Gleixner, Guenter Roeck, Jens Axboe, Andy Lutomirski,
Ingo Molnar, Peter Anvin, Linux Kernel Mailing List,
the arch/x86 maintainers
On Wed, Jul 31, 2024 at 09:17:44AM -0700, Linus Torvalds wrote:
> On Wed, 31 Jul 2024 at 08:55, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > Right, so Thomas found that i386-pti fails to map the entire entry text.
> > Specifically pti_clone_pgtable() hard relies -- and does not verify --
> > that the start address is aligned to the given granularity.
> >
> > Now, i386 does not align __entry_text_start, and so the termination
> > condition goes sideways and pte_clone_entry() does not always work right
> > and it becomes a games of code layout roulette.
>
> Lovely.
:-)
This fixes the alignment assumptions and makes it all go again.
diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index 2e69abf4f852..bfdf5f45b137 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -374,14 +374,14 @@ pti_clone_pgtable(unsigned long start, unsigned long end,
*/
*target_pmd = *pmd;
- addr += PMD_SIZE;
+ addr = round_up(addr + 1, PMD_SIZE);
} else if (level == PTI_CLONE_PTE) {
/* Walk the page-table down to the pte level */
pte = pte_offset_kernel(pmd, addr);
if (pte_none(*pte)) {
- addr += PAGE_SIZE;
+ addr = round_up(addr + 1, PAGE_SIZE);
continue;
}
@@ -401,7 +401,7 @@ pti_clone_pgtable(unsigned long start, unsigned long end,
/* Clone the PTE */
*target_pte = *pte;
- addr += PAGE_SIZE;
+ addr = round_up(addr + 1, PAGE_SIZE);
} else {
BUG();
^ permalink raw reply related [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 16:17 ` Linus Torvalds
2024-07-31 16:31 ` Peter Zijlstra
@ 2024-07-31 16:49 ` Guenter Roeck
2024-07-31 17:19 ` Thomas Gleixner
1 sibling, 1 reply; 59+ messages in thread
From: Guenter Roeck @ 2024-07-31 16:49 UTC (permalink / raw)
To: Linus Torvalds, Peter Zijlstra
Cc: Thomas Gleixner, Jens Axboe, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On 7/31/24 09:17, Linus Torvalds wrote:
> On Wed, 31 Jul 2024 at 08:55, Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> Right, so Thomas found that i386-pti fails to map the entire entry text.
>> Specifically pti_clone_pgtable() hard relies -- and does not verify --
>> that the start address is aligned to the given granularity.
>>
>> Now, i386 does not align __entry_text_start, and so the termination
>> condition goes sideways and pte_clone_entry() does not always work right
>> and it becomes a games of code layout roulette.
>
> Lovely.
>
>> Also, should we just kill PTI on 32bit perhaps?
>
> I don't think there's much technical reason to keep it - I can't
> imagine any security-conscious people actually use 32-bit x86 any more
> - but apart from fixing this bug I wonder how much of a maintenance
> burden it is? I think most of the code is shared with 64-bit, isn't
> it? The 32-bit case in many ways is simpler, even if it happened to
> hit this odd alignment issue because it's obviously also a lot less
> tested.
>
> I'd rather kill highmem and X86_PAE, but I also suspect that horror
> has a much larger chance of still being used.
>
I guess there is at least one user - me with my annoying boot tests ;-).
But seriously the question is: How likely is it for that code to find
potential problems in the 64-bit code ? pti_clone_pgtable() doesn't
seem to be 32-bit specific.
Guenter
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 16:31 ` Peter Zijlstra
@ 2024-07-31 16:50 ` Guenter Roeck
2024-07-31 16:51 ` Peter Zijlstra
` (2 subsequent siblings)
3 siblings, 0 replies; 59+ messages in thread
From: Guenter Roeck @ 2024-07-31 16:50 UTC (permalink / raw)
To: Peter Zijlstra, Linus Torvalds
Cc: Thomas Gleixner, Jens Axboe, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On 7/31/24 09:31, Peter Zijlstra wrote:
> On Wed, Jul 31, 2024 at 09:17:44AM -0700, Linus Torvalds wrote:
>> On Wed, 31 Jul 2024 at 08:55, Peter Zijlstra <peterz@infradead.org> wrote:
>>>
>>> Right, so Thomas found that i386-pti fails to map the entire entry text.
>>> Specifically pti_clone_pgtable() hard relies -- and does not verify --
>>> that the start address is aligned to the given granularity.
>>>
>>> Now, i386 does not align __entry_text_start, and so the termination
>>> condition goes sideways and pte_clone_entry() does not always work right
>>> and it becomes a games of code layout roulette.
>>
>> Lovely.
>
> :-)
>
> This fixes the alignment assumptions and makes it all go again.
>
Confirmed.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Thanks,
Guenter
> diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
> index 2e69abf4f852..bfdf5f45b137 100644
> --- a/arch/x86/mm/pti.c
> +++ b/arch/x86/mm/pti.c
> @@ -374,14 +374,14 @@ pti_clone_pgtable(unsigned long start, unsigned long end,
> */
> *target_pmd = *pmd;
>
> - addr += PMD_SIZE;
> + addr = round_up(addr + 1, PMD_SIZE);
>
> } else if (level == PTI_CLONE_PTE) {
>
> /* Walk the page-table down to the pte level */
> pte = pte_offset_kernel(pmd, addr);
> if (pte_none(*pte)) {
> - addr += PAGE_SIZE;
> + addr = round_up(addr + 1, PAGE_SIZE);
> continue;
> }
>
> @@ -401,7 +401,7 @@ pti_clone_pgtable(unsigned long start, unsigned long end,
> /* Clone the PTE */
> *target_pte = *pte;
>
> - addr += PAGE_SIZE;
> + addr = round_up(addr + 1, PAGE_SIZE);
>
> } else {
> BUG();
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 16:31 ` Peter Zijlstra
2024-07-31 16:50 ` Guenter Roeck
@ 2024-07-31 16:51 ` Peter Zijlstra
2024-07-31 17:26 ` Thomas Gleixner
2024-08-01 10:55 ` [tip: x86/urgent] x86/mm: Fix pti_clone_pgtable() alignment assumption tip-bot2 for Peter Zijlstra
2024-08-01 13:03 ` tip-bot2 for Peter Zijlstra
3 siblings, 1 reply; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-31 16:51 UTC (permalink / raw)
To: Linus Torvalds
Cc: Thomas Gleixner, Guenter Roeck, Jens Axboe, Andy Lutomirski,
Ingo Molnar, Peter Anvin, Linux Kernel Mailing List,
the arch/x86 maintainers
On Wed, Jul 31, 2024 at 06:31:05PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 31, 2024 at 09:17:44AM -0700, Linus Torvalds wrote:
> > On Wed, 31 Jul 2024 at 08:55, Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > Right, so Thomas found that i386-pti fails to map the entire entry text.
> > > Specifically pti_clone_pgtable() hard relies -- and does not verify --
> > > that the start address is aligned to the given granularity.
> > >
> > > Now, i386 does not align __entry_text_start, and so the termination
> > > condition goes sideways and pte_clone_entry() does not always work right
> > > and it becomes a games of code layout roulette.
> >
> > Lovely.
>
> :-)
>
> This fixes the alignment assumptions and makes it all go again.
Thomas, this all still relies on the full text section being PMD mapped,
and since we don't have ALIGN_ENTRY_TEXT_END and _etext has PAGE_SIZE
alignment, can't have a PAGE mapped tail which then doesn't get cloned?
Do we want to make pto_clone_entry_text() use PTI_LEVEL_KERNEL_IMAGE
such that it will clone whatever it has?
> diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
> index 2e69abf4f852..bfdf5f45b137 100644
> --- a/arch/x86/mm/pti.c
> +++ b/arch/x86/mm/pti.c
> @@ -374,14 +374,14 @@ pti_clone_pgtable(unsigned long start, unsigned long end,
> */
> *target_pmd = *pmd;
>
> - addr += PMD_SIZE;
> + addr = round_up(addr + 1, PMD_SIZE);
>
> } else if (level == PTI_CLONE_PTE) {
>
> /* Walk the page-table down to the pte level */
> pte = pte_offset_kernel(pmd, addr);
> if (pte_none(*pte)) {
> - addr += PAGE_SIZE;
> + addr = round_up(addr + 1, PAGE_SIZE);
> continue;
> }
>
> @@ -401,7 +401,7 @@ pti_clone_pgtable(unsigned long start, unsigned long end,
> /* Clone the PTE */
> *target_pte = *pte;
>
> - addr += PAGE_SIZE;
> + addr = round_up(addr + 1, PAGE_SIZE);
>
> } else {
> BUG();
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 16:49 ` Linux 6.11-rc1 Guenter Roeck
@ 2024-07-31 17:19 ` Thomas Gleixner
0 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2024-07-31 17:19 UTC (permalink / raw)
To: Guenter Roeck, Linus Torvalds, Peter Zijlstra
Cc: Jens Axboe, Andy Lutomirski, Ingo Molnar, Peter Anvin,
Linux Kernel Mailing List, the arch/x86 maintainers
On Wed, Jul 31 2024 at 09:49, Guenter Roeck wrote:
> On 7/31/24 09:17, Linus Torvalds wrote:
> I guess there is at least one user - me with my annoying boot tests ;-).
>
> But seriously the question is: How likely is it for that code to find
> potential problems in the 64-bit code ? pti_clone_pgtable() doesn't
> seem to be 32-bit specific.
64-bit does not have the problem because everything is PMD aligned.
Thanks,
tglx
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 16:51 ` Peter Zijlstra
@ 2024-07-31 17:26 ` Thomas Gleixner
2024-07-31 21:20 ` Peter Zijlstra
0 siblings, 1 reply; 59+ messages in thread
From: Thomas Gleixner @ 2024-07-31 17:26 UTC (permalink / raw)
To: Peter Zijlstra, Linus Torvalds
Cc: Guenter Roeck, Jens Axboe, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On Wed, Jul 31 2024 at 18:51, Peter Zijlstra wrote:
> On Wed, Jul 31, 2024 at 06:31:05PM +0200, Peter Zijlstra wrote:
> Thomas, this all still relies on the full text section being PMD mapped,
> and since we don't have ALIGN_ENTRY_TEXT_END and _etext has PAGE_SIZE
> alignment, can't have a PAGE mapped tail which then doesn't get cloned?
>
> Do we want to make pto_clone_entry_text() use PTI_LEVEL_KERNEL_IMAGE
> such that it will clone whatever it has?
Yes, I think so.
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 17:26 ` Thomas Gleixner
@ 2024-07-31 21:20 ` Peter Zijlstra
2024-07-31 21:23 ` Linus Torvalds
2024-07-31 22:22 ` Guenter Roeck
0 siblings, 2 replies; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-31 21:20 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Linus Torvalds, Guenter Roeck, Jens Axboe, Andy Lutomirski,
Ingo Molnar, Peter Anvin, Linux Kernel Mailing List,
the arch/x86 maintainers
On Wed, Jul 31, 2024 at 07:26:04PM +0200, Thomas Gleixner wrote:
> On Wed, Jul 31 2024 at 18:51, Peter Zijlstra wrote:
> > On Wed, Jul 31, 2024 at 06:31:05PM +0200, Peter Zijlstra wrote:
> > Thomas, this all still relies on the full text section being PMD mapped,
> > and since we don't have ALIGN_ENTRY_TEXT_END and _etext has PAGE_SIZE
> > alignment, can't have a PAGE mapped tail which then doesn't get cloned?
> >
> > Do we want to make pto_clone_entry_text() use PTI_LEVEL_KERNEL_IMAGE
> > such that it will clone whatever it has?
>
> Yes, I think so.
The alternative is ripping that level thing out entirely, and simply
duplicate anything we find in the page-tables.
We could add something like:
WARN_ON_ONCE(IS_ENABLED(CONFIG_X86_64));
in the PTE path, but do we really care?
---
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -47,16 +47,6 @@
#define __GFP_NOTRACK 0
#endif
-/*
- * Define the page-table levels we clone for user-space on 32
- * and 64 bit.
- */
-#ifdef CONFIG_X86_64
-#define PTI_LEVEL_KERNEL_IMAGE PTI_CLONE_PMD
-#else
-#define PTI_LEVEL_KERNEL_IMAGE PTI_CLONE_PTE
-#endif
-
static void __init pti_print_if_insecure(const char *reason)
{
if (boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN))
@@ -294,14 +284,7 @@ static void __init pti_setup_vsyscall(vo
static void __init pti_setup_vsyscall(void) { }
#endif
-enum pti_clone_level {
- PTI_CLONE_PMD,
- PTI_CLONE_PTE,
-};
-
-static void
-pti_clone_pgtable(unsigned long start, unsigned long end,
- enum pti_clone_level level)
+static void pti_clone_pgtable(unsigned long start, unsigned long end)
{
unsigned long addr;
@@ -341,7 +324,7 @@ pti_clone_pgtable(unsigned long start, u
continue;
}
- if (pmd_leaf(*pmd) || level == PTI_CLONE_PMD) {
+ if (pmd_leaf(*pmd)) {
target_pmd = pti_user_pagetable_walk_pmd(addr);
if (WARN_ON(!target_pmd))
return;
@@ -375,37 +358,33 @@ pti_clone_pgtable(unsigned long start, u
*target_pmd = *pmd;
addr = round_up(addr + 1, PMD_SIZE);
+ continue;
+ }
- } else if (level == PTI_CLONE_PTE) {
-
- /* Walk the page-table down to the pte level */
- pte = pte_offset_kernel(pmd, addr);
- if (pte_none(*pte)) {
- addr = round_up(addr + 1, PAGE_SIZE);
- continue;
- }
-
- /* Only clone present PTEs */
- if (WARN_ON(!(pte_flags(*pte) & _PAGE_PRESENT)))
- return;
+ /* Walk the page-table down to the pte level */
+ pte = pte_offset_kernel(pmd, addr);
+ if (pte_none(*pte)) {
+ addr = round_up(addr + 1, PAGE_SIZE);
+ continue;
+ }
- /* Allocate PTE in the user page-table */
- target_pte = pti_user_pagetable_walk_pte(addr);
- if (WARN_ON(!target_pte))
- return;
+ /* Only clone present PTEs */
+ if (WARN_ON(!(pte_flags(*pte) & _PAGE_PRESENT)))
+ return;
- /* Set GLOBAL bit in both PTEs */
- if (boot_cpu_has(X86_FEATURE_PGE))
- *pte = pte_set_flags(*pte, _PAGE_GLOBAL);
+ /* Allocate PTE in the user page-table */
+ target_pte = pti_user_pagetable_walk_pte(addr);
+ if (WARN_ON(!target_pte))
+ return;
- /* Clone the PTE */
- *target_pte = *pte;
+ /* Set GLOBAL bit in both PTEs */
+ if (boot_cpu_has(X86_FEATURE_PGE))
+ *pte = pte_set_flags(*pte, _PAGE_GLOBAL);
- addr = round_up(addr + 1, PAGE_SIZE);
+ /* Clone the PTE */
+ *target_pte = *pte;
- } else {
- BUG();
- }
+ addr = round_up(addr + 1, PAGE_SIZE);
}
}
@@ -475,7 +454,7 @@ static void __init pti_clone_user_shared
start = CPU_ENTRY_AREA_BASE;
end = start + (PAGE_SIZE * CPU_ENTRY_AREA_PAGES);
- pti_clone_pgtable(start, end, PTI_CLONE_PMD);
+ pti_clone_pgtable(start, end);
}
#endif /* CONFIG_X86_64 */
@@ -495,8 +474,7 @@ static void __init pti_setup_espfix64(vo
static void pti_clone_entry_text(void)
{
pti_clone_pgtable((unsigned long) __entry_text_start,
- (unsigned long) __entry_text_end,
- PTI_CLONE_PMD);
+ (unsigned long) __entry_text_end);
}
/*
@@ -571,7 +549,7 @@ static void pti_clone_kernel_text(void)
* pti_set_kernel_image_nonglobal() did to clear the
* global bit.
*/
- pti_clone_pgtable(start, end_clone, PTI_LEVEL_KERNEL_IMAGE);
+ pti_clone_pgtable(start, end_clone);
/*
* pti_clone_pgtable() will set the global bit in any PMDs
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 21:20 ` Peter Zijlstra
@ 2024-07-31 21:23 ` Linus Torvalds
2024-07-31 21:26 ` Peter Zijlstra
2024-07-31 22:22 ` Guenter Roeck
1 sibling, 1 reply; 59+ messages in thread
From: Linus Torvalds @ 2024-07-31 21:23 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Thomas Gleixner, Guenter Roeck, Jens Axboe, Andy Lutomirski,
Ingo Molnar, Peter Anvin, Linux Kernel Mailing List,
the arch/x86 maintainers
On Wed, 31 Jul 2024 at 14:20, Peter Zijlstra <peterz@infradead.org> wrote:
>
> The alternative is ripping that level thing out entirely, and simply
> duplicate anything we find in the page-tables.
That looks clean to me, and don't we want to clone the minimal range
anyway - even on x86-64?
Linus
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 21:23 ` Linus Torvalds
@ 2024-07-31 21:26 ` Peter Zijlstra
2024-07-31 21:41 ` Linus Torvalds
0 siblings, 1 reply; 59+ messages in thread
From: Peter Zijlstra @ 2024-07-31 21:26 UTC (permalink / raw)
To: Linus Torvalds
Cc: Thomas Gleixner, Guenter Roeck, Jens Axboe, Andy Lutomirski,
Ingo Molnar, Peter Anvin, Linux Kernel Mailing List,
the arch/x86 maintainers
On Wed, Jul 31, 2024 at 02:23:02PM -0700, Linus Torvalds wrote:
> On Wed, 31 Jul 2024 at 14:20, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > The alternative is ripping that level thing out entirely, and simply
> > duplicate anything we find in the page-tables.
>
> That looks clean to me, and don't we want to clone the minimal range
> anyway - even on x86-64?
x86_64 has everything PMD aligned. It *should* never encounter a PTE.
Also, this thing blindly clones the format the kernel page-tables have,
it will not split a PMD into multiple PTE entries just to clone a
smaller range. It is super simple.
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 21:26 ` Peter Zijlstra
@ 2024-07-31 21:41 ` Linus Torvalds
2024-07-31 21:47 ` Thomas Gleixner
0 siblings, 1 reply; 59+ messages in thread
From: Linus Torvalds @ 2024-07-31 21:41 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Thomas Gleixner, Guenter Roeck, Jens Axboe, Andy Lutomirski,
Ingo Molnar, Peter Anvin, Linux Kernel Mailing List,
the arch/x86 maintainers
On Wed, 31 Jul 2024 at 14:26, Peter Zijlstra <peterz@infradead.org> wrote:
>
> x86_64 has everything PMD aligned. It *should* never encounter a PTE.
Ahh. I thought it only aligned the beginning, but yeah, I see that
ALIGN_ENTRY_TEXT_END is also PMD_SIZE aligned.
That smells of wasted memory, but I guess the TLB advantage is worth it.
Linus
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 21:41 ` Linus Torvalds
@ 2024-07-31 21:47 ` Thomas Gleixner
0 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2024-07-31 21:47 UTC (permalink / raw)
To: Linus Torvalds, Peter Zijlstra
Cc: Guenter Roeck, Jens Axboe, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On Wed, Jul 31 2024 at 14:41, Linus Torvalds wrote:
> On Wed, 31 Jul 2024 at 14:26, Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> x86_64 has everything PMD aligned. It *should* never encounter a PTE.
>
> Ahh. I thought it only aligned the beginning, but yeah, I see that
> ALIGN_ENTRY_TEXT_END is also PMD_SIZE aligned.
>
> That smells of wasted memory, but I guess the TLB advantage is worth it.
It definitely is.
Thanks,
tglx
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 21:20 ` Peter Zijlstra
2024-07-31 21:23 ` Linus Torvalds
@ 2024-07-31 22:22 ` Guenter Roeck
2024-08-01 8:54 ` Peter Zijlstra
1 sibling, 1 reply; 59+ messages in thread
From: Guenter Roeck @ 2024-07-31 22:22 UTC (permalink / raw)
To: Peter Zijlstra, Thomas Gleixner
Cc: Linus Torvalds, Jens Axboe, Andy Lutomirski, Ingo Molnar,
Peter Anvin, Linux Kernel Mailing List, the arch/x86 maintainers
On 7/31/24 14:20, Peter Zijlstra wrote:
> On Wed, Jul 31, 2024 at 07:26:04PM +0200, Thomas Gleixner wrote:
>> On Wed, Jul 31 2024 at 18:51, Peter Zijlstra wrote:
>>> On Wed, Jul 31, 2024 at 06:31:05PM +0200, Peter Zijlstra wrote:
>>> Thomas, this all still relies on the full text section being PMD mapped,
>>> and since we don't have ALIGN_ENTRY_TEXT_END and _etext has PAGE_SIZE
>>> alignment, can't have a PAGE mapped tail which then doesn't get cloned?
>>>
>>> Do we want to make pto_clone_entry_text() use PTI_LEVEL_KERNEL_IMAGE
>>> such that it will clone whatever it has?
>>
>> Yes, I think so.
>
> The alternative is ripping that level thing out entirely, and simply
> duplicate anything we find in the page-tables.
>
The patch below (on top of the previous one, because otherwise it doesn't
apply) causes qemu to bail out hard, with
...
[ 3.658327] sr 2:0:0:0: Attached scsi generic sg0 type 5
[ 3.858040] sched_clock: Marking stable (3834034034, 23728553)->(3865222956, -7460369)
[ 3.861469] registered taskstats version 1
[ 3.861584] Loading compiled-in X.509 certificates
[ 4.082031] Btrfs loaded, zoned=no, fsverity=no
[ 4.096034] cryptomgr_test (69) used greatest stack depth: 6136 bytes left
No backtrace or other message, it just exits immediately.
Guenter
> We could add something like:
>
> WARN_ON_ONCE(IS_ENABLED(CONFIG_X86_64));
>
> in the PTE path, but do we really care?
>
> ---
> --- a/arch/x86/mm/pti.c
> +++ b/arch/x86/mm/pti.c
> @@ -47,16 +47,6 @@
> #define __GFP_NOTRACK 0
> #endif
>
> -/*
> - * Define the page-table levels we clone for user-space on 32
> - * and 64 bit.
> - */
> -#ifdef CONFIG_X86_64
> -#define PTI_LEVEL_KERNEL_IMAGE PTI_CLONE_PMD
> -#else
> -#define PTI_LEVEL_KERNEL_IMAGE PTI_CLONE_PTE
> -#endif
> -
> static void __init pti_print_if_insecure(const char *reason)
> {
> if (boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN))
> @@ -294,14 +284,7 @@ static void __init pti_setup_vsyscall(vo
> static void __init pti_setup_vsyscall(void) { }
> #endif
>
> -enum pti_clone_level {
> - PTI_CLONE_PMD,
> - PTI_CLONE_PTE,
> -};
> -
> -static void
> -pti_clone_pgtable(unsigned long start, unsigned long end,
> - enum pti_clone_level level)
> +static void pti_clone_pgtable(unsigned long start, unsigned long end)
> {
> unsigned long addr;
>
> @@ -341,7 +324,7 @@ pti_clone_pgtable(unsigned long start, u
> continue;
> }
>
> - if (pmd_leaf(*pmd) || level == PTI_CLONE_PMD) {
> + if (pmd_leaf(*pmd)) {
> target_pmd = pti_user_pagetable_walk_pmd(addr);
> if (WARN_ON(!target_pmd))
> return;
> @@ -375,37 +358,33 @@ pti_clone_pgtable(unsigned long start, u
> *target_pmd = *pmd;
>
> addr = round_up(addr + 1, PMD_SIZE);
> + continue;
> + }
>
> - } else if (level == PTI_CLONE_PTE) {
> -
> - /* Walk the page-table down to the pte level */
> - pte = pte_offset_kernel(pmd, addr);
> - if (pte_none(*pte)) {
> - addr = round_up(addr + 1, PAGE_SIZE);
> - continue;
> - }
> -
> - /* Only clone present PTEs */
> - if (WARN_ON(!(pte_flags(*pte) & _PAGE_PRESENT)))
> - return;
> + /* Walk the page-table down to the pte level */
> + pte = pte_offset_kernel(pmd, addr);
> + if (pte_none(*pte)) {
> + addr = round_up(addr + 1, PAGE_SIZE);
> + continue;
> + }
>
> - /* Allocate PTE in the user page-table */
> - target_pte = pti_user_pagetable_walk_pte(addr);
> - if (WARN_ON(!target_pte))
> - return;
> + /* Only clone present PTEs */
> + if (WARN_ON(!(pte_flags(*pte) & _PAGE_PRESENT)))
> + return;
>
> - /* Set GLOBAL bit in both PTEs */
> - if (boot_cpu_has(X86_FEATURE_PGE))
> - *pte = pte_set_flags(*pte, _PAGE_GLOBAL);
> + /* Allocate PTE in the user page-table */
> + target_pte = pti_user_pagetable_walk_pte(addr);
> + if (WARN_ON(!target_pte))
> + return;
>
> - /* Clone the PTE */
> - *target_pte = *pte;
> + /* Set GLOBAL bit in both PTEs */
> + if (boot_cpu_has(X86_FEATURE_PGE))
> + *pte = pte_set_flags(*pte, _PAGE_GLOBAL);
>
> - addr = round_up(addr + 1, PAGE_SIZE);
> + /* Clone the PTE */
> + *target_pte = *pte;
>
> - } else {
> - BUG();
> - }
> + addr = round_up(addr + 1, PAGE_SIZE);
> }
> }
>
> @@ -475,7 +454,7 @@ static void __init pti_clone_user_shared
> start = CPU_ENTRY_AREA_BASE;
> end = start + (PAGE_SIZE * CPU_ENTRY_AREA_PAGES);
>
> - pti_clone_pgtable(start, end, PTI_CLONE_PMD);
> + pti_clone_pgtable(start, end);
> }
> #endif /* CONFIG_X86_64 */
>
> @@ -495,8 +474,7 @@ static void __init pti_setup_espfix64(vo
> static void pti_clone_entry_text(void)
> {
> pti_clone_pgtable((unsigned long) __entry_text_start,
> - (unsigned long) __entry_text_end,
> - PTI_CLONE_PMD);
> + (unsigned long) __entry_text_end);
> }
>
> /*
> @@ -571,7 +549,7 @@ static void pti_clone_kernel_text(void)
> * pti_set_kernel_image_nonglobal() did to clear the
> * global bit.
> */
> - pti_clone_pgtable(start, end_clone, PTI_LEVEL_KERNEL_IMAGE);
> + pti_clone_pgtable(start, end_clone);
>
> /*
> * pti_clone_pgtable() will set the global bit in any PMDs
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-31 22:22 ` Guenter Roeck
@ 2024-08-01 8:54 ` Peter Zijlstra
0 siblings, 0 replies; 59+ messages in thread
From: Peter Zijlstra @ 2024-08-01 8:54 UTC (permalink / raw)
To: Guenter Roeck
Cc: Thomas Gleixner, Linus Torvalds, Jens Axboe, Andy Lutomirski,
Ingo Molnar, Peter Anvin, Linux Kernel Mailing List,
the arch/x86 maintainers
On Wed, Jul 31, 2024 at 03:22:53PM -0700, Guenter Roeck wrote:
> On 7/31/24 14:20, Peter Zijlstra wrote:
> > On Wed, Jul 31, 2024 at 07:26:04PM +0200, Thomas Gleixner wrote:
> > > On Wed, Jul 31 2024 at 18:51, Peter Zijlstra wrote:
> > > > On Wed, Jul 31, 2024 at 06:31:05PM +0200, Peter Zijlstra wrote:
> > > > Thomas, this all still relies on the full text section being PMD mapped,
> > > > and since we don't have ALIGN_ENTRY_TEXT_END and _etext has PAGE_SIZE
> > > > alignment, can't have a PAGE mapped tail which then doesn't get cloned?
> > > >
> > > > Do we want to make pto_clone_entry_text() use PTI_LEVEL_KERNEL_IMAGE
> > > > such that it will clone whatever it has?
> > >
> > > Yes, I think so.
> >
> > The alternative is ripping that level thing out entirely, and simply
> > duplicate anything we find in the page-tables.
> >
>
> The patch below (on top of the previous one, because otherwise it doesn't
> apply) causes qemu to bail out hard, with
>
> ...
> [ 3.658327] sr 2:0:0:0: Attached scsi generic sg0 type 5
> [ 3.858040] sched_clock: Marking stable (3834034034, 23728553)->(3865222956, -7460369)
> [ 3.861469] registered taskstats version 1
> [ 3.861584] Loading compiled-in X.509 certificates
> [ 4.082031] Btrfs loaded, zoned=no, fsverity=no
> [ 4.096034] cryptomgr_test (69) used greatest stack depth: 6136 bytes left
>
> No backtrace or other message, it just exits immediately.
Ha, I hadn't even compiled the thing :-) I was just wondering alound and
in patch form if the whole level thing was worth having in the first
place.
If it lives, I'll make sure to test it.
Thanks!
^ permalink raw reply [flat|nested] 59+ messages in thread
* [tip: x86/urgent] x86/mm: Fix pti_clone_pgtable() alignment assumption
2024-07-31 16:31 ` Peter Zijlstra
2024-07-31 16:50 ` Guenter Roeck
2024-07-31 16:51 ` Peter Zijlstra
@ 2024-08-01 10:55 ` tip-bot2 for Peter Zijlstra
2024-08-01 13:03 ` tip-bot2 for Peter Zijlstra
3 siblings, 0 replies; 59+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2024-08-01 10:55 UTC (permalink / raw)
To: linux-tip-commits
Cc: Guenter Roeck, Thomas Gleixner, Peter Zijlstra (Intel), x86,
linux-kernel
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 36e2dcf9840019de70e517a9df890fff316dd522
Gitweb: https://git.kernel.org/tip/36e2dcf9840019de70e517a9df890fff316dd522
Author: Peter Zijlstra <peterz@infradead.org>
AuthorDate: Wed, 31 Jul 2024 18:31:05 +02:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 01 Aug 2024 12:48:22 +02:00
x86/mm: Fix pti_clone_pgtable() alignment assumption
Guenter reported dodgy crashes on an i386-nosmp build using GCC-11
that had the form of endless traps until entry stack exhaust and then
#DF from the stack guard.
It turned out that pti_clone_pgtable() had alignment assumptions on
the start address, notably it hard assumes start is PMD aligned. This
is true on x86_64, but very much not true on i386.
These assumptions can cause the end condition to malfunction, leading
to a 'short' clone. Guess what happens when the user mapping has a
short copy of the entry text?
Use the correct increment form for addr to avoid alignment
assumptions.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20240731163105.GG33588@noisy.programming.kicks-ass.net
---
arch/x86/mm/pti.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index 2e69abf..48c5032 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -374,14 +374,14 @@ pti_clone_pgtable(unsigned long start, unsigned long end,
*/
*target_pmd = *pmd;
- addr += PMD_SIZE;
+ addr = round_up(addr + 1, PMD_SIZE);
} else if (level == PTI_CLONE_PTE) {
/* Walk the page-table down to the pte level */
pte = pte_offset_kernel(pmd, addr);
if (pte_none(*pte)) {
- addr += PAGE_SIZE;
+ addr = round_up(addr + 1, PAGE_SIZE);
continue;
}
@@ -401,7 +401,7 @@ pti_clone_pgtable(unsigned long start, unsigned long end,
/* Clone the PTE */
*target_pte = *pte;
- addr += PAGE_SIZE;
+ addr = round_up(addr + 1, PAGE_SIZE);
} else {
BUG();
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [tip: x86/urgent] x86/mm: Fix pti_clone_pgtable() alignment assumption
2024-07-31 16:31 ` Peter Zijlstra
` (2 preceding siblings ...)
2024-08-01 10:55 ` [tip: x86/urgent] x86/mm: Fix pti_clone_pgtable() alignment assumption tip-bot2 for Peter Zijlstra
@ 2024-08-01 13:03 ` tip-bot2 for Peter Zijlstra
3 siblings, 0 replies; 59+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2024-08-01 13:03 UTC (permalink / raw)
To: linux-tip-commits
Cc: Guenter Roeck, Thomas Gleixner, Peter Zijlstra (Intel), x86,
linux-kernel
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 41e71dbb0e0a0fe214545fe64af031303a08524c
Gitweb: https://git.kernel.org/tip/41e71dbb0e0a0fe214545fe64af031303a08524c
Author: Peter Zijlstra <peterz@infradead.org>
AuthorDate: Wed, 31 Jul 2024 18:31:05 +02:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 01 Aug 2024 14:52:56 +02:00
x86/mm: Fix pti_clone_pgtable() alignment assumption
Guenter reported dodgy crashes on an i386-nosmp build using GCC-11
that had the form of endless traps until entry stack exhaust and then
#DF from the stack guard.
It turned out that pti_clone_pgtable() had alignment assumptions on
the start address, notably it hard assumes start is PMD aligned. This
is true on x86_64, but very much not true on i386.
These assumptions can cause the end condition to malfunction, leading
to a 'short' clone. Guess what happens when the user mapping has a
short copy of the entry text?
Use the correct increment form for addr to avoid alignment
assumptions.
Fixes: 16a3fe634f6a ("x86/mm/pti: Clone kernel-image on PTE level for 32 bit")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20240731163105.GG33588@noisy.programming.kicks-ass.net
---
arch/x86/mm/pti.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index 2e69abf..48c5032 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -374,14 +374,14 @@ pti_clone_pgtable(unsigned long start, unsigned long end,
*/
*target_pmd = *pmd;
- addr += PMD_SIZE;
+ addr = round_up(addr + 1, PMD_SIZE);
} else if (level == PTI_CLONE_PTE) {
/* Walk the page-table down to the pte level */
pte = pte_offset_kernel(pmd, addr);
if (pte_none(*pte)) {
- addr += PAGE_SIZE;
+ addr = round_up(addr + 1, PAGE_SIZE);
continue;
}
@@ -401,7 +401,7 @@ pti_clone_pgtable(unsigned long start, unsigned long end,
/* Clone the PTE */
*target_pte = *pte;
- addr += PAGE_SIZE;
+ addr = round_up(addr + 1, PAGE_SIZE);
} else {
BUG();
^ permalink raw reply related [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-07-29 15:29 ` Linux 6.11-rc1 Guenter Roeck
2024-07-29 19:23 ` Linus Torvalds
2024-07-30 17:04 ` Guenter Roeck
@ 2024-08-02 17:35 ` Linus Walleij
2024-08-02 19:40 ` Guenter Roeck
2 siblings, 1 reply; 59+ messages in thread
From: Linus Walleij @ 2024-08-02 17:35 UTC (permalink / raw)
To: Guenter Roeck, Rob Herring; +Cc: Linus Torvalds, Linux Kernel Mailing List
On Mon, Jul 29, 2024 at 5:29 PM Guenter Roeck <linux@roeck-us.net> wrote:
> Failed tests:
> arm:versatilepb:versatile_defconfig:aeabi:pci:scsi:mem128:net=default:versatile-pb:ext2
> arm:versatilepb:versatile_defconfig:aeabi:pci:flash64:mem128:net=default:versatile-pb:ext2
> arm:versatilepb:versatile_defconfig:aeabi:pci:mem128:net=default:versatile-pb:initrd
> arm:versatileab:versatile_defconfig:mem128:net=default:versatile-ab:initrd
I traced these fails down to:
commit 04f08ef291d4b8d76f8d198bf2929ad43b96eecf
"arm/arm64: dts: arm: Use generic clock and regulator nodenames"
The following oneliner fixes it:
diff --git a/arch/arm/boot/dts/arm/versatile-ab.dts
b/arch/arm/boot/dts/arm/versatile-ab.dts
index 6fe6b49f5d8e..289c3d093579 100644
--- a/arch/arm/boot/dts/arm/versatile-ab.dts
+++ b/arch/arm/boot/dts/arm/versatile-ab.dts
@@ -157,7 +157,7 @@ timclk: clock-1000000 {
clocks = <&xtal24mhz>;
};
- pclk: clock-24000000 {
+ pclk: pclk@24M {
#clock-cells = <0>;
compatible = "fixed-factor-clock";
clock-div = <1>;
(versatile-ab is included by versatile-pb hence it regresses)
The problem is: I don't know why.
Rob: any ideas? (Perhaps some uglyhack of mine, I don't know.)
If nothing comes up I'll send an "unknown cause" onliner revert.
Yours,
Linus Walleij
^ permalink raw reply related [flat|nested] 59+ messages in thread
* Re: Linux 6.11-rc1
2024-08-02 17:35 ` Linus Walleij
@ 2024-08-02 19:40 ` Guenter Roeck
0 siblings, 0 replies; 59+ messages in thread
From: Guenter Roeck @ 2024-08-02 19:40 UTC (permalink / raw)
To: Linus Walleij, Rob Herring; +Cc: Linus Torvalds, Linux Kernel Mailing List
On 8/2/24 10:35, Linus Walleij wrote:
> On Mon, Jul 29, 2024 at 5:29 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
>> Failed tests:
>> arm:versatilepb:versatile_defconfig:aeabi:pci:scsi:mem128:net=default:versatile-pb:ext2
>> arm:versatilepb:versatile_defconfig:aeabi:pci:flash64:mem128:net=default:versatile-pb:ext2
>> arm:versatilepb:versatile_defconfig:aeabi:pci:mem128:net=default:versatile-pb:initrd
>> arm:versatileab:versatile_defconfig:mem128:net=default:versatile-ab:initrd
>
> I traced these fails down to:
> commit 04f08ef291d4b8d76f8d198bf2929ad43b96eecf
> "arm/arm64: dts: arm: Use generic clock and regulator nodenames"
>
> The following oneliner fixes it:
>
> diff --git a/arch/arm/boot/dts/arm/versatile-ab.dts
> b/arch/arm/boot/dts/arm/versatile-ab.dts
> index 6fe6b49f5d8e..289c3d093579 100644
> --- a/arch/arm/boot/dts/arm/versatile-ab.dts
> +++ b/arch/arm/boot/dts/arm/versatile-ab.dts
> @@ -157,7 +157,7 @@ timclk: clock-1000000 {
> clocks = <&xtal24mhz>;
> };
>
> - pclk: clock-24000000 {
> + pclk: pclk@24M {
> #clock-cells = <0>;
> compatible = "fixed-factor-clock";
> clock-div = <1>;
>
> (versatile-ab is included by versatile-pb hence it regresses)
>
> The problem is: I don't know why.
>
> Rob: any ideas? (Perhaps some uglyhack of mine, I don't know.)
>
Rob already sent a patch fixing the problem.
https://lore.kernel.org/r/20240730210030.2150467-2-robh@kernel.org
Guenter
^ permalink raw reply [flat|nested] 59+ messages in thread
end of thread, other threads:[~2024-08-02 19:40 UTC | newest]
Thread overview: 59+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-28 21:40 Linux 6.11-rc1 Linus Torvalds
2024-07-29 9:28 ` Build regressions/improvements in v6.11-rc1 Geert Uytterhoeven
2024-07-29 9:35 ` Geert Uytterhoeven
2024-07-29 9:54 ` Arnd Bergmann
2024-07-29 10:07 ` Geert Uytterhoeven
2024-07-29 15:29 ` Linux 6.11-rc1 Guenter Roeck
2024-07-29 19:23 ` Linus Torvalds
2024-07-29 19:50 ` Linus Torvalds
2024-07-29 21:34 ` Arnd Bergmann
2024-07-29 23:47 ` Linus Torvalds
2024-07-30 15:47 ` Arnd Bergmann
2024-07-30 7:54 ` Peter Zijlstra
2024-07-31 15:45 ` Guenter Roeck
2024-07-30 17:04 ` Guenter Roeck
2024-07-30 17:20 ` Jens Axboe
2024-07-30 18:22 ` Guenter Roeck
2024-07-30 18:35 ` Jens Axboe
2024-07-30 18:54 ` Jens Axboe
2024-07-30 18:53 ` Linus Torvalds
2024-07-30 19:22 ` Peter Zijlstra
2024-07-30 19:31 ` Jens Axboe
2024-07-30 19:34 ` Jens Axboe
2024-07-30 19:38 ` Peter Zijlstra
2024-07-30 19:41 ` Linus Torvalds
2024-07-30 20:04 ` Guenter Roeck
2024-07-30 20:09 ` Peter Zijlstra
2024-07-30 21:12 ` Peter Zijlstra
2024-07-30 23:29 ` Guenter Roeck
2024-07-30 23:54 ` Linus Torvalds
2024-07-31 8:21 ` Borislav Petkov
2024-07-31 9:11 ` Peter Zijlstra
2024-07-31 10:02 ` Borislav Petkov
2024-07-31 14:37 ` Guenter Roeck
2024-07-31 13:24 ` Jens Axboe
2024-07-30 20:13 ` Linus Torvalds
2024-07-30 20:24 ` Guenter Roeck
2024-07-31 12:20 ` Peter Zijlstra
2024-07-31 13:03 ` Thomas Gleixner
2024-07-31 15:55 ` Peter Zijlstra
2024-07-31 16:17 ` Linus Torvalds
2024-07-31 16:31 ` Peter Zijlstra
2024-07-31 16:50 ` Guenter Roeck
2024-07-31 16:51 ` Peter Zijlstra
2024-07-31 17:26 ` Thomas Gleixner
2024-07-31 21:20 ` Peter Zijlstra
2024-07-31 21:23 ` Linus Torvalds
2024-07-31 21:26 ` Peter Zijlstra
2024-07-31 21:41 ` Linus Torvalds
2024-07-31 21:47 ` Thomas Gleixner
2024-07-31 22:22 ` Guenter Roeck
2024-08-01 8:54 ` Peter Zijlstra
2024-08-01 10:55 ` [tip: x86/urgent] x86/mm: Fix pti_clone_pgtable() alignment assumption tip-bot2 for Peter Zijlstra
2024-08-01 13:03 ` tip-bot2 for Peter Zijlstra
2024-07-31 16:49 ` Linux 6.11-rc1 Guenter Roeck
2024-07-31 17:19 ` Thomas Gleixner
2024-07-31 10:33 ` Peter Zijlstra
2024-07-31 14:15 ` Peter Zijlstra
2024-08-02 17:35 ` Linus Walleij
2024-08-02 19:40 ` Guenter Roeck
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox