* [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+
@ 2022-06-28 5:13 Alexandre Messier
2022-06-28 9:20 ` Borislav Petkov
0 siblings, 1 reply; 7+ messages in thread
From: Alexandre Messier @ 2022-06-28 5:13 UTC (permalink / raw)
To: linux-kernel
Cc: tglx, Andrew.Cooper3, mingo, bp, dave.hansen, x86, regressions,
Alexandre Messier
Hello,
I tested 5.19-rc4 on my system that is currently running 5.18.0, and came
across an issue when unlocking the encrypted rootfs disk at startup. The error
message is:
device-mapper: reload ioctl on nvme0n1p3_crypt (254:0) failed: No such file or directory
The kernel log shows:
device-mapper: table: 254:0: crypt: Error allocating crypto tfm (-ENOENT)
device-mapper: ioctl: error adding target to table
I tested the previous 5.19-rcX, and the issue started happening with 5.19-rc1.
A bisection between 5.18.0 and 5.19-rc1 identifies the following commit:
8ad7e8f69695 ("x86/fpu/xsave: Support XSAVEC in the kernel")
I reverted that commit on top of 5.19-rc4, and unlocking the encrypted disk
works again.
Some more information about the system:
- CPU is AMD Ryzen 5700G
- Userspace is Debian Sid
- The encrypted disk setup is a default encrypted rootfs, as configured by the
standard Debian installer
Please let me know if more information is needed, or if some tests are needed
to be run.
Thanks,
Alex
#regzbot introduced 8ad7e8f69695
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+ 2022-06-28 5:13 [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+ Alexandre Messier @ 2022-06-28 9:20 ` Borislav Petkov 2022-06-28 16:52 ` Dave Hansen 2022-06-28 21:31 ` Alexandre Messier 0 siblings, 2 replies; 7+ messages in thread From: Borislav Petkov @ 2022-06-28 9:20 UTC (permalink / raw) To: Alexandre Messier Cc: linux-kernel, tglx, Andrew.Cooper3, mingo, dave.hansen, x86, regressions On Tue, Jun 28, 2022 at 01:13:30AM -0400, Alexandre Messier wrote: > Please let me know if more information is needed, or if some tests are needed > to be run. Yeah, pls send /proc/cpuinfo and full dmesg - privately is fine too. Also, it would be lovely if I were able to reproduce this on a machine here but mine doesn't have a crypto rootfs. Perhaps you can point me to the exact instructions you're running to decrypt your rootfs and I can try to create a usb crypto disk and try to reproduce it with them... Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+ 2022-06-28 9:20 ` Borislav Petkov @ 2022-06-28 16:52 ` Dave Hansen 2022-06-28 21:31 ` Alexandre Messier 1 sibling, 0 replies; 7+ messages in thread From: Dave Hansen @ 2022-06-28 16:52 UTC (permalink / raw) To: Borislav Petkov, Alexandre Messier Cc: linux-kernel, tglx, Andrew.Cooper3, mingo, dave.hansen, x86, regressions First of all, thank you for bisecting this! I know those are a lot of work. That XSAVEC patch modifies the AVX register save/restore code. There is a set of x86 AES acceleration instructions called AES-NI. Those instructions use the AVX registers. So, it's at least a plausible connection between that patch and your symptoms. But, I don't think anyone's been able to reproduce what you're seeing yet. The kernel XSAVE buffer formats also differ slightly between AMD and Intel. That *should* be OK, but it might explain why I can't reproduce this. If you get a chance, could you apply this (ugly hackish) patch to the userspace 'cryptsetup' utility and run it? https://sr71.net/~dave/intel/cryptsetup-memcmp.patch On Ubuntu at least, it was as simple as: apt-get source cryptsetup apt-get build-dep cryptsetup cd cryptsetup-1.6.6 ./configure make Then I could run: ./src/cryptsetup benchmark --cipher=aes-xts --key-size=512 and ./src/cryptsetup benchmark --cipher=aes-xts --key-size=256 With that patch applied, you should see some output like: # ./src/cryptsetup benchmark --cipher=aes-xts --key-size=512 # Tests are approximate using memory only (no storage IO). memcmp12: 0 memcmp23: 0 memcmp13: 0 memcmp12: -173 memcmp23: 173 memcmp13: 0 # Algorithm | Key | Encryption | Decryption aes-xts 512b 4592.2 MiB/s 4192.0 MiB/s The "memcmp13:" lines should both be 0. That means that an encryption and decryption cycle didn't change the data. You *might* have to run this in a loop if there's some kind of bad timing involved in triggering the bug. If you see a "memcmp13:" with something other than 0, that will narrow things down and means we'll have a pretty quick reproducer that doesn't involve luks which should speed things along. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+ 2022-06-28 9:20 ` Borislav Petkov 2022-06-28 16:52 ` Dave Hansen @ 2022-06-28 21:31 ` Alexandre Messier 2022-06-28 22:59 ` Thomas Gleixner 1 sibling, 1 reply; 7+ messages in thread From: Alexandre Messier @ 2022-06-28 21:31 UTC (permalink / raw) To: Borislav Petkov Cc: linux-kernel, tglx, Andrew.Cooper3, mingo, dave.hansen, x86, regressions On 2022-06-28 05:20, Borislav Petkov wrote: > On Tue, Jun 28, 2022 at 01:13:30AM -0400, Alexandre Messier wrote: >> Please let me know if more information is needed, or if some tests are needed >> to be run. > > Yeah, pls send /proc/cpuinfo and full dmesg - privately is fine too. Here is the cpuinfo output: processor : 0 vendor_id : AuthenticAMD cpu family : 25 model : 80 model name : AMD Ryzen 7 5700G with Radeon Graphics stepping : 0 microcode : 0xa50000c cpu MHz : 3514.072 cache size : 512 KB physical id : 0 siblings : 16 core id : 0 cpu cores : 8 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 16 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass bogomips : 7585.33 TLB size : 2560 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14] And here is the dmesg output of 5.19-rc4 without the revert (taken from the initramfs). I put it on a paste service since it is too big for email: https://paste.debian.net/1245491/ > > Also, it would be lovely if I were able to reproduce this on a machine > here but mine doesn't have a crypto rootfs. > > Perhaps you can point me to the exact instructions you're running to > decrypt your rootfs and I can try to create a usb crypto disk and try to > reproduce it with them... I setup an unencrypted Debian installation on another drive to be able to run cryptsetup commands in userspace while using rc4, and was able to see the issue. In a up-to-date Debian Sid installation (important, more on this below), running these commands makes it possible to reproduce the issue: dd if=/dev/zero bs=1M count=20 of=./test.img sudo cryptsetup luksFormat ./test.img sudo cryptsetup luksOpen ./test.img test_crypt The "luksOpen" will fail with the same error message I get on my main system. It seems using the latest Debian Sid is important. At first, I was trying with Debian Bullseye, but everything was working, even unlocking my main drive. Could it be a difference due to the cryptsetup version? Sid is using 2.4.3, while Bullseye is based on 2.3.7. I will try to compile cryptsetup 2.4.3 and use it in a Bullseye system with kernel 5.19-rc4, to see if the issue occurs in that setup. Thanks, Alex > > Thx. > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+ 2022-06-28 21:31 ` Alexandre Messier @ 2022-06-28 22:59 ` Thomas Gleixner 2022-06-28 23:24 ` Alexandre Messier 0 siblings, 1 reply; 7+ messages in thread From: Thomas Gleixner @ 2022-06-28 22:59 UTC (permalink / raw) To: Alexandre Messier, Borislav Petkov Cc: linux-kernel, Andrew.Cooper3, mingo, dave.hansen, x86, regressions Alexandre, On Tue, Jun 28 2022 at 17:31, Alexandre Messier wrote: > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov > pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext > fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl > nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq > monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave > avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm > sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce > topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb > cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall > fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed > adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 > xsaves cqm_llc cqm_occup_llc cqm_mbm_total > cqm_mbm_local So this CPU supports XSAVEC and XSAVES which means the kernel uses XSAVES as the kernel before that. > And here is the dmesg output of 5.19-rc4 without the revert (taken from the > initramfs). I put it on a paste service since it is too big for email: > > https://paste.debian.net/1245491/ [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers' [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.000000] x86/fpu: xstate_offset[9]: 832, xstate_sizes[9]: 8 [ 0.000000] x86/fpu: Enabled xstate features 0x207, context size is 840 bytes, using 'compacted' format. This is correct. Is there any difference on a 5.18 kernel or on 5.19-rc with the commit reverted? I doubt that. I'm completely puzzled and stared at the commit in question on and off, but I can't spot the fail. > I setup an unencrypted Debian installation on another drive to be able to run > cryptsetup commands in userspace while using rc4, and was able to see the > issue. In a up-to-date Debian Sid installation (important, more on this below), > running these commands makes it possible to reproduce the issue: > > dd if=/dev/zero bs=1M count=20 of=./test.img > sudo cryptsetup luksFormat ./test.img > sudo cryptsetup luksOpen ./test.img test_crypt > > The "luksOpen" will fail with the same error message I get on my main system. > > It seems using the latest Debian Sid is important. At first, I was trying with > Debian Bullseye, but everything was working, even unlocking my main drive. > > Could it be a difference due to the cryptsetup version? Sid is using 2.4.3, > while Bullseye is based on 2.3.7. I will try to compile cryptsetup 2.4.3 and > use it in a Bullseye system with kernel 5.19-rc4, to see if the issue occurs > in that setup. It might use a different crypto algorithm. Still confused.... I'll have another look tomorrow morning with brain awake. Thanks, tglx ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+ 2022-06-28 22:59 ` Thomas Gleixner @ 2022-06-28 23:24 ` Alexandre Messier 2022-06-29 15:24 ` Dave Hansen 0 siblings, 1 reply; 7+ messages in thread From: Alexandre Messier @ 2022-06-28 23:24 UTC (permalink / raw) To: Thomas Gleixner, Borislav Petkov Cc: linux-kernel, Andrew.Cooper3, mingo, dave.hansen, x86, regressions On 2022-06-28 18:59, Thomas Gleixner wrote: > Alexandre, > > On Tue, Jun 28 2022 at 17:31, Alexandre Messier wrote: >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov >> pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext >> fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl >> nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq >> monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave >> avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm >> sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce >> topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb >> cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall >> fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed >> adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 >> xsaves cqm_llc cqm_occup_llc cqm_mbm_total >> cqm_mbm_local > > So this CPU supports XSAVEC and XSAVES which means the kernel uses > XSAVES as the kernel before that. > >> And here is the dmesg output of 5.19-rc4 without the revert (taken from the >> initramfs). I put it on a paste service since it is too big for email: >> >> https://paste.debian.net/1245491/ > > [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' > [ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' > [ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' > [ 0.000000] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers' > [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 > [ 0.000000] x86/fpu: xstate_offset[9]: 832, xstate_sizes[9]: 8 > [ 0.000000] x86/fpu: Enabled xstate features 0x207, context size is 840 bytes, using 'compacted' format. > > This is correct. Is there any difference on a 5.18 kernel or on 5.19-rc > with the commit reverted? I doubt that. > > I'm completely puzzled and stared at the commit in question on and off, > but I can't spot the fail. > >> I setup an unencrypted Debian installation on another drive to be able to run >> cryptsetup commands in userspace while using rc4, and was able to see the >> issue. In a up-to-date Debian Sid installation (important, more on this below), >> running these commands makes it possible to reproduce the issue: >> >> dd if=/dev/zero bs=1M count=20 of=./test.img >> sudo cryptsetup luksFormat ./test.img >> sudo cryptsetup luksOpen ./test.img test_crypt >> >> The "luksOpen" will fail with the same error message I get on my main system. >> >> It seems using the latest Debian Sid is important. At first, I was trying with >> Debian Bullseye, but everything was working, even unlocking my main drive. >> >> Could it be a difference due to the cryptsetup version? Sid is using 2.4.3, >> while Bullseye is based on 2.3.7. I will try to compile cryptsetup 2.4.3 and >> use it in a Bullseye system with kernel 5.19-rc4, to see if the issue occurs >> in that setup. > > It might use a different crypto algorithm. > > Still confused.... > > I'll have another look tomorrow morning with brain awake. Thomas, Borislav, Well this is embarrassing... I ran the test Dave sent in his email, and when running it on that unencrypted Debian Sid installation with kernel 5.19-rc4, it failed too, but indicated that "aes-xts" was not available... It was right. I forgot to mention I am using a custom kernel config, and indeed CRYPTO_XTS was not enabled. When I enabled it, the cryptsetup benchmark worked, along with the test that previously failed with the test file. So I enabled that option too on my main installation and I am now able to unlock the drive like before. I don't know why it is needed now, but that fixed the issue. Sorry again for the trouble, this was not a kernel regression, but my error. Thanks, Alex #regzbot invalid: Missing kernel config, not kernel regression > > Thanks, > > tglx ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+ 2022-06-28 23:24 ` Alexandre Messier @ 2022-06-29 15:24 ` Dave Hansen 0 siblings, 0 replies; 7+ messages in thread From: Dave Hansen @ 2022-06-29 15:24 UTC (permalink / raw) To: Alexandre Messier, Thomas Gleixner, Borislav Petkov Cc: linux-kernel, Andrew.Cooper3, mingo, dave.hansen, x86, regressions On 6/28/22 16:24, Alexandre Messier wrote: > Sorry again for the trouble, this was not a kernel regression, but my error. Been there, done that! I'm just glad we don't have anything to fix. :) ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-06-29 15:25 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-06-28 5:13 [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+ Alexandre Messier 2022-06-28 9:20 ` Borislav Petkov 2022-06-28 16:52 ` Dave Hansen 2022-06-28 21:31 ` Alexandre Messier 2022-06-28 22:59 ` Thomas Gleixner 2022-06-28 23:24 ` Alexandre Messier 2022-06-29 15:24 ` Dave Hansen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox