* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective) [not found] ` <675a8893-429d-05be-b647-089b249c814c@leemhuis.info> @ 2023-06-22 12:36 ` Michael Ellerman 2023-06-22 14:38 ` Limonciello, Mario 0 siblings, 1 reply; 7+ messages in thread From: Michael Ellerman @ 2023-06-22 12:36 UTC (permalink / raw) To: Linux regressions mailing list, Sachin Sant Cc: open list, linuxppc-dev, jarkko, mario.limonciello, linux-integrity "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> writes: > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting > for once, to make this easily accessible to everyone. > > As Linus will likely release 6.4 on this or the following Sunday a quick > question: is there any hope this regression might be fixed any time > soon? No. I have added the author of the commit to Cc, maybe they can help? The immediate question is, is it expected for chip->ops to be NULL in this path? Obviously on actual AMD systems that isn't the case, otherwise the code would crash there. But is the fact that chip->ops is NULL a bug in the ibmvtpm driver, or a possibility that has been overlooked by the checking code. cheers > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) > -- > Everything you wanna know about Linux kernel regression tracking: > https://linux-regtracking.leemhuis.info/about/#tldr > If I did something stupid, please tell me, as explained on that page. > > #regzbot poke > > On 15.06.23 06:57, Sachin Sant wrote: >> >>>> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 >>> >>> 2c: 28 07 23 e9 ld r9,1832(r3) >>> 30: 50 00 89 e9 ld r12,80(r9) >>> >>> Where r3 is *chip. >>> r9 is NULL, and 80 = 0x50. >>> >>> Looks like a NULL chip->ops, which oopses in: >>> >>> static int tpm_request_locality(struct tpm_chip *chip) >>> { >>> int rc; >>> >>> if (!chip->ops->request_locality) >>> >>> >>> Can you test the patch below? >>> >> >> It proceeds further but then run into following crash >> >> [ 103.269574] Kernel attempted to read user page (18) - exploit attempt? (uid: 0) >> [ 103.269589] BUG: Kernel NULL pointer dereference on read at 0x00000018 >> [ 103.269595] Faulting instruction address: 0xc0000000009dcf34 >> [ 103.269599] Oops: Kernel access of bad area, sig: 11 [#1] >> [ 103.269602] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries >> [ 103.269606] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E) >> [ 103.269644] CPU: 18 PID: 6872 Comm: kexec Kdump: loaded Tainted: G E 6.4.0-rc6-dirty #8 >> [ 103.269649] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries >> [ 103.269653] NIP: c0000000009dcf34 LR: c0000000009dd2bc CTR: c0000000009eaa60 >> [ 103.269656] REGS: c0000000a113f510 TRAP: 0300 Tainted: G E (6.4.0-rc6-dirty) >> [ 103.269660] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88484886 XER: 00000001 >> [ 103.269669] CFAR: c0000000009dd2b8 DAR: 0000000000000018 DSISR: 40000000 IRQMASK: 0 [ 103.269669] GPR00: c0000000009dd2bc c0000000a113f7b0 c0000000014a1500 c000000090310000 [ 103.269669] GPR04: c00000009f770000 0000000000000016 0000060000007a01 0000000000000016 [ 103.269669] GPR08: c00000009f770000 0000000000000000 0000000000000000 0000000000008000 [ 103.269669] GPR12: c0000000009eaa60 c00000135fab7f00 0000000000000000 0000000000000000 [ 103.269669] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 103.269669] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 103.269669] GPR24: 0000000000000000 0000000000000016 c000000090310000 0000000000001000 [ 103.269669] GPR28: c00000009f770000 000000007a010000 c00000009f770000 c000000090310000 [ 103.269707] NIP [c0000000009dcf34] tpm_try_transmit+0x74/0x300 >> [ 103.269713] LR [c0000000009dd2bc] tpm_transmit+0xfc/0x190 >> [ 103.269717] Call Trace: >> [ 103.269718] [c0000000a113f7b0] [c0000000a113f880] 0xc0000000a113f880 (unreliable) >> [ 103.269724] [c0000000a113f840] [c0000000009dd2bc] tpm_transmit+0xfc/0x190 >> [ 103.269727] [c0000000a113f900] [c0000000009dd398] tpm_transmit_cmd+0x48/0x110 >> [ 103.269731] [c0000000a113f980] [c0000000009df1b0] tpm2_get_tpm_pt+0x140/0x230 >> [ 103.269736] [c0000000a113fa20] [c0000000009db208] tpm_amd_is_rng_defective+0xb8/0x250 >> [ 103.269739] [c0000000a113faa0] [c0000000009db828] tpm_chip_unregister+0x138/0x160 >> [ 103.269743] [c0000000a113fae0] [c0000000009eaa94] tpm_ibmvtpm_remove+0x34/0x130 >> [ 103.269748] [c0000000a113fb50] [c000000000115738] vio_bus_remove+0x58/0xd0 >> [ 103.269754] [c0000000a113fb90] [c000000000a01dcc] device_shutdown+0x21c/0x39c >> [ 103.269758] [c0000000a113fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70 >> [ 103.269762] [c0000000a113fc40] [c000000000292c48] kernel_kexec+0xa8/0x100 >> [ 103.269766] [c0000000a113fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0 >> [ 103.269770] [c0000000a113fe10] [c000000000034adc] system_call_exception+0x13c/0x340 >> [ 103.269776] [c0000000a113fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec >> [ 103.269781] --- interrupt: 3000 at 0x7fff805459f0 >> [ 103.269784] NIP: 00007fff805459f0 LR: 0000000000000000 CTR: 0000000000000000 >> [ 103.269786] REGS: c0000000a113fe80 TRAP: 3000 Tainted: G E (6.4.0-rc6-dirty) >> [ 103.269790] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 42422884 XER: 00000000 >> [ 103.269799] IRQMASK: 0 [ 103.269799] GPR00: 0000000000000058 00007fffc07a68c0 0000000110437f00 fffffffffee1dead [ 103.269799] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003 [ 103.269799] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000 [ 103.269799] GPR12: 0000000000000000 00007fff8089b2c0 000000011042f598 0000000000000000 [ 103.269799] GPR16: ffffffffffffffff 0000000000000000 000000011040fcc0 0000000000000000 [ 103.269799] GPR20: 0000000000008913 0000000000008914 0000000149c61020 0000000000000003 [ 103.269799] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007fffc07a6a40 [ 103.269799] GPR28: 0000000110409f10 00007fff806419c0 0000000149c61080 0000000149c61040 [ 103.269833] NIP [00007fff805459f0] 0x7fff805459f0 >> [ 103.269836] LR [0000000000000000] 0x0 >> [ 103.269838] --- interrupt: 3000 >> [ 103.269839] Code: 83a40006 2c090000 41820208 7c0802a6 79250020 7c25d840 f80100a0 41810224 fbe10088 f8410018 7c7f1b78 e9230728 <e9890018> 7d8903a6 4e800421 e8410018 [ 103.269852] ---[ end trace 0000000000000000 ]— >> >> - Sachin ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective) 2023-06-22 12:36 ` [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective) Michael Ellerman @ 2023-06-22 14:38 ` Limonciello, Mario 2023-06-23 2:52 ` Sachin Sant 2023-06-29 17:06 ` Jerry Snitselaar 0 siblings, 2 replies; 7+ messages in thread From: Limonciello, Mario @ 2023-06-22 14:38 UTC (permalink / raw) To: Michael Ellerman, Linux regressions mailing list, Sachin Sant Cc: open list, linuxppc-dev, jarkko, linux-integrity On 6/22/2023 7:36 AM, Michael Ellerman wrote: > "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> writes: >> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting >> for once, to make this easily accessible to everyone. >> >> As Linus will likely release 6.4 on this or the following Sunday a quick >> question: is there any hope this regression might be fixed any time >> soon? > No. > > I have added the author of the commit to Cc, maybe they can help? > > The immediate question is, is it expected for chip->ops to be NULL in > this path? Obviously on actual AMD systems that isn't the case, > otherwise the code would crash there. But is the fact that chip->ops is > NULL a bug in the ibmvtpm driver, or a possibility that has been > overlooked by the checking code. > > cheers All that code assumes that the TPM is still functional which seems not to be the case for your TPM. This should fix it: diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c index 5be91591cb3b..7082b031741e 100644 --- a/drivers/char/tpm/tpm-chip.c +++ b/drivers/char/tpm/tpm-chip.c @@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct tpm_chip *chip) u64 version; int ret; + if (!chip->ops) + return false; + if (!(chip->flags & TPM_CHIP_FLAG_TPM2)) return false; >> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) >> -- >> Everything you wanna know about Linux kernel regression tracking: >> https://linux-regtracking.leemhuis.info/about/#tldr >> If I did something stupid, please tell me, as explained on that page. >> >> #regzbot poke >> >> On 15.06.23 06:57, Sachin Sant wrote: >>>>> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 >>>> 2c: 28 07 23 e9 ld r9,1832(r3) >>>> 30: 50 00 89 e9 ld r12,80(r9) >>>> >>>> Where r3 is *chip. >>>> r9 is NULL, and 80 = 0x50. >>>> >>>> Looks like a NULL chip->ops, which oopses in: >>>> >>>> static int tpm_request_locality(struct tpm_chip *chip) >>>> { >>>> int rc; >>>> >>>> if (!chip->ops->request_locality) >>>> >>>> >>>> Can you test the patch below? >>>> >>> It proceeds further but then run into following crash >>> >>> [ 103.269574] Kernel attempted to read user page (18) - exploit attempt? (uid: 0) >>> [ 103.269589] BUG: Kernel NULL pointer dereference on read at 0x00000018 >>> [ 103.269595] Faulting instruction address: 0xc0000000009dcf34 >>> [ 103.269599] Oops: Kernel access of bad area, sig: 11 [#1] >>> [ 103.269602] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries >>> [ 103.269606] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E) >>> [ 103.269644] CPU: 18 PID: 6872 Comm: kexec Kdump: loaded Tainted: G E 6.4.0-rc6-dirty #8 >>> [ 103.269649] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries >>> [ 103.269653] NIP: c0000000009dcf34 LR: c0000000009dd2bc CTR: c0000000009eaa60 >>> [ 103.269656] REGS: c0000000a113f510 TRAP: 0300 Tainted: G E (6.4.0-rc6-dirty) >>> [ 103.269660] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88484886 XER: 00000001 >>> [ 103.269669] CFAR: c0000000009dd2b8 DAR: 0000000000000018 DSISR: 40000000 IRQMASK: 0 [ 103.269669] GPR00: c0000000009dd2bc c0000000a113f7b0 c0000000014a1500 c000000090310000 [ 103.269669] GPR04: c00000009f770000 0000000000000016 0000060000007a01 0000000000000016 [ 103.269669] GPR08: c00000009f770000 0000000000000000 0000000000000000 0000000000008000 [ 103.269669] GPR12: c0000000009eaa60 c00000135fab7f00 0000000000000000 0000000000000000 [ 103.269669] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 103.269669] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 103.269669] GPR24: 0000000000000000 0000000000000016 c000000090310000 0000000000001000 [ 103.269669] GPR28: c00000009f770000 000000007a010000 c00000009f770000 c000000090310000 [ 103.269707] NIP [c0000000009dcf34] tpm_try_transmit+0x74/0x300 >>> [ 103.269713] LR [c0000000009dd2bc] tpm_transmit+0xfc/0x190 >>> [ 103.269717] Call Trace: >>> [ 103.269718] [c0000000a113f7b0] [c0000000a113f880] 0xc0000000a113f880 (unreliable) >>> [ 103.269724] [c0000000a113f840] [c0000000009dd2bc] tpm_transmit+0xfc/0x190 >>> [ 103.269727] [c0000000a113f900] [c0000000009dd398] tpm_transmit_cmd+0x48/0x110 >>> [ 103.269731] [c0000000a113f980] [c0000000009df1b0] tpm2_get_tpm_pt+0x140/0x230 >>> [ 103.269736] [c0000000a113fa20] [c0000000009db208] tpm_amd_is_rng_defective+0xb8/0x250 >>> [ 103.269739] [c0000000a113faa0] [c0000000009db828] tpm_chip_unregister+0x138/0x160 >>> [ 103.269743] [c0000000a113fae0] [c0000000009eaa94] tpm_ibmvtpm_remove+0x34/0x130 >>> [ 103.269748] [c0000000a113fb50] [c000000000115738] vio_bus_remove+0x58/0xd0 >>> [ 103.269754] [c0000000a113fb90] [c000000000a01dcc] device_shutdown+0x21c/0x39c >>> [ 103.269758] [c0000000a113fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70 >>> [ 103.269762] [c0000000a113fc40] [c000000000292c48] kernel_kexec+0xa8/0x100 >>> [ 103.269766] [c0000000a113fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0 >>> [ 103.269770] [c0000000a113fe10] [c000000000034adc] system_call_exception+0x13c/0x340 >>> [ 103.269776] [c0000000a113fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec >>> [ 103.269781] --- interrupt: 3000 at 0x7fff805459f0 >>> [ 103.269784] NIP: 00007fff805459f0 LR: 0000000000000000 CTR: 0000000000000000 >>> [ 103.269786] REGS: c0000000a113fe80 TRAP: 3000 Tainted: G E (6.4.0-rc6-dirty) >>> [ 103.269790] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 42422884 XER: 00000000 >>> [ 103.269799] IRQMASK: 0 [ 103.269799] GPR00: 0000000000000058 00007fffc07a68c0 0000000110437f00 fffffffffee1dead [ 103.269799] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003 [ 103.269799] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000 [ 103.269799] GPR12: 0000000000000000 00007fff8089b2c0 000000011042f598 0000000000000000 [ 103.269799] GPR16: ffffffffffffffff 0000000000000000 000000011040fcc0 0000000000000000 [ 103.269799] GPR20: 0000000000008913 0000000000008914 0000000149c61020 0000000000000003 [ 103.269799] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007fffc07a6a40 [ 103.269799] GPR28: 0000000110409f10 00007fff806419c0 0000000149c61080 0000000149c61040 [ 103.269833] NIP [00007fff805459f0] 0x7fff805459f0 >>> [ 103.269836] LR [0000000000000000] 0x0 >>> [ 103.269838] --- interrupt: 3000 >>> [ 103.269839] Code: 83a40006 2c090000 41820208 7c0802a6 79250020 7c25d840 f80100a0 41810224 fbe10088 f8410018 7c7f1b78 e9230728 <e9890018> 7d8903a6 4e800421 e8410018 [ 103.269852] ---[ end trace 0000000000000000 ]— >>> >>> - Sachin ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective) 2023-06-22 14:38 ` Limonciello, Mario @ 2023-06-23 2:52 ` Sachin Sant 2023-06-29 17:06 ` Jerry Snitselaar 1 sibling, 0 replies; 7+ messages in thread From: Sachin Sant @ 2023-06-23 2:52 UTC (permalink / raw) To: Limonciello, Mario Cc: Michael Ellerman, Linux regressions mailing list, open list, linuxppc-dev, jarkko, linux-integrity, Aneesh Kumar K.V > On 22-Jun-2023, at 8:08 PM, Limonciello, Mario <Mario.Limonciello@amd.com> wrote: > > > On 6/22/2023 7:36 AM, Michael Ellerman wrote: >> "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> writes: >>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting >>> for once, to make this easily accessible to everyone. >>> >>> As Linus will likely release 6.4 on this or the following Sunday a quick >>> question: is there any hope this regression might be fixed any time >>> soon? >> No. >> >> I have added the author of the commit to Cc, maybe they can help? >> >> The immediate question is, is it expected for chip->ops to be NULL in >> this path? Obviously on actual AMD systems that isn't the case, >> otherwise the code would crash there. But is the fact that chip->ops is >> NULL a bug in the ibmvtpm driver, or a possibility that has been >> overlooked by the checking code. >> >> cheers > > All that code assumes that the TPM is still functional which > seems not to be the case for your TPM. > > This should fix it: Yes, with this change kexec works correctly. Since Aneesh first reported this problem including reported by credit for him Reported-by: Aneesh Kumar K. V <aneesh.kumar@linux.ibm.com> Reported-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: Sachin Sant <sachinp@linux.ibm.com> -Sachin ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective) 2023-06-22 14:38 ` Limonciello, Mario 2023-06-23 2:52 ` Sachin Sant @ 2023-06-29 17:06 ` Jerry Snitselaar 2023-06-29 17:28 ` Limonciello, Mario 1 sibling, 1 reply; 7+ messages in thread From: Jerry Snitselaar @ 2023-06-29 17:06 UTC (permalink / raw) To: Limonciello, Mario Cc: Michael Ellerman, Linux regressions mailing list, Sachin Sant, open list, linuxppc-dev, jarkko, linux-integrity On Thu, Jun 22, 2023 at 09:38:04AM -0500, Limonciello, Mario wrote: > > On 6/22/2023 7:36 AM, Michael Ellerman wrote: > > "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> writes: > > > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting > > > for once, to make this easily accessible to everyone. > > > > > > As Linus will likely release 6.4 on this or the following Sunday a quick > > > question: is there any hope this regression might be fixed any time > > > soon? > > No. > > > > I have added the author of the commit to Cc, maybe they can help? > > > > The immediate question is, is it expected for chip->ops to be NULL in > > this path? Obviously on actual AMD systems that isn't the case, > > otherwise the code would crash there. But is the fact that chip->ops is > > NULL a bug in the ibmvtpm driver, or a possibility that has been > > overlooked by the checking code. > > > > cheers > > All that code assumes that the TPM is still functional which > seems not to be the case for your TPM. > > This should fix it: > > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c > index 5be91591cb3b..7082b031741e 100644 > --- a/drivers/char/tpm/tpm-chip.c > +++ b/drivers/char/tpm/tpm-chip.c > @@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct tpm_chip > *chip) > u64 version; > int ret; > > + if (!chip->ops) > + return false; > + > if (!(chip->flags & TPM_CHIP_FLAG_TPM2)) > return false; Should tpm_amd_is_rng_defective compile to nothing on non-x86 architectures? This code is all about working around an issue with the AMD fTPM, right? Regards, Jerry ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective) 2023-06-29 17:06 ` Jerry Snitselaar @ 2023-06-29 17:28 ` Limonciello, Mario 2023-06-29 17:43 ` Jerry Snitselaar 0 siblings, 1 reply; 7+ messages in thread From: Limonciello, Mario @ 2023-06-29 17:28 UTC (permalink / raw) To: Jerry Snitselaar Cc: Michael Ellerman, Linux regressions mailing list, Sachin Sant, open list, linuxppc-dev, jarkko@kernel.org, linux-integrity@vger.kernel.org [Public] > -----Original Message----- > From: Jerry Snitselaar <jsnitsel@redhat.com> > Sent: Thursday, June 29, 2023 12:07 PM > To: Limonciello, Mario <Mario.Limonciello@amd.com> > Cc: Michael Ellerman <mpe@ellerman.id.au>; Linux regressions mailing list > <regressions@lists.linux.dev>; Sachin Sant <sachinp@linux.ibm.com>; open > list <linux-kernel@vger.kernel.org>; linuxppc-dev <linuxppc- > dev@lists.ozlabs.org>; jarkko@kernel.org; linux-integrity@vger.kernel.org > Subject: Re: [6.4-rc6] Crash during a kexec operation > (tpm_amd_is_rng_defective) > > On Thu, Jun 22, 2023 at 09:38:04AM -0500, Limonciello, Mario wrote: > > > > On 6/22/2023 7:36 AM, Michael Ellerman wrote: > > > "Linux regression tracking (Thorsten Leemhuis)" > <regressions@leemhuis.info> writes: > > > > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting > > > > for once, to make this easily accessible to everyone. > > > > > > > > As Linus will likely release 6.4 on this or the following Sunday a quick > > > > question: is there any hope this regression might be fixed any time > > > > soon? > > > No. > > > > > > I have added the author of the commit to Cc, maybe they can help? > > > > > > The immediate question is, is it expected for chip->ops to be NULL in > > > this path? Obviously on actual AMD systems that isn't the case, > > > otherwise the code would crash there. But is the fact that chip->ops is > > > NULL a bug in the ibmvtpm driver, or a possibility that has been > > > overlooked by the checking code. > > > > > > cheers > > > > All that code assumes that the TPM is still functional which > > seems not to be the case for your TPM. > > > > This should fix it: > > > > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c > > index 5be91591cb3b..7082b031741e 100644 > > --- a/drivers/char/tpm/tpm-chip.c > > +++ b/drivers/char/tpm/tpm-chip.c > > @@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct > tpm_chip > > *chip) > > u64 version; > > int ret; > > > > + if (!chip->ops) > > + return false; > > + > > if (!(chip->flags & TPM_CHIP_FLAG_TPM2)) > > return false; > > > Should tpm_amd_is_rng_defective compile to nothing on non-x86 > architectures? This code is all about > working around an issue with the AMD fTPM, right? > That's a good point. Yes it could and that would also solve this problem. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective) 2023-06-29 17:28 ` Limonciello, Mario @ 2023-06-29 17:43 ` Jerry Snitselaar 2023-06-29 17:45 ` Limonciello, Mario 0 siblings, 1 reply; 7+ messages in thread From: Jerry Snitselaar @ 2023-06-29 17:43 UTC (permalink / raw) To: Limonciello, Mario Cc: Michael Ellerman, Linux regressions mailing list, Sachin Sant, open list, linuxppc-dev, jarkko@kernel.org, linux-integrity@vger.kernel.org On Thu, Jun 29, 2023 at 05:28:58PM +0000, Limonciello, Mario wrote: > [Public] > > > -----Original Message----- > > From: Jerry Snitselaar <jsnitsel@redhat.com> > > Sent: Thursday, June 29, 2023 12:07 PM > > To: Limonciello, Mario <Mario.Limonciello@amd.com> > > Cc: Michael Ellerman <mpe@ellerman.id.au>; Linux regressions mailing list > > <regressions@lists.linux.dev>; Sachin Sant <sachinp@linux.ibm.com>; open > > list <linux-kernel@vger.kernel.org>; linuxppc-dev <linuxppc- > > dev@lists.ozlabs.org>; jarkko@kernel.org; linux-integrity@vger.kernel.org > > Subject: Re: [6.4-rc6] Crash during a kexec operation > > (tpm_amd_is_rng_defective) > > > > On Thu, Jun 22, 2023 at 09:38:04AM -0500, Limonciello, Mario wrote: > > > > > > On 6/22/2023 7:36 AM, Michael Ellerman wrote: > > > > "Linux regression tracking (Thorsten Leemhuis)" > > <regressions@leemhuis.info> writes: > > > > > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting > > > > > for once, to make this easily accessible to everyone. > > > > > > > > > > As Linus will likely release 6.4 on this or the following Sunday a quick > > > > > question: is there any hope this regression might be fixed any time > > > > > soon? > > > > No. > > > > > > > > I have added the author of the commit to Cc, maybe they can help? > > > > > > > > The immediate question is, is it expected for chip->ops to be NULL in > > > > this path? Obviously on actual AMD systems that isn't the case, > > > > otherwise the code would crash there. But is the fact that chip->ops is > > > > NULL a bug in the ibmvtpm driver, or a possibility that has been > > > > overlooked by the checking code. > > > > > > > > cheers > > > > > > All that code assumes that the TPM is still functional which > > > seems not to be the case for your TPM. > > > > > > This should fix it: > > > > > > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c > > > index 5be91591cb3b..7082b031741e 100644 > > > --- a/drivers/char/tpm/tpm-chip.c > > > +++ b/drivers/char/tpm/tpm-chip.c > > > @@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct > > tpm_chip > > > *chip) > > > u64 version; > > > int ret; > > > > > > + if (!chip->ops) > > > + return false; > > > + > > > if (!(chip->flags & TPM_CHIP_FLAG_TPM2)) > > > return false; > > > > > > Should tpm_amd_is_rng_defective compile to nothing on non-x86 > > architectures? This code is all about > > working around an issue with the AMD fTPM, right? > > > > That's a good point. Yes it could and that would also solve this problem. > Or I guess more accurately for non-x86 it should be: static bool tpm_amd_is_rng_defective(struct tpm_chip *chip) { return false; } ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective) 2023-06-29 17:43 ` Jerry Snitselaar @ 2023-06-29 17:45 ` Limonciello, Mario 0 siblings, 0 replies; 7+ messages in thread From: Limonciello, Mario @ 2023-06-29 17:45 UTC (permalink / raw) To: Jerry Snitselaar Cc: Michael Ellerman, Linux regressions mailing list, Sachin Sant, open list, linuxppc-dev, jarkko@kernel.org, linux-integrity@vger.kernel.org [AMD Official Use Only - General] > -----Original Message----- > From: Jerry Snitselaar <jsnitsel@redhat.com> > Sent: Thursday, June 29, 2023 12:43 PM > To: Limonciello, Mario <Mario.Limonciello@amd.com> > Cc: Michael Ellerman <mpe@ellerman.id.au>; Linux regressions mailing list > <regressions@lists.linux.dev>; Sachin Sant <sachinp@linux.ibm.com>; open > list <linux-kernel@vger.kernel.org>; linuxppc-dev <linuxppc- > dev@lists.ozlabs.org>; jarkko@kernel.org; linux-integrity@vger.kernel.org > Subject: Re: [6.4-rc6] Crash during a kexec operation > (tpm_amd_is_rng_defective) > > On Thu, Jun 29, 2023 at 05:28:58PM +0000, Limonciello, Mario wrote: > > [Public] > > > > > -----Original Message----- > > > From: Jerry Snitselaar <jsnitsel@redhat.com> > > > Sent: Thursday, June 29, 2023 12:07 PM > > > To: Limonciello, Mario <Mario.Limonciello@amd.com> > > > Cc: Michael Ellerman <mpe@ellerman.id.au>; Linux regressions mailing list > > > <regressions@lists.linux.dev>; Sachin Sant <sachinp@linux.ibm.com>; > open > > > list <linux-kernel@vger.kernel.org>; linuxppc-dev <linuxppc- > > > dev@lists.ozlabs.org>; jarkko@kernel.org; linux-integrity@vger.kernel.org > > > Subject: Re: [6.4-rc6] Crash during a kexec operation > > > (tpm_amd_is_rng_defective) > > > > > > On Thu, Jun 22, 2023 at 09:38:04AM -0500, Limonciello, Mario wrote: > > > > > > > > On 6/22/2023 7:36 AM, Michael Ellerman wrote: > > > > > "Linux regression tracking (Thorsten Leemhuis)" > > > <regressions@leemhuis.info> writes: > > > > > > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting > > > > > > for once, to make this easily accessible to everyone. > > > > > > > > > > > > As Linus will likely release 6.4 on this or the following Sunday a quick > > > > > > question: is there any hope this regression might be fixed any time > > > > > > soon? > > > > > No. > > > > > > > > > > I have added the author of the commit to Cc, maybe they can help? > > > > > > > > > > The immediate question is, is it expected for chip->ops to be NULL in > > > > > this path? Obviously on actual AMD systems that isn't the case, > > > > > otherwise the code would crash there. But is the fact that chip->ops is > > > > > NULL a bug in the ibmvtpm driver, or a possibility that has been > > > > > overlooked by the checking code. > > > > > > > > > > cheers > > > > > > > > All that code assumes that the TPM is still functional which > > > > seems not to be the case for your TPM. > > > > > > > > This should fix it: > > > > > > > > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c > > > > index 5be91591cb3b..7082b031741e 100644 > > > > --- a/drivers/char/tpm/tpm-chip.c > > > > +++ b/drivers/char/tpm/tpm-chip.c > > > > @@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct > > > tpm_chip > > > > *chip) > > > > u64 version; > > > > int ret; > > > > > > > > + if (!chip->ops) > > > > + return false; > > > > + > > > > if (!(chip->flags & TPM_CHIP_FLAG_TPM2)) > > > > return false; > > > > > > > > > Should tpm_amd_is_rng_defective compile to nothing on non-x86 > > > architectures? This code is all about > > > working around an issue with the AMD fTPM, right? > > > > > > > That's a good point. Yes it could and that would also solve this problem. > > > Or I guess more accurately for non-x86 it should be: > > static bool tpm_amd_is_rng_defective(struct tpm_chip *chip) > { > return false; > } Right, but it should be inline. Would you mind sending something out for your cleaner idea to supercede my other solution that still didn't merge? ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-06-29 17:45 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <99B81401-DB46-49B9-B321-CF832B50CAC3@linux.ibm.com>
[not found] ` <87o7lhfmoh.fsf@mail.lhotse>
[not found] ` <CA0088E4-2851-4AFF-94F8-2A07C5CDA8D8@linux.ibm.com>
[not found] ` <675a8893-429d-05be-b647-089b249c814c@leemhuis.info>
2023-06-22 12:36 ` [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective) Michael Ellerman
2023-06-22 14:38 ` Limonciello, Mario
2023-06-23 2:52 ` Sachin Sant
2023-06-29 17:06 ` Jerry Snitselaar
2023-06-29 17:28 ` Limonciello, Mario
2023-06-29 17:43 ` Jerry Snitselaar
2023-06-29 17:45 ` Limonciello, Mario
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox