* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 [not found] <ce0b4d26-3a6e-7c5a-5f66-44cba05f9f35@molgen.mpg.de> @ 2022-08-19 16:02 ` Paul Menzel 2022-08-19 18:28 ` Dave Hansen 0 siblings, 1 reply; 16+ messages in thread From: Paul Menzel @ 2022-08-19 16:02 UTC (permalink / raw) To: Jarkko Sakkinen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86 Cc: linux-sgx, LKML [Cc: +linux-sgx@vger.kernel.org] Am 19.08.22 um 15:19 schrieb Paul Menzel: > Dear Linux folks, > > > On the Dell XPS 13 9370, Linux 5.18.16 prints the warning below: > > ``` > [ 0.000000] Linux version 5.18.0-4-amd64 (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-5) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP PREEMPT_DYNAMIC Debian 5.18.16-1 (2022-08-10) > [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64 root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet > […] > [ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS 1.21.0 07/06/2022 > […] > [ 0.235418] sgx: EPC section 0x40200000-0x45f7ffff > [ 0.235853] ------------[ cut here ]------------ > [ 0.235855] WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 > [ 0.235861] Modules linked in: > [ 0.235862] CPU: 1 PID: 83 Comm: ksgxd Not tainted 5.18.0-4-amd64 #1 Debian 5.18.16-1 > [ 0.235865] Hardware name: Dell Inc. XPS 13 9370/0RMYH9, BIOS 1.21.0 07/06/2022 > [ 0.235866] RIP: 0010:ksgxd+0x1b7/0x1d0 > [ 0.235869] Code: ff e9 f2 fe ff ff 48 89 df e8 55 56 0d 00 84 c0 0f 84 c3 fe ff ff 31 ff e8 c6 56 0d 00 84 c0 0f 85 94 fe ff ff e9 af fe ff ff <0f> 0b e9 7f fe ff ff e8 3d dd 93 00 66 66 2e 0f 1f 84 00 00 00 00 > [ 0.235870] RSP: 0000:ffffaaed0097bed8 EFLAGS: 00010287 > [ 0.235872] RAX: ffffaaed00431890 RBX: ffff9a323ccc8000 RCX: 0000000000000000 > [ 0.235873] RDX: 0000000080000000 RSI: ffffaaed00431850 RDI: 00000000ffffffff > [ 0.235875] RBP: ffff9a31416ca080 R08: ffff9a31416cae40 R09: ffff9a31416cae40 > [ 0.235876] R10: 0000000000000000 R11: 0000000000000001 R12: ffffaaed0006bce0 > [ 0.235877] R13: ffff9a3140e9c480 R14: ffffffff9825ee60 R15: 0000000000000000 > [ 0.235878] FS: 0000000000000000(0000) GS:ffff9a32e6640000(0000) knlGS:0000000000000000 > [ 0.235880] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 0.235881] CR2: 0000000000000000 CR3: 00000001fbe10001 CR4: 00000000003706e0 > [ 0.235882] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 0.235883] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 0.235884] Call Trace: > [ 0.235893] <TASK> > [ 0.235895] ? _raw_spin_lock_irqsave+0x24/0x60 > [ 0.235900] ? _raw_spin_unlock_irqrestore+0x23/0x40 > [ 0.235902] ? __kthread_parkme+0x36/0x90 > [ 0.235905] kthread+0xe5/0x110 > [ 0.235907] ? kthread_complete_and_exit+0x20/0x20 > [ 0.235909] ret_from_fork+0x1f/0x30 > [ 0.235914] </TASK> > [ 0.235915] ---[ end trace 0000000000000000 ]--- > ``` > > > Kind regards, > > Paul ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-19 16:02 ` WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 Paul Menzel @ 2022-08-19 18:28 ` Dave Hansen 2022-08-20 6:13 ` Paul Menzel 2022-08-25 4:57 ` Jarkko Sakkinen 0 siblings, 2 replies; 16+ messages in thread From: Dave Hansen @ 2022-08-19 18:28 UTC (permalink / raw) To: Paul Menzel, Jarkko Sakkinen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Chatre, Reinette Cc: linux-sgx, LKML On 8/19/22 09:02, Paul Menzel wrote: > On the Dell XPS 13 9370, Linux 5.18.16 prints the warning below: > > ``` > [ 0.000000] Linux version 5.18.0-4-amd64 > (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-5) 11.3.0, GNU > ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP PREEMPT_DYNAMIC > Debian 5.18.16-1 (2022-08-10) > [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64 > root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet > […] > [ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS 1.21.0 07/06/2022 > […] > [ 0.235418] sgx: EPC section 0x40200000-0x45f7ffff Hi Paul, Would you be able to send the entire dmesg, along with: cat /proc/iomem # (as root) and cpuid -1 --raw I'm suspecting either a BIOS problem. Reinette (cc'd) also thought this might be a case of the SGX initialization getting a bit too far along when it should have been disabled. We had some bugs where we didn't stop fast enough after spitting out the "SGX Launch Control is locked..." errors. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-19 18:28 ` Dave Hansen @ 2022-08-20 6:13 ` Paul Menzel 2022-08-23 13:48 ` Paul Menzel 2022-08-25 4:57 ` Jarkko Sakkinen 1 sibling, 1 reply; 16+ messages in thread From: Paul Menzel @ 2022-08-20 6:13 UTC (permalink / raw) To: Dave Hansen, Jarkko Sakkinen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Reinette Chatre Cc: linux-sgx, LKML Dear Dave, Thank you for your quick reply. Am 19.08.22 um 20:28 schrieb Dave Hansen: > On 8/19/22 09:02, Paul Menzel wrote: >> On the Dell XPS 13 9370, Linux 5.18.16 prints the warning below: >> >> ``` >> [ 0.000000] Linux version 5.18.0-4-amd64 (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-5) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP PREEMPT_DYNAMIC Debian 5.18.16-1 (2022-08-10) >> [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64 root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet >> […] >> [ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS 1.21.0 07/06/2022 >> […] >> [ 0.235418] sgx: EPC section 0x40200000-0x45f7ffff > Would you be able to send the entire dmesg, along with: The log message are attached to the first message, where I missed to carbon-copy linux-sgx@ [1]. > cat /proc/iomem # (as root) > and > cpuid -1 --raw I am going to provide that next week. (Side note, Intel might have some Dell XPS 9370 test machines in some QA lab.) > I'm suspecting either a BIOS problem. Reinette (cc'd) also thought this > might be a case of the SGX initialization getting a bit too far along > when it should have been disabled. > > We had some bugs where we didn't stop fast enough after spitting out the > "SGX Launch Control is locked..." errors. Kind regards, Paul [1]: https://lore.kernel.org/lkml/ce0b4d26-3a6e-7c5a-5f66-44cba05f9f35@molgen.mpg.de/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-20 6:13 ` Paul Menzel @ 2022-08-23 13:48 ` Paul Menzel 2022-08-23 16:32 ` Dave Hansen 2022-08-25 2:12 ` Haitao Huang 0 siblings, 2 replies; 16+ messages in thread From: Paul Menzel @ 2022-08-23 13:48 UTC (permalink / raw) To: Dave Hansen, Jarkko Sakkinen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Reinette Chatre Cc: linux-sgx, LKML Dear Dave, Am 20.08.22 um 08:13 schrieb Paul Menzel: > Am 19.08.22 um 20:28 schrieb Dave Hansen: >> On 8/19/22 09:02, Paul Menzel wrote: >>> On the Dell XPS 13 9370, Linux 5.18.16 prints the warning below: >>> >>> ``` >>> [ 0.000000] Linux version 5.18.0-4-amd64 (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-5) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP PREEMPT_DYNAMIC Debian 5.18.16-1 (2022-08-10) >>> [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64 root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet >>> […] >>> [ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS 1.21.0 07/06/2022 >>> […] >>> [ 0.235418] sgx: EPC section 0x40200000-0x45f7ffff > >> Would you be able to send the entire dmesg, along with: > > The log message are attached to the first message, where I missed to > carbon-copy linux-sgx@ [1]. > >> cat /proc/iomem # (as root) >> and >> cpuid -1 --raw > > I am going to provide that next week. (Side note, Intel might have some > Dell XPS 9370 test machines in some QA lab.) Please find both outputs at the end of the file. >> I'm suspecting either a BIOS problem. Reinette (cc'd) also thought this >> might be a case of the SGX initialization getting a bit too far along >> when it should have been disabled. >> >> We had some bugs where we didn't stop fast enough after spitting out the >> "SGX Launch Control is locked..." errors. Let’s hope it’s something known to you. Kind regards, Paul > [1]: https://lore.kernel.org/lkml/ce0b4d26-3a6e-7c5a-5f66-44cba05f9f35@molgen.mpg.de/ PS: $ sudo cat /proc/iomem [sudo] password for molgenit: 00000000-00000fff : Reserved 00001000-00057fff : System RAM 00058000-00058fff : Reserved 00059000-0009dfff : System RAM 0009e000-000fffff : Reserved 00000000-00000000 : PCI Bus 0000:00 00000000-00000000 : PCI Bus 0000:00 00000000-00000000 : PCI Bus 0000:00 00000000-00000000 : PCI Bus 0000:00 00000000-00000000 : PCI Bus 0000:00 00000000-00000000 : PCI Bus 0000:00 00000000-00000000 : PCI Bus 0000:00 00000000-00000000 : PCI Bus 0000:00 000a0000-000dffff : PCI Bus 0000:00 000c0000-000dffff : 0000:00:02.0 000f0000-000fffff : System ROM 00100000-2d6c4fff : System RAM 2d6c5000-2d6c5fff : ACPI Non-volatile Storage 2d6c6000-2d6c6fff : Reserved 2d6c7000-3b6acfff : System RAM 3b6ad000-3b720fff : Reserved 3b721000-3ecf1fff : System RAM 3ecf2000-3f0b1fff : Reserved 3f0b2000-3f0fefff : ACPI Tables 3f0ff000-3f7b6fff : ACPI Non-volatile Storage 3f798000-3f798fff : USBC000:00 3f7b7000-3ff25fff : Reserved 3ff26000-3fffefff : Unknown E820 type 3ffff000-3fffffff : System RAM 40000000-47ffffff : Reserved 40200000-45f7ffff : INT0E0C:00 48000000-48dfffff : System RAM 48e00000-4f7fffff : Reserved 4b800000-4f7fffff : Graphics Stolen Memory 4f800000-dfffffff : PCI Bus 0000:00 50000000-5fffffff : 0000:00:02.0 60000000-a9ffffff : PCI Bus 0000:03 ac000000-da0fffff : PCI Bus 0000:03 db000000-dbffffff : 0000:00:02.0 dc000000-dc0fffff : PCI Bus 0000:6e dc000000-dc003fff : 0000:6e:00.0 dc000000-dc003fff : nvme dc100000-dc1fffff : PCI Bus 0000:02 dc100000-dc101fff : 0000:02:00.0 dc100000-dc101fff : iwlwifi dc200000-dc2fffff : PCI Bus 0000:01 dc200000-dc200fff : 0000:01:00.0 dc200000-dc200fff : rtsx_pci dc300000-dc30ffff : 0000:00:1f.3 dc310000-dc31ffff : 0000:00:14.0 dc310000-dc31ffff : xhci-hcd dc318070-dc31846f : intel_xhci_usb_sw dc320000-dc327fff : 0000:00:04.0 dc320000-dc327fff : proc_thermal dc328000-dc32bfff : 0000:00:1f.3 dc328000-dc32bfff : ICH HD audio dc32c000-dc32ffff : 0000:00:1f.2 dc330000-dc3300ff : 0000:00:1f.4 dc331000-dc331fff : 0000:00:16.3 dc332000-dc332fff : 0000:00:16.0 dc332000-dc332fff : mei_me dc333000-dc333fff : 0000:00:15.1 dc333000-dc3331ff : lpss_dev dc333000-dc3331ff : i2c_designware.1 lpss_dev dc333200-dc3332ff : lpss_priv dc333800-dc333fff : idma64.1 dc333800-dc333fff : idma64.1 idma64.1 dc334000-dc334fff : 0000:00:15.0 dc334000-dc3341ff : lpss_dev dc334000-dc3341ff : i2c_designware.0 lpss_dev dc334200-dc3342ff : lpss_priv dc334800-dc334fff : idma64.0 dc334800-dc334fff : idma64.0 idma64.0 dc335000-dc335fff : 0000:00:14.2 dc335000-dc335fff : Intel PCH thermal driver dffe0000-dfffffff : pnp 00:05 e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff] e0000000-efffffff : Reserved e0000000-efffffff : pnp 00:05 fd000000-fe7fffff : PCI Bus 0000:00 fd000000-fdabffff : pnp 00:06 fdac0000-fdacffff : INT344B:00 fdac0000-fdacffff : INT344B:00 INT344B:00 fdad0000-fdadffff : pnp 00:06 fdae0000-fdaeffff : INT344B:00 fdae0000-fdaeffff : INT344B:00 INT344B:00 fdaf0000-fdafffff : INT344B:00 fdaf0000-fdafffff : INT344B:00 INT344B:00 fdb00000-fdffffff : pnp 00:06 fdc6000c-fdc6000f : iTCO_wdt fdc6000c-fdc6000f : iTCO_wdt iTCO_wdt fe000000-fe010fff : Reserved fe028000-fe028fff : pnp 00:08 fe029000-fe029fff : pnp 00:08 fe036000-fe03bfff : pnp 00:06 fe03d000-fe3fffff : pnp 00:06 fe410000-fe7fffff : pnp 00:06 fec00000-fec00fff : Reserved fec00000-fec003ff : IOAPIC 0 fed00000-fed003ff : HPET 0 fed00000-fed003ff : PNP0103:00 fed10000-fed17fff : pnp 00:05 fed18000-fed18fff : pnp 00:05 fed19000-fed19fff : pnp 00:05 fed20000-fed3ffff : pnp 00:05 fed40000-fed44fff : MSFT0101:00 fed40000-fed44fff : MSFT0101:00 MSFT0101:00 fed45000-fed8ffff : pnp 00:05 fed90000-fed90fff : dmar0 fed91000-fed91fff : dmar1 fee00000-fee00fff : Local APIC fee00000-fee00fff : Reserved ff000000-ffffffff : Reserved ff000000-ffffffff : INT0800:00 ff000000-ffffffff : pnp 00:05 100000000-2ae7fffff : System RAM 190c00000-191801987 : Kernel code 191a00000-19225ffff : Kernel rodata 192400000-1926b57bf : Kernel data 192d2b000-1931fffff : Kernel bss 2ae800000-2afffffff : RAM buffer $ sudo cpuid -1 --raw CPU: 0x00000000 0x00: eax=0x00000016 ebx=0x756e6547 ecx=0x6c65746e edx=0x49656e69 0x00000001 0x00: eax=0x000806ea ebx=0x00100800 ecx=0x7ffafbff edx=0xbfebfbff 0x00000002 0x00: eax=0x76036301 ebx=0x00f0b5ff ecx=0x00000000 edx=0x00c30000 0x00000003 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000004 0x00: eax=0x1c004121 ebx=0x01c0003f ecx=0x0000003f edx=0x00000000 0x00000004 0x01: eax=0x1c004122 ebx=0x01c0003f ecx=0x0000003f edx=0x00000000 0x00000004 0x02: eax=0x1c004143 ebx=0x00c0003f ecx=0x000003ff edx=0x00000000 0x00000004 0x03: eax=0x1c03c163 ebx=0x02c0003f ecx=0x00001fff edx=0x00000006 0x00000004 0x04: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000005 0x00: eax=0x00000040 ebx=0x00000040 ecx=0x00000003 edx=0x11142120 0x00000006 0x00: eax=0x000027f7 ebx=0x00000002 ecx=0x00000009 edx=0x00000000 0x00000007 0x00: eax=0x00000000 ebx=0x029c67af ecx=0x00000000 edx=0xbc002e00 0x00000008 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x0000000a 0x00: eax=0x07300404 ebx=0x00000000 ecx=0x00000000 edx=0x00000603 0x0000000b 0x00: eax=0x00000001 ebx=0x00000002 ecx=0x00000100 edx=0x00000000 0x0000000b 0x01: eax=0x00000004 ebx=0x00000008 ecx=0x00000201 edx=0x00000000 0x0000000b 0x02: eax=0x00000000 ebx=0x00000000 ecx=0x00000002 edx=0x00000000 0x0000000c 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x0000000d 0x00: eax=0x0000001f ebx=0x00000440 ecx=0x00000440 edx=0x00000000 0x0000000d 0x01: eax=0x0000000f ebx=0x000003c0 ecx=0x00000100 edx=0x00000000 0x0000000d 0x02: eax=0x00000100 ebx=0x00000240 ecx=0x00000000 edx=0x00000000 0x0000000d 0x03: eax=0x00000040 ebx=0x000003c0 ecx=0x00000000 edx=0x00000000 0x0000000d 0x04: eax=0x00000040 ebx=0x00000400 ecx=0x00000000 edx=0x00000000 0x0000000d 0x08: eax=0x00000080 ebx=0x00000000 ecx=0x00000001 edx=0x00000000 0x0000000e 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x0000000f 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000010 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000011 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000012 0x00: eax=0x00000001 ebx=0x00000000 ecx=0x00000000 edx=0x0000241f 0x00000012 0x01: eax=0x00000036 ebx=0x00000000 ecx=0x0000001f edx=0x00000000 0x00000012 0x02: eax=0x40200001 ebx=0x00000000 ecx=0x05d80001 edx=0x00000000 0x00000012 0x03: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000013 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000014 0x00: eax=0x00000001 ebx=0x0000000f ecx=0x00000007 edx=0x00000000 0x00000014 0x01: eax=0x02490002 ebx=0x003f3fff ecx=0x00000000 edx=0x00000000 0x00000015 0x00: eax=0x00000002 ebx=0x0000009e ecx=0x00000000 edx=0x00000000 0x00000016 0x00: eax=0x0000076c ebx=0x00000e10 ecx=0x00000064 edx=0x00000000 0x20000000 0x00: eax=0x0000076c ebx=0x00000e10 ecx=0x00000064 edx=0x00000000 0x80000000 0x00: eax=0x80000008 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000001 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000121 edx=0x2c100800 0x80000002 0x00: eax=0x65746e49 ebx=0x2952286c ecx=0x726f4320 edx=0x4d542865 0x80000003 0x00: eax=0x35692029 ebx=0x3533382d ecx=0x43205530 edx=0x40205550 0x80000004 0x00: eax=0x372e3120 ebx=0x7a484730 ecx=0x00000000 edx=0x00000000 0x80000005 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000006 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x01006040 edx=0x00000000 0x80000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000100 0x80000008 0x00: eax=0x00003027 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80860000 0x00: eax=0x0000076c ebx=0x00000e10 ecx=0x00000064 edx=0x00000000 0xc0000000 0x00: eax=0x0000076c ebx=0x00000e10 ecx=0x00000064 edx=0x00000000 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-23 13:48 ` Paul Menzel @ 2022-08-23 16:32 ` Dave Hansen 2022-08-23 22:33 ` Paul Menzel 2022-08-25 2:12 ` Haitao Huang 1 sibling, 1 reply; 16+ messages in thread From: Dave Hansen @ 2022-08-23 16:32 UTC (permalink / raw) To: Paul Menzel, Jarkko Sakkinen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Reinette Chatre Cc: linux-sgx, LKML On 8/23/22 06:48, Paul Menzel wrote: >>> I'm suspecting either a BIOS problem. Reinette (cc'd) also thought this >>> might be a case of the SGX initialization getting a bit too far along >>> when it should have been disabled. >>> >>> We had some bugs where we didn't stop fast enough after spitting out the >>> "SGX Launch Control is locked..." errors. > > Let’s hope it’s something known to you. Thanks for the extra debug info. Unfortunately, nothing is really sticking out as an obvious problem. The EREMOVE return codes would be interesting to know, as well as an idea what the physical addresses are that fail and the _counts_ of how many pages get sanitized versus fail. But, I don't really have a theory about what could be going on yet. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-23 16:32 ` Dave Hansen @ 2022-08-23 22:33 ` Paul Menzel 2022-08-24 18:39 ` Dave Hansen 2022-08-25 5:27 ` Jarkko Sakkinen 0 siblings, 2 replies; 16+ messages in thread From: Paul Menzel @ 2022-08-23 22:33 UTC (permalink / raw) To: Dave Hansen, Jarkko Sakkinen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Reinette Chatre Cc: linux-sgx, LKML Dear Dave, Thank you for your reply. Am 23.08.22 um 18:32 schrieb Dave Hansen: > On 8/23/22 06:48, Paul Menzel wrote: >>>> I'm suspecting either a BIOS problem. Reinette (cc'd) also thought this >>>> might be a case of the SGX initialization getting a bit too far along >>>> when it should have been disabled. >>>> >>>> We had some bugs where we didn't stop fast enough after spitting out the >>>> "SGX Launch Control is locked..." errors. >> >> Let’s hope it’s something known to you. > > Thanks for the extra debug info. Unfortunately, nothing is really > sticking out as an obvious problem. > > The EREMOVE return codes would be interesting to know, as well as an > idea what the physical addresses are that fail and the _counts_ of how > many pages get sanitized versus fail. Is there a knob to print out this information? Or way to get this information using ftrace? I’d like to avoid rebuilding the Linux kernel. > But, I don't really have a theory about what could be going on yet. Kind regards, Paul ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-23 22:33 ` Paul Menzel @ 2022-08-24 18:39 ` Dave Hansen 2022-08-25 5:27 ` Jarkko Sakkinen 1 sibling, 0 replies; 16+ messages in thread From: Dave Hansen @ 2022-08-24 18:39 UTC (permalink / raw) To: Paul Menzel, Jarkko Sakkinen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Reinette Chatre Cc: linux-sgx, LKML On 8/23/22 15:33, Paul Menzel wrote: >> Thanks for the extra debug info. Unfortunately, nothing is really >> sticking out as an obvious problem. >> >> The EREMOVE return codes would be interesting to know, as well as an >> idea what the physical addresses are that fail and the _counts_ of how >> many pages get sanitized versus fail. > > Is there a knob to print out this information? Or way to get this > information using ftrace? I’d like to avoid rebuilding the Linux kernel. You can probably do it with a kprobe and ftrace, but it's a little bit of a pain since the ENCL* instructions are all inlined and don't get wrapped in actual function calls. I'd just rebuild the kernel if it were me. Maybe we just just uninline all of the ENCL* instruction so that we *can* more easily trace them. It's not like they are performance sensitive. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-23 22:33 ` Paul Menzel 2022-08-24 18:39 ` Dave Hansen @ 2022-08-25 5:27 ` Jarkko Sakkinen 1 sibling, 0 replies; 16+ messages in thread From: Jarkko Sakkinen @ 2022-08-25 5:27 UTC (permalink / raw) To: Paul Menzel Cc: Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Reinette Chatre, linux-sgx, LKML On Wed, Aug 24, 2022 at 12:33:07AM +0200, Paul Menzel wrote: > Dear Dave, > > > Thank you for your reply. > > Am 23.08.22 um 18:32 schrieb Dave Hansen: > > On 8/23/22 06:48, Paul Menzel wrote: > > > > > I'm suspecting either a BIOS problem. Reinette (cc'd) also thought this > > > > > might be a case of the SGX initialization getting a bit too far along > > > > > when it should have been disabled. > > > > > > > > > > We had some bugs where we didn't stop fast enough after spitting out the > > > > > "SGX Launch Control is locked..." errors. > > > > > > Let’s hope it’s something known to you. > > > > Thanks for the extra debug info. Unfortunately, nothing is really > > sticking out as an obvious problem. > > > > The EREMOVE return codes would be interesting to know, as well as an > > idea what the physical addresses are that fail and the _counts_ of how > > many pages get sanitized versus fail. > > Is there a knob to print out this information? Or way to get this > information using ftrace? I’d like to avoid rebuilding the Linux kernel. Since __sgx_sanitize_pages() is a local symbol, it's not possible to attach kprobe into it, so we actually do require a code change to see inside. BR, Jarkko ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-23 13:48 ` Paul Menzel 2022-08-23 16:32 ` Dave Hansen @ 2022-08-25 2:12 ` Haitao Huang 2022-08-25 5:49 ` Jarkko Sakkinen 2022-08-26 9:54 ` Paul Menzel 1 sibling, 2 replies; 16+ messages in thread From: Haitao Huang @ 2022-08-25 2:12 UTC (permalink / raw) To: Dave Hansen, Jarkko Sakkinen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Reinette Chatre, Paul Menzel Cc: linux-sgx, LKML Hi Paul On Tue, 23 Aug 2022 08:48:52 -0500, Paul Menzel <pmenzel@molgen.mpg.de> wrote: > Dear Dave, > > > Am 20.08.22 um 08:13 schrieb Paul Menzel: > >> Am 19.08.22 um 20:28 schrieb Dave Hansen: >>> On 8/19/22 09:02, Paul Menzel wrote: >>>> On the Dell XPS 13 9370, Linux 5.18.16 prints the warning below: >>>> >>>> ``` >>>> [ 0.000000] Linux version 5.18.0-4-amd64 >>>> (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-5) 11.3.0, >>>> GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP >>>> PREEMPT_DYNAMIC Debian 5.18.16-1 (2022-08-10) >>>> [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64 >>>> root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet >>>> […] >>>> [ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS 1.21.0 >>>> 07/06/2022 >>>> […] >>>> [ 0.235418] sgx: EPC section 0x40200000-0x45f7ffff >> >>> Would you be able to send the entire dmesg, along with: >> The log message are attached to the first message, where I missed to >> carbon-copy linux-sgx@ [1]. >> >>> cat /proc/iomem # (as root) >>> and >>> cpuid -1 --raw >> I am going to provide that next week. (Side note, Intel might have >> some Dell XPS 9370 test machines in some QA lab.) > > Please find both outputs at the end of the file. > Could you also check output of "sudo rdmsr -x 0x3a"? Also was CONFIG_X86_SGX_KVM set? If CONFIG_X86_SGX_KVM is not set and bit 17 (SGX_LC) of the MSR 3A not set, then I think following sequence during sgx_init is possible: sgx_page_cache_init -> sgx_setup_epc_section ->put all physical EPC pages in sgx_dirty_page_list. Kick off ksgxd. Later, sgx_drv_init returns none-zero due to this check: if (!cpu_feature_enabled(X86_FEATURE_SGX_LC)) return -ENODEV; sgx_vepc_init also returns none-zero if CONFIG_X86_SGX_KVM was not set. And sgx_init will call kthread_stop(ksgxd_tsk): ret = sgx_drv_init(); if (sgx_vepc_init() && ret) goto err_provision; ... err_provision: misc_deregister(&sgx_dev_provision); err_kthread: kthread_stop(ksgxd_tsk); That triggers __sgx_sanitize_pages return early due to these lines: /* dirty_page_list is thread-local, no need for a lock: */ while (!list_empty(dirty_page_list)) { if (kthread_should_stop()) return; And that would trigger (depends on timing?) the warning in ksgxd due to non-empty sgx_dirty_page_list at that moment. Thanks Haitao ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-25 2:12 ` Haitao Huang @ 2022-08-25 5:49 ` Jarkko Sakkinen 2022-08-25 8:34 ` Jarkko Sakkinen 2022-08-26 9:54 ` Paul Menzel 1 sibling, 1 reply; 16+ messages in thread From: Jarkko Sakkinen @ 2022-08-25 5:49 UTC (permalink / raw) To: Haitao Huang Cc: Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Reinette Chatre, Paul Menzel, linux-sgx, LKML On Wed, Aug 24, 2022 at 09:12:06PM -0500, Haitao Huang wrote: > Hi Paul > > On Tue, 23 Aug 2022 08:48:52 -0500, Paul Menzel <pmenzel@molgen.mpg.de> > wrote: > > > Dear Dave, > > > > > > Am 20.08.22 um 08:13 schrieb Paul Menzel: > > > > > Am 19.08.22 um 20:28 schrieb Dave Hansen: > > > > On 8/19/22 09:02, Paul Menzel wrote: > > > > > On the Dell XPS 13 9370, Linux 5.18.16 prints the warning below: > > > > > > > > > > ``` > > > > > [ 0.000000] Linux version 5.18.0-4-amd64 > > > > > (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-5) > > > > > 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) > > > > > #1 SMP PREEMPT_DYNAMIC Debian 5.18.16-1 (2022-08-10) > > > > > [ 0.000000] Command line: > > > > > BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64 > > > > > root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet > > > > > […] > > > > > [ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS > > > > > 1.21.0 07/06/2022 > > > > > […] > > > > > [ 0.235418] sgx: EPC section 0x40200000-0x45f7ffff > > > > > > > Would you be able to send the entire dmesg, along with: > > > The log message are attached to the first message, where I missed > > > to carbon-copy linux-sgx@ [1]. > > > > > > > cat /proc/iomem # (as root) > > > > and > > > > cpuid -1 --raw > > > I am going to provide that next week. (Side note, Intel might have > > > some Dell XPS 9370 test machines in some QA lab.) > > > > Please find both outputs at the end of the file. > > > > Could you also check output of "sudo rdmsr -x 0x3a"? > Also was CONFIG_X86_SGX_KVM set? > > If CONFIG_X86_SGX_KVM is not set and bit 17 (SGX_LC) of the MSR 3A not set, > then I think following sequence during sgx_init is possible: > > sgx_page_cache_init -> sgx_setup_epc_section > ->put all physical EPC pages in sgx_dirty_page_list. > Kick off ksgxd. > Later, sgx_drv_init returns none-zero due to this check: > if (!cpu_feature_enabled(X86_FEATURE_SGX_LC)) > return -ENODEV; > sgx_vepc_init also returns none-zero if CONFIG_X86_SGX_KVM was not set. > > And sgx_init will call kthread_stop(ksgxd_tsk): > ret = sgx_drv_init(); > > if (sgx_vepc_init() && ret) > goto err_provision; > ... > err_provision: > misc_deregister(&sgx_dev_provision); > > err_kthread: > kthread_stop(ksgxd_tsk); > > > That triggers __sgx_sanitize_pages return early due to these lines: > /* dirty_page_list is thread-local, no need for a lock: */ > while (!list_empty(dirty_page_list)) { > if (kthread_should_stop()) > return; > > And that would trigger (depends on timing?) the warning in ksgxd due to > non-empty sgx_dirty_page_list > at that moment. You're correct, and it's not a bug but completely legit behaviour. And given that non-empty dirty page list is legit behavior WARN_ON() is not what should be used in here. Fix coming in a bit. BR, Jarkko ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-25 5:49 ` Jarkko Sakkinen @ 2022-08-25 8:34 ` Jarkko Sakkinen 0 siblings, 0 replies; 16+ messages in thread From: Jarkko Sakkinen @ 2022-08-25 8:34 UTC (permalink / raw) To: Haitao Huang Cc: Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Reinette Chatre, Paul Menzel, linux-sgx, LKML On Thu, Aug 25, 2022 at 08:49:53AM +0300, Jarkko Sakkinen wrote: > On Wed, Aug 24, 2022 at 09:12:06PM -0500, Haitao Huang wrote: > > Hi Paul > > > > On Tue, 23 Aug 2022 08:48:52 -0500, Paul Menzel <pmenzel@molgen.mpg.de> > > wrote: > > > > > Dear Dave, > > > > > > > > > Am 20.08.22 um 08:13 schrieb Paul Menzel: > > > > > > > Am 19.08.22 um 20:28 schrieb Dave Hansen: > > > > > On 8/19/22 09:02, Paul Menzel wrote: > > > > > > On the Dell XPS 13 9370, Linux 5.18.16 prints the warning below: > > > > > > > > > > > > ``` > > > > > > [ 0.000000] Linux version 5.18.0-4-amd64 > > > > > > (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-5) > > > > > > 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) > > > > > > #1 SMP PREEMPT_DYNAMIC Debian 5.18.16-1 (2022-08-10) > > > > > > [ 0.000000] Command line: > > > > > > BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64 > > > > > > root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet > > > > > > […] > > > > > > [ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS > > > > > > 1.21.0 07/06/2022 > > > > > > […] > > > > > > [ 0.235418] sgx: EPC section 0x40200000-0x45f7ffff > > > > > > > > > Would you be able to send the entire dmesg, along with: > > > > The log message are attached to the first message, where I missed > > > > to carbon-copy linux-sgx@ [1]. > > > > > > > > > cat /proc/iomem # (as root) > > > > > and > > > > > cpuid -1 --raw > > > > I am going to provide that next week. (Side note, Intel might have > > > > some Dell XPS 9370 test machines in some QA lab.) > > > > > > Please find both outputs at the end of the file. > > > > > > > Could you also check output of "sudo rdmsr -x 0x3a"? > > Also was CONFIG_X86_SGX_KVM set? > > > > If CONFIG_X86_SGX_KVM is not set and bit 17 (SGX_LC) of the MSR 3A not set, > > then I think following sequence during sgx_init is possible: > > > > sgx_page_cache_init -> sgx_setup_epc_section > > ->put all physical EPC pages in sgx_dirty_page_list. > > Kick off ksgxd. > > Later, sgx_drv_init returns none-zero due to this check: > > if (!cpu_feature_enabled(X86_FEATURE_SGX_LC)) > > return -ENODEV; > > sgx_vepc_init also returns none-zero if CONFIG_X86_SGX_KVM was not set. > > > > And sgx_init will call kthread_stop(ksgxd_tsk): > > ret = sgx_drv_init(); > > > > if (sgx_vepc_init() && ret) > > goto err_provision; > > ... > > err_provision: > > misc_deregister(&sgx_dev_provision); > > > > err_kthread: > > kthread_stop(ksgxd_tsk); > > > > > > That triggers __sgx_sanitize_pages return early due to these lines: > > /* dirty_page_list is thread-local, no need for a lock: */ > > while (!list_empty(dirty_page_list)) { > > if (kthread_should_stop()) > > return; > > > > And that would trigger (depends on timing?) the warning in ksgxd due to > > non-empty sgx_dirty_page_list > > at that moment. > > You're correct, and it's not a bug but completely legit behaviour. > > And given that non-empty dirty page list is legit behavior WARN_ON() > is not what should be used in here. > > Fix coming in a bit. https://lore.kernel.org/linux-sgx/20220825080802.259528-1-jarkko@kernel.org/T/#u BR, Jarkko ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-25 2:12 ` Haitao Huang 2022-08-25 5:49 ` Jarkko Sakkinen @ 2022-08-26 9:54 ` Paul Menzel 1 sibling, 0 replies; 16+ messages in thread From: Paul Menzel @ 2022-08-26 9:54 UTC (permalink / raw) To: Haitao Huang, Dave Hansen, Jarkko Sakkinen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Reinette Chatre Cc: linux-sgx, LKML Dear Haitao, Thank you for your reply. Just for the record: Am 25.08.22 um 04:12 schrieb Haitao Huang: > On Tue, 23 Aug 2022 08:48:52 -0500, Paul Menzel wrote: >> Am 20.08.22 um 08:13 schrieb Paul Menzel: >> >>> Am 19.08.22 um 20:28 schrieb Dave Hansen: >>>> On 8/19/22 09:02, Paul Menzel wrote: >>>>> On the Dell XPS 13 9370, Linux 5.18.16 prints the warning below: >>>>> >>>>> ``` >>>>> [ 0.000000] Linux version 5.18.0-4-amd64 (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-5) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP PREEMPT_DYNAMIC Debian 5.18.16-1 (2022-08-10) >>>>> [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64 root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet >>>>> […] >>>>> [ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS 1.21.0 07/06/2022 >>>>> […] >>>>> [ 0.235418] sgx: EPC section 0x40200000-0x45f7ffff >>> >>>> Would you be able to send the entire dmesg, along with: >>> The log message are attached to the first message, where I missed to >>> carbon-copy linux-sgx@ [1]. >>> >>>> cat /proc/iomem # (as root) >>>> and >>>> cpuid -1 --raw >>> I am going to provide that next week. (Side note, Intel might have >>> some Dell XPS 9370 test machines in some QA lab.) >> >> Please find both outputs at the end of the file. > > Could you also check output of "sudo rdmsr -x 0x3a"? 40005 > Also was CONFIG_X86_SGX_KVM set? No, it’s not set in Debian’s Linux kernel configuration. > If CONFIG_X86_SGX_KVM is not set and bit 17 (SGX_LC) of the MSR 3A not set, > then I think following sequence during sgx_init is possible: 40005 = 0x09c45, so bit 17 (if starting from 0) is 0. […] Kind regards, Paul ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-19 18:28 ` Dave Hansen 2022-08-20 6:13 ` Paul Menzel @ 2022-08-25 4:57 ` Jarkko Sakkinen 2022-08-25 5:25 ` Jarkko Sakkinen 1 sibling, 1 reply; 16+ messages in thread From: Jarkko Sakkinen @ 2022-08-25 4:57 UTC (permalink / raw) To: Dave Hansen Cc: Paul Menzel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Chatre, Reinette, linux-sgx, LKML On Fri, Aug 19, 2022 at 11:28:24AM -0700, Dave Hansen wrote: > On 8/19/22 09:02, Paul Menzel wrote: > > On the Dell XPS 13 9370, Linux 5.18.16 prints the warning below: > > > > ``` > > [ 0.000000] Linux version 5.18.0-4-amd64 > > (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-5) 11.3.0, GNU > > ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP PREEMPT_DYNAMIC > > Debian 5.18.16-1 (2022-08-10) > > [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64 > > root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet > > […] > > [ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS 1.21.0 07/06/2022 > > […] > > [ 0.235418] sgx: EPC section 0x40200000-0x45f7ffff > > Hi Paul, > > Would you be able to send the entire dmesg, along with: > > cat /proc/iomem # (as root) > and > cpuid -1 --raw > > I'm suspecting either a BIOS problem. Reinette (cc'd) also thought this > might be a case of the SGX initialization getting a bit too far along > when it should have been disabled. > > We had some bugs where we didn't stop fast enough after spitting out the > "SGX Launch Control is locked..." errors. For some reason the pages do not get properly sanitized: /* sanity check: */ WARN_ON(!list_empty(&sgx_dirty_page_list)); EPC should be good, given that EREMOVE does not fail. If SGX would be disabled, also EREMOVE should fail. BR, Jarkko ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-25 4:57 ` Jarkko Sakkinen @ 2022-08-25 5:25 ` Jarkko Sakkinen 2022-08-25 6:46 ` Paul Menzel 0 siblings, 1 reply; 16+ messages in thread From: Jarkko Sakkinen @ 2022-08-25 5:25 UTC (permalink / raw) To: Dave Hansen, Paul Menzel Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Chatre, Reinette, linux-sgx, LKML [-- Attachment #1: Type: text/plain, Size: 2051 bytes --] On Thu, Aug 25, 2022 at 07:57:30AM +0300, Jarkko Sakkinen wrote: > On Fri, Aug 19, 2022 at 11:28:24AM -0700, Dave Hansen wrote: > > On 8/19/22 09:02, Paul Menzel wrote: > > > On the Dell XPS 13 9370, Linux 5.18.16 prints the warning below: > > > > > > ``` > > > [ 0.000000] Linux version 5.18.0-4-amd64 > > > (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-5) 11.3.0, GNU > > > ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP PREEMPT_DYNAMIC > > > Debian 5.18.16-1 (2022-08-10) > > > [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64 > > > root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet > > > […] > > > [ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS 1.21.0 07/06/2022 > > > […] > > > [ 0.235418] sgx: EPC section 0x40200000-0x45f7ffff > > > > Hi Paul, > > > > Would you be able to send the entire dmesg, along with: > > > > cat /proc/iomem # (as root) > > and > > cpuid -1 --raw > > > > I'm suspecting either a BIOS problem. Reinette (cc'd) also thought this > > might be a case of the SGX initialization getting a bit too far along > > when it should have been disabled. > > > > We had some bugs where we didn't stop fast enough after spitting out the > > "SGX Launch Control is locked..." errors. > > For some reason the pages do not get properly sanitized: > > /* sanity check: */ > WARN_ON(!list_empty(&sgx_dirty_page_list)); > > EPC should be good, given that EREMOVE does not fail. > If SGX would be disabled, also EREMOVE should fail. Sorry forgot that in no circumstances we're printing the error code inside __sgx_sanitize_pages(). I wrote a quick patch to address this (attached) [*]. Paul, Any chance to try the patch out? It's pretty hard to attach e.g. kprobe to grab this info. Does it reproduce every single time? Alternatively: what kind of workload is triggering this? I do own 2020 model XPS13, which might be able to reproduce the same issue. [*] Also: https://lore.kernel.org/linux-sgx/20220825051827.246698-1-jarkko@kernel.org/T/#u BR, Jarkko [-- Attachment #2: 0001-x86-sgx-Print-EREMOVE-return-value-in-__sgx_sanitize.patch --] [-- Type: text/plain, Size: 2077 bytes --] From ddccefc8e864bd9973a5445202922b59760d3460 Mon Sep 17 00:00:00 2001 From: Jarkko Sakkinen <jarkko@kernel.org> Date: Thu, 25 Aug 2022 08:12:30 +0300 Subject: [PATCH] x86/sgx: Print EREMOVE return value in __sgx_sanitize_pages() In the 2nd run of __sgx_sanitize_pages() print the error message. All EREMOVE's should succeed. This will allow to provide some additional clues, if not. Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> --- arch/x86/kernel/cpu/sgx/main.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 515e2a5f25bb..33354921c59f 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -50,7 +50,7 @@ static LIST_HEAD(sgx_dirty_page_list); * from the input list, and made available for the page allocator. SECS pages * prepending their children in the input list are left intact. */ -static void __sgx_sanitize_pages(struct list_head *dirty_page_list) +static void __sgx_sanitize_pages(struct list_head *dirty_page_list, bool verbose) { struct sgx_epc_page *page; LIST_HEAD(dirty); @@ -90,6 +90,9 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list) list_del(&page->list); sgx_free_epc_page(page); } else { + if (verbose) + pr_err_ratelimited(EREMOVE_ERROR_MESSAGE, ret, ret); + /* The page is not yet clean - move to the dirty list. */ list_move_tail(&page->list, &dirty); } @@ -394,8 +397,8 @@ static int ksgxd(void *p) * Sanitize pages in order to recover from kexec(). The 2nd pass is * required for SECS pages, whose child pages blocked EREMOVE. */ - __sgx_sanitize_pages(&sgx_dirty_page_list); - __sgx_sanitize_pages(&sgx_dirty_page_list); + __sgx_sanitize_pages(&sgx_dirty_page_list, false); + __sgx_sanitize_pages(&sgx_dirty_page_list, true); /* sanity check: */ WARN_ON(!list_empty(&sgx_dirty_page_list)); -- 2.37.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-25 5:25 ` Jarkko Sakkinen @ 2022-08-25 6:46 ` Paul Menzel 2022-08-25 8:39 ` Jarkko Sakkinen 0 siblings, 1 reply; 16+ messages in thread From: Paul Menzel @ 2022-08-25 6:46 UTC (permalink / raw) To: Jarkko Sakkinen, Dave Hansen Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Chatre, Reinette, linux-sgx, LKML Dear Jarkko, Am 25.08.22 um 07:25 schrieb Jarkko Sakkinen: > On Thu, Aug 25, 2022 at 07:57:30AM +0300, Jarkko Sakkinen wrote: >> On Fri, Aug 19, 2022 at 11:28:24AM -0700, Dave Hansen wrote: >>> On 8/19/22 09:02, Paul Menzel wrote: >>>> On the Dell XPS 13 9370, Linux 5.18.16 prints the warning below: >>>> >>>> ``` >>>> [ 0.000000] Linux version 5.18.0-4-amd64 (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-5) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP PREEMPT_DYNAMIC Debian 5.18.16-1 (2022-08-10) >>>> [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64 root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet >>>> […] >>>> [ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS 1.21.0 07/06/2022 >>>> […] >>>> [ 0.235418] sgx: EPC section 0x40200000-0x45f7ffff >>> Would you be able to send the entire dmesg, along with: >>> >>> cat /proc/iomem # (as root) >>> and >>> cpuid -1 --raw >>> >>> I'm suspecting either a BIOS problem. Reinette (cc'd) also thought this >>> might be a case of the SGX initialization getting a bit too far along >>> when it should have been disabled. >>> >>> We had some bugs where we didn't stop fast enough after spitting out the >>> "SGX Launch Control is locked..." errors. >> >> For some reason the pages do not get properly sanitized: >> >> /* sanity check: */ >> WARN_ON(!list_empty(&sgx_dirty_page_list)); >> >> EPC should be good, given that EREMOVE does not fail. >> If SGX would be disabled, also EREMOVE should fail. > > Sorry forgot that in no circumstances we're printing the > error code inside __sgx_sanitize_pages(). I wrote a quick > patch to address this (attached) [*]. > > Paul, > > Any chance to try the patch out? Yes, I am going to try it in the next days. > It's pretty hard to attach e.g. kprobe to grab this info. Does it > reproduce every single time? Yes, on each boot up. > Alternatively: what kind of workload is triggering this? > I do own 2020 model XPS13, which might be able to > reproduce the same issue. The Dell XPS 13 9370 is from 2018 (Intel i5-8350U), so no idea if it happens with later processors. Kind regards, Paul > [*] Also: https://lore.kernel.org/linux-sgx/20220825051827.246698-1-jarkko@kernel.org/T/#u ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 2022-08-25 6:46 ` Paul Menzel @ 2022-08-25 8:39 ` Jarkko Sakkinen 0 siblings, 0 replies; 16+ messages in thread From: Jarkko Sakkinen @ 2022-08-25 8:39 UTC (permalink / raw) To: Paul Menzel Cc: Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86, Chatre, Reinette, linux-sgx, LKML On Thu, Aug 25, 2022 at 08:46:19AM +0200, Paul Menzel wrote: > Dear Jarkko, > > > Am 25.08.22 um 07:25 schrieb Jarkko Sakkinen: > > On Thu, Aug 25, 2022 at 07:57:30AM +0300, Jarkko Sakkinen wrote: > > > On Fri, Aug 19, 2022 at 11:28:24AM -0700, Dave Hansen wrote: > > > > On 8/19/22 09:02, Paul Menzel wrote: > > > > > On the Dell XPS 13 9370, Linux 5.18.16 prints the warning below: > > > > > > > > > > ``` > > > > > [ 0.000000] Linux version 5.18.0-4-amd64 (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-5) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP PREEMPT_DYNAMIC Debian 5.18.16-1 (2022-08-10) > > > > > [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64 root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet > > > > > […] > > > > > [ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS 1.21.0 07/06/2022 > > > > > […] > > > > > [ 0.235418] sgx: EPC section 0x40200000-0x45f7ffff > > > > > Would you be able to send the entire dmesg, along with: > > > > > > > > cat /proc/iomem # (as root) > > > > and > > > > cpuid -1 --raw > > > > > > > > I'm suspecting either a BIOS problem. Reinette (cc'd) also thought this > > > > might be a case of the SGX initialization getting a bit too far along > > > > when it should have been disabled. > > > > > > > > We had some bugs where we didn't stop fast enough after spitting out the > > > > "SGX Launch Control is locked..." errors. > > > > > > For some reason the pages do not get properly sanitized: > > > > > > /* sanity check: */ > > > WARN_ON(!list_empty(&sgx_dirty_page_list)); > > > > > > EPC should be good, given that EREMOVE does not fail. > > > If SGX would be disabled, also EREMOVE should fail. > > > > Sorry forgot that in no circumstances we're printing the > > error code inside __sgx_sanitize_pages(). I wrote a quick > > patch to address this (attached) [*]. > > > > Paul, > > > > Any chance to try the patch out? > > Yes, I am going to try it in the next days. > > > It's pretty hard to attach e.g. kprobe to grab this info. Does it > > reproduce every single time? > Yes, on each boot up. > > > Alternatively: what kind of workload is triggering this? > > I do own 2020 model XPS13, which might be able to > > reproduce the same issue. > > The Dell XPS 13 9370 is from 2018 (Intel i5-8350U), so no idea if it happens > with later processors. I think this should work out, and actually fix the issue: https://lore.kernel.org/linux-sgx/20220825080802.259528-1-jarkko@kernel.org/T/#u Just to add, perhaps for some future issue, I think my laptop and yours are comparable because they have the SGX side pretty much the same. For Icelake, things are not as comparable because it uses different type of encryption engine in the hardware. BR, Jarkko ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2022-08-26 9:54 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <ce0b4d26-3a6e-7c5a-5f66-44cba05f9f35@molgen.mpg.de> 2022-08-19 16:02 ` WARNING: CPU: 1 PID: 83 at arch/x86/kernel/cpu/sgx/main.c:446 ksgxd+0x1b7/0x1d0 Paul Menzel 2022-08-19 18:28 ` Dave Hansen 2022-08-20 6:13 ` Paul Menzel 2022-08-23 13:48 ` Paul Menzel 2022-08-23 16:32 ` Dave Hansen 2022-08-23 22:33 ` Paul Menzel 2022-08-24 18:39 ` Dave Hansen 2022-08-25 5:27 ` Jarkko Sakkinen 2022-08-25 2:12 ` Haitao Huang 2022-08-25 5:49 ` Jarkko Sakkinen 2022-08-25 8:34 ` Jarkko Sakkinen 2022-08-26 9:54 ` Paul Menzel 2022-08-25 4:57 ` Jarkko Sakkinen 2022-08-25 5:25 ` Jarkko Sakkinen 2022-08-25 6:46 ` Paul Menzel 2022-08-25 8:39 ` Jarkko Sakkinen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).