From: Greg KH <gregkh@linuxfoundation.org>
To: Shaoying Xu <shaoyi@amazon.com>
Cc: tglx@linutronix.de, linux-kernel@vger.kernel.org,
stable@vger.kernel.org, jgross@suse.com, sjpark@amazon.com,
hailmo@amazon.com, kuniyu@amazon.com
Subject: Re: Linux 5.4.252 FPU initialization warnings in stable kernels 5.4/5.10
Date: Tue, 15 Aug 2023 23:16:52 +0200 [thread overview]
Message-ID: <2023081511-easing-exerciser-c356@gregkh> (raw)
In-Reply-To: <20230815201539.19015-1-shaoyi@amazon.com>
On Tue, Aug 15, 2023 at 08:15:39PM +0000, Shaoying Xu wrote:
> Hi Thomas/Greg
>
> We are seeing “get of unsupported state” warnings during FPU initialization in the v5.4.252 and v5.10.189
> kernel booted on AWS EC2 instances with Intel processors based on Nitro system. These warnings are observed
> in EC2 c5.18xlarge instance:
>
> [ 1.204495] ------------[ cut here ]------------
> [ 1.204495] get of unsupported state
> [ 1.204495] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:879 get_xsave_addr+0x81/0x90
> [ 1.204495] Modules linked in:
> [ 1.204495] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.252 #10
> [ 1.204495] Hardware name: Amazon EC2 c5.18xlarge/, BIOS 1.0 10/16/2017
> [ 1.204495] RIP: 0010:get_xsave_addr+0x81/0x90
> [ 1.204495] Code: 5b c3 48 83 c4 08 31 c0 5b c3 80 3d 7c f0 78 01 00 75 c1 48 c7 c7 34 be 03 b2 89 4c 24 04 c6 05 68 f0 78 01 01 e8 ef 41 05 00 <0f> 0b 48 63 4c 24 04 eb a1 31 c0 c3 0f 1f 00 0f 1f 44 00 00 41 54
> [ 1.204495] RSP: 0000:ffffffffb2603ed0 EFLAGS: 00010282
> [ 1.204495] RAX: 0000000000000000 RBX: ffffffffb27ebe80 RCX: 0000000047cb2486
> [ 1.204495] RDX: 0000000000000018 RSI: ffffffffb39e99a0 RDI: ffffffffb39e756c
> [ 1.204495] RBP: ffffffffb27ebd40 R08: 7520666f20746567 R09: 74726f707075736e
> [ 1.204495] R10: 00000000000962fc R11: 6574617473206465 R12: ffffffffb2d89b60
> [ 1.204495] R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000
> [ 1.204495] FS: 0000000000000000(0000) GS:ffff96d031400000(0000) knlGS:0000000000000000
> [ 1.204495] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1.204495] CR2: ffff96e277fff000 CR3: 000000103060a001 CR4: 00000000007200b0
> [ 1.204495] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1.204495] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 1.204495] Call Trace:
> [ 1.204495] ? __warn+0x85/0xd0
> [ 1.204495] ? get_xsave_addr+0x81/0x90
> [ 1.204495] ? report_bug+0xb6/0x130
> [ 1.204495] ? get_xsave_addr+0x81/0x90
> [ 1.204495] ? fixup_bug.part.12+0x18/0x30
> [ 1.204495] ? do_error_trap+0x95/0xb0
> [ 1.204495] ? do_invalid_op+0x36/0x40
> [ 1.204495] ? get_xsave_addr+0x81/0x90
> [ 1.204495] ? invalid_op+0x1e/0x30
> [ 1.204495] ? get_xsave_addr+0x81/0x90
> [ 1.204495] identify_cpu+0x422/0x510
> [ 1.204495] identify_boot_cpu+0xc/0x94
> [ 1.204495] arch_cpu_finalize_init+0x5/0x47
> [ 1.204495] start_kernel+0x468/0x511
> [ 1.204495] secondary_startup_64+0xa4/0xb0
> [ 1.204495] ---[ end trace dffac81ff531fcf2 ]---
>
> The issue can be easily reproduced on both virtualized and bare metal instances but interesting thing is
> it can’t be found in other latest stable kernels v4.14, v4.19, v5.15 and newer. We tried to bisect between v5.4.251 and v5.4.252 and
> were able to find below commit to be the culprit. Also, reverting it in v5.4.252 and v5.10.189 resolved above warnings completely.
>
> x86/fpu: Move FPU initialization into arch_cpu_finalize_init()
> commit b81fac906a8f9e682e513ddd95697ec7a20878d4 upstream
>
> We used to speculate the fix might be similar to commit 3f8968f1f0ad (“x86/xen: Fix secondary processors' FPU initialization”) but
> since only kernel 5.4/5.10 are impacted, we’re not quite sure how this commit affects them in practice. Could you please take a look and share your insights?
>
> Also put stack traces from v5.10.189:
>
> [ 1.210910] ------------[ cut here ]------------
> [ 1.210910] get of unsupported state
> [ 1.210910] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:974 get_xsave_addr+0x89/0xa0
> [ 1.210910] Modules linked in:
> [ 1.210910] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.189 #4
> [ 1.210910] Hardware name: Amazon EC2 c5.18xlarge/, BIOS 1.0 10/16/2017
> [ 1.210910] RIP: 0010:get_xsave_addr+0x89/0xa0
> [ 1.210910] Code: c4 08 31 c0 5b e9 17 a4 bc 00 80 3d e7 75 eb 01 00 75 b9 48 c7 c7 b7 f4 09 ab 89 4c 24 04 c6 05 d3 75 eb 01 01 e8 17 98 05 00 <0f> 0b 48 63 4c 24 04 eb 99 31 c0 e9 e7 a3 bc 00 0f 1f 80 00 00 00
> [ 1.210910] RSP: 0000:ffffffffab603ec8 EFLAGS: 00010286
> [ 1.210910] RAX: 0000000000000000 RBX: ffffffffabf25bc0 RCX: 00000000fffeffff
> [ 1.210910] RDX: ffffffffab603cd0 RSI: 00000000fffeffff RDI: ffffffffad1a3dec
> [ 1.210910] RBP: ffffffffabf25a60 R08: 0000000000000000 R09: 0000000000000001
> [ 1.210910] R10: 0000000000000000 R11: ffffffffab603cc8 R12: ffffffffac539b40
> [ 1.210910] R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000
> [ 1.210910] FS: 0000000000000000(0000) GS:ffff9150f1600000(0000) knlGS:0000000000000000
> [ 1.210910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1.210910] CR2: ffff915702801000 CR3: 0000001780610001 CR4: 00000000007300b0
> [ 1.210910] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1.210910] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 1.210910] Call Trace:
> [ 1.210910] ? __warn+0x7d/0xe0
> [ 1.210910] ? get_xsave_addr+0x89/0xa0
> [ 1.210910] ? report_bug+0xbb/0x140
> [ 1.210910] ? handle_bug+0x3f/0x70
> [ 1.210910] ? exc_invalid_op+0x13/0x60
> [ 1.210910] ? asm_exc_invalid_op+0x12/0x20
> [ 1.210910] ? get_xsave_addr+0x89/0xa0
> [ 1.210910] ? get_xsave_addr+0x89/0xa0
> [ 1.210910] identify_cpu+0x42a/0x550
> [ 1.210910] identify_boot_cpu+0xc/0x94
> [ 1.210910] arch_cpu_finalize_init+0x5/0x47
> [ 1.210910] start_kernel+0x4bc/0x56b
> [ 1.210910] secondary_startup_64_no_verify+0xb0/0xbb
> [ 1.210910] ---[ end trace 14850c6f8ee0875d ]---
I think this is fixed with commit b3607269ff57 ("x86/pkeys: Revert
a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate")"), which is
queued up for the next 5.4 and 5.10.y releases to happen "soon". Can
you test the released -rc1 versions of this to verify it is resolved or
not?
thanks,
greg k-h
next prev parent reply other threads:[~2023-08-15 21:17 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-08 18:55 Linux 5.4.252 Greg Kroah-Hartman
2023-08-08 18:55 ` Greg Kroah-Hartman
2023-08-15 20:15 ` Linux 5.4.252 FPU initialization warnings in stable kernels 5.4/5.10 Shaoying Xu
2023-08-15 21:16 ` Greg KH [this message]
2023-08-16 5:52 ` Salvatore Bonaccorso
2023-08-16 17:58 ` Shaoying Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2023081511-easing-exerciser-c356@gregkh \
--to=gregkh@linuxfoundation.org \
--cc=hailmo@amazon.com \
--cc=jgross@suse.com \
--cc=kuniyu@amazon.com \
--cc=linux-kernel@vger.kernel.org \
--cc=shaoyi@amazon.com \
--cc=sjpark@amazon.com \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox