qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Chen, Zide" <zide.chen@intel.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: qemu-devel@nongnu.org, pbonzini@redhat.com, mst@redhat.com,
	thuth@redhat.com, cfontana@suse.de, xiaoyao.li@intel.com,
	qemu-trivial@nongnu.org
Subject: Re: [PATCH V2 0/3] improve -overcommit cpu-pm=on|off
Date: Wed, 29 May 2024 10:31:21 -0700	[thread overview]
Message-ID: <898effa1-1a5b-42c0-9305-8db8d5febbf5@intel.com> (raw)
In-Reply-To: <20240529144634.40aa597f@imammedo.users.ipa.redhat.com>



On 5/29/2024 5:46 AM, Igor Mammedov wrote:
> On Tue, 28 May 2024 11:16:59 -0700
> "Chen, Zide" <zide.chen@intel.com> wrote:
> 
>> On 5/28/2024 2:23 AM, Igor Mammedov wrote:
>>> On Fri, 24 May 2024 13:00:14 -0700
>>> Zide Chen <zide.chen@intel.com> wrote:
>>>   
>>>> Currently, if running "-overcommit cpu-pm=on" on hosts that don't
>>>> have MWAIT support, the MWAIT/MONITOR feature is advertised to the
>>>> guest and executing MWAIT/MONITOR on the guest triggers #UD.  
>>>
>>> this is missing proper description how do you trigger issue
>>> with reproducer and detailed description why guest sees MWAIT
>>> when it's not supported by host.  
>>
>> If "overcommit cpu-pm=on" and "-cpu host" are present, as shown in the
> it's bette to provide full QEMU CLI and host/guest kernels used and what
> hardware was used if it's relevant so others can reproduce problem.

I ever reproduced this on an older Intel Icelake machine, a
Sapphire Rapids and a Sierra Forest, but I believe this is a x86 generic
issue, not specific to particular models.

For the CLI, I think the only command line options that matter are
 -overcommit cpu-pm=on: to set enable_cpu_pm
 -cpu host: so that cpu->max_features is set

For QEMU version, as long as it's after this commit: 662175b91ff2
("i386: reorder call to cpu_exec_realizefn")

The guest fails to boot:

[ 24.825568] smpboot: x86: Booting SMP configuration:
[ 24.826377] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12
#13 #14 #15 #17
[ 24.985799] .... node #1, CPUs: #128 #129 #130 #131 #132 #133 #134 #135
#136 #137 #138 #139 #140 #141 #142 #143 #145
[ 25.136955] invalid opcode: 0000 1 PREEMPT SMP NOPTI
[ 25.137790] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.8.0 #2
[ 25.137790] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/04
[ 25.137790] RIP: 0010:mwait_idle+0x35/0x80
[ 25.137790] Code: 6f f0 80 48 02 20 48 8b 10 83 e2 08 75 3e 65 48 8b 15
47 d6 56 6f 48 0f ba e2 27 72 41 31 d2 48 89 d8
[ 25.137790] RSP: 0000:ffffffff91403e70 EFLAGS: 00010046
[ 25.137790] RAX: ffffffff9140a980 RBX: ffffffff9140a980 RCX:
0000000000000000
[ 25.137790] RDX: 0000000000000000 RSI: ffff97f1ade21b20 RDI:
0000000000000004
[ 25.137790] RBP: 0000000000000000 R08: 00000005da4709cb R09:
0000000000000001
[ 25.137790] R10: 0000000000005da4 R11: 0000000000000009 R12:
0000000000000000
[ 25.137790] R13: ffff98573ff90fc0 R14: ffffffff9140a038 R15:
0000000000093ff0
[ 25.137790] FS: 0000000000000000(0000) GS:ffff97f1ade00000(0000)
knlGS:0000000000000000
[ 25.137790] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 25.137790] CR2: ffff97d8aa801000 CR3: 00000049e9430001 CR4:
0000000000770ef0
[ 25.137790] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 25.137790] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7:
0000000000000400
[ 25.137790] PKRU: 55555554
[ 25.137790] Call Trace:
[ 25.137790] <TASK>
[ 25.137790] ? die+0x37/0x90
[ 25.137790] ? do_trap+0xe3/0x110
[ 25.137790] ? mwait_idle+0x35/0x80
[ 25.137790] ? do_error_trap+0x6a/0x90
[ 25.137790] ? mwait_idle+0x35/0x80
[ 25.137790] ? exc_invalid_op+0x52/0x70
[ 25.137790] ? mwait_idle+0x35/0x80
[ 25.137790] ? asm_exc_invalid_op+0x1a/0x20
[ 25.137790] ? mwait_idle+0x35/0x80
[ 25.137790] default_idle_call+0x30/0x100
[ 25.137790] cpuidle_idle_call+0x12c/0x170
[ 25.137790] ? tsc_verify_tsc_adjust+0x73/0xd0
[ 25.137790] do_idle+0x7f/0xd0
[ 25.137790] cpu_startup_entry+0x29/0x30
[ 25.137790] rest_init+0xcc/0xd0
[ 25.137790] start_kernel+0x396/0x5d0
[ 25.137790] x86_64_start_reservations+0x18/0x30
[ 25.137790] x86_64_start_kernel+0xe7/0xf0
[ 25.137790] common_startup_64+0x13e/0x148
[ 25.137790] </TASK>
[ 25.137790] Modules linked in:
[ 25.137790] --[ end trace 0000000000000000 ]--
[ 25.137790] invalid opcode: 0000 2 PREEMPT SMP NOPTI
[ 25.137790] RIP: 0010:mwait_idle+0x35/0x80
[ 25.137790] Code: 6f f0 80 48 02 20 48 8b 10 83 e2 08 75 3e 65 48 8b 15
47 d6 56 6f 48 0f ba e2 27 72 41 31 d2 48 89 d8

> 
>> following, CPUID_EXT_MONITOR is set after x86_cpu_filter_features(), so
>> that it doesn't have a chance to check MWAIT against host features and
>> will be advertised to the guest regardless of whether it's supported by
>> the host or not.
>>
>> x86_cpu_realizefn()
>>   x86_cpu_filter_features()
>>   cpu_exec_realizefn()
>>     kvm_cpu_realizefn
>>       host_cpu_realizefn
>>         host_cpu_enable_cpu_pm
>>           env->features[FEAT_1_ECX] |= CPUID_EXT_MONITOR;
>>
>>
>> If it's not supported by the host, executing MONITOR or MWAIT
>> instructions from the guest triggers #UD, no matter MWAIT_EXITING
>> control is set or not.
> 
> If I recall right, kvm was able to emulate mwait/monitor.
> So question is why it leads to exception instead?

KVM can come to play only iff it can trigger MWAIT/MONITOR VM exits. I
didn't find explicit proof from Intel SDM that #UD exceptions take
precedence over MWAIT/MONITOR VM exits, but this is my speculation. For
example, in ancient machines which don't support MWAIT yet, the only way
it can do is #UD, not MWAIT VM exit?





  reply	other threads:[~2024-05-29 17:32 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-24 20:00 [PATCH V2 0/3] improve -overcommit cpu-pm=on|off Zide Chen
2024-05-24 20:00 ` [PATCH V2 1/3] vl: Allow multiple -overcommit commands Zide Chen
2024-05-27  5:19   ` Thomas Huth
2024-05-30 14:01     ` Zhao Liu
2024-05-31  4:57       ` Thomas Huth
2024-06-03  8:44         ` Markus Armbruster
2024-05-30 13:39   ` Zhao Liu
2024-05-24 20:00 ` [PATCH V2 2/3] target/i386: call cpu_exec_realizefn before x86_cpu_filter_features Zide Chen
2024-05-31  6:30   ` Zhao Liu
2024-05-31 17:13     ` Chen, Zide
2024-06-01 15:26       ` Zhao Liu
2024-06-03  9:30         ` Igor Mammedov
2024-06-03 21:29           ` Chen, Zide
2024-06-05 15:07             ` Igor Mammedov
2024-06-05 17:58               ` Chen, Zide
2024-06-03 21:29         ` Chen, Zide
2024-05-24 20:00 ` [PATCH V2 3/3] target/i386: Move host_cpu_enable_cpu_pm into kvm_cpu_realizefn() Zide Chen
2024-05-31  6:53   ` Zhao Liu
2024-05-31 17:13     ` Chen, Zide
2024-05-28  9:23 ` [PATCH V2 0/3] improve -overcommit cpu-pm=on|off Igor Mammedov
2024-05-28 18:16   ` Chen, Zide
2024-05-29 12:46     ` Igor Mammedov
2024-05-29 17:31       ` Chen, Zide [this message]
2024-05-30 13:54         ` Zhao Liu
2024-05-30 14:34           ` Igor Mammedov
2024-05-30 14:53             ` Sean Christopherson
2024-05-30 14:49           ` Igor Mammedov
2024-06-02 21:54             ` Michael S. Tsirkin
2024-05-30 16:15           ` Chen, Zide

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=898effa1-1a5b-42c0-9305-8db8d5febbf5@intel.com \
    --to=zide.chen@intel.com \
    --cc=cfontana@suse.de \
    --cc=imammedo@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-trivial@nongnu.org \
    --cc=thuth@redhat.com \
    --cc=xiaoyao.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).