From: Igor Mammedov <imammedo@redhat.com>
To: "Chen, Zide" <zide.chen@intel.com>
Cc: Zhao Liu <zhao1.liu@intel.com>,
qemu-devel@nongnu.org, pbonzini@redhat.com, mst@redhat.com,
thuth@redhat.com, cfontana@suse.de, xiaoyao.li@intel.com,
qemu-trivial@nongnu.org, Sean Christopherson <seanjc@google.com>
Subject: Re: [PATCH V2 2/3] target/i386: call cpu_exec_realizefn before x86_cpu_filter_features
Date: Wed, 5 Jun 2024 17:07:35 +0200 [thread overview]
Message-ID: <20240605170735.5a3924cd@imammedo.users.ipa.redhat.com> (raw)
In-Reply-To: <dcd913cd-10ce-45c9-9ed8-79b9c42c234b@intel.com>
On Mon, 3 Jun 2024 14:29:50 -0700
"Chen, Zide" <zide.chen@intel.com> wrote:
> On 6/3/2024 2:30 AM, Igor Mammedov wrote:
> > On Sat, 1 Jun 2024 23:26:55 +0800
> > Zhao Liu <zhao1.liu@intel.com> wrote:
> >
> >> On Fri, May 31, 2024 at 10:13:47AM -0700, Chen, Zide wrote:
> >>> Date: Fri, 31 May 2024 10:13:47 -0700
> >>> From: "Chen, Zide" <zide.chen@intel.com>
> >>> Subject: Re: [PATCH V2 2/3] target/i386: call cpu_exec_realizefn before
> >>> x86_cpu_filter_features
> >>>
> >>> On 5/30/2024 11:30 PM, Zhao Liu wrote:
> >>>> Hi Zide,
> >>>>
> >>>> On Fri, May 24, 2024 at 01:00:16PM -0700, Zide Chen wrote:
> >>>>> Date: Fri, 24 May 2024 13:00:16 -0700
> >>>>> From: Zide Chen <zide.chen@intel.com>
> >>>>> Subject: [PATCH V2 2/3] target/i386: call cpu_exec_realizefn before
> >>>>> x86_cpu_filter_features
> >>>>> X-Mailer: git-send-email 2.34.1
> >>>>>
> >>>>> cpu_exec_realizefn which calls the accel-specific realizefn may expand
> >>>>> features. e.g., some accel-specific options may require extra features
> >>>>> to be enabled, and it's appropriate to expand these features in accel-
> >>>>> specific realizefn.
> >>>>>
> >>>>> One such example is the cpu-pm option, which may add CPUID_EXT_MONITOR.
> >>>>>
> >>>>> Thus, call cpu_exec_realizefn before x86_cpu_filter_features to ensure
> >>>>> that it won't expose features not supported by the host.
> >>>>>
> >>>>> Fixes: 662175b91ff2 ("i386: reorder call to cpu_exec_realizefn")
> >>>>> Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
> >>>>> Signed-off-by: Zide Chen <zide.chen@intel.com>
> >>>>> ---
> >>>>> target/i386/cpu.c | 24 ++++++++++++------------
> >>>>> target/i386/kvm/kvm-cpu.c | 1 -
> >>>>> 2 files changed, 12 insertions(+), 13 deletions(-)
> >>>>>
> >>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> >>>>> index bc2dceb647fa..a1c1c785bd2f 100644
> >>>>> --- a/target/i386/cpu.c
> >>>>> +++ b/target/i386/cpu.c
> >>>>> @@ -7604,6 +7604,18 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> + /*
> >>>>> + * note: the call to the framework needs to happen after feature expansion,
> >>>>> + * but before the checks/modifications to ucode_rev, mwait, phys_bits.
> >>>>> + * These may be set by the accel-specific code,
> >>>>> + * and the results are subsequently checked / assumed in this function.
> >>>>> + */
> >>>>> + cpu_exec_realizefn(cs, &local_err);
> >>>>> + if (local_err != NULL) {
> >>>>> + error_propagate(errp, local_err);
> >>>>> + return;
> >>>>> + }
> >>>>> +
> >>>>> x86_cpu_filter_features(cpu, cpu->check_cpuid || cpu->enforce_cpuid);
> >>>>
> >>>> For your case, which sets cpu-pm=on via overcommit, then
> >>>> x86_cpu_filter_features() will complain that mwait is not supported.
> >>>>
> >>>> Such warning is not necessary, because the purpose of overcommit (from
> >>>> code) is only to support mwait when possible, not to commit to support
> >>>> mwait in Guest.
> >>>>
> >>>> Additionally, I understand x86_cpu_filter_features() is primarily
> >>>> intended to filter features configured by the user,
> >>>
> >>> Yes, that's why this patches intends to let x86_cpu_filter_features()
> >>> filter out the MWAIT bit which is set from the overcommit option.
> >>
> >> HMM, but in fact x86_cpu_filter_features() has already checked the MWAIT
> >> bit set by "-overcommit cpu-pm=on". ;-)
> >>
> >> (Pls correct me if I'm wrong) Revisiting what cpu-pm did to MWAIT:
> >> * Firstly, it set MWAIT bit in x86_cpu_expand_features():
> >> x86_cpu_expand_features()
> >> -> x86_cpu_get_supported_feature_word()
> >> -> kvm_arch_get_supported_cpuid()
> >> This MWAIT is based on Host's MWAIT capability. This MWAIT enablement
> >> is fine for next x86_cpu_filter_features() and x86_cpu_filter_features()
> >> is working correctly here!
> >>
> >> * Then, MWAIT was secondly set in host_cpu_enable_cpu_pm() regardless
> >> neither Host's support or previous MWAIT enablement result. This is
> >> the root cause of your issue.
> >>
> >> Therefore, we should make cpu-pm honor his first MWAIT enablement result
> >> instead of repeatly and unconditionally setting the MWAIT bit again in
> >> host_cpu_enable_cpu_pm().
> >>
> >> Additionally, I think the code in x86_cpu_realizefn():
> >> cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
> >> has the similar issue because it also should check MWAIT feature bit.
> >>
> >> Further, it may be possible to remove cpu->mwait: just check the MWAIT
> >> bit in leaf 5 of cpu_x86_cpuid(), and if MWAIT is present, use host's
> >> mwait info plus CPUID_MWAIT_EMX | CPUID_MWAIT_IBE.
> >
> > Agreed with above analysis,
> > we shouldn't have host_cpu_enable_cpu_pm() as kvm_arch_get_supported_cpuid
> > gets us MWAIT already.
>
> Yes, I agree don't need to set CPUID_EXT_MONITOR besides
> kvm_arch_get_supported_cpuid().
>
> >
> > filling in cpu->mwait.ecx is benign mistake which likely doesn't
> > trigger anything if CPUID_EXT_MONITOR is not present.
> > But for clarity it's better to add an explicit check there as well.
>
> Yes, I agree without MWAIT available and advertised, it's meaningless to
> set the EMX and IBE bits. Seems to me it's cleaner to remove cpu->mwait
> all together, and in cpu_x86_cpuid(), just read from host_cpuid(5) if
> MWAIT is available to the guest. But I don't understand the history of
> why QEMU unconditionally advertises these two bits, and don't know it it
> could break some thing if these two bits are removed.
>
> Even if we want to fix these two bits, we can do it in another separate
> patch.
>
> e737b32a36 (" Core 2 Duo specification (Alexander Graf).")
> unconditionally adds "CPUID_MWAIT_EMX | CPUID_MWAIT_IBE" to CPUID.5.ECX
> with further explanation.
>
> 2266d44311 ("i386/cpu: make -cpu host support monitor/mwait") adds
> comment "We always wake on interrupt even if host does not have the
> capability" to CPUID_MWAIT_IBE.
>
>
> >
> >>
> >>>> and the changes of
> >>>> CPUID after x86_cpu_filter_features() should by default be regarded like
> >>>> "QEMU knows what it is doing".
> >>>
> >>> Sure, we can add feature bits after x86_cpu_filter_features(), but I
> >>> think moving cpu_exec_realizefn() before x86_cpu_filter_features() is
> >>> more generic, and actually this is what QEMU did before commit 662175b91ff2.
> >>>
> >>> - Less redundant code. Specifically, no need to call
> >>> x86_cpu_get_supported_feature_word() again.
> >>> - Potentially there could be other features could be added from the
> >>> accel-specific realizefn, kvm_cpu_realizefn() for example. And these
> >>> features need to be checked against the host availability.
> >>
> >> Mainly I don't think this reorder is a direct fix for the problem (I
> >> just analyse it above), also in your case x86_cpu_filter_features() will
> >> print a WARNING when QEMU boots, which I don't think is cpu-pm's intention.
> >
> > There is no problem with warning, I'd even say it's a good thing.
>
> I agree it's good to have the warning as well.
>
> > But you are right reordering just masks the issue.
> >
> > As for expected behavior, if user asked for "-overcommit cpu-pm=on"
> > there are 2 options:
> > * it's working as expected (mwait exiting is enabled successfully with CPUID MONITOR bit set)
> > * QEMU shall fail to start.
>
> I like the idea that QEMU refuses to launch the guest if the asked CPU
> features are not available, which is more user friendly. But the
> problem is, "-overcommit cpu-pm=on" is an umbrella which intends to
> enable all the following CPUIDs and KVM features if it's possible. So,
> if QEMU fails the guest in this case, then it needs to fail the WAITPKG
> feature as well. Additionally, it may need to provide individual options
> to enable these individual features, which I doubt could be too complicated.
how about
kvm_arch_init()
...
if (enable_cpu_pm) {
int disable_exits = kvm_check_extension(s, KVM_CAP_X86_DISABLE_EXITS);
/* Work around for kernel header with a typo. TODO: fix header and drop. */
#if defined(KVM_X86_DISABLE_EXITS_HTL) && !defined(KVM_X86_DISABLE_EXITS_HLT)
#define KVM_X86_DISABLE_EXITS_HLT KVM_X86_DISABLE_EXITS_HTL
#endif
above comment probably needs to be cleaned up
if (disable_exits) {
disable_exits &= (KVM_X86_DISABLE_EXITS_MWAIT |
KVM_X86_DISABLE_EXITS_HLT |
KVM_X86_DISABLE_EXITS_PAUSE |
KVM_X86_DISABLE_EXITS_CSTATE);
}
fail here if filtered disable_exits is an empty set?
ret = kvm_vm_enable_cap(s, KVM_CAP_X86_DISABLE_EXITS, 0,
disable_exits);
if (ret < 0) {
error_report("kvm: guest stopping CPU not supported: %s",
strerror(-ret));
}
}
> KVM_X86_DISABLE_EXITS_MWAI
> KVM_X86_DISABLE_EXITS_HLTKVM_X86_DISABLE_EXITS_PAUSE
> KVM_X86_DISABLE_EXITS_CSTATE
> CPUID.7.0:ECX.WAITPKG
> CPUID.1.ECX.MWAIT
>
next prev parent reply other threads:[~2024-06-05 15:09 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-24 20:00 [PATCH V2 0/3] improve -overcommit cpu-pm=on|off Zide Chen
2024-05-24 20:00 ` [PATCH V2 1/3] vl: Allow multiple -overcommit commands Zide Chen
2024-05-27 5:19 ` Thomas Huth
2024-05-30 14:01 ` Zhao Liu
2024-05-31 4:57 ` Thomas Huth
2024-06-03 8:44 ` Markus Armbruster
2024-05-30 13:39 ` Zhao Liu
2024-05-24 20:00 ` [PATCH V2 2/3] target/i386: call cpu_exec_realizefn before x86_cpu_filter_features Zide Chen
2024-05-31 6:30 ` Zhao Liu
2024-05-31 17:13 ` Chen, Zide
2024-06-01 15:26 ` Zhao Liu
2024-06-03 9:30 ` Igor Mammedov
2024-06-03 21:29 ` Chen, Zide
2024-06-05 15:07 ` Igor Mammedov [this message]
2024-06-05 17:58 ` Chen, Zide
2024-06-03 21:29 ` Chen, Zide
2024-05-24 20:00 ` [PATCH V2 3/3] target/i386: Move host_cpu_enable_cpu_pm into kvm_cpu_realizefn() Zide Chen
2024-05-31 6:53 ` Zhao Liu
2024-05-31 17:13 ` Chen, Zide
2024-05-28 9:23 ` [PATCH V2 0/3] improve -overcommit cpu-pm=on|off Igor Mammedov
2024-05-28 18:16 ` Chen, Zide
2024-05-29 12:46 ` Igor Mammedov
2024-05-29 17:31 ` Chen, Zide
2024-05-30 13:54 ` Zhao Liu
2024-05-30 14:34 ` Igor Mammedov
2024-05-30 14:53 ` Sean Christopherson
2024-05-30 14:49 ` Igor Mammedov
2024-06-02 21:54 ` Michael S. Tsirkin
2024-05-30 16:15 ` Chen, Zide
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240605170735.5a3924cd@imammedo.users.ipa.redhat.com \
--to=imammedo@redhat.com \
--cc=cfontana@suse.de \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-trivial@nongnu.org \
--cc=seanjc@google.com \
--cc=thuth@redhat.com \
--cc=xiaoyao.li@intel.com \
--cc=zhao1.liu@intel.com \
--cc=zide.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).