From: Xiaoyao Li <xiaoyao.li@intel.com>
To: Eduardo Habkost <ehabkost@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
qemu-devel@nongnu.org
Subject: Re: [RFC PATCH 3/5] target/i386: Enable host pass through of Intel PT
Date: Sun, 17 Oct 2021 18:37:08 +0800 [thread overview]
Message-ID: <01e4d8fe-5ac3-d855-447f-2f9f825dcde6@intel.com> (raw)
In-Reply-To: <20211015202220.ghdn6gsdfuh56xq7@habkost.net>
On 10/16/2021 4:22 AM, Eduardo Habkost wrote:
> On Thu, Sep 09, 2021 at 10:41:48PM +0800, Xiaoyao Li wrote:
>> commit e37a5c7fa459 ("i386: Add Intel Processor Trace feature support")
>> added the support of Intel PT by making CPUID[14] of PT as fixed feature
>> set (from ICX) for any CPU model on any host.
>>
>> This truly breaks the PT exposing on Intel SPR platform because SPR has
>> less supported bitmap CPUID(0x14,1):EBX[15:0] than ICX.
>>
>> This patch enables passing through host's PT capabilities for the case
>> "-cpu host/max" while ensure named CPU model still has the fixed
>> PT feature set to not break the live migration.
>>
>> It introduces @has_specific_intel_pt_feature_set field for name CPU
>> model. If a named CPU model has this field as false, it will use fixed
>> PT feature set of ICX. Besides same to previous behavior, if fixed PT
>> feature set of ICX cannot be satisfied/supported by host, it disables PT
>> instead of adjusting the feature set based on host's capabilities.
>>
>> In the future, new named CPU model, e.g., Sapphire Rapids, can define
>> its own PT feature set by setting @has_specific_intel_pt_feature_set to
>> true and defines it's own FEAT_14_0_EBX, FEAT_14_0_ECX, FEAT_14_1_EAX
>> and FEAT_14_1_EBX.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>> target/i386/cpu.c | 106 ++++++++++++++++++++--------------------------
>> target/i386/cpu.h | 1 +
>> 2 files changed, 47 insertions(+), 60 deletions(-)
>>
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index 58e98210f219..00c4ad23110d 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -543,34 +543,24 @@ static CPUCacheInfo legacy_l3_cache = {
>> #define L2_ITLB_4K_ASSOC 4
>> #define L2_ITLB_4K_ENTRIES 512
>>
>> -/* CPUID Leaf 0x14 constants: */
>> -#define INTEL_PT_MAX_SUBLEAF 0x1
>> -/*
>> - * bit[00]: IA32_RTIT_CTL.CR3 filter can be set to 1 and IA32_RTIT_CR3_MATCH
>> - * MSR can be accessed;
>> - * bit[01]: Support Configurable PSB and Cycle-Accurate Mode;
>> - * bit[02]: Support IP Filtering, TraceStop filtering, and preservation
>> - * of Intel PT MSRs across warm reset;
>> - * bit[03]: Support MTC timing packet and suppression of COFI-based packets;
>> - */
>> -#define INTEL_PT_MINIMAL_EBX 0xf
>> -/*
>> - * bit[00]: Tracing can be enabled with IA32_RTIT_CTL.ToPA = 1 and
>> - * IA32_RTIT_OUTPUT_BASE and IA32_RTIT_OUTPUT_MASK_PTRS MSRs can be
>> - * accessed;
>> - * bit[01]: ToPA tables can hold any number of output entries, up to the
>> - * maximum allowed by the MaskOrTableOffset field of
>> - * IA32_RTIT_OUTPUT_MASK_PTRS;
>> - * bit[02]: Support Single-Range Output scheme;
>> - */
>> -#define INTEL_PT_MINIMAL_ECX 0x7
>> -/* generated packets which contain IP payloads have LIP values */
>> -#define INTEL_PT_IP_LIP (1 << 31)
>> -#define INTEL_PT_ADDR_RANGES_NUM 0x2 /* Number of configurable address ranges */
>> -#define INTEL_PT_ADDR_RANGES_NUM_MASK 0x7
>> -#define INTEL_PT_MTC_BITMAP (0x0249 << 16) /* Support ART(0,3,6,9) */
>> -#define INTEL_PT_CYCLE_BITMAP 0x1fff /* Support 0,2^(0~11) */
>> -#define INTEL_PT_PSB_BITMAP (0x003f << 16) /* Support 2K,4K,8K,16K,32K,64K */
>> +#define INTEL_PT_MAX_SUBLEAF 0x1
>> +#define INTEL_PT_DEFAULT_ADDR_RANGES_NUM 0x2
>> +#define INTEL_PT_ADDR_RANGES_NUM_MASK 0x7
>> +/* Support ART(0,3,6,9) */
>> +#define INTEL_PT_DEFAULT_MTC_BITMAP 0x0249
>> +/* Support 0,2^(0~11) */
>> +#define INTEL_PT_DEFAULT_CYCLE_BITMAP 0x1fff
>> +/* Support 2K,4K,8K,16K,32K,64K */
>> +#define INTEL_PT_DEFAULT_PSB_BITMAP 0x003f
>> +
>> +#define INTEL_PT_DEFAULT_0_EBX (CPUID_14_0_EBX_CR3_FILTER | \
>> + CPUID_14_0_EBX_PSB | CPUID_14_0_EBX_IP_FILTER | CPUID_14_0_EBX_MTC)
>> +#define INTEL_PT_DEFAULT_0_ECX (CPUID_14_0_ECX_TOPA | \
>> + CPUID_14_0_ECX_MULTI_ENTRIES | CPUID_14_0_ECX_SINGLE_RANGE)
>> +#define INTEL_PT_DEFAULT_1_EAX (INTEL_PT_DEFAULT_MTC_BITMAP << 16 | \
>> + INTEL_PT_DEFAULT_ADDR_RANGES_NUM)
>> +#define INTEL_PT_DEFAULT_1_EBX (INTEL_PT_DEFAULT_PSB_BITMAP << 16 | \
>> + INTEL_PT_DEFAULT_CYCLE_BITMAP)
>
> I like these new macros because they make the code at
> x86_cpu_filter_features() clearer. Can we do this in a separate
> patch, to be applied before "Introduce FeatureWordInfo for Intel
> PT CPUID leaf 0xD"?
>
>>
>> void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
>> uint32_t vendor2, uint32_t vendor3)
>> @@ -1517,6 +1507,7 @@ typedef struct X86CPUDefinition {
>> int family;
>> int model;
>> int stepping;
>> + bool has_specific_intel_pt_feature_set;
>> FeatureWordArray features;
>> const char *model_id;
>> const CPUCaches *const cache_info;
>> @@ -5061,6 +5052,14 @@ static void x86_cpu_load_model(X86CPU *cpu, X86CPUModel *model)
>> env->features[w] = def->features[w];
>> }
>>
>> + if (!def->has_specific_intel_pt_feature_set) {
>> + env->use_default_intel_pt = true;
>> + env->features[FEAT_14_0_EBX] = INTEL_PT_DEFAULT_0_EBX;
>> + env->features[FEAT_14_0_ECX] = INTEL_PT_DEFAULT_0_ECX;
>> + env->features[FEAT_14_1_EAX] = INTEL_PT_DEFAULT_1_EAX;
>> + env->features[FEAT_14_1_EBX] = INTEL_PT_DEFAULT_1_EBX;
>> + }
>
> There's nothing special about intel-pt, and other features might
> benefit from having default values set if omitted from the CPU
> model definition. I don't know yet what's the best solution
> here, but some possibilities are:
>
> A) The simpler and more obvious solution: just set
> features[FEAT_14_*] explicitly to INTEL_PT_DEFAULT_* on all CPU
> models in builtin_x86_defs.
>
> B) Treat (env->features[FEAT_14_0_EBX] == 0 &&
> env->features[FEAT_14_0_ECX] == 0 &&
> env->features[FEAT_14_1_EAX] == 0 &&
> env->features[FEAT_14_1_EBX] == 0) as a special case
> that indicates that INTEL_PT_DEFAULT_* should be used instead
> of 0.
>
> C) Rework X86CPUDefinition completely and replace
> FeatureWordArray features;
> with something like:
> struct {
> FeatureWord w;
> uint32_t value;
> } *features;
> or:
> struct {
> const char *property, *value;
> } *features;
> so we can have a concept of omitted/default flags in
> builtin_x86_defs.
>
> (A) and (C) are generic but require a lot more work, so we could
> keep your solution or just implement (B).
yew, I can implement B. Maybe for B), we can just use
(env->features[FEAT_14_0_EBX] == 0)
for simplicity?
> In either case, I suggest you add a comment explaining why the
> boolean flag or special case exists (I believe the answer is:
> "because all CPU models have the same default value for
> INTEL_PT_* and we don't want to manually copy/paste it to all
> entries in builtin_x86_defs").
yes, I can add the comment.
(It's just because we want to keep compatible with old QEMU, which has a
not-so-good design of INTEL_PT feature set)
>
>> +
>> /* legacy-cache defaults to 'off' if CPU model provides cache info */
>> cpu->legacy_cache = !def->cache_info;
>>
>> @@ -5465,14 +5464,11 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>
>> if (count == 0) {
>> *eax = INTEL_PT_MAX_SUBLEAF;
>> - *ebx = INTEL_PT_MINIMAL_EBX;
>> - *ecx = INTEL_PT_MINIMAL_ECX;
>> - if (env->features[FEAT_14_0_ECX] & CPUID_14_0_ECX_LIP) {
>> - *ecx |= CPUID_14_0_ECX_LIP;
>> - }
>> + *ebx = env->features[FEAT_14_0_EBX];
>> + *ecx = env->features[FEAT_14_0_ECX];
>> } else if (count == 1) {
>> - *eax = INTEL_PT_MTC_BITMAP | INTEL_PT_ADDR_RANGES_NUM;
>> - *ebx = INTEL_PT_PSB_BITMAP | INTEL_PT_CYCLE_BITMAP;
>> + *eax = env->features[FEAT_14_1_EAX];
>> + *ebx = env->features[FEAT_14_1_EBX];
>> }
>> break;
>> }
>> @@ -6081,6 +6077,7 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>> CPUX86State *env = &cpu->env;
>> FeatureWord w;
>> const char *prefix = NULL;
>> + uint64_t host_feat;
>>
>> if (verbose) {
>> prefix = accel_uses_host_cpuid()
>> @@ -6089,8 +6086,7 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>> }
>>
>> for (w = 0; w < FEATURE_WORDS; w++) {
>> - uint64_t host_feat =
>> - x86_cpu_get_supported_feature_word(w, false);
>> + host_feat = x86_cpu_get_supported_feature_word(w, false);
>
> This doesn't look right. The value of host_feat set here is
> useless outside this for loop, and there's no reason to change
> the scope of this variable.
I just don't want to define host_feat twice in different sub-block.
I don't think there is any problem I move the define of host_feat out of
the loop. host_feat is just the variable to store the value when query
host/kvm supported value for specific feature word.
In the later code, I need to query the value for FEAT_14_0_ECX again.
>> uint64_t requested_features = env->features[w];
>> uint64_t unavailable_features;
>> if (w == FEAT_14_1_EAX) {
>> @@ -6107,32 +6103,22 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>> mark_unavailable_features(cpu, w, unavailable_features, prefix);
>> }
>>
>> - if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) &&
>> - kvm_enabled()) {
>> - KVMState *s = CPU(cpu)->kvm_state;
>> - uint32_t eax_0 = kvm_arch_get_supported_cpuid(s, 0x14, 0, R_EAX);
>> - uint32_t ebx_0 = kvm_arch_get_supported_cpuid(s, 0x14, 0, R_EBX);
>> - uint32_t ecx_0 = kvm_arch_get_supported_cpuid(s, 0x14, 0, R_ECX);
>> - uint32_t eax_1 = kvm_arch_get_supported_cpuid(s, 0x14, 1, R_EAX);
>> - uint32_t ebx_1 = kvm_arch_get_supported_cpuid(s, 0x14, 1, R_EBX);
>> + if (env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) {
>> + host_feat = x86_cpu_get_supported_feature_word(FEAT_14_0_ECX, false);
>
> This doesn't look right. You seem to be using the same variable
> (host_feat) for two completely different purposes.
on the former, it's
host_feat = x86_cpu_get_supported_feature_word(w, false);
here, it's
host_feat = x86_cpu_get_supported_feature_word(FEAT_14_0_ECX, false);
why different?
>>
>> - if (!eax_0 ||
>> - ((ebx_0 & INTEL_PT_MINIMAL_EBX) != INTEL_PT_MINIMAL_EBX) ||
>> - ((ecx_0 & INTEL_PT_MINIMAL_ECX) != INTEL_PT_MINIMAL_ECX) ||
>> - ((eax_1 & INTEL_PT_MTC_BITMAP) != INTEL_PT_MTC_BITMAP) ||
>> - ((eax_1 & INTEL_PT_ADDR_RANGES_NUM_MASK) <
>> - INTEL_PT_ADDR_RANGES_NUM) ||
>> - ((ebx_1 & (INTEL_PT_PSB_BITMAP | INTEL_PT_CYCLE_BITMAP)) !=
>> - (INTEL_PT_PSB_BITMAP | INTEL_PT_CYCLE_BITMAP)) ||
>> - ((ecx_0 & CPUID_14_0_ECX_LIP) !=
>> - (env->features[FEAT_14_0_ECX] & CPUID_14_0_ECX_LIP))) {
>> - /*
>> - * Processor Trace capabilities aren't configurable, so if the
>> - * host can't emulate the capabilities we report on
>> - * cpu_x86_cpuid(), intel-pt can't be enabled on the current host.
>> - */
>> + if (env->use_default_intel_pt &&
>> + ((env->features[FEAT_14_0_EBX] != INTEL_PT_DEFAULT_0_EBX) ||
>> + ((env->features[FEAT_14_0_ECX] & ~CPUID_14_0_ECX_LIP) !=
>> + INTEL_PT_DEFAULT_0_ECX) ||
>> + (env->features[FEAT_14_1_EAX] != INTEL_PT_DEFAULT_1_EAX) ||
>> + (env->features[FEAT_14_1_EBX] != INTEL_PT_DEFAULT_1_EBX))) {
>> mark_unavailable_features(cpu, FEAT_7_0_EBX, CPUID_7_0_EBX_INTEL_PT, prefix);
>
> Why is use_default_intel_pt necessary? Why can't this function
> simply validate whatever is inside env->features[FEAT_14_*]?
Because this handling is only to keep it consistent with old QEMU. you
can see that it validate against INTEL_PT_DEFAULT_* , which is the PT
set of ICX. If remove use_default_intel_pt check, it will also apply the
check on other CPU models (e.g., in patch 4) that have different
INTEL_PT set.
>> }
>> +
>> + if ((env->features[FEAT_14_0_ECX] ^ host_feat) & CPUID_14_0_ECX_LIP) {
>
> What if CPUID_7_0_EBX_INTEL_PT is not set? What should be the
> value of host_feat?
Above all handling is executed under the condition of
if (env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) {
>> + warn_report("Cannot configure different Intel PT IP payload format than hardware");
>> + mark_unavailable_features(cpu, FEAT_7_0_EBX, CPUID_7_0_EBX_INTEL_PT, NULL);
>> + }
>> }
>> }
>>
>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>> index f5478a169f9a..e6236c371c4f 100644
>> --- a/target/i386/cpu.h
>> +++ b/target/i386/cpu.h
>> @@ -1639,6 +1639,7 @@ typedef struct CPUX86State {
>> uint32_t cpuid_vendor2;
>> uint32_t cpuid_vendor3;
>> uint32_t cpuid_version;
>> + bool use_default_intel_pt;
>> FeatureWordArray features;
>> /* Features that were explicitly enabled/disabled */
>> FeatureWordArray user_features;
>> --
>> 2.27.0
>>
>
next prev parent reply other threads:[~2021-10-17 10:38 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-09 14:41 [RFC PATCH 0/5] Make Intel PT configurable Xiaoyao Li
2021-09-09 14:41 ` [RFC PATCH 1/5] target/i386: Print CPUID subleaf info for unsupported feature Xiaoyao Li
2021-10-15 15:12 ` Eduardo Habkost
2021-09-09 14:41 ` [RFC PATCH 2/5] target/i386: Introduce FeatureWordInfo for Intel PT CPUID leaf 0xD Xiaoyao Li
2021-10-15 16:04 ` Eduardo Habkost
2021-10-17 7:53 ` Xiaoyao Li
2021-09-09 14:41 ` [RFC PATCH 3/5] target/i386: Enable host pass through of Intel PT Xiaoyao Li
2021-10-15 20:22 ` Eduardo Habkost
2021-10-17 10:37 ` Xiaoyao Li [this message]
2021-10-18 3:46 ` Xiaoyao Li
2021-10-18 5:37 ` Xiaoyao Li
2021-10-20 14:40 ` Eduardo Habkost
2021-09-09 14:41 ` [RFC PATCH 4/5] target/i386: Define specific PT feature set for IceLake-server and Snowridge Xiaoyao Li
2021-09-09 14:41 ` [RFC PATCH 5/5] target/i386: Access MSR_IA32_RTIT_ADDRn based on guest CPUID configuration Xiaoyao Li
2021-09-26 5:21 ` [RFC PATCH 0/5] Make Intel PT configurable Xiaoyao Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=01e4d8fe-5ac3-d855-447f-2f9f825dcde6@intel.com \
--to=xiaoyao.li@intel.com \
--cc=ehabkost@redhat.com \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).