qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Zhao Liu <zhao1.liu@linux.intel.com>
To: "Moger, Babu" <babu.moger@amd.com>
Cc: "Eduardo Habkost" <eduardo@habkost.net>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Yanan Wang" <wangyanan55@huawei.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	qemu-devel@nongnu.org, "Zhenyu Wang" <zhenyu.z.wang@intel.com>,
	"Xiaoyao Li" <xiaoyao.li@intel.com>,
	"Zhao Liu" <zhao1.liu@intel.com>
Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]
Date: Fri, 4 Aug 2023 17:48:24 +0800	[thread overview]
Message-ID: <ZMzJaElw/T5caQU+@liuzhao-OptiPlex-7080> (raw)
In-Reply-To: <3f7510f2-20f3-93df-72b3-01cfa687f554@amd.com>

Hi Babu,

On Thu, Aug 03, 2023 at 11:41:40AM -0500, Moger, Babu wrote:
> Date: Thu, 3 Aug 2023 11:41:40 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>  CPUID[4]
> 
> Hi Zhao,
> 
> On 8/2/23 18:49, Moger, Babu wrote:
> > Hi Zhao,
> > 
> > Hitting this error after this patch.
> > 
> > ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
> > not be reached
> > Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
> > should not be reached
> > Aborted (core dumped)
> > 
> > Looks like share_level for all the caches for AMD is not initialized.

I missed these change when I rebase. Sorry for that.

BTW, could I ask a question? From a previous discussion[1], I understand
that the cache info is used to show the correct cache information in
new machine. And from [2], the wrong cache info may cause "compatibility
issues".

Is this "compatibility issues" AMD specific? I'm not sure if Intel should
update the cache info like that. thanks!

[1]: https://patchwork.kernel.org/project/kvm/patch/CY4PR12MB1768A3CBE42AAFB03CB1081E95AA0@CY4PR12MB1768.namprd12.prod.outlook.com/
[2]: https://lore.kernel.org/qemu-devel/20180510204148.11687-1-babu.moger@amd.com/

> 
> The following patch fixes the problem.
> 
> ======================================================
> 
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index f4c48e19fa..976a2755d8 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -528,6 +528,7 @@ static CPUCacheInfo legacy_l2_cache_cpuid2 = {
>      .size = 2 * MiB,
>      .line_size = 64,
>      .associativity = 8,
> +    .share_level = CPU_TOPO_LEVEL_CORE,

This "legacy_l2_cache_cpuid2" is not used to encode cache topology.
I should explicitly set this default topo level as CPU_TOPO_LEVEL_UNKNOW.

>  };
> 
> 
> @@ -1904,6 +1905,7 @@ static CPUCaches epyc_v4_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l1i_cache = &(CPUCacheInfo) {
>          .type = INSTRUCTION_CACHE,
> @@ -1916,6 +1918,7 @@ static CPUCaches epyc_v4_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l2_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -1926,6 +1929,7 @@ static CPUCaches epyc_v4_cache_info = {
>          .partitions = 1,
>          .sets = 1024,
>          .lines_per_tag = 1,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l3_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -1939,6 +1943,7 @@ static CPUCaches epyc_v4_cache_info = {
>          .self_init = true,
>          .inclusive = true,
>          .complex_indexing = false,
> +        .share_level = CPU_TOPO_LEVEL_DIE,
>      },
>  };
> 
> @@ -2008,6 +2013,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l1i_cache = &(CPUCacheInfo) {
>          .type = INSTRUCTION_CACHE,
> @@ -2020,6 +2026,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l2_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -2030,6 +2037,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>          .partitions = 1,
>          .sets = 1024,
>          .lines_per_tag = 1,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l3_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -2043,6 +2051,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>          .self_init = true,
>          .inclusive = true,
>          .complex_indexing = false,
> +        .share_level = CPU_TOPO_LEVEL_DIE,
>      },
>  };
> 
> @@ -2112,6 +2121,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l1i_cache = &(CPUCacheInfo) {
>          .type = INSTRUCTION_CACHE,
> @@ -2124,6 +2134,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l2_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -2134,6 +2145,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
>          .partitions = 1,
>          .sets = 1024,
>          .lines_per_tag = 1,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l3_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -2147,6 +2159,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
>          .self_init = true,
>          .inclusive = true,
>          .complex_indexing = false,
> +        .share_level = CPU_TOPO_LEVEL_DIE,
>      },
>  };
> 
> @@ -2162,6 +2175,7 @@ static const CPUCaches epyc_genoa_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l1i_cache = &(CPUCacheInfo) {
>          .type = INSTRUCTION_CACHE,
> @@ -2174,6 +2188,7 @@ static const CPUCaches epyc_genoa_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l2_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -2184,6 +2199,7 @@ static const CPUCaches epyc_genoa_cache_info = {
>          .partitions = 1,
>          .sets = 2048,
>          .lines_per_tag = 1,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l3_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -2197,6 +2213,7 @@ static const CPUCaches epyc_genoa_cache_info = {
>          .self_init = true,
>          .inclusive = true,
>          .complex_indexing = false,
> +        .share_level = CPU_TOPO_LEVEL_DIE,
>      },
>  };
> 
> 
> =========================================================================


Look good to me except legacy_l2_cache_cpuid2, thanks very much!
I'll add this in next version.

-Zhao

> 
> Thanks
> Babu
> > 
> > On 8/1/23 05:35, Zhao Liu wrote:
> >> From: Zhao Liu <zhao1.liu@intel.com>
> >>
> >> CPUID[4].EAX[bits 25:14] is used to represent the cache topology for
> >> intel CPUs.
> >>
> >> After cache models have topology information, we can use
> >> CPUCacheInfo.share_level to decide which topology level to be encoded
> >> into CPUID[4].EAX[bits 25:14].
> >>
> >> And since maximum_processor_id (original "num_apic_ids") is parsed
> >> based on cpu topology levels, which are verified when parsing smp, it's
> >> no need to check this value by "assert(num_apic_ids > 0)" again, so
> >> remove this assert.
> >>
> >> Additionally, wrap the encoding of CPUID[4].EAX[bits 31:26] into a
> >> helper to make the code cleaner.
> >>
> >> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> >> ---
> >> Changes since v1:
> >>  * Use "enum CPUTopoLevel share_level" as the parameter in
> >>    max_processor_ids_for_cache().
> >>  * Make cache_into_passthrough case also use
> >>    max_processor_ids_for_cache() and max_core_ids_in_package() to
> >>    encode CPUID[4]. (Yanan)
> >>  * Rename the title of this patch (the original is "i386: Use
> >>    CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14]").
> >> ---
> >>  target/i386/cpu.c | 70 +++++++++++++++++++++++++++++------------------
> >>  1 file changed, 43 insertions(+), 27 deletions(-)
> >>
> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> >> index 55aba4889628..c9897c0fe91a 100644
> >> --- a/target/i386/cpu.c
> >> +++ b/target/i386/cpu.c
> >> @@ -234,22 +234,53 @@ static uint8_t cpuid2_cache_descriptor(CPUCacheInfo *cache)
> >>                         ((t) == UNIFIED_CACHE) ? CACHE_TYPE_UNIFIED : \
> >>                         0 /* Invalid value */)
> >>  
> >> +static uint32_t max_processor_ids_for_cache(X86CPUTopoInfo *topo_info,
> >> +                                            enum CPUTopoLevel share_level)
> >> +{
> >> +    uint32_t num_ids = 0;
> >> +
> >> +    switch (share_level) {
> >> +    case CPU_TOPO_LEVEL_CORE:
> >> +        num_ids = 1 << apicid_core_offset(topo_info);
> >> +        break;
> >> +    case CPU_TOPO_LEVEL_DIE:
> >> +        num_ids = 1 << apicid_die_offset(topo_info);
> >> +        break;
> >> +    case CPU_TOPO_LEVEL_PACKAGE:
> >> +        num_ids = 1 << apicid_pkg_offset(topo_info);
> >> +        break;
> >> +    default:
> >> +        /*
> >> +         * Currently there is no use case for SMT and MODULE, so use
> >> +         * assert directly to facilitate debugging.
> >> +         */
> >> +        g_assert_not_reached();
> >> +    }
> >> +
> >> +    return num_ids - 1;
> >> +}
> >> +
> >> +static uint32_t max_core_ids_in_package(X86CPUTopoInfo *topo_info)
> >> +{
> >> +    uint32_t num_cores = 1 << (apicid_pkg_offset(topo_info) -
> >> +                               apicid_core_offset(topo_info));
> >> +    return num_cores - 1;
> >> +}
> >>  
> >>  /* Encode cache info for CPUID[4] */
> >>  static void encode_cache_cpuid4(CPUCacheInfo *cache,
> >> -                                int num_apic_ids, int num_cores,
> >> +                                X86CPUTopoInfo *topo_info,
> >>                                  uint32_t *eax, uint32_t *ebx,
> >>                                  uint32_t *ecx, uint32_t *edx)
> >>  {
> >>      assert(cache->size == cache->line_size * cache->associativity *
> >>                            cache->partitions * cache->sets);
> >>  
> >> -    assert(num_apic_ids > 0);
> >>      *eax = CACHE_TYPE(cache->type) |
> >>             CACHE_LEVEL(cache->level) |
> >>             (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) |
> >> -           ((num_cores - 1) << 26) |
> >> -           ((num_apic_ids - 1) << 14);
> >> +           (max_core_ids_in_package(topo_info) << 26) |
> >> +           (max_processor_ids_for_cache(topo_info, cache->share_level) << 14);
> >>  
> >>      assert(cache->line_size > 0);
> >>      assert(cache->partitions > 0);
> >> @@ -6116,56 +6147,41 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >>                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> >>  
> >>                  if (cores_per_pkg > 1) {
> >> -                    int addressable_cores_offset =
> >> -                                                apicid_pkg_offset(&topo_info) -
> >> -                                                apicid_core_offset(&topo_info);
> >> -
> >>                      *eax &= ~0xFC000000;
> >> -                    *eax |= (1 << addressable_cores_offset - 1) << 26;
> >> +                    *eax |= max_core_ids_in_package(&topo_info) << 26;
> >>                  }
> >>                  if (host_vcpus_per_cache > cpus_per_pkg) {
> >> -                    int pkg_offset = apicid_pkg_offset(&topo_info);
> >> -
> >>                      *eax &= ~0x3FFC000;
> >> -                    *eax |= (1 << pkg_offset - 1) << 14;
> >> +                    *eax |=
> >> +                        max_processor_ids_for_cache(&topo_info,
> >> +                                                CPU_TOPO_LEVEL_PACKAGE) << 14;
> >>                  }
> >>              }
> >>          } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
> >>              *eax = *ebx = *ecx = *edx = 0;
> >>          } else {
> >>              *eax = 0;
> >> -            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
> >> -                                           apicid_core_offset(&topo_info);
> >> -            int core_offset, die_offset;
> >>  
> >>              switch (count) {
> >>              case 0: /* L1 dcache info */
> >> -                core_offset = apicid_core_offset(&topo_info);
> >>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
> >> -                                    (1 << core_offset),
> >> -                                    (1 << addressable_cores_offset),
> >> +                                    &topo_info,
> >>                                      eax, ebx, ecx, edx);
> >>                  break;
> >>              case 1: /* L1 icache info */
> >> -                core_offset = apicid_core_offset(&topo_info);
> >>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
> >> -                                    (1 << core_offset),
> >> -                                    (1 << addressable_cores_offset),
> >> +                                    &topo_info,
> >>                                      eax, ebx, ecx, edx);
> >>                  break;
> >>              case 2: /* L2 cache info */
> >> -                core_offset = apicid_core_offset(&topo_info);
> >>                  encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
> >> -                                    (1 << core_offset),
> >> -                                    (1 << addressable_cores_offset),
> >> +                                    &topo_info,
> >>                                      eax, ebx, ecx, edx);
> >>                  break;
> >>              case 3: /* L3 cache info */
> >> -                die_offset = apicid_die_offset(&topo_info);
> >>                  if (cpu->enable_l3_cache) {
> >>                      encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
> >> -                                        (1 << die_offset),
> >> -                                        (1 << addressable_cores_offset),
> >> +                                        &topo_info,
> >>                                          eax, ebx, ecx, edx);
> >>                      break;
> >>                  }
> > 
> 
> -- 
> Thanks
> Babu Moger


  reply	other threads:[~2023-08-04  9:39 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
2023-08-01 10:35 ` [PATCH v3 01/17] i386: Fix comment style in topology.h Zhao Liu
2023-08-01 23:13   ` Moger, Babu
2023-08-04  8:12     ` Zhao Liu
2023-08-07  2:16   ` Xiaoyao Li
2023-08-07  7:05     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 02/17] tests: Rename test-x86-cpuid.c to test-x86-topo.c Zhao Liu
2023-08-01 23:20   ` Moger, Babu
2023-08-04  8:14     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
2023-08-02 15:25   ` Moger, Babu
2023-08-04  8:16     ` Zhao Liu
2023-08-07  7:03   ` Xiaoyao Li
2023-08-07  7:53     ` Zhao Liu
2023-08-07  8:43       ` Xiaoyao Li
2023-08-07 10:00         ` Zhao Liu
2023-08-07 14:20           ` Xiaoyao Li
2023-08-07 14:42             ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 04/17] i386/cpu: Fix i/d-cache topology to core level for Intel CPU Zhao Liu
2023-08-04  9:56   ` Xiaoyao Li
2023-08-04 12:43     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4] Zhao Liu
2023-08-02 15:41   ` Moger, Babu
2023-08-04  8:21     ` Zhao Liu
2023-08-07  8:13   ` Xiaoyao Li
2023-08-07  9:30     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 06/17] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
2023-08-02 16:31   ` Moger, Babu
2023-08-04  8:23     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 07/17] i386: Introduce module-level cpu topology to CPUX86State Zhao Liu
2023-08-01 10:35 ` [PATCH v3 08/17] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
2023-08-02 17:25   ` Moger, Babu
2023-08-04  9:05     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 09/17] i386: Support module_id in X86CPUTopoIDs Zhao Liu
2023-08-01 10:35 ` [PATCH v3 10/17] i386/cpu: Introduce cluster-id to X86CPU Zhao Liu
2023-08-02 22:44   ` Moger, Babu
2023-08-04  9:06     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 11/17] tests: Add test case of APIC ID for module level parsing Zhao Liu
2023-08-01 10:35 ` [PATCH v3 12/17] hw/i386/pc: Support smp.clusters for x86 PC machine Zhao Liu
2023-08-01 10:35 ` [PATCH v3 13/17] i386: Add cache topology info in CPUCacheInfo Zhao Liu
2023-08-01 10:35 ` [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4] Zhao Liu
2023-08-02 23:49   ` Moger, Babu
2023-08-03 16:41     ` Moger, Babu
2023-08-04  9:48       ` Zhao Liu [this message]
2023-08-04 15:48         ` Moger, Babu
2023-08-14  8:22           ` Zhao Liu
2023-08-14 16:03             ` Moger, Babu
2023-08-18  7:37               ` Zhao Liu
2023-08-23 17:18                 ` Moger, Babu
2023-09-01  8:43                   ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 15/17] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
2023-08-03 20:40   ` Moger, Babu
2023-08-04  9:50     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode " Zhao Liu
2023-08-03 20:44   ` Moger, Babu
2023-08-04  9:56     ` Zhao Liu
2023-08-04 18:50       ` Moger, Babu
2023-08-01 10:35 ` [PATCH v3 17/17] i386: Add new property to control L2 cache topo in CPUID.04H Zhao Liu
2023-08-01 15:35 ` [PATCH v3 00/17] Support smp.clusters for x86 Jonathan Cameron via
2023-08-04 13:17   ` Zhao Liu
2023-08-08 11:52     ` Jonathan Cameron via
2023-08-01 23:11 ` Moger, Babu
2023-08-04  7:44   ` Zhao Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZMzJaElw/T5caQU+@liuzhao-OptiPlex-7080 \
    --to=zhao1.liu@linux.intel.com \
    --cc=babu.moger@amd.com \
    --cc=eduardo@habkost.net \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=wangyanan55@huawei.com \
    --cc=xiaoyao.li@intel.com \
    --cc=zhao1.liu@intel.com \
    --cc=zhenyu.z.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).