qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Philipp Eppelt <1871842@bugs.launchpad.net>
Cc: qemu-devel@nongnu.org
Subject: Re: [Bug 1871842] [NEW] AMD CPUID leaf 0x8000'0008 reported number of cores inconsistent with ACPI.MADT
Date: Wed, 15 Apr 2020 11:25:46 +0200	[thread overview]
Message-ID: <20200415112546.68ca1f15@redhat.com> (raw)
In-Reply-To: <42f0624d-b0fb-5d96-2921-8994c28b9937@kernkonzept.com>

On Tue, 14 Apr 2020 13:27:34 -0000
Philipp Eppelt <1871842@bugs.launchpad.net> wrote:

> Hi,
> 
> I have to clarify some things mentioned in my last post:
> 
> I only tested the change with an emulated EPYC-v2 CPU, I cannot test on
> a physical EPYC CPU at the moment. However, I doubt that the results
> will be different regarding the 0x8000_0008.ECX result.
> 
> The topology information printed is from the EPYC-v2 CPU model. I try to
> get access to the machine and have a look if -cpu host affects this
> topology.
> 
> So there is still the open question for the -enable-kvm -cpu host -smp 4
> case. Shouldn't in this case the topology of the host CPU be reported?
topology was never affected by the choice of -cpu, it's up to users to
define it using -smp the way they prefer.
 

> In all emulated-CPU cases it's on the user to define the topology or to
> live with the generated one (although I think preferring multi-socket
> systems is outdated, but it's likely just the case in my 'world').
> 
> 
> Cheers,
> Philipp
> 
> 
> On 4/14/20 10:24 AM, Philipp Eppelt wrote:
> > Hi,
> > 
> > thanks for looking into this so quickly.
> > 
> > With this patch applied ontop of git commit
> > f3bac27cc1e303e1860cc55b9b6889ba39dee587 I still have the issue and it
> > reports the same numbers. I like the new usage of the ApicIdSize field.
> > 
> > 
> > I looked into the mentioned pc_smp_parse() and had it print the topology
> > for -smp 4:
> > 
> > qemu-system-x86_64: warning: cpu topology: sockets (4) , dies (1) ,
> > cores (1) , threads (1) , maxcpus (4), cpus (4)
> > 
> > and with -smp 4,cores=4:
> > 
> > qemu-system-x86_64: warning: cpu topology: sockets (1) , dies (1) ,
> > cores (4) , threads (1) , maxcpus (4), cpus (4)
> > 
> > As far as I understand it, these are the numbers the cpuid:8000'0008
> > code relies on:
> > `cs->nr_cores`, `cs->nr_threads` with `cs` being of type CPUState.
> > 
> > So I think the issue is rooted with the preferring sockets over cores
> > when the -smp cmdline option is parsed, as stated in hw/i386/pc.c:729.
> > 
> > I guess this is the same code for Intel and AMD CPUs alike and this
> > issue just didn't surface for us on Intel CPUs, as they don't have this
> > CPUID leaf and we don't look at the topology.
> > 
> > This seems to boil down to a more careful use of the -smp option on my end.
> > 
> > Thanks again for looking into this.
> > 
> > Cheers,
> > Philipp
> > 
> > 
> > 
> > On 4/10/20 2:12 AM, Babu Moger wrote:  
> >> Philipp,
> >>   Can you please check if this patch works for you.
> >>
> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> >> index 90ffc5f..e467fee 100644
> >> --- a/target/i386/cpu.c
> >> +++ b/target/i386/cpu.c
> >> @@ -5831,10 +5831,17 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t
> >> index, uint32_t count,
> >>          }
> >>          *ebx = env->features[FEAT_8000_0008_EBX];
> >>          *ecx = 0;
> >> -        *edx = 0;
> >>          if (cs->nr_cores * cs->nr_threads > 1) {
> >> -            *ecx |= (cs->nr_cores * cs->nr_threads) - 1;
> >> +            unsigned long max_apicids, bits_required;
> >> +
> >> +            max_apicids = (cs->nr_cores * cs->nr_threads) - 1;
> >> +            if (max_apicids) {
> >> +                /* Find out the number of bits to represent all the
> >> apicids */
> >> +                bits_required = find_last_bit(&max_apicids,
> >> BITS_PER_BYTE) + 1;
> >> +                *ecx |= bits_required << 12 | max_apicids;
> >> +            }
> >>          }
> >> +        *edx = 0;
> >>          break;
> >>      case 0x8000000A:
> >>          if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM) {
> >>
> >>
> >> On 4/9/20 9:00 AM, Igor Mammedov wrote:  
> >>> On Thu, 09 Apr 2020 12:58:11 -0000
> >>> Philipp Eppelt <1871842@bugs.launchpad.net> wrote:
> >>>  
> >>>> Public bug reported:
> >>>>
> >>>> Setup:
> >>>> CPU: AMD EPYC-v2 or host's EPYC cpu
> >>>> Linux 64-bit fedora host; Kernel version 5.5.15-200.fc31
> >>>> qemu version: self build
> >>>> git-head: f3bac27cc1e303e1860cc55b9b6889ba39dee587
> >>>> config: Configured with: '../configure' '--target-list=x86_64-softmmu,mips64el-softmmu,mips64-softmmu,mipsel-softmmu,mips-softmmu,i386-softmmu,aarch64-softmmu,arm-softmmu' '--prefix=/opt/qemu-master'
> >>>>
> >>>> Cmdline: 
> >>>> qemu-system-x86_64 -kernel /home/peppelt/code/l4/internal/.build-x86_64/bin/amd64_gen/bootstrap -append "" -initrd "./fiasco/.build-x86_64/fiasco , ... " -serial stdio -nographic -monitor none -nographic -monitor none -cpu EPYC-v2 -m 4G -smp 4 
> >>>>
> >>>> Issue:
> >>>> We are developing an microkernel operating system called L4Re. We recently got an AMD EPYC server for testing and we couldn't execute SMP tests of our system when running Linux + qemu + VM w/ L4Re.
> >>>> In fact, the kernel did not recognize any APs at all. On AMD CPUs the kernel checks for the number of cores reported in CPUID leaf 0x8000_0008.ECX[NC] or [ApicIdSize].  [0][1]
> >>>>
> >>>> The physical machine reports for leaf 0x8000_0008:  EAX: 0x3030 EBX: 0x18cf757 ECX: 0x703f EDX: 0x1000
> >>>> The lower four bits of ECX are the [NC] field and all set.
> >>>>
> >>>> When querying inside qemu with -enable-kvm -cpu host -smp 4 (basically as replacement and addition to the above cmdline) the CPUID leaf shows: EAX: 0x3024, EBX: 0x1001000, ECX: 0x0, EDX: 0x0
> >>>> Note, ECX is zero. Indicating that this is no SMP capabale CPU.
> >>>>
> >>>> I'm debugging it using my local machine and the QEMU provided EPYC-v2
> >>>> CPU model and it is reproducible there as well and reports:  EAX:
> >>>> 0x3028, EBX: 0x0, ECX: 0x0, EDX: 0x0
> >>>>
> >>>> I checked other AMD based CPU models (phenom, opteron_g3/g5) and they behave the same. [2] shows the CPUID 0x8000'0008 handling in the QEMU source.
> >>>> I believe that behavior here is wrong as ECX[NC] should report the number of cores per processor, as stated in the AMD manual [2] p.584. In my understanding -smp 4 should then lead to ECX[NC] = 0x3.
> >>>>
> >>>> The following table shows my findings with the -smp option:
> >>>> Option | Qemu guest observed ECX value
> >>>> -smp 4 | 0x0
> >>>> -smp 4,cores=4  | 0x3
> >>>> -smp 4,cores=2,thread=2 | 0x3
> >>>> -smp 4,cores=4,threads=2 | QEMU boot error: topology false.
> >>>>
> >>>> Now, I'm asking myself how the terminology of the AMD manual maps to QEMU's -smp option.
> >>>> Obviously, nr_cores and nr_threads correspond to the cores and threads options on the cmdline and cores * threads <= 4 (in this example), but what corresponds the X in -smp X to?  
> >>> I'd say X corresponds to number of logical CPUs.
> >>> Depending on presence of other options these are distributed among them in magical manner
> >>> (see pc_smp_parse() for starters)
> >>>  
> >>>> Querying 0x8000'0008 on the physical processor results in different
> >>>> reports than quering QEMU's model as does it with -enable-kvm -cpu host.
> >>>>
> >>>> Furthermore, the ACPI.MADT shows 4 local APICs to be present while the
> >>>> CPU leave reports a single core processor.  
> >>> it matches -smp X as it should be.
> >>>  
> >>>>
> >>>> This leads me to the conclusion that CPUID 0x8000'0008.ECX reports the
> >>>> wrong number.  
> >>> CCed author of recent epyc patches who might know how AMD should work better than me.
> >>>  
> >>>>
> >>>> Please let me know, if you need more information from my side.
> >>>>
> >>>>
> >>>> [0] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkernkonzept%2Ffiasco%2Fblob%2F522ccc5f29ab120213cf02d71328e2b879cbbd19%2Fsrc%2Fkern%2Fia32%2Fkernel_thread-ia32.cpp%23L109&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C57569f7959744399655b08d7dc8e6e24%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220379083511672&amp;sdata=hcFJzLAVQoIh5IN9CP%2F9cUQNOZoBnpRA6FliJur1wzQ%3D&amp;reserved=0
> >>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkernkonzept%2Ffiasco%2Fblob%2F522ccc5f29ab120213cf02d71328e2b879cbbd19%2Fsrc%2Fkern%2Fia32%2Fcpu-ia32.cpp%23L1120&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C57569f7959744399655b08d7dc8e6e24%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220379083511672&amp;sdata=ANJIbYKbwfq2bDelH%2FRLKnDPIUZc1BwxHspmgxLU7gs%3D&amp;reserved=0
> >>>> [2] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fqemu%2Fqemu%2Fblob%2Ff2a8261110c32c4dccd84e774d8dd7a0524e00fb%2Ftarget%2Fi386%2Fcpu.c%23L5835&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C57569f7959744399655b08d7dc8e6e24%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220379083511672&amp;sdata=oj3mv9e5YOzUsfUjXK44gC8LybyWgMKo8JBIrRR%2BmDA%3D&amp;reserved=0
> >>>> [3] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F24594.pdf&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C57569f7959744399655b08d7dc8e6e24%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220379083511672&amp;sdata=7Yr3J9ihlqSqXCXKN5JJNTByO3NGI%2BGMz2EqBF2Y4hw%3D&amp;reserved=0
> >>>>
> >>>> ** Affects: qemu
> >>>>      Importance: Undecided
> >>>>          Status: New
> >>>>  
> >>>  
> >>  
> >   
> 



WARNING: multiple messages have this Message-ID (diff)
From: Igor <imammedo@redhat.com>
To: qemu-devel@nongnu.org
Subject: Re: [Bug 1871842] [NEW] AMD CPUID leaf 0x8000'0008 reported number of cores inconsistent with ACPI.MADT
Date: Wed, 15 Apr 2020 09:25:46 -0000	[thread overview]
Message-ID: <20200415112546.68ca1f15@redhat.com> (raw)
Message-ID: <20200415092546.qLo48ShNNVoTkN0Sc9AjAxxentMgVl371N8JftoTTq8@z> (raw)
In-Reply-To: 42f0624d-b0fb-5d96-2921-8994c28b9937@kernkonzept.com

On Tue, 14 Apr 2020 13:27:34 -0000
Philipp Eppelt <1871842@bugs.launchpad.net> wrote:

> Hi,
> 
> I have to clarify some things mentioned in my last post:
> 
> I only tested the change with an emulated EPYC-v2 CPU, I cannot test on
> a physical EPYC CPU at the moment. However, I doubt that the results
> will be different regarding the 0x8000_0008.ECX result.
> 
> The topology information printed is from the EPYC-v2 CPU model. I try to
> get access to the machine and have a look if -cpu host affects this
> topology.
> 
> So there is still the open question for the -enable-kvm -cpu host -smp 4
> case. Shouldn't in this case the topology of the host CPU be reported?
topology was never affected by the choice of -cpu, it's up to users to
define it using -smp the way they prefer.
 

> In all emulated-CPU cases it's on the user to define the topology or to
> live with the generated one (although I think preferring multi-socket
> systems is outdated, but it's likely just the case in my 'world').
> 
> 
> Cheers,
> Philipp
> 
> 
> On 4/14/20 10:24 AM, Philipp Eppelt wrote:
> > Hi,
> > 
> > thanks for looking into this so quickly.
> > 
> > With this patch applied ontop of git commit
> > f3bac27cc1e303e1860cc55b9b6889ba39dee587 I still have the issue and it
> > reports the same numbers. I like the new usage of the ApicIdSize field.
> > 
> > 
> > I looked into the mentioned pc_smp_parse() and had it print the topology
> > for -smp 4:
> > 
> > qemu-system-x86_64: warning: cpu topology: sockets (4) , dies (1) ,
> > cores (1) , threads (1) , maxcpus (4), cpus (4)
> > 
> > and with -smp 4,cores=4:
> > 
> > qemu-system-x86_64: warning: cpu topology: sockets (1) , dies (1) ,
> > cores (4) , threads (1) , maxcpus (4), cpus (4)
> > 
> > As far as I understand it, these are the numbers the cpuid:8000'0008
> > code relies on:
> > `cs->nr_cores`, `cs->nr_threads` with `cs` being of type CPUState.
> > 
> > So I think the issue is rooted with the preferring sockets over cores
> > when the -smp cmdline option is parsed, as stated in hw/i386/pc.c:729.
> > 
> > I guess this is the same code for Intel and AMD CPUs alike and this
> > issue just didn't surface for us on Intel CPUs, as they don't have this
> > CPUID leaf and we don't look at the topology.
> > 
> > This seems to boil down to a more careful use of the -smp option on my end.
> > 
> > Thanks again for looking into this.
> > 
> > Cheers,
> > Philipp
> > 
> > 
> > 
> > On 4/10/20 2:12 AM, Babu Moger wrote:  
> >> Philipp,
> >>   Can you please check if this patch works for you.
> >>
> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> >> index 90ffc5f..e467fee 100644
> >> --- a/target/i386/cpu.c
> >> +++ b/target/i386/cpu.c
> >> @@ -5831,10 +5831,17 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t
> >> index, uint32_t count,
> >>          }
> >>          *ebx = env->features[FEAT_8000_0008_EBX];
> >>          *ecx = 0;
> >> -        *edx = 0;
> >>          if (cs->nr_cores * cs->nr_threads > 1) {
> >> -            *ecx |= (cs->nr_cores * cs->nr_threads) - 1;
> >> +            unsigned long max_apicids, bits_required;
> >> +
> >> +            max_apicids = (cs->nr_cores * cs->nr_threads) - 1;
> >> +            if (max_apicids) {
> >> +                /* Find out the number of bits to represent all the
> >> apicids */
> >> +                bits_required = find_last_bit(&max_apicids,
> >> BITS_PER_BYTE) + 1;
> >> +                *ecx |= bits_required << 12 | max_apicids;
> >> +            }
> >>          }
> >> +        *edx = 0;
> >>          break;
> >>      case 0x8000000A:
> >>          if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM) {
> >>
> >>
> >> On 4/9/20 9:00 AM, Igor Mammedov wrote:  
> >>> On Thu, 09 Apr 2020 12:58:11 -0000
> >>> Philipp Eppelt <1871842@bugs.launchpad.net> wrote:
> >>>  
> >>>> Public bug reported:
> >>>>
> >>>> Setup:
> >>>> CPU: AMD EPYC-v2 or host's EPYC cpu
> >>>> Linux 64-bit fedora host; Kernel version 5.5.15-200.fc31
> >>>> qemu version: self build
> >>>> git-head: f3bac27cc1e303e1860cc55b9b6889ba39dee587
> >>>> config: Configured with: '../configure' '--target-list=x86_64-softmmu,mips64el-softmmu,mips64-softmmu,mipsel-softmmu,mips-softmmu,i386-softmmu,aarch64-softmmu,arm-softmmu' '--prefix=/opt/qemu-master'
> >>>>
> >>>> Cmdline: 
> >>>> qemu-system-x86_64 -kernel /home/peppelt/code/l4/internal/.build-x86_64/bin/amd64_gen/bootstrap -append "" -initrd "./fiasco/.build-x86_64/fiasco , ... " -serial stdio -nographic -monitor none -nographic -monitor none -cpu EPYC-v2 -m 4G -smp 4 
> >>>>
> >>>> Issue:
> >>>> We are developing an microkernel operating system called L4Re. We recently got an AMD EPYC server for testing and we couldn't execute SMP tests of our system when running Linux + qemu + VM w/ L4Re.
> >>>> In fact, the kernel did not recognize any APs at all. On AMD CPUs the kernel checks for the number of cores reported in CPUID leaf 0x8000_0008.ECX[NC] or [ApicIdSize].  [0][1]
> >>>>
> >>>> The physical machine reports for leaf 0x8000_0008:  EAX: 0x3030 EBX: 0x18cf757 ECX: 0x703f EDX: 0x1000
> >>>> The lower four bits of ECX are the [NC] field and all set.
> >>>>
> >>>> When querying inside qemu with -enable-kvm -cpu host -smp 4 (basically as replacement and addition to the above cmdline) the CPUID leaf shows: EAX: 0x3024, EBX: 0x1001000, ECX: 0x0, EDX: 0x0
> >>>> Note, ECX is zero. Indicating that this is no SMP capabale CPU.
> >>>>
> >>>> I'm debugging it using my local machine and the QEMU provided EPYC-v2
> >>>> CPU model and it is reproducible there as well and reports:  EAX:
> >>>> 0x3028, EBX: 0x0, ECX: 0x0, EDX: 0x0
> >>>>
> >>>> I checked other AMD based CPU models (phenom, opteron_g3/g5) and they behave the same. [2] shows the CPUID 0x8000'0008 handling in the QEMU source.
> >>>> I believe that behavior here is wrong as ECX[NC] should report the number of cores per processor, as stated in the AMD manual [2] p.584. In my understanding -smp 4 should then lead to ECX[NC] = 0x3.
> >>>>
> >>>> The following table shows my findings with the -smp option:
> >>>> Option | Qemu guest observed ECX value
> >>>> -smp 4 | 0x0
> >>>> -smp 4,cores=4  | 0x3
> >>>> -smp 4,cores=2,thread=2 | 0x3
> >>>> -smp 4,cores=4,threads=2 | QEMU boot error: topology false.
> >>>>
> >>>> Now, I'm asking myself how the terminology of the AMD manual maps to QEMU's -smp option.
> >>>> Obviously, nr_cores and nr_threads correspond to the cores and threads options on the cmdline and cores * threads <= 4 (in this example), but what corresponds the X in -smp X to?  
> >>> I'd say X corresponds to number of logical CPUs.
> >>> Depending on presence of other options these are distributed among them in magical manner
> >>> (see pc_smp_parse() for starters)
> >>>  
> >>>> Querying 0x8000'0008 on the physical processor results in different
> >>>> reports than quering QEMU's model as does it with -enable-kvm -cpu host.
> >>>>
> >>>> Furthermore, the ACPI.MADT shows 4 local APICs to be present while the
> >>>> CPU leave reports a single core processor.  
> >>> it matches -smp X as it should be.
> >>>  
> >>>>
> >>>> This leads me to the conclusion that CPUID 0x8000'0008.ECX reports the
> >>>> wrong number.  
> >>> CCed author of recent epyc patches who might know how AMD should work better than me.
> >>>  
> >>>>
> >>>> Please let me know, if you need more information from my side.
> >>>>
> >>>>
> >>>> [0] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkernkonzept%2Ffiasco%2Fblob%2F522ccc5f29ab120213cf02d71328e2b879cbbd19%2Fsrc%2Fkern%2Fia32%2Fkernel_thread-ia32.cpp%23L109&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C57569f7959744399655b08d7dc8e6e24%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220379083511672&amp;sdata=hcFJzLAVQoIh5IN9CP%2F9cUQNOZoBnpRA6FliJur1wzQ%3D&amp;reserved=0
> >>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkernkonzept%2Ffiasco%2Fblob%2F522ccc5f29ab120213cf02d71328e2b879cbbd19%2Fsrc%2Fkern%2Fia32%2Fcpu-ia32.cpp%23L1120&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C57569f7959744399655b08d7dc8e6e24%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220379083511672&amp;sdata=ANJIbYKbwfq2bDelH%2FRLKnDPIUZc1BwxHspmgxLU7gs%3D&amp;reserved=0
> >>>> [2] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fqemu%2Fqemu%2Fblob%2Ff2a8261110c32c4dccd84e774d8dd7a0524e00fb%2Ftarget%2Fi386%2Fcpu.c%23L5835&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C57569f7959744399655b08d7dc8e6e24%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220379083511672&amp;sdata=oj3mv9e5YOzUsfUjXK44gC8LybyWgMKo8JBIrRR%2BmDA%3D&amp;reserved=0
> >>>> [3] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F24594.pdf&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C57569f7959744399655b08d7dc8e6e24%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220379083511672&amp;sdata=7Yr3J9ihlqSqXCXKN5JJNTByO3NGI%2BGMz2EqBF2Y4hw%3D&amp;reserved=0
> >>>>
> >>>> ** Affects: qemu
> >>>>      Importance: Undecided
> >>>>          Status: New
> >>>>  
> >>>  
> >>  
> >   
>

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1871842

Title:
  AMD CPUID leaf 0x8000'0008 reported number of cores  inconsistent with
  ACPI.MADT

Status in QEMU:
  New

Bug description:
  Setup:
  CPU: AMD EPYC-v2 or host's EPYC cpu
  Linux 64-bit fedora host; Kernel version 5.5.15-200.fc31
  qemu version: self build
  git-head: f3bac27cc1e303e1860cc55b9b6889ba39dee587
  config: Configured with: '../configure' '--target-list=x86_64-softmmu,mips64el-softmmu,mips64-softmmu,mipsel-softmmu,mips-softmmu,i386-softmmu,aarch64-softmmu,arm-softmmu' '--prefix=/opt/qemu-master'

  Cmdline: 
  qemu-system-x86_64 -kernel /home/peppelt/code/l4/internal/.build-x86_64/bin/amd64_gen/bootstrap -append "" -initrd "./fiasco/.build-x86_64/fiasco , ... " -serial stdio -nographic -monitor none -nographic -monitor none -cpu EPYC-v2 -m 4G -smp 4 

  Issue:
  We are developing an microkernel operating system called L4Re. We recently got an AMD EPYC server for testing and we couldn't execute SMP tests of our system when running Linux + qemu + VM w/ L4Re.
  In fact, the kernel did not recognize any APs at all. On AMD CPUs the kernel checks for the number of cores reported in CPUID leaf 0x8000_0008.ECX[NC] or [ApicIdSize].  [0][1]

  The physical machine reports for leaf 0x8000_0008:  EAX: 0x3030 EBX: 0x18cf757 ECX: 0x703f EDX: 0x1000
  The lower four bits of ECX are the [NC] field and all set.

  When querying inside qemu with -enable-kvm -cpu host -smp 4 (basically as replacement and addition to the above cmdline) the CPUID leaf shows: EAX: 0x3024, EBX: 0x1001000, ECX: 0x0, EDX: 0x0
  Note, ECX is zero. Indicating that this is no SMP capabale CPU.

  I'm debugging it using my local machine and the QEMU provided EPYC-v2
  CPU model and it is reproducible there as well and reports:  EAX:
  0x3028, EBX: 0x0, ECX: 0x0, EDX: 0x0

  I checked other AMD based CPU models (phenom, opteron_g3/g5) and they behave the same. [2] shows the CPUID 0x8000'0008 handling in the QEMU source.
  I believe that behavior here is wrong as ECX[NC] should report the number of cores per processor, as stated in the AMD manual [2] p.584. In my understanding -smp 4 should then lead to ECX[NC] = 0x3.

  The following table shows my findings with the -smp option:
  Option | Qemu guest observed ECX value
  -smp 4 | 0x0
  -smp 4,cores=4  | 0x3
  -smp 4,cores=2,thread=2 | 0x3
  -smp 4,cores=4,threads=2 | QEMU boot error: topology false.

  Now, I'm asking myself how the terminology of the AMD manual maps to QEMU's -smp option.
  Obviously, nr_cores and nr_threads correspond to the cores and threads options on the cmdline and cores * threads <= 4 (in this example), but what corresponds the X in -smp X to?

  Querying 0x8000'0008 on the physical processor results in different
  reports than quering QEMU's model as does it with -enable-kvm -cpu
  host.

  Furthermore, the ACPI.MADT shows 4 local APICs to be present while the
  CPU leave reports a single core processor.

  This leads me to the conclusion that CPUID 0x8000'0008.ECX reports the
  wrong number.

  
  Please let me know, if you need more information from my side.

  
  [0] https://github.com/kernkonzept/fiasco/blob/522ccc5f29ab120213cf02d71328e2b879cbbd19/src/kern/ia32/kernel_thread-ia32.cpp#L109
  [1] https://github.com/kernkonzept/fiasco/blob/522ccc5f29ab120213cf02d71328e2b879cbbd19/src/kern/ia32/cpu-ia32.cpp#L1120
  [2] https://github.com/qemu/qemu/blob/f2a8261110c32c4dccd84e774d8dd7a0524e00fb/target/i386/cpu.c#L5835
  [3] https://www.amd.com/system/files/TechDocs/24594.pdf

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1871842/+subscriptions


  reply	other threads:[~2020-04-15  9:26 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-09 12:58 [Bug 1871842] [NEW] AMD CPUID leaf 0x8000'0008 reported number of cores inconsistent with ACPI.MADT Philipp Eppelt
2020-04-09 14:00 ` Igor Mammedov
2020-04-09 14:00   ` Igor
2020-04-09 18:37   ` Babu Moger
2020-04-09 18:37     ` Babu Moger
2020-04-10  0:12   ` Babu Moger
2020-04-10  0:12     ` Babu Moger
2020-04-14  8:24     ` Philipp Eppelt
2020-04-14 13:27       ` Philipp Eppelt
2020-04-15  9:25         ` Igor Mammedov [this message]
2020-04-15  9:25           ` Igor
     [not found]   ` <c01f506c-5447-d048-15b2-3f113818844f@amd.com>
2020-04-15 18:08     ` Babu Moger
2020-04-15 18:08       ` Babu Moger
2020-04-17 15:14 ` [PATCH] target/i386: Fix the CPUID leaf CPUID_Fn80000008 Babu Moger
2020-04-17 15:14   ` [Bug 1871842] Re: AMD CPUID leaf 0x8000'0008 reported number of cores inconsistent with ACPI.MADT Babu Moger
2020-04-17 19:15   ` [PATCH] target/i386: Fix the CPUID leaf CPUID_Fn80000008 Eduardo Habkost
2020-04-17 19:15     ` [Bug 1871842] " Eduardo Habkost
2020-04-17 19:44     ` Babu Moger
2020-04-17 19:44       ` [Bug 1871842] " Babu Moger
2020-04-17 21:55 ` [v2 PATCH] " Babu Moger
2020-04-17 21:55   ` [Bug 1871842] " Babu Moger
2020-04-21 11:45   ` Philipp Eppelt
2020-05-21 16:04   ` Paolo Bonzini
2021-05-06  7:59 ` [Bug 1871842] Re: AMD CPUID leaf 0x8000'0008 reported number of cores inconsistent with ACPI.MADT Thomas Huth
2021-06-18 15:58 ` Thomas Huth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200415112546.68ca1f15@redhat.com \
    --to=imammedo@redhat.com \
    --cc=1871842@bugs.launchpad.net \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).