Re: [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus'

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Igor Mammedov <imammedo@redhat.com>
To: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-devel@nongnu.org, Eduardo Habkost <ehabkost@redhat.com>,
	mst@redhat.com, Stefan Hajnoczi <stefanha@gmail.com>,
	Kevin O'Connor <kevin@koconnor.net>,
	Gerd Hoffmann <kraxel@redhat.com>,
	"Jordan Justen (Intel address)" <jordan.l.justen@intel.com>,
	Jeff Fan <jeff.fan@intel.com>
Subject: Re: [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus'
Date: Fri, 11 Nov 2016 15:36:19 +0100	[thread overview]
Message-ID: <20161111153619.3ef19ccb@nial.brq.redhat.com> (raw)
In-Reply-To: <0404998c-3374-c2df-d960-596f71baac26@redhat.com>

On Fri, 11 Nov 2016 15:02:36 +0100
Laszlo Ersek <lersek@redhat.com> wrote:

> adding Jeff Fan and Jordan Justen
> 
> On 11/11/16 13:57, Igor Mammedov wrote:
> > While looking at OVMF and how it handles CPUs (ACPI/AP wakeup),
> > I've noticed that it uses legacy FW_CFG_NB_CPUS(0x05) to get
> > the number of present at start CPUs.  
> 
> Where exactly do you see this?
That's the only place in OVMF, but according to google there are
other firmwares that use FW_CFG_NB_CPUS so we are not free to remove
it and break guests.

So I'd just drop not yet released 'etc/boot-cpus',
of cause SeaBIOS should be fixed and its blob in QEMU updated.


> The only place where I see this fw_cfg key being accessed is
> "OvmfPkg/AcpiPlatformDxe/Qemu.c", function QemuInstallAcpiMadtTable().
> 
> That function is only called by OVMF if QEMU does not provide the
> "etc/table-loader" interface. Which means, in practical terms, that the
> QemuInstallAcpiMadtTable() function is historical / dead.
> 
> IOW, if you run OVMF on a non-historical machine type, then fw_cfg key
> 0x05 is unused.
> 
> > Variable was introduced back in 2008 by fbfcf955ba and is/was used
> > by ppc/sparc/arm/x86 and a bunch of firmwares (but not by SeaBIOS).
> > 
> > However in 2.8 I've just added similar fwcfg file 'etc/boot-cpus'
> > which is used for the same purpose \-)
> > 
> > Both variables are UINT16 and the only difference that FW_CFG_NB_CPUS
> > doesn't account for CPUs added with -device which might make
> > a firmware to wait indefinitely in FW_CFG_NB_CPUS == Woken-up-APs loop
> > due to extra not expected APs being woken up.
> > 
> > FW_CFG_NB_CPUS should be fixed to mimic 'etc/boot-cpus' behavior anyway,
> > so question arises why not to reuse fixed legacy FW_CFG_NB_CPUS and
> > drop just added but not yet released 'etc/boot-cpus' from QEMU and SeaBIOS.
> > 
> > Any opinions?
> >   
> 
> I'm fine either way -- I don't have a preference for either of key 0x05
> or file "etc/boot-cpus".
I'll post fixup patches  (removing"etc/boot-cpus" ) today, after I've tested
them a little bit more.



> For starting up APs, OVMF includes UefiCpuPkg modules. Recently, a good
> chunk of the multiprocessing code has been extracted to
> "UefiCpuPkg/Library/MpInitLib". This directory has two INF files:
> 
> - PeiMpInitLib.inf, which customizes the library for PEI phase modules
>   (PEIMs),
> 
> - DxeMpInitLib.inf, which customizes the library for DXE_DRIVER modules.
> 
> The most important drivers that use this library are:
> 
> - UefiCpuPkg/CpuDxe, which is a DXE_DRIVER and provides (among other
>   things) the EFI_MP_SERVICES_PROTOCOL, for DXE- and later phase
>   clients,
> 
> - UefiCpuPkg/CpuMpPei, which is a PEIM and provides the
>   EFI_PEI_MP_SERVICES_PPI for PEI-phase clients.
> 
> For example, when OVMF programs the MSR_IA32_FEATURE_CONTROL MSR on each
> CPU during boot (and S3 resume too), it calls
> EFI_PEI_MP_SERVICES_PPI.StartupAllAPs() for this. Please see commit
> dbab994991c7 for details.
> 
> (The parallel SeaBIOS commits are 20f83d5c7c0f and 54e3a88609da.)
> 
> The number of processors in the system is apparently determined in
> MpInitLibInitialize() --> CollectProcessorCount(). It does not use any
> fw_cfg hints from OVMF. What it uses is called
> 
>   PcdCpuMaxLogicalProcessorNumber
> 
> which is an absolute upper bound on the number of logical processors in
> the system. (It is not required that this many CPUs be available: there
> may be fewer. However, there mustn't be more -- in that case, things
> will break.)
> 
> This value is currently set statically, at build time, to 64. We can
> raise it statically. If you want to set it dynamically, that's possible
> as well.
> 
> Normally, this would not be trivial, because the PCD is consumed early.
> Usually we set such dynamic PCDs in OVMF's PlatformPei, but that doesn't
> work -- generally -- when another PEIM would like to consume the PCD.
> The dispatch order between PEIMs is generally unspecified, so we
> couldn't be sure that PlatformPei would set the PCD before CpuMpPei
> consumed it at startup, via PeiMpInitLib.
> 
> Normally, we solve this with a plugin library instance that we hook into
> all the PCD consumer modules (without their explicit knowledge), so the
> PCD gets set (unbeknownst to them) right when they start up, from
> fw_cfg. However, in this case we can simplify things a bit, because
> CpuMpPei is serialized strictly after PlatformPei through the
> installation of the permanent PEI RAM (the
> gEfiPeiMemoryDiscoveredPpiGuid depex). CpuMpPei cannot launch until
> PlatformPei exposes the permanent PEI RAM, hence PlatformPei can set the
> PCD as first guy too.
> 
> (If you are interested why PlatformPei can use CpuMpPei's services then:
> because we register a callback in PlatformPei so that CpuMpPei's PPI
> installation will inform us.)
> 
> Okay, okay, this is likely too much information. Let me summarize:
> 
> - The use of fw_cfg key 0x05 in OVMF is his historical, and only
>   related to OVMF's built-in ACPI tables. Which are never used with
>   halfway recent QEMU machine types.
> 
> - OVMF -- via UefiCpuPkg's MpInitLib, CpuMpPei, CpuDxe and
>   PiSmmCpuDxeSmm -- starts up all APs without looking for a dynamically
>   set "present at boot" count, or absolute limit. The current static
>   limit is 64 processors.
> 
> - If you want to make the limit dynamic, from fw_cfg, we can do that.
>   We just have to know where to read it from; key 0x05, or file
>   "etc/boot-cpus".
> 
> - I don't know how high we can go with the maximum. 255 should work. I
>   seem to remember that we might want 288, and for that we need X2Apic.
>   OVMF already uses the "UefiCpuPkg/Library/BaseXApicX2ApicLib"
>   instance exclusively, for the LocalApicLib class, so if there's no
>   other limitation, things should be fine.
I've already have a patch that makes upstream OVMF work for more
than 64 CPUs and upto 288 (hopefully written in correct way).
So expect me bothering you about boring stuff like rules where/how
to post patches for edk2 and asking for review.


> - Note that with the SMM driver stack built into OVMF, the maximum CPU
>   count influences SMRAM consumption. 8MB -- the maximum that Q35
>   provides -- might not be enough for a huge number of processors
>   (I even suspect it's no longer enough for 255).
I haven't looked at SMM yet, maybe I shouldn't after all :)

 
> Thanks
> Laszlo
>

next prev parent reply	other threads:[~2016-11-11 14:36 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-11 12:57 [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus' Igor Mammedov
2016-11-11 14:02 ` Laszlo Ersek
2016-11-11 14:36   ` Igor Mammedov [this message]
2016-11-11 14:44     ` Kevin O'Connor
2016-11-11 15:05       ` Igor Mammedov
2016-11-11 16:37     ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161111153619.3ef19ccb@nial.brq.redhat.com \
    --to=imammedo@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=jeff.fan@intel.com \
    --cc=jordan.l.justen@intel.com \
    --cc=kevin@koconnor.net \
    --cc=kraxel@redhat.com \
    --cc=lersek@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).