From: Igor Mammedov <imammedo@redhat.com>
To: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-devel@nongnu.org, Eduardo Habkost <ehabkost@redhat.com>,
mst@redhat.com, Stefan Hajnoczi <stefanha@gmail.com>,
Kevin O'Connor <kevin@koconnor.net>,
Gerd Hoffmann <kraxel@redhat.com>,
"Jordan Justen (Intel address)" <jordan.l.justen@intel.com>,
Jeff Fan <jeff.fan@intel.com>
Subject: Re: [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus'
Date: Fri, 11 Nov 2016 15:36:19 +0100 [thread overview]
Message-ID: <20161111153619.3ef19ccb@nial.brq.redhat.com> (raw)
In-Reply-To: <0404998c-3374-c2df-d960-596f71baac26@redhat.com>
On Fri, 11 Nov 2016 15:02:36 +0100
Laszlo Ersek <lersek@redhat.com> wrote:
> adding Jeff Fan and Jordan Justen
>
> On 11/11/16 13:57, Igor Mammedov wrote:
> > While looking at OVMF and how it handles CPUs (ACPI/AP wakeup),
> > I've noticed that it uses legacy FW_CFG_NB_CPUS(0x05) to get
> > the number of present at start CPUs.
>
> Where exactly do you see this?
That's the only place in OVMF, but according to google there are
other firmwares that use FW_CFG_NB_CPUS so we are not free to remove
it and break guests.
So I'd just drop not yet released 'etc/boot-cpus',
of cause SeaBIOS should be fixed and its blob in QEMU updated.
> The only place where I see this fw_cfg key being accessed is
> "OvmfPkg/AcpiPlatformDxe/Qemu.c", function QemuInstallAcpiMadtTable().
>
> That function is only called by OVMF if QEMU does not provide the
> "etc/table-loader" interface. Which means, in practical terms, that the
> QemuInstallAcpiMadtTable() function is historical / dead.
>
> IOW, if you run OVMF on a non-historical machine type, then fw_cfg key
> 0x05 is unused.
>
> > Variable was introduced back in 2008 by fbfcf955ba and is/was used
> > by ppc/sparc/arm/x86 and a bunch of firmwares (but not by SeaBIOS).
> >
> > However in 2.8 I've just added similar fwcfg file 'etc/boot-cpus'
> > which is used for the same purpose \-)
> >
> > Both variables are UINT16 and the only difference that FW_CFG_NB_CPUS
> > doesn't account for CPUs added with -device which might make
> > a firmware to wait indefinitely in FW_CFG_NB_CPUS == Woken-up-APs loop
> > due to extra not expected APs being woken up.
> >
> > FW_CFG_NB_CPUS should be fixed to mimic 'etc/boot-cpus' behavior anyway,
> > so question arises why not to reuse fixed legacy FW_CFG_NB_CPUS and
> > drop just added but not yet released 'etc/boot-cpus' from QEMU and SeaBIOS.
> >
> > Any opinions?
> >
>
> I'm fine either way -- I don't have a preference for either of key 0x05
> or file "etc/boot-cpus".
I'll post fixup patches (removing"etc/boot-cpus" ) today, after I've tested
them a little bit more.
> For starting up APs, OVMF includes UefiCpuPkg modules. Recently, a good
> chunk of the multiprocessing code has been extracted to
> "UefiCpuPkg/Library/MpInitLib". This directory has two INF files:
>
> - PeiMpInitLib.inf, which customizes the library for PEI phase modules
> (PEIMs),
>
> - DxeMpInitLib.inf, which customizes the library for DXE_DRIVER modules.
>
> The most important drivers that use this library are:
>
> - UefiCpuPkg/CpuDxe, which is a DXE_DRIVER and provides (among other
> things) the EFI_MP_SERVICES_PROTOCOL, for DXE- and later phase
> clients,
>
> - UefiCpuPkg/CpuMpPei, which is a PEIM and provides the
> EFI_PEI_MP_SERVICES_PPI for PEI-phase clients.
>
> For example, when OVMF programs the MSR_IA32_FEATURE_CONTROL MSR on each
> CPU during boot (and S3 resume too), it calls
> EFI_PEI_MP_SERVICES_PPI.StartupAllAPs() for this. Please see commit
> dbab994991c7 for details.
>
> (The parallel SeaBIOS commits are 20f83d5c7c0f and 54e3a88609da.)
>
> The number of processors in the system is apparently determined in
> MpInitLibInitialize() --> CollectProcessorCount(). It does not use any
> fw_cfg hints from OVMF. What it uses is called
>
> PcdCpuMaxLogicalProcessorNumber
>
> which is an absolute upper bound on the number of logical processors in
> the system. (It is not required that this many CPUs be available: there
> may be fewer. However, there mustn't be more -- in that case, things
> will break.)
>
> This value is currently set statically, at build time, to 64. We can
> raise it statically. If you want to set it dynamically, that's possible
> as well.
>
> Normally, this would not be trivial, because the PCD is consumed early.
> Usually we set such dynamic PCDs in OVMF's PlatformPei, but that doesn't
> work -- generally -- when another PEIM would like to consume the PCD.
> The dispatch order between PEIMs is generally unspecified, so we
> couldn't be sure that PlatformPei would set the PCD before CpuMpPei
> consumed it at startup, via PeiMpInitLib.
>
> Normally, we solve this with a plugin library instance that we hook into
> all the PCD consumer modules (without their explicit knowledge), so the
> PCD gets set (unbeknownst to them) right when they start up, from
> fw_cfg. However, in this case we can simplify things a bit, because
> CpuMpPei is serialized strictly after PlatformPei through the
> installation of the permanent PEI RAM (the
> gEfiPeiMemoryDiscoveredPpiGuid depex). CpuMpPei cannot launch until
> PlatformPei exposes the permanent PEI RAM, hence PlatformPei can set the
> PCD as first guy too.
>
> (If you are interested why PlatformPei can use CpuMpPei's services then:
> because we register a callback in PlatformPei so that CpuMpPei's PPI
> installation will inform us.)
>
> Okay, okay, this is likely too much information. Let me summarize:
>
> - The use of fw_cfg key 0x05 in OVMF is his historical, and only
> related to OVMF's built-in ACPI tables. Which are never used with
> halfway recent QEMU machine types.
>
> - OVMF -- via UefiCpuPkg's MpInitLib, CpuMpPei, CpuDxe and
> PiSmmCpuDxeSmm -- starts up all APs without looking for a dynamically
> set "present at boot" count, or absolute limit. The current static
> limit is 64 processors.
>
> - If you want to make the limit dynamic, from fw_cfg, we can do that.
> We just have to know where to read it from; key 0x05, or file
> "etc/boot-cpus".
>
> - I don't know how high we can go with the maximum. 255 should work. I
> seem to remember that we might want 288, and for that we need X2Apic.
> OVMF already uses the "UefiCpuPkg/Library/BaseXApicX2ApicLib"
> instance exclusively, for the LocalApicLib class, so if there's no
> other limitation, things should be fine.
I've already have a patch that makes upstream OVMF work for more
than 64 CPUs and upto 288 (hopefully written in correct way).
So expect me bothering you about boring stuff like rules where/how
to post patches for edk2 and asking for review.
> - Note that with the SMM driver stack built into OVMF, the maximum CPU
> count influences SMRAM consumption. 8MB -- the maximum that Q35
> provides -- might not be enough for a huge number of processors
> (I even suspect it's no longer enough for 255).
I haven't looked at SMM yet, maybe I shouldn't after all :)
> Thanks
> Laszlo
>
next prev parent reply other threads:[~2016-11-11 14:36 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-11 12:57 [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus' Igor Mammedov
2016-11-11 14:02 ` Laszlo Ersek
2016-11-11 14:36 ` Igor Mammedov [this message]
2016-11-11 14:44 ` Kevin O'Connor
2016-11-11 15:05 ` Igor Mammedov
2016-11-11 16:37 ` Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161111153619.3ef19ccb@nial.brq.redhat.com \
--to=imammedo@redhat.com \
--cc=ehabkost@redhat.com \
--cc=jeff.fan@intel.com \
--cc=jordan.l.justen@intel.com \
--cc=kevin@koconnor.net \
--cc=kraxel@redhat.com \
--cc=lersek@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).