From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52641) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c5CwN-0004j7-PV for qemu-devel@nongnu.org; Fri, 11 Nov 2016 09:36:29 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c5CwK-0004r6-KO for qemu-devel@nongnu.org; Fri, 11 Nov 2016 09:36:27 -0500 Received: from mx1.redhat.com ([209.132.183.28]:48446) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c5CwK-0004qx-Bh for qemu-devel@nongnu.org; Fri, 11 Nov 2016 09:36:24 -0500 Date: Fri, 11 Nov 2016 15:36:19 +0100 From: Igor Mammedov Message-ID: <20161111153619.3ef19ccb@nial.brq.redhat.com> In-Reply-To: <0404998c-3374-c2df-d960-596f71baac26@redhat.com> References: <20161111135717.18eaa6dc@nial.brq.redhat.com> <0404998c-3374-c2df-d960-596f71baac26@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus' List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Laszlo Ersek Cc: qemu-devel@nongnu.org, Eduardo Habkost , mst@redhat.com, Stefan Hajnoczi , Kevin O'Connor , Gerd Hoffmann , "Jordan Justen (Intel address)" , Jeff Fan On Fri, 11 Nov 2016 15:02:36 +0100 Laszlo Ersek wrote: > adding Jeff Fan and Jordan Justen > > On 11/11/16 13:57, Igor Mammedov wrote: > > While looking at OVMF and how it handles CPUs (ACPI/AP wakeup), > > I've noticed that it uses legacy FW_CFG_NB_CPUS(0x05) to get > > the number of present at start CPUs. > > Where exactly do you see this? That's the only place in OVMF, but according to google there are other firmwares that use FW_CFG_NB_CPUS so we are not free to remove it and break guests. So I'd just drop not yet released 'etc/boot-cpus', of cause SeaBIOS should be fixed and its blob in QEMU updated. > The only place where I see this fw_cfg key being accessed is > "OvmfPkg/AcpiPlatformDxe/Qemu.c", function QemuInstallAcpiMadtTable(). > > That function is only called by OVMF if QEMU does not provide the > "etc/table-loader" interface. Which means, in practical terms, that the > QemuInstallAcpiMadtTable() function is historical / dead. > > IOW, if you run OVMF on a non-historical machine type, then fw_cfg key > 0x05 is unused. > > > Variable was introduced back in 2008 by fbfcf955ba and is/was used > > by ppc/sparc/arm/x86 and a bunch of firmwares (but not by SeaBIOS). > > > > However in 2.8 I've just added similar fwcfg file 'etc/boot-cpus' > > which is used for the same purpose \-) > > > > Both variables are UINT16 and the only difference that FW_CFG_NB_CPUS > > doesn't account for CPUs added with -device which might make > > a firmware to wait indefinitely in FW_CFG_NB_CPUS == Woken-up-APs loop > > due to extra not expected APs being woken up. > > > > FW_CFG_NB_CPUS should be fixed to mimic 'etc/boot-cpus' behavior anyway, > > so question arises why not to reuse fixed legacy FW_CFG_NB_CPUS and > > drop just added but not yet released 'etc/boot-cpus' from QEMU and SeaBIOS. > > > > Any opinions? > > > > I'm fine either way -- I don't have a preference for either of key 0x05 > or file "etc/boot-cpus". I'll post fixup patches (removing"etc/boot-cpus" ) today, after I've tested them a little bit more. > For starting up APs, OVMF includes UefiCpuPkg modules. Recently, a good > chunk of the multiprocessing code has been extracted to > "UefiCpuPkg/Library/MpInitLib". This directory has two INF files: > > - PeiMpInitLib.inf, which customizes the library for PEI phase modules > (PEIMs), > > - DxeMpInitLib.inf, which customizes the library for DXE_DRIVER modules. > > The most important drivers that use this library are: > > - UefiCpuPkg/CpuDxe, which is a DXE_DRIVER and provides (among other > things) the EFI_MP_SERVICES_PROTOCOL, for DXE- and later phase > clients, > > - UefiCpuPkg/CpuMpPei, which is a PEIM and provides the > EFI_PEI_MP_SERVICES_PPI for PEI-phase clients. > > For example, when OVMF programs the MSR_IA32_FEATURE_CONTROL MSR on each > CPU during boot (and S3 resume too), it calls > EFI_PEI_MP_SERVICES_PPI.StartupAllAPs() for this. Please see commit > dbab994991c7 for details. > > (The parallel SeaBIOS commits are 20f83d5c7c0f and 54e3a88609da.) > > The number of processors in the system is apparently determined in > MpInitLibInitialize() --> CollectProcessorCount(). It does not use any > fw_cfg hints from OVMF. What it uses is called > > PcdCpuMaxLogicalProcessorNumber > > which is an absolute upper bound on the number of logical processors in > the system. (It is not required that this many CPUs be available: there > may be fewer. However, there mustn't be more -- in that case, things > will break.) > > This value is currently set statically, at build time, to 64. We can > raise it statically. If you want to set it dynamically, that's possible > as well. > > Normally, this would not be trivial, because the PCD is consumed early. > Usually we set such dynamic PCDs in OVMF's PlatformPei, but that doesn't > work -- generally -- when another PEIM would like to consume the PCD. > The dispatch order between PEIMs is generally unspecified, so we > couldn't be sure that PlatformPei would set the PCD before CpuMpPei > consumed it at startup, via PeiMpInitLib. > > Normally, we solve this with a plugin library instance that we hook into > all the PCD consumer modules (without their explicit knowledge), so the > PCD gets set (unbeknownst to them) right when they start up, from > fw_cfg. However, in this case we can simplify things a bit, because > CpuMpPei is serialized strictly after PlatformPei through the > installation of the permanent PEI RAM (the > gEfiPeiMemoryDiscoveredPpiGuid depex). CpuMpPei cannot launch until > PlatformPei exposes the permanent PEI RAM, hence PlatformPei can set the > PCD as first guy too. > > (If you are interested why PlatformPei can use CpuMpPei's services then: > because we register a callback in PlatformPei so that CpuMpPei's PPI > installation will inform us.) > > Okay, okay, this is likely too much information. Let me summarize: > > - The use of fw_cfg key 0x05 in OVMF is his historical, and only > related to OVMF's built-in ACPI tables. Which are never used with > halfway recent QEMU machine types. > > - OVMF -- via UefiCpuPkg's MpInitLib, CpuMpPei, CpuDxe and > PiSmmCpuDxeSmm -- starts up all APs without looking for a dynamically > set "present at boot" count, or absolute limit. The current static > limit is 64 processors. > > - If you want to make the limit dynamic, from fw_cfg, we can do that. > We just have to know where to read it from; key 0x05, or file > "etc/boot-cpus". > > - I don't know how high we can go with the maximum. 255 should work. I > seem to remember that we might want 288, and for that we need X2Apic. > OVMF already uses the "UefiCpuPkg/Library/BaseXApicX2ApicLib" > instance exclusively, for the LocalApicLib class, so if there's no > other limitation, things should be fine. I've already have a patch that makes upstream OVMF work for more than 64 CPUs and upto 288 (hopefully written in correct way). So expect me bothering you about boring stuff like rules where/how to post patches for edk2 and asking for review. > - Note that with the SMM driver stack built into OVMF, the maximum CPU > count influences SMRAM consumption. 8MB -- the maximum that Q35 > provides -- might not be enough for a huge number of processors > (I even suspect it's no longer enough for 255). I haven't looked at SMM yet, maybe I shouldn't after all :) > Thanks > Laszlo >