From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52864) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c9hxI-0000ZE-Gn for qemu-devel@nongnu.org; Wed, 23 Nov 2016 19:32:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c9hxE-0005ov-2Q for qemu-devel@nongnu.org; Wed, 23 Nov 2016 19:32:00 -0500 Received: from mx1.redhat.com ([209.132.183.28]:59348) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c9hxD-0005me-OD for qemu-devel@nongnu.org; Wed, 23 Nov 2016 19:31:56 -0500 References: <20161118103659.10448-1-lersek@redhat.com> From: Laszlo Ersek Message-ID: <03e8f12b-5fcf-9ed1-aa34-8e521657b50f@redhat.com> Date: Thu, 24 Nov 2016 01:31:50 +0100 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini , qemu devel list Cc: Kevin O'Connor , "Michael S. Tsirkin" , Gerd Hoffmann , Igor Mammedov , "Jordan Justen (Intel address)" , Michael Kinney On 11/24/16 01:01, Laszlo Ersek wrote: > CC Jordan & Mike >=20 > On 11/23/16 23:35, Paolo Bonzini wrote: >> >> >> On 18/11/2016 11:36, Laszlo Ersek wrote: >>> This is v3 of the series, with updates based on the v2 discussion: >>> . >>> >>> I've added feature negotiation via the APM_STS ("scratchpad") registe= r. >>> A new spec file called "docs/specs/q35-apm-sts.txt" is included. >>> >>> Tested with new OVMF patches (about to send out those as well). >>> Regression tested with SeaBIOS (beyond simple functional tests with >>> maximum SeaBIOS logging enabled, I used gdb to step through the new >>> ich9_apm_status_changed() callback to see if it was behaving compatib= ly >>> with SeaBIOS). >>> >>> The series was developed and tested on top of v2.7.0, because v2.8.0-= rc0 >>> crashes very quickly for me when running OVMF: >>> >>> kvm_io_ioeventfd_add: error adding ioeventfd: File exists >>> >>> It is my understanding that there are patches on the list for this: >>> >>> [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes >>> >>> Anyway, the series rebases to v2.8.0-rc0 without as much as context >>> differences. >> >> Hi Laszlo, >> >> sorry for the slightly delayed reply. >> >> First of all, I'm wondering if we would be better off adding a new por= t >> 0xB1 that is QEMU-specific, instead of reusing 0xB3. >=20 > Sure, I can look into that, if we agree that's the best way to proceed, > for now. (Although I'm not really happy about the new memory region > stuff it would require. :() >=20 > I CC'd Kevin to learn if he foresaw other uses for the APM_STS register > in SeaBIOS. >=20 > BTW, what happens in QEMU if the guest reads an unimplemented port? > Hm... unassigned_io_write() seems to be a no-op, and > unassigned_io_read() returns all-bits-one. This means that for a new > port, the negotiation protocol / values have to be reworked. >=20 > Port 0xB1 is occupied by ICH9 according to the spec: >=20 > Table 9-2. Fixed I/O Ranges Decoded by Intel =C2=AE ICH9 (Sheet 2 of = 2) >=20 > I/O > Address Read Target Write Target Internal Unit > ------- -------------------- -------------------- ------------- > B0h=E2=80=93B1h Interrupt Controller Interrupt Controller Interrup= t >=20 > I wonder if we care -- after all, APM_STS (0xB3) is documented not to > have any hardware effects ("scratchpad register"). >=20 >> Second, I now remembered the reason why I was against broadcast SMI. >> The reason is that it breaks hot-plug. >=20 > How does it break hot-plug? After reading your explanation below: is it > that the broadcast SMI (possibly raised by the OS directly) would get t= o > the new VCPU before the firmware relocated its SMBASE? >=20 >> >> On hot-plug, the firmware (if it wants to use SMI for anything secure) >> must relocate the SMBASE of the newly-hotplugged CPU before the OS has >> any chance to put its fangs on it. This is because the OS can send >> direct SMIs and is in control of the area at 0x30000. So OVMF is >> already broken with respect to hotplug, >=20 > Yes. We theorized that there could be further edk2 core modules that > hadn't been open sourced yet, necessary for handling VCPU hotplug. >=20 >> but I am not yet sure if these >> patches would break it further. >=20 > Hard to say without seeing those modules. >=20 > I will speculate: assuming that the non-public modules fit together wit= h > the public modules in some way, I expect the broadcast SMI shouldn't > break them. The reason is that the broadcast SMI / traditional delivery > are the default method in UefiCpuPkg, and in practice they work better > (more reliably) with the rest of the edk2 infrastructure, in my > experience, than the relaxed sync method. >=20 > In his review today, Jordan wrote > , >=20 >> I'm glad we'll be using a mechanism that broadcasts to all the >> processors like the real hardware. It is a bit unfortunate that it >> doesn't go through the b2 port for it. >=20 > If broadcast is how real hardware does it (even by default!, > apparently), I expect those as-yet unreleased, hotplug-handling modules > in edk2 should be able to cope with broadcast. >=20 >> The full solution is to follow a protocol similar to what real hardwar= e >> does. >=20 > Real hardware seems to use broadcast, according to the above... >=20 > On 11/23/16 23:35, Paolo Bonzini wrote: >=20 >> On hot-plug, before the new CPU starts running, the DSDT should >> generate an SMI (with a well-known value written to 0xB2). >=20 > I'm not sure I understand right. If it is the DSDT that writes to 0xB2, > that's just another way to say, "the firmware vendor asks the operating > system to write to 0xB2". If the malicious OS is smart enough, it can > realize (from the hardware signal to run the ACPI GPE handler, IIRC) > that a new VCPU is available, and simply not trigger the SMI. >=20 >> The handler >> then: >> >> 1) waits for SMM rendezvous >> >> 2) unparks the hotplugged VCPU. Until the VCPU is unparked, it doesn'= t >> react to either INITs or of course SIPIs (this would need to be >> implemented separately for both TCG and KVM! but only in QEMU, not in >> the kernel). >=20 > Okay, this does plug the hole I sketched out above. This logic (the > QEMU-specific unparking) can be done in another platform API in OVMF I > guess (like those in SmmCpuFeaturesLib), but I wonder if we have to > provide the infrastructure in platform code up to the separate SMI > command handler. I think it again depends on those unreleased modules. >=20 >> 3) relocates SMBASE >> >> 4) records the presence of the new VCPU's APIC id for subsequent SMIs. >> >> >> The other important things are: >> >> * new CPU-hotplug controller (docs/specs/acpi_cpu_hotplug.txt) >> interfaces. I think this would only need a new bit in the write >> register at 0x4 ("CPU device control fields"). >> >> The 0x0-0x3 write register, currently reserved for reading, might >> become read/write for easier communication with the SMI handler. The >> SMI handler would write 1 to the new bit in order to unpark the VCPU. >> >> * how to enable this. I think it would need a new SMM feature, just >> like those that you are adding here, in order to make it negotiable. = If >> it is not negotiated, but broadcast SMI is negotiated, you'd need to d= o >> something such as not allowing CPU-hotplug. (This is the only part th= at >> I think is required for 2.9). >=20 > My hope is that we can work on this incrementally (or at least that we > don't forfeit our chance at SMM + VCPU hotplug) by going down the > broadcast SMI route first. Based on what Jordan wrote, this hope should > not be unfounded. >=20 > Also, the SMM stack (across components: core edk2 code, OVMF platform > code, QEMU and KVM) is now more than a year old, and we still have > stability problems (hence this series). I'd like to stabilize the base > features before getting wild with hotplug. >=20 > (The fact that OVMF lags behind SeaBIOS in a number of features is just > business as usual.) >=20 > So, I'd vastly prefer if I needed to extend this series only with a way > to prevent VCPUs from being hotplugged. I think negotiating SMI > broadcast should be good enough for that, as a hook point, given that i= t > runs in the firmware before the OS gets control (on S3 resume too). >=20 > I guess a global variable (or a board-level query function) that would > cause pc_query_hotpluggable_cpus() to return an empty list would be > frowned upon; what's the recommended method? >=20 > There is DeviceClass.hotpluggable, but that's not a dynamic property. >=20 >> >> In order to trigger the SMI, the >> >> ifctx =3D aml_if(aml_equal(ins_evt, one)); >> { >> aml_append(ifctx, >> aml_call2(CPU_NOTIFY_METHOD, cpu_data, dev_ch= k)); >> aml_append(ifctx, aml_store(one, ins_evt)); >> aml_append(ifctx, aml_store(one, has_event)); >> } >> >> would be replaced by something like this pseudo-AML: >> >> Store(And(smm_features, 2), Local1) >> ... >> Store(next_cpu_cmd, cpu_cmd) >> If (Equal(ins_evt, One)) { >> If (Greater(Local1, Zero)) { >> Store(CPU_HP_APM_CMD, apm_cmd) >> } >> CPU_NOTIFY_METHOD(cpu_data, dev_chk) >> Store(One, ins_evt) >> Store(One, has_event) >> } >> >> >> Of course all this is for OVMF only. SeaBIOS doesn't need to do >> anything of this, because it actually likes to have its SMIs only on t= he >> current CPU (and it had better be the BSP, since SMBASE is not relocat= ed >> elsewhere!). >> >> Igor, any thoughts? >> >> I understand that this is quite huge in both OVMF and QEMU, but we've >> only been delaying it and we knew about it. :( >=20 > I agree about being aware of lack of VCPU hotplug support wrt. SMM. >=20 > I disagree about delaying it. I've been aware of problems with the base > SMM features (of differing importance), for example >=20 > - that S3 resume with the Ia32 build of OVMF was almost hit-or-miss, >=20 > - or that the heavy use of UEFI variable services during Windows 8 boot > caused user-visible choppiness in the rotating animation (because of ho= w > our SMIs worked) -- I think I even mentioned this to you. >=20 > Since the stalemate in >=20 > http://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg00125.html >=20 > from last November, I've never stopped disliking the lack of broadcast > SMIs, but I couldn't do anything about it, despite it leaving the base > feature set unreliable. And let's not forget about "niceties" like - host kernel commit 879ae1880449 ("KVM: x86: obey KVM_X86_QUIRK_CD_NW_CLEARED in kvm_set_cr0()"), and - kernel bugzilla , which ate up incredible amounts of time, sending a whole lot of VFIO/OVMF/GPU-assign gamers from the vfio-users list complaining on every forum they could think of. (These issues weren't related to SMM directly, but PiSmmCpuDxeSmm has a depex on gEfiMpServiceProtocolGuid, so anything that interferes with multiprocessing in CpuDxe or MpInitLib is automatically an SMM bug too.) I can think of yet more: have you gotten around looking into , which is about SMM breaking on KVM hosts that lack EPT? I believe you haven't. Which is fine. I just take issue with the notion that we've been "too lazy" to enable SMM + VCPU hotplug. We got problems that come before. Thanks Laszlo > Recently, the open source edk2 SMM stack has acquired a bunch of new > code (you're aware), which exacerbate the instabilities (as I've > documented on the edk2-devel list in excruciating detail). I was > overjoyed when you finally suggested (!) to add broadcast SMI, getting > the discussion un-stuck, because (a) it miraculously fixes everything, > and (b) honestly, I see no other way from keeping the SMM stack from > falling apart otherwise, *without* VCPU hotplug in the picture. >=20 > For the last few weeks, I've been working day and night (I'm not > exaggerating) on testing & reviewing edk2 patches related to SMM or eve= n > just multiprocessing, catching errors, reporting them, and even > debugging and fixing them myself. (See for example commits >=20 > 00650c531a90 UefiCpuPkg/MpInitLib/X64/MpFuncs.nasm: fix fatal typo > 4af3ae146379 UefiCpuPkg/LocalApicLib: fix feature test for Extended > Topology CPUID leaf > 1cbd83308989 UefiCpuPkg/MpInitLib: fix feature test for Extended > Topology CPUID leaf >=20 > ). Putting out fires more or less as a norm. >=20 > Furthermore, I haven't been pretending that VCPU hotplug "doesn't > exist", see >=20 > https://lists.01.org/pipermail/edk2-devel/2016-November/005131.html >=20 > for example, which I posted independently, ~8 hours ago. >=20 > So, no. I reject the idea that we've been procrastinating SMM enabled > VCPU hotplug. We can't build the second floor while the foundation is > shaking. And then we need to ask Intel to release more code as open sou= rce. >=20 > Thanks > Laszlo >=20 >> >> Paolo >> >>> Cc: "Kevin O'Connor" >>> Cc: "Michael S. Tsirkin" >>> Cc: Gerd Hoffmann >>> Cc: Paolo Bonzini >>> >>> Thanks >>> Laszlo >>> >>> Laszlo Ersek (3): >>> hw/isa/apm: introduce callback for APM_STS_IOPORT writes >>> hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS >>> hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VC= PUs >>> >>> docs/specs/q35-apm-sts.txt | 80 ++++++++++++++++++++++++++++++++++++= ++++++++++ >>> include/hw/i386/ich9.h | 9 ++++++ >>> include/hw/isa/apm.h | 9 +++--- >>> hw/acpi/piix4.c | 2 +- >>> hw/isa/apm.c | 15 ++++++--- >>> hw/isa/lpc_ich9.c | 64 +++++++++++++++++++++++++++++++++++-= - >>> hw/isa/vt82c686.c | 2 +- >>> 7 files changed, 168 insertions(+), 13 deletions(-) >>> create mode 100644 docs/specs/q35-apm-sts.txt >>> >=20