From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53666) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diSOD-0004xk-K6 for qemu-devel@nongnu.org; Thu, 17 Aug 2017 17:31:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1diSOB-0002MV-2K for qemu-devel@nongnu.org; Thu, 17 Aug 2017 17:31:41 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:46520 helo=mx0a-001b2d01.pphosted.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1diSOA-0002Lr-TQ for qemu-devel@nongnu.org; Thu, 17 Aug 2017 17:31:38 -0400 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v7HLSkMr026678 for ; Thu, 17 Aug 2017 17:31:36 -0400 Received: from e12.ny.us.ibm.com (e12.ny.us.ibm.com [129.33.205.202]) by mx0b-001b2d01.pphosted.com with ESMTP id 2cdg4qsbrf-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Thu, 17 Aug 2017 17:31:36 -0400 Received: from localhost by e12.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 17 Aug 2017 17:31:35 -0400 References: <20170815202846.24749-1-danielhb@linux.vnet.ibm.com> <20170817075236.GI5509@umbus.fritz.box> From: Daniel Henrique Barboza Date: Thu, 17 Aug 2017 18:31:28 -0300 MIME-Version: 1.0 In-Reply-To: <20170817075236.GI5509@umbus.fritz.box> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Message-Id: Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH] hw/ppc: disable hotplug before CAS is completed List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: David Gibson Cc: lvivier@redhat.com, qemu-ppc@nongnu.org, qemu-devel@nongnu.org, mdroth@linux.vnet.ibm.com On 08/17/2017 04:52 AM, David Gibson wrote: > On Tue, Aug 15, 2017 at 05:28:46PM -0300, Daniel Henrique Barboza wrote: >> This patch is a follow up on the discussions that started with >> Laurent's patch series "spapr: disable hotplugging without OS" [1] >> and discussions made at patch "spapr: reset DRCs on migration >> pre_load" [2]. >> >> At this moment, we do not support CPU/memory hotplug in early >> boot stages, before CAS. The reason is that the hotplug event >> can't be handled at SLOF level (or even in PRELAUNCH runstate) and >> at the same time can't be canceled. This leads to devices being >> unable to be hot unplugged and, in some cases, guest kernel Ooops. >> After CAS, with the FDT in place, the guest can handle the hotplug >> events and everything works as usual. >> >> An attempt to try to support hotplug before CAS was made, but not >> successful. The key difference in the current code flow between a >> coldplugged and a hotplugged device, in the PRELAUNCH state, is that >> the coldplugged device is registered at the base FDT, allowing its >> DRC to go straight to CONFIGURED state. In theory, this can also be >> done with a hotplugged device if we can add it to the base of the >> existing FDT. However, tampering with the FDT after writing in the >> guest memory, besides being a dubitable idea, is also not >> possible. The FDT is written in ppc_spapr_reset and there is no >> way to retrieve it - we can calculate the fdt_address but the >> fdt_size isn't stored. Storing the fdt_size to allow for >> later retrieval is yet another state that would need to be >> migrated. In short, it is not worth the trouble. >> >> All this said, this patch opted to disable CPU/mem hotplug at early >> boot stages. CAS detection is made by checking if there are >> any bits set in ov5_cas to avoid adding an extra state that >> would need tracking/migration. The patch also makes sure that >> it doesn't interfere with hotplug in the INMIGRATE state. >> >> [1] https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg05226.html >> [2] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg01989.html >> >> Signed-off-by: Daniel Henrique Barboza > I don't think this is a good idea. > > 1) After my DRC cleanups, early hotplug works just fine for me. I'm > not sure why it isn't for you: we need to understand that before > proceeding. > > 2) libvirt actually uses early hotplug fairly often (before even > starting the firmware). At the moment this works - at least in some > cases (see above), though there are some wrinkles to work out. This > will break it completely and require an entirely different approach to > fix again. Now that you mentioned I remember having this same discussion with you, about the same topic. Back then we decided to leave it alone, since you couldn't reproduce the behavior but I could. I still can reproduce this bug and ended up investigating a bit more today: - one difference in QEMU between hotplugging before and after CAS is here: hw/ppc/spapr_events.c - rtas_event_log_to_source switch (log_type) { case RTAS_LOG_TYPE_HOTPLUG: source = spapr_event_sources_get_source(spapr->event_sources, EVENT_CLASS_HOT_PLUG); if (spapr_ovec_test(spapr->ov5_cas, OV5_HP_EVT)) { g_assert(source->enabled); break; } /* fall back to epow for legacy hotplug interrupt source */ case RTAS_LOG_TYPE_EPOW: source = spapr_event_sources_get_source(spapr->event_sources, EVENT_CLASS_EPOW); break; default: Note the ovec_test for OV5_HP_EVT. When hotplugging a CPU In early boot, ov5_cas doesn't have anything set, making this check fails and due to the 'break' position there (that I believe it is intended), it falls back to log the event as EPOW instead of HOT_PLUG. I tried to hack this code by adding another break and ensure that the event got logged as HOT_PLUG (like it happens in the post-CAS) but then I got a kernel panic at boot. So I am not sure if this code needs any change or afterthought. - hotplugging the CPU at early stage gives me a warning message in SLOF: -------------- Calling ibm,client-architecture-support...Node not supported Node not supported not implemented memory layout at init: --------------- The code that gives the 'Node not supported' message is related to the fdt-create-cas-node function of board-qemu/slof/fdt.fs . The code is looking for either "memory@" or "ibm,dynamic-reconfiguration-memory" nodes, giving this error when finding a CPU node. - if I hotplug another CPU after the guest completes the boot, the previously added CPU suddenly turns online too: [ started VM with -S ] (qemu) device_add host-spapr-cpu-core,id=core1,core-id=1 (qemu) cont [ guest finishes boot ] danielhb@ubuntu1710:~$ lscpu Architecture: ppc64le Byte Order: Little Endian CPU(s): 1 On-line CPU(s) list: 0 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1 NUMA node(s): 1 Model: 2.1 (pvr 004b 0201) Model name: POWER8E (raw), altivec supported Hypervisor vendor: KVM Virtualization type: para L1d cache: 64K L1i cache: 32K NUMA node0 CPU(s): 0 danielhb@ubuntu1710:~$ danielhb@ubuntu1710:~$ danielhb@ubuntu1710:~$ (qemu) (qemu) info cpus * CPU #0: nip=0xc0000000000a464c thread_id=131946 CPU #1: nip=0x0000000000000000 (halted) thread_id=131954 (qemu) (qemu) device_add host-spapr-cpu-core,id=core2,core-id=2 (qemu) (qemu) info cpus * CPU #0: nip=0xc0000000000a464c thread_id=131946 CPU #1: nip=0xc0000000000a464c thread_id=131954 CPU #2: nip=0xc0000000000a464c thread_id=132144 (qemu) danielhb@ubuntu1710:~$ lscpu Architecture: ppc64le Byte Order: Little Endian CPU(s): 3 On-line CPU(s) list: 0-2 Thread(s) per core: 1 Core(s) per socket: 3 Socket(s): 1 NUMA node(s): 1 Model: 2.1 (pvr 004b 0201) Model name: POWER8E (raw), altivec supported Hypervisor vendor: KVM Virtualization type: para L1d cache: 64K L1i cache: 32K NUMA node0 CPU(s): 0-2 danielhb@ubuntu1710:~$ This makes me believe that the issue is that the guest isn't aware of the CPU presence, making me wonder whether this has something to do with the qemu_irq_pulse in the end of spapr_hotplug_req_event being lost. In the second hotplug, we re-assert the IRQ in the end of check-exception and the guest is made aware of the queued hotplug event that was ignored at first. Daniel > > 3) There's no fundamental reason early hotplug shouldn't work - the > event will just be queued until the OS boots and processes it. > > I know I suggested disabling early hotplug earlier, but that was > before I'd dug into the DRC layer and properly understood what was > going on here. >