From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([209.51.188.92]:42184) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hErKT-0008Su-K6 for qemu-devel@nongnu.org; Fri, 12 Apr 2019 04:14:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hErKS-0004zw-9q for qemu-devel@nongnu.org; Fri, 12 Apr 2019 04:14:33 -0400 Date: Fri, 12 Apr 2019 10:14:18 +0200 From: Igor Mammedov Message-ID: <20190412101418.616f1b30@redhat.com> In-Reply-To: <20190412054407.GA5035@richard> References: <20190313044253.31988-1-richardw.yang@linux.intel.com> <20190313044253.31988-4-richardw.yang@linux.intel.com> <20190313132300.3f56a5ca@redhat.com> <20190313133359.h2njohyijgvkcbtv@master> <20190313170943.5384f5cf@redhat.com> <20190402035343.GA6527@richard> <20190402081512.6194a58f@redhat.com> <20190405085529.GA24071@richard> <20190409165415.2e5df0c6@redhat.com> <20190412054407.GA5035@richard> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH 3/3] hw/acpi: Extract build_mcfg List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wei Yang Cc: yang.zhong@intel.com, peter.maydell@linaro.org, mst@redhat.com, qemu-devel@nongnu.org, Wei Yang , shannon.zhaosl@gmail.com, wei.w.wang@intel.com, qemu-arm@nongnu.org On Fri, 12 Apr 2019 13:44:07 +0800 Wei Yang wrote: > On Tue, Apr 09, 2019 at 04:54:15PM +0200, Igor Mammedov wrote: > >> > >> Let's see a normal hotplug case first. > >> > >> I did one test to see the how a guest with hot-plug cpu migrate to > >> destination. It looks the migration fails if I just do hot-plug at > >> source. So I have to do hot-plug both at source and at destination. This > >> will expand the table_mr both at source and destination. > >> > >> Then let's see the effect of hotplug more devices to exceed original padded > >> size. There are two cases, before re-sizable MemoryRegion and after. > >> > >> 1) Before re-sizable MemoryRegion introduced > >> > >> Before re-sizable MemoryRegion introduced, we just pad table_mr to 4K. And > >> this size never gets bigger, if I am right. To be accurate, the table_blob > >> would grow to next padded size if we hot-add more cpus/pci bridge, but we > >> just copy the original size of MemoryRegion. Even without migration, the > >> ACPI table is corrupted when we expand to next padded size. > >> > >> Is my understanding correct here? > >> > >> 2) After re-sizable MemoryRegion introduced > >> > >> This time both tabl_blob and MemoryRegion grows when it expand to next > >> padded size. Since we need to hot-add device at both side, ACPI table > >> grows at the same pace. > >> > >> Every thing looks good, until one of it exceed the resizable > >> MemoryRegion's max size. (Not sure this is possible in reality, while > >> possible in theory). Actually, this looks like case 1) when resizable > >> MemoryRegion is not introduced. The too big ACPI table get corrupted. > >> > >> So if my understanding is correct, the procedure you mentioned "expand from > >> initial padded size to next padded size" only applies to two different max > >> size resizable MemoryRegion. For other cases, the procedure corrupt the ACPI > >> table itself. > >> > >> Then when we look at > >> > >> commit 07fb61760cdea7c3f1b9c897513986945bca8e89 > >> Author: Paolo Bonzini > >> Date: Mon Jul 28 17:34:15 2014 +0200 > >> > >> pc: hack for migration compatibility from QEMU 2.0 > >> > >> This fix ACPI migration issue before resizable MemoryRegion is > >> introduced(introduced in 2015-01-08). This looks expand to next padded size > >> always corrupt ACPI table at that time. And it make me think expand to next > >> padded size is not the procedure we should do? > >> > >> And my colleague Wei Wang(in cc) mentioned, to make migration succeed, the > >> MemoryRegion has to be the same size at both side. So I guess the problem > >> doesn't lie in hotplug but in "main table" size difference. > > > >It's true only for pre-resizable MemoryRegion QEMU versions, > >after that size doesn't affect migration anymore. > > > > > >> For example, we have two version of Qemu: v1 and v2. Their "main table" size > >> is: > >> > >> v1: 3990 > >> v2: 4020 > >> > >> At this point, their ACPI table all padded to 4k, which is the same. > >> > >> Then we create a machine with 1 more vcpu by these two versions. This will > >> expand the table to: > >> > >> v1: 4095 > >> v2: 4125 > >> > >> After padding, v1's ACPI table size is still 4k but v2's is 8k. Now the > >> migration is broken. > >> > >> If this analysis is correct, the relationship between migration failure and > >> ACPI table is "the change of ACPI table size". Any size change of any > >you should make distinction between used_length and max_length here. > >Migration puts on wire used_length and that's what matter for keeping migration > >working. > > > >> ACPI table would break migration. While of course, since we pad the table, > >> only some combinations of tables would result in a visible real size change in > >> MemoryRegion. > >> > >> Then the principle for future ACPI development is to keep all ACPI table size > >> unchanged. > >once again it applies only for QEMU (versions < 2.1) and that was > >the problem (i.e. there always would be configurations that would create > >differently sized tables regardless of arbitrary size we would preallocate) > >resizable MemoryRegions solved. > > > >> Now let's back to mcfg table. As the comment mentioned, guest could > >> enable/disable MCFG, so the code here reserve table no matter it is enabled or > >> not. This behavior ensures ACPI table size not changed. So do we need to find > >> the machine type as you suggested before? > >We should be able to drop mcgf 'padding' hack since machine version > >which was introduced in the QEMU version that introduced resizable MemoryRegion > >as well. > > > >I'll send a patch to address that > > Hi, Igor, > > We have found the qemu version 2.1 which is with resizable MemoryRegion > enabled and q35 will stop support version before 2.3. The concern about ACPI > mcfg table breaking live migration is solved, right? > > If so, I would prepare mcfg refactor patch based on your cleanup. yes, just base your patches on top of "[PATCH for-4.1] q35: acpi: do not create dummy MCFG table" From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.9 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F5C9C10F14 for ; Fri, 12 Apr 2019 08:15:43 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E63F22083E for ; Fri, 12 Apr 2019 08:15:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E63F22083E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([127.0.0.1]:60679 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hErLa-0000ht-29 for qemu-devel@archiver.kernel.org; Fri, 12 Apr 2019 04:15:42 -0400 Received: from eggs.gnu.org ([209.51.188.92]:42184) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hErKT-0008Su-K6 for qemu-devel@nongnu.org; Fri, 12 Apr 2019 04:14:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hErKS-0004zw-9q for qemu-devel@nongnu.org; Fri, 12 Apr 2019 04:14:33 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48800) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hErKP-0004xU-3w; Fri, 12 Apr 2019 04:14:29 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EFCB558569; Fri, 12 Apr 2019 08:14:26 +0000 (UTC) Received: from localhost (unknown [10.43.2.182]) by smtp.corp.redhat.com (Postfix) with ESMTP id 82156183DF; Fri, 12 Apr 2019 08:14:22 +0000 (UTC) Date: Fri, 12 Apr 2019 10:14:18 +0200 From: Igor Mammedov To: Wei Yang Message-ID: <20190412101418.616f1b30@redhat.com> In-Reply-To: <20190412054407.GA5035@richard> References: <20190313044253.31988-1-richardw.yang@linux.intel.com> <20190313044253.31988-4-richardw.yang@linux.intel.com> <20190313132300.3f56a5ca@redhat.com> <20190313133359.h2njohyijgvkcbtv@master> <20190313170943.5384f5cf@redhat.com> <20190402035343.GA6527@richard> <20190402081512.6194a58f@redhat.com> <20190405085529.GA24071@richard> <20190409165415.2e5df0c6@redhat.com> <20190412054407.GA5035@richard> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 12 Apr 2019 08:14:27 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: Re: [Qemu-devel] [RFC PATCH 3/3] hw/acpi: Extract build_mcfg X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: yang.zhong@intel.com, peter.maydell@linaro.org, mst@redhat.com, qemu-devel@nongnu.org, Wei Yang , shannon.zhaosl@gmail.com, wei.w.wang@intel.com, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Message-ID: <20190412081418.jhSVUoKyZwin057lxQnWomBpahYV20Vxnbv0taUijEo@z> On Fri, 12 Apr 2019 13:44:07 +0800 Wei Yang wrote: > On Tue, Apr 09, 2019 at 04:54:15PM +0200, Igor Mammedov wrote: > >> > >> Let's see a normal hotplug case first. > >> > >> I did one test to see the how a guest with hot-plug cpu migrate to > >> destination. It looks the migration fails if I just do hot-plug at > >> source. So I have to do hot-plug both at source and at destination. This > >> will expand the table_mr both at source and destination. > >> > >> Then let's see the effect of hotplug more devices to exceed original padded > >> size. There are two cases, before re-sizable MemoryRegion and after. > >> > >> 1) Before re-sizable MemoryRegion introduced > >> > >> Before re-sizable MemoryRegion introduced, we just pad table_mr to 4K. And > >> this size never gets bigger, if I am right. To be accurate, the table_blob > >> would grow to next padded size if we hot-add more cpus/pci bridge, but we > >> just copy the original size of MemoryRegion. Even without migration, the > >> ACPI table is corrupted when we expand to next padded size. > >> > >> Is my understanding correct here? > >> > >> 2) After re-sizable MemoryRegion introduced > >> > >> This time both tabl_blob and MemoryRegion grows when it expand to next > >> padded size. Since we need to hot-add device at both side, ACPI table > >> grows at the same pace. > >> > >> Every thing looks good, until one of it exceed the resizable > >> MemoryRegion's max size. (Not sure this is possible in reality, while > >> possible in theory). Actually, this looks like case 1) when resizable > >> MemoryRegion is not introduced. The too big ACPI table get corrupted. > >> > >> So if my understanding is correct, the procedure you mentioned "expand from > >> initial padded size to next padded size" only applies to two different max > >> size resizable MemoryRegion. For other cases, the procedure corrupt the ACPI > >> table itself. > >> > >> Then when we look at > >> > >> commit 07fb61760cdea7c3f1b9c897513986945bca8e89 > >> Author: Paolo Bonzini > >> Date: Mon Jul 28 17:34:15 2014 +0200 > >> > >> pc: hack for migration compatibility from QEMU 2.0 > >> > >> This fix ACPI migration issue before resizable MemoryRegion is > >> introduced(introduced in 2015-01-08). This looks expand to next padded size > >> always corrupt ACPI table at that time. And it make me think expand to next > >> padded size is not the procedure we should do? > >> > >> And my colleague Wei Wang(in cc) mentioned, to make migration succeed, the > >> MemoryRegion has to be the same size at both side. So I guess the problem > >> doesn't lie in hotplug but in "main table" size difference. > > > >It's true only for pre-resizable MemoryRegion QEMU versions, > >after that size doesn't affect migration anymore. > > > > > >> For example, we have two version of Qemu: v1 and v2. Their "main table" size > >> is: > >> > >> v1: 3990 > >> v2: 4020 > >> > >> At this point, their ACPI table all padded to 4k, which is the same. > >> > >> Then we create a machine with 1 more vcpu by these two versions. This will > >> expand the table to: > >> > >> v1: 4095 > >> v2: 4125 > >> > >> After padding, v1's ACPI table size is still 4k but v2's is 8k. Now the > >> migration is broken. > >> > >> If this analysis is correct, the relationship between migration failure and > >> ACPI table is "the change of ACPI table size". Any size change of any > >you should make distinction between used_length and max_length here. > >Migration puts on wire used_length and that's what matter for keeping migration > >working. > > > >> ACPI table would break migration. While of course, since we pad the table, > >> only some combinations of tables would result in a visible real size change in > >> MemoryRegion. > >> > >> Then the principle for future ACPI development is to keep all ACPI table size > >> unchanged. > >once again it applies only for QEMU (versions < 2.1) and that was > >the problem (i.e. there always would be configurations that would create > >differently sized tables regardless of arbitrary size we would preallocate) > >resizable MemoryRegions solved. > > > >> Now let's back to mcfg table. As the comment mentioned, guest could > >> enable/disable MCFG, so the code here reserve table no matter it is enabled or > >> not. This behavior ensures ACPI table size not changed. So do we need to find > >> the machine type as you suggested before? > >We should be able to drop mcgf 'padding' hack since machine version > >which was introduced in the QEMU version that introduced resizable MemoryRegion > >as well. > > > >I'll send a patch to address that > > Hi, Igor, > > We have found the qemu version 2.1 which is with resizable MemoryRegion > enabled and q35 will stop support version before 2.3. The concern about ACPI > mcfg table breaking live migration is solved, right? > > If so, I would prepare mcfg refactor patch based on your cleanup. yes, just base your patches on top of "[PATCH for-4.1] q35: acpi: do not create dummy MCFG table"