From mboxrd@z Thu Jan 1 00:00:00 1970 Received: by 2002:a05:6000:188:0:0:0:0 with SMTP id p8csp909074wrx; Fri, 1 Mar 2019 10:02:18 -0800 (PST) X-Google-Smtp-Source: APXvYqxt+rBgLxu8dIjp717hBS5/lXWoN77rEFPnNJxv3STXomzZpH90ho2da0lSs1+hUzPGrRLK X-Received: by 2002:a25:400d:: with SMTP id n13mr3123849yba.418.1551463338267; Fri, 01 Mar 2019 10:02:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551463338; cv=none; d=google.com; s=arc-20160816; b=gHPbQ1kJtsBaP+u8D8wcbKAh8tXxaXK8n6zNYxFGgX0lLI0l4YTTo+Xpyjh5XDJAfs E8T6tSlEl8mt7L5Z8W9uESL1Wc8mzTHY2oxhnM8MxavD+rPoZmGG6zeiDA4CL6CKeVBT q42R+FzFaJmDH+Zz4IJtbmBSfIwr506ACIQaa1JqDVLb/BwuceqMEscBY57Bv35Ce+Td gcUEkftnDRJERjCB4NB40wSvC0ATVEw1V6slAIPj/Eqh6fws9voumrciDBE+0bRRHbu1 1nr60jQwe0ajNLYc+vMS03of9ciwYMgP9+7yXBQ5wnCimUWNLwJVopjlDWKC9CwYXp7i 0A5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject :content-transfer-encoding:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:to:from:date; bh=Ojq+5N1r3XQS4NBiMXVtcTPlrn+/ffcfpdu+/HJRubA=; b=QkX7tMvUT4H0PCw6XN8+OsI9z9UDyot74cOKN/72Ys6gIVI9Q+8efkZAPK3sMYWwFH HFZ8qV2ZTpLR2Gy+9IwJLLT04ucX+nxWbfmDY6mRtrpAioI/XGHplWsifOk1uZagF1s2 J3g6GqRPHKKcXdqzgkYz5yYn3RIGFlNjScKl6htC1ZiV5Gu2bs62+F1agc/dTNK3iQlI h6jALHe7ShWc5QP7kiFRWUn9+N3NAk2hYWhK9kyP8R/j5LP6YWuF55pEimOYmqHB/uHF QcV0KK6bnF5MjJElQeK5RdG8uaQsYuUYFr0lmWuKT9QiEEt3k7cZ8033EFzegdz0CkGn 987w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id e64si2878035ybe.164.2019.03.01.10.02.18 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 01 Mar 2019 10:02:18 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; spf=pass (google.com: domain of qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from localhost ([127.0.0.1]:41843 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gzmUD-0002Ar-O3 for alex.bennee@linaro.org; Fri, 01 Mar 2019 13:02:17 -0500 Received: from eggs.gnu.org ([209.51.188.92]:53535) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gzmU3-0002AW-8z for qemu-arm@nongnu.org; Fri, 01 Mar 2019 13:02:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gzmU0-0006Ex-HJ for qemu-arm@nongnu.org; Fri, 01 Mar 2019 13:02:06 -0500 Received: from mx1.redhat.com ([209.132.183.28]:53098) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gzmTw-0006BN-Pw; Fri, 01 Mar 2019 13:02:02 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 92AC930B2377; Fri, 1 Mar 2019 18:01:59 +0000 (UTC) Received: from work-vm (ovpn-117-117.ams2.redhat.com [10.36.117.117]) by smtp.corp.redhat.com (Postfix) with ESMTPS id AC2B85D9C5; Fri, 1 Mar 2019 18:01:54 +0000 (UTC) Date: Fri, 1 Mar 2019 18:01:52 +0000 From: "Dr. David Alan Gilbert" To: Igor Mammedov Message-ID: <20190301180151.GE2851@work-vm> References: <1551454936-205218-1-git-send-email-imammedo@redhat.com> <1551454936-205218-2-git-send-email-imammedo@redhat.com> <20190301154947.GJ21251@redhat.com> <20190301183328.20b63e23@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20190301183328.20b63e23@redhat.com> User-Agent: Mutt/1.11.3 (2019-02-01) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Fri, 01 Mar 2019 18:01:59 +0000 (UTC) Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: Re: [Qemu-arm] [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option X-BeenThere: qemu-arm@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, Daniel =?iso-8859-1?Q?P=2E_Berrang=E9?= , ehabkost@redhat.com, libvir-list@redhat.com, qemu-devel@nongnu.org, qemu-arm@nongnu.org, qemu-ppc@nongnu.org, pbonzini@redhat.com, david@gibson.dropbear.id.au Errors-To: qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org Sender: "Qemu-arm" X-TUID: KIcbI1MZdxrZ * Igor Mammedov (imammedo@redhat.com) wrote: > On Fri, 1 Mar 2019 15:49:47 +0000 > Daniel P. Berrang=E9 wrote: >=20 > > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote: > > > The parameter allows to configure fake NUMA topology where guest > > > VM simulates NUMA topology but not actually getting a performance > > > benefits from it. The same or better results could be achieved > > > using 'memdev' parameter. In light of that any VM that uses NUMA > > > to get its benefits should use 'memdev' and to allow transition > > > initial RAM to device based model, deprecate 'mem' parameter as > > > its ad-hoc partitioning of initial RAM MemoryRegion can't be > > > translated to memdev based backend transparently to users and in > > > compatible manner (migration wise). > > >=20 > > > That will also allow to clean up a bit our numa code, leaving only > > > 'memdev' impl. in place and several boards that use node_mem > > > to generate FDT/ACPI description from it. =20 > >=20 > > Can you confirm that the 'mem' and 'memdev' parameters to -numa > > are 100% live migration compatible in both directions ? Libvirt > > would need this to be the case in order to use the 'memdev' syntax > > instead. > Unfortunately they are not migration compatible in any direction, > if it where possible to translate them to each other I'd alias 'mem' > to 'memdev' without deprecation. The former sends over only one > MemoryRegion to target, while the later sends over several (one per > memdev). >=20 > Mixed memory issue[1] first came from libvirt side RHBZ1624223, > back then it was resolved on libvirt side in favor of migration > compatibility vs correctness (i.e. bind policy doesn't work as expected= ). > What worse that it was made default and affects all new machines, > as I understood it. >=20 > In case of -mem-path + -mem-prealloc (with 1 numa node or numa less) > it's possible on QEMU side to make conversion to memdev in migration > compatible way (that's what stopped Michal from memdev approach). > But it's hard to do so in multi-nodes case as amount of MemoryRegions > is different. >=20 > Point is to consider 'mem' as mis-configuration error, as the user > in the first place using broken numa configuration > (i.e. fake numa configuration doesn't actually improve performance). >=20 > CCed David, maybe he could offer a way to do 1:n migration and other > way around. I can't see a trivial way. About the easiest I can think of is if you had a way to create a memdev that was an alias to pc.ram (of a particular size and offset). Dave >=20 > > > Signed-off-by: Igor Mammedov > > > --- > > > numa.c | 2 ++ > > > qemu-deprecated.texi | 14 ++++++++++++++ > > > 2 files changed, 16 insertions(+) > > >=20 > > > diff --git a/numa.c b/numa.c > > > index 3875e1e..2205773 100644 > > > --- a/numa.c > > > +++ b/numa.c > > > @@ -121,6 +121,8 @@ static void parse_numa_node(MachineState *ms, N= umaNodeOptions *node, > > > =20 > > > if (node->has_mem) { > > > numa_info[nodenr].node_mem =3D node->mem; > > > + warn_report("Parameter -numa node,mem is deprecated," > > > + " use -numa node,memdev instead"); > > > } > > > if (node->has_memdev) { > > > Object *o; > > > diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi > > > index 45c5795..73f99d4 100644 > > > --- a/qemu-deprecated.texi > > > +++ b/qemu-deprecated.texi > > > @@ -60,6 +60,20 @@ Support for invalid topologies will be removed, = the user must ensure > > > topologies described with -smp include all possible cpus, i.e. > > > @math{@var{sockets} * @var{cores} * @var{threads} =3D @var{maxcp= us}}. > > > =20 > > > +@subsection -numa node,mem=3D@var{size} (since 4.0) > > > + > > > +The parameter @option{mem} of @option{-numa node} is used to assig= n a part of > > > +guest RAM to a NUMA node. But when using it, it's impossible to ma= nage specified > > > +size on the host side (like bind it to a host node, setting bind p= olicy, ...), > > > +so guest end-ups with the fake NUMA configuration with suboptiomal= performance. > > > +However since 2014 there is an alternative way to assign RAM to a = NUMA node > > > +using parameter @option{memdev}, which does the same as @option{me= m} and has > > > +an ability to actualy manage node RAM on the host side. Use parame= ter > > > +@option{memdev} with @var{memory-backend-ram} backend as an replac= ement for > > > +parameter @option{mem} to achieve the same fake NUMA effect or a p= roperly > > > +configured @var{memory-backend-file} backend to actually benefit f= rom NUMA > > > +configuration. > > > + > > > @section QEMU Machine Protocol (QMP) commands > > > =20 > > > @subsection block-dirty-bitmap-add "autoload" parameter (since 2.1= 2.0) > > > --=20 > > > 2.7.4 > > >=20 > > > -- > > > libvir-list mailing list > > > libvir-list@redhat.com > > > https://www.redhat.com/mailman/listinfo/libvir-list =20 > >=20 > > Regards, > > Daniel >=20 -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([209.51.188.92]:53598) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gzmU6-0002C0-Ks for qemu-devel@nongnu.org; Fri, 01 Mar 2019 13:02:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gzmU4-0006Mr-PU for qemu-devel@nongnu.org; Fri, 01 Mar 2019 13:02:10 -0500 Date: Fri, 1 Mar 2019 18:01:52 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20190301180151.GE2851@work-vm> References: <1551454936-205218-1-git-send-email-imammedo@redhat.com> <1551454936-205218-2-git-send-email-imammedo@redhat.com> <20190301154947.GJ21251@redhat.com> <20190301183328.20b63e23@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20190301183328.20b63e23@redhat.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Igor Mammedov Cc: Daniel =?iso-8859-1?Q?P=2E_Berrang=E9?= , peter.maydell@linaro.org, ehabkost@redhat.com, libvir-list@redhat.com, qemu-devel@nongnu.org, qemu-arm@nongnu.org, qemu-ppc@nongnu.org, pbonzini@redhat.com, david@gibson.dropbear.id.au * Igor Mammedov (imammedo@redhat.com) wrote: > On Fri, 1 Mar 2019 15:49:47 +0000 > Daniel P. Berrang=E9 wrote: >=20 > > On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote: > > > The parameter allows to configure fake NUMA topology where guest > > > VM simulates NUMA topology but not actually getting a performance > > > benefits from it. The same or better results could be achieved > > > using 'memdev' parameter. In light of that any VM that uses NUMA > > > to get its benefits should use 'memdev' and to allow transition > > > initial RAM to device based model, deprecate 'mem' parameter as > > > its ad-hoc partitioning of initial RAM MemoryRegion can't be > > > translated to memdev based backend transparently to users and in > > > compatible manner (migration wise). > > >=20 > > > That will also allow to clean up a bit our numa code, leaving only > > > 'memdev' impl. in place and several boards that use node_mem > > > to generate FDT/ACPI description from it. =20 > >=20 > > Can you confirm that the 'mem' and 'memdev' parameters to -numa > > are 100% live migration compatible in both directions ? Libvirt > > would need this to be the case in order to use the 'memdev' syntax > > instead. > Unfortunately they are not migration compatible in any direction, > if it where possible to translate them to each other I'd alias 'mem' > to 'memdev' without deprecation. The former sends over only one > MemoryRegion to target, while the later sends over several (one per > memdev). >=20 > Mixed memory issue[1] first came from libvirt side RHBZ1624223, > back then it was resolved on libvirt side in favor of migration > compatibility vs correctness (i.e. bind policy doesn't work as expected= ). > What worse that it was made default and affects all new machines, > as I understood it. >=20 > In case of -mem-path + -mem-prealloc (with 1 numa node or numa less) > it's possible on QEMU side to make conversion to memdev in migration > compatible way (that's what stopped Michal from memdev approach). > But it's hard to do so in multi-nodes case as amount of MemoryRegions > is different. >=20 > Point is to consider 'mem' as mis-configuration error, as the user > in the first place using broken numa configuration > (i.e. fake numa configuration doesn't actually improve performance). >=20 > CCed David, maybe he could offer a way to do 1:n migration and other > way around. I can't see a trivial way. About the easiest I can think of is if you had a way to create a memdev that was an alias to pc.ram (of a particular size and offset). Dave >=20 > > > Signed-off-by: Igor Mammedov > > > --- > > > numa.c | 2 ++ > > > qemu-deprecated.texi | 14 ++++++++++++++ > > > 2 files changed, 16 insertions(+) > > >=20 > > > diff --git a/numa.c b/numa.c > > > index 3875e1e..2205773 100644 > > > --- a/numa.c > > > +++ b/numa.c > > > @@ -121,6 +121,8 @@ static void parse_numa_node(MachineState *ms, N= umaNodeOptions *node, > > > =20 > > > if (node->has_mem) { > > > numa_info[nodenr].node_mem =3D node->mem; > > > + warn_report("Parameter -numa node,mem is deprecated," > > > + " use -numa node,memdev instead"); > > > } > > > if (node->has_memdev) { > > > Object *o; > > > diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi > > > index 45c5795..73f99d4 100644 > > > --- a/qemu-deprecated.texi > > > +++ b/qemu-deprecated.texi > > > @@ -60,6 +60,20 @@ Support for invalid topologies will be removed, = the user must ensure > > > topologies described with -smp include all possible cpus, i.e. > > > @math{@var{sockets} * @var{cores} * @var{threads} =3D @var{maxcp= us}}. > > > =20 > > > +@subsection -numa node,mem=3D@var{size} (since 4.0) > > > + > > > +The parameter @option{mem} of @option{-numa node} is used to assig= n a part of > > > +guest RAM to a NUMA node. But when using it, it's impossible to ma= nage specified > > > +size on the host side (like bind it to a host node, setting bind p= olicy, ...), > > > +so guest end-ups with the fake NUMA configuration with suboptiomal= performance. > > > +However since 2014 there is an alternative way to assign RAM to a = NUMA node > > > +using parameter @option{memdev}, which does the same as @option{me= m} and has > > > +an ability to actualy manage node RAM on the host side. Use parame= ter > > > +@option{memdev} with @var{memory-backend-ram} backend as an replac= ement for > > > +parameter @option{mem} to achieve the same fake NUMA effect or a p= roperly > > > +configured @var{memory-backend-file} backend to actually benefit f= rom NUMA > > > +configuration. > > > + > > > @section QEMU Machine Protocol (QMP) commands > > > =20 > > > @subsection block-dirty-bitmap-add "autoload" parameter (since 2.1= 2.0) > > > --=20 > > > 2.7.4 > > >=20 > > > -- > > > libvir-list mailing list > > > libvir-list@redhat.com > > > https://www.redhat.com/mailman/listinfo/libvir-list =20 > >=20 > > Regards, > > Daniel >=20 -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK