From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58461) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z4SRr-0005g4-1b for qemu-devel@nongnu.org; Mon, 15 Jun 2015 07:21:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z4SRm-0000GZ-Uu for qemu-devel@nongnu.org; Mon, 15 Jun 2015 07:21:02 -0400 Received: from e06smtp10.uk.ibm.com ([195.75.94.106]:58831) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z4SRm-00008Y-Ee for qemu-devel@nongnu.org; Mon, 15 Jun 2015 07:20:58 -0400 Received: from /spool/local by e06smtp10.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 15 Jun 2015 12:10:49 +0100 Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 410CB1B08067 for ; Mon, 15 Jun 2015 12:11:46 +0100 (BST) Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t5FBAkon24903890 for ; Mon, 15 Jun 2015 11:10:46 GMT Received: from d06av01.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t5FBAj80020504 for ; Mon, 15 Jun 2015 05:10:46 -0600 Message-ID: <557EB2B5.5020602@de.ibm.com> Date: Mon, 15 Jun 2015 13:10:45 +0200 From: Christian Borntraeger MIME-Version: 1.0 References: <1433845144-26889-1-git-send-email-den@openvz.org> <1433845144-26889-2-git-send-email-den@openvz.org> <5576C1CF.40305@de.ibm.com> <5578274D.6070900@openvz.org> <20150610151113-mutt-send-email-mst@redhat.com> <557AC8F5.6040105@de.ibm.com> <20150612185256-mutt-send-email-mst@redhat.com> <557E7861.7070207@de.ibm.com> <20150615104654-mutt-send-email-mst@redhat.com> <557EA1E9.3070000@de.ibm.com> <20150615120626-mutt-send-email-mst@redhat.com> In-Reply-To: <20150615120626-mutt-send-email-mst@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/1] balloon: add a feature bit to let Guest OS deflate balloon on oom List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: James.Bottomley@HansenPartnership.com, "Denis V. Lunev" , qemu-devel@nongnu.org, Raushaniya Maksudova , Anthony Liguori Am 15.06.2015 um 12:10 schrieb Michael S. Tsirkin: > On Mon, Jun 15, 2015 at 11:59:05AM +0200, Christian Borntraeger wrote: >> Am 15.06.2015 um 11:06 schrieb Michael S. Tsirkin: >> >>>>> AFAIK management tools depend on balloon not deflating >>>>> below host-specified threshold to avoid OOM on the host. >>>>> So I don't think we can make this a default, >>>>> management needs to enable this explicitly. >>>> >>>> If the ballooning is required to keep the host memory managedment >>>> from OOM - iow abusing ballooning as memory hotplug between guests >>>> then yes better let the guest oom - that makes sense. >>>> >>>> Now: I think that doing so (not having enough swap in the host if >>>> all guests deflate) and relying on balloon semantics is fundamentally >>>> broken. Let me explain this: The problem is that we rely on guest >>>> cooperation for the host integrity. As I explained using madvise >>>> WONT_NEED will replace the current PTEs with invalid/emtpy PTEs. As >>>> soon as the guest kernel re-touches the page (e.g. a malicious >>>> kernel module - not the balloon driver) it will be backed by the VMAs >>>> default method - so usually with a shared R/O copy of the empty >>>> zero page. Write accesses will result in a copy-on-write and allocate >>>> new memory in the host. >>>> There is nothing we can do in the balloon protocol to protect the host >>>> against malicious guests allocating all the maximum memory. >>> >>> If we want to try and harden host, we can unmap it so guest will crash >>> if it touches pages without deflate. >>> >>>> If you need host integrity against guest memory usage, something like >>>> cgroups_memory or so is probably the only reliable way. >>> >>> In the original design, protection against a malicious guest is not the >>> point of the balloon, it's a technology that let you overcommit >>> cooperative guests. >> >> Sure. But then its perfectly fine to let the guest reclaim by default, >> because your statement >> "AFAIK management tools depend on balloon not deflating >> below host-specified threshold to avoid OOM on the host. >> So I don't think we can make this a default, >> management needs to enable this explicitly." >> >> is not true ;-) >> >> Christian > > I don't see the connection. > Deflate on OOM means "it's OK to deflate if you like, this won't > cause any harm". If you set this flag, you can't overcommit host too > agressively even with cooperative guests. The connection is that there is no fundamental issue being solved that requires the setting to be off or on. (both is ok with pros and cons) Keeping reclaim on oom off has of course the lowest impact as it is just todays behaviour - so we play safe. I personally can live fine with both defaults and in the end its up to you to decide as virtio maintainer. (my personal opinion to have deflate on OOM=yes, MUST_TELL_HOST=off still stands, though) Just keep in mind that you add an interface that we will drag along forever and that we might require all future user to add a statement to libvirt to do the right thing. So we better make up our mind which default value has the bigger downsides. If that decision making process comes to the conclusion to do it like this patch - fine with me. Christian