From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christian Borntraeger Subject: Re: [PATCH] KVM: s390: remove delayed reallocation of page tables for KVM Date: Mon, 27 Apr 2015 16:08:19 +0200 Message-ID: <553E42D3.9040508@de.ibm.com> References: <1429787297-9292-1-git-send-email-borntraeger@de.ibm.com> <1429787297-9292-2-git-send-email-borntraeger@de.ibm.com> <2E97DEE6-47EA-484A-9F02-9F031DCA8F36@suse.de> <5538DAD4.4060505@de.ibm.com> <20150423141309.7c500236@mschwide> <553E3E3A.9010107@suse.de> <20150427155745.64729393@mschwide> <553E41AE.10604@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: Paolo Bonzini , KVM , Cornelia Huck , Jens Freimann To: Alexander Graf , Martin Schwidefsky Return-path: Received: from e06smtp17.uk.ibm.com ([195.75.94.113]:49528 "EHLO e06smtp17.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932828AbbD0OIZ (ORCPT ); Mon, 27 Apr 2015 10:08:25 -0400 Received: from /spool/local by e06smtp17.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 27 Apr 2015 15:08:22 +0100 Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id 7B66B17D805D for ; Mon, 27 Apr 2015 15:09:02 +0100 (BST) Received: from d06av12.portsmouth.uk.ibm.com (d06av12.portsmouth.uk.ibm.com [9.149.37.247]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t3RE8LXu50462942 for ; Mon, 27 Apr 2015 14:08:21 GMT Received: from d06av12.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av12.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t3RE8JMh009562 for ; Mon, 27 Apr 2015 08:08:21 -0600 In-Reply-To: <553E41AE.10604@suse.de> Sender: kvm-owner@vger.kernel.org List-ID: Am 27.04.2015 um 16:03 schrieb Alexander Graf: > On 04/27/2015 03:57 PM, Martin Schwidefsky wrote: >> On Mon, 27 Apr 2015 15:48:42 +0200 >> Alexander Graf wrote: >> >>> On 04/23/2015 02:13 PM, Martin Schwidefsky wrote: >>>> On Thu, 23 Apr 2015 14:01:23 +0200 >>>> Alexander Graf wrote: >>>> >>>>> As far as alternative approaches go, I don't have a great idea otoh. >>>>> We could have an elf flag indicating that this process needs 4k page >>>>> tables to limit the impact to a single process. In fact, could we >>>>> maybe still limit the scope to non-global? A personality may work >>>>> as well. Or ulimit? >>>> I tried the ELF flag approach, does not work. The trouble is that >>>> allocate_mm() has to create the page tables with 4K tables if you >>>> want to change the page table layout later on. We have learned the >>>> hard way that the direction 2K to 4K does not work due to races >>>> in the mm. >>>> >>>> Now there are two major cases: 1) fork + execve and 2) fork only. >>>> The ELF flag can be used to reduce from 4K to 2K for 1) but not 2). >>>> 2) is required for apps that use lots of forking, e.g. database or >>>> web servers. Same goes for the approach with a personality flag or >>>> ulimit. >>>> >>>> We would have to distinguish the two cases for allocate_mm(), >>>> if the new mm is allocated for a fork the current mm decides >>>> 2K vs. 4K. If the new mm is allocated by binfmt_elf, then start >>>> with 4K and do the downgrade after the ELF flag has been evaluated. >>> Well, you could also make it a personality flag for example, no? Then >>> every new process below a certain one always gets 4k page tables until >>> they drop the personality, at which point each child would only get 2k >>> page tables again. >>> >>> I'm mostly concerned that people will end up mixing VMs and other >>> workloads on the same LPAR, so I don't think there's a one-shoe-fits-all >>> solution. >> If I add an argument to mm_init() to indicate if this context >> is for fork() or execve() then the ELF header flag approach works. > > So you don't need the sysctl? It would not be enough to enable old userspaces that do not have the ELF header flag. So we need both to enable old userspace - new kernel. Christian