From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66EC5C54E71 for ; Fri, 22 Mar 2024 10:22:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E28476B008C; Fri, 22 Mar 2024 06:22:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DD9A76B0092; Fri, 22 Mar 2024 06:22:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C7A3F6B0093; Fri, 22 Mar 2024 06:22:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B498D6B008C for ; Fri, 22 Mar 2024 06:22:26 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 80D35140D51 for ; Fri, 22 Mar 2024 10:22:26 +0000 (UTC) X-FDA: 81924285492.07.5031937 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf29.hostedemail.com (Postfix) with ESMTP id C9D3A120011 for ; Fri, 22 Mar 2024 10:22:23 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=OCumWId4; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf29.hostedemail.com: domain of borntraeger@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=borntraeger@linux.ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711102944; a=rsa-sha256; cv=none; b=TPrn/uCYZDSPGAMQzHftu2w61Wm7/t5EWAI5EDJ+F8X7BbEuGorTLNxgHzsBu2+OuxeoRa AwECBmzWjG8pC8aqMpiEYRJ48NFJ9v2EVRuzs1iOkHrfAGQMJBf1XWbUShyMtDtCz3Iugy zTjV3hkkBKQnOAiILE6IUQDExPc2lbA= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=OCumWId4; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf29.hostedemail.com: domain of borntraeger@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=borntraeger@linux.ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711102944; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t4EtCupj1AZsWtQ2aOS+eqQ4iFxWWbTe9gpwh+nYbgo=; b=aK3XfgTud9aUuCjb7g1eGUOwFrqpArqJ5NFJmNIT93lGktQIApuAXZPNJV/ReXu40k27lb jC6yT/eqO9m5W/Fc7Ym8aKPVb470/q+qWuReOuk7HaJVon7s00aoT782wu9tH0pBiuTjVh DIq0F4dWfFN7wOnFrBGy8HxjNWj5ohs= Received: from pps.filterd (m0360083.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 42M7D0oO029705; Fri, 22 Mar 2024 10:22:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=t4EtCupj1AZsWtQ2aOS+eqQ4iFxWWbTe9gpwh+nYbgo=; b=OCumWId4GftXCC+x8XAN+4LzM5nfQc2Vt/51ofVBaheILkrNgywAF5nxxOT3ugMMfwbA 1j5by1l1X8hD/dPcSokAsNhZQLcXz8nqpLC+Cu9zYl0eS1zXrKF4DX7eTP2hP4jZOhlx o1dJkNZqgSVbr4Ls/DqVHnLHaHi16bwnPxADpGBOxYuNj2PuyOuROQrTgkRAByE28Erw oTIUczGUPaIVhc+8YtXnQkXdcDeqTeLiLoc8RaopVRYflvAYekQkr+c+ijVVQ1lpoUAl njxfj2pDb7jDT7OCs/USg4OVicIPam4oA9xB1eNAGfqoXjxMcH3Oy0ne/uQchOaBQKIP AA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3x15d30c1v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Mar 2024 10:22:21 +0000 Received: from m0360083.ppops.net (m0360083.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 42MAMLB0015406; Fri, 22 Mar 2024 10:22:21 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3x15d30c1s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Mar 2024 10:22:21 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 42M8kq8w015722; Fri, 22 Mar 2024 10:22:20 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3x0x15k3nh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Mar 2024 10:22:20 +0000 Received: from smtpav07.fra02v.mail.ibm.com (smtpav07.fra02v.mail.ibm.com [10.20.54.106]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 42MAMEqP47382980 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 22 Mar 2024 10:22:16 GMT Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9EE7620067; Fri, 22 Mar 2024 10:22:14 +0000 (GMT) Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5DC812004F; Fri, 22 Mar 2024 10:22:14 +0000 (GMT) Received: from [9.152.224.222] (unknown [9.152.224.222]) by smtpav07.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 22 Mar 2024 10:22:14 +0000 (GMT) Message-ID: Date: Fri, 22 Mar 2024 11:22:13 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1 2/2] s390/mm: re-enable the shared zeropage for !PV and !skeys KVM guests Content-Language: en-US To: David Hildenbrand , linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, Janosch Frank , Claudio Imbrenda , Heiko Carstens , Vasily Gorbik , Andrew Morton , Peter Xu , Alexander Gordeev , Sven Schnelle , Gerald Schaefer , Andrea Arcangeli , kvm@vger.kernel.org, linux-s390@vger.kernel.org References: <20240321215954.177730-1-david@redhat.com> <20240321215954.177730-3-david@redhat.com> From: Christian Borntraeger In-Reply-To: <20240321215954.177730-3-david@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: ztFlzz80v3ERALXfSy3FTl3bPX6Bi31k X-Proofpoint-ORIG-GUID: 3N0heez5LomRZyHLGJD7rvjWaDPGsK2B X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-03-22_06,2024-03-21_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 phishscore=0 clxscore=1011 priorityscore=1501 impostorscore=0 bulkscore=0 adultscore=0 spamscore=0 malwarescore=0 lowpriorityscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2403210000 definitions=main-2403220074 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: C9D3A120011 X-Stat-Signature: x1y7xyp6hyqouw31ba4jo6ecqzjgicxu X-HE-Tag: 1711102943-563229 X-HE-Meta: U2FsdGVkX19NSf76G5At9PZt6cyHtiNSDJWPHYeaE6u4j2pllVDox/jIRPxd72by5F+t5afMRf7FDv+PtGymyWwLB2i/v90WdAjajj/6DtZOzz2JACOL85M5F9ueUH6+PNAMnC4VF7ZVZsY8QX+oQV6M/HbavnXjXZRM3QKe6turZdu5qSZwiRvM6o8RHbjzccZXdZihVTN21fMfZuFEAtmCVZ8N2hlmItUUeCBro+HDrju5H9xsiT8bIB2FC6l/rPDJrOgZpC0g8G4Ti4gAsv9CEXtPIbWUg7Lq5XDKFajK8FoBEqs532BeSvtIhEX3A6sdnhwGCv9vW1RcQefXvn4pgXXXKFo0CFyKswjarj5A4wss8fBt45zNMm4/lM3kKfSLn3UL9zr6Ksxtl1PKqdQn1PDb8CVNHvqbeS0WkwhD7GT0XBi0HMMEqg2OJYkyaTziPeLbwCnB8i7Q1LCXcoSAhzH169bOvzRoVCXY5CCAYRfb+MJXFn5TWPo52ZW6mz4brLG3E1+qI325JvVhjrXWSTnlO+Un1/7D6mVLB3YRJ4OsOLAtwOkdQZlHaE78U+PntzxT81T8desvSWIuINe0CAvtzwzdCy0dDSFAgjuLBtUgONCPoDBQHVtGlqzyuFpKNYXfihUsll1lZX9MWgpTA7vy1Ue3L6FlBTDbnNQiLut3WNmFzSgYU8Zz7fXJK1uIdNUpaL1JDOCrPR3B8G2Tcm7LjH56eflYVsSeVbY2Dl3pEx+V5f2zwJ9cyb8EdLlj1sYMorPUoaNwqI4KDNKrTiCQIZy2gaIwsB8eKjxJU5gTPkJWR51JKOa6RxhKTI8lkb3U1nVQs+2uyrwLtqmsfXN8jSEyLKKHw3aWTgZYWrKcW7bT11O1Llii70B2qo8mwceXfhYvuG6vno/nkkZ6BFpRJqEpoLVSULsKP8YsQcw2yXwgHh5/xYdwy72dh+MaAdVS2MGfFS5z339 wuWHeUoZ tq9Ou56KivVzpOotZ0BOabhdk9RXo6ujz4M5r70xjUKBX12OxDP1u07xMCrDMX0Ii5hIEzvwy5rFsB8XxHxPrT8quNc+xSErDKnnm07EpSMHjS6utERT48pv1HBPKLT9J/kU/Tc5K/HTPya20k/9hS5DYIs2mKDcH6KlQUlDZbVfyApTu2DS9UR489ARFoIKMJ6+z0FJ9HUCc43FtpIIoj/tvtRBHbO61CdYnXA3bpRkao6azbLfuLvqAvPGefSfDzewnQQPHeaIh2WgwN3LYLbjzaL97pGsycyq532Kcb2S+gqk/iwhrRplvHwfpplrK3fZ6VmbzdNYcI2Rorx0Mgfks+v9T3n37+zbE X-Bogosity: Ham, tests=bogofilter, spamicity=0.000021, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Am 21.03.24 um 22:59 schrieb David Hildenbrand: > commit fa41ba0d08de ("s390/mm: avoid empty zero pages for KVM guests to > avoid postcopy hangs") introduced an undesired side effect when combined > with memory ballooning and VM migration: memory part of the inflated > memory balloon will consume memory. > > Assuming we have a 100GiB VM and inflated the balloon to 40GiB. Our VM > will consume ~60GiB of memory. If we now trigger a VM migration, > hypervisors like QEMU will read all VM memory. As s390x does not support > the shared zeropage, we'll end up allocating for all previously-inflated > memory part of the memory balloon: 50 GiB. So we might easily > (unexpectedly) crash the VM on the migration source. > > Even worse, hypervisors like QEMU optimize for zeropage migration to not > consume memory on the migration destination: when migrating a > "page full of zeroes", on the migration destination they check whether the > target memory is already zero (by reading the destination memory) and avoid > writing to the memory to not allocate memory: however, s390x will also > allocate memory here, implying that also on the migration destination, we > will end up allocating all previously-inflated memory part of the memory > balloon. > > This is especially bad if actual memory overcommit was not desired, when > memory ballooning is used for dynamic VM memory resizing, setting aside > some memory during boot that can be added later on demand. Alternatives > like virtio-mem that would avoid this issue are not yet available on > s390x. > > There could be ways to optimize some cases in user space: before reading > memory in an anonymous private mapping on the migration source, check via > /proc/self/pagemap if anything is already populated. Similarly check on > the migration destination before reading. While that would avoid > populating tables full of shared zeropages on all architectures, it's > harder to get right and performant, and requires user space changes. > > Further, with posctopy live migration we must place a page, so there, > "avoid touching memory to avoid allocating memory" is not really > possible. (Note that a previously we would have falsely inserted > shared zeropages into processes using UFFDIO_ZEROPAGE where > mm_forbids_zeropage() would have actually forbidden it) > > PV is currently incompatible with memory ballooning, and in the common > case, KVM guests don't make use of storage keys. Instead of zapping > zeropages when enabling storage keys / PV, that turned out to be > problematic in the past, let's do exactly the same we do with KSM pages: > trigger unsharing faults to replace the shared zeropages by proper > anonymous folios. > > What about added latency when enabling storage kes? Having a lot of > zeropages in applicable environments (PV, legacy guests, unittests) is > unexpected. Further, KSM could today already unshare the zeropages > and unmerging KSM pages when enabling storage kets would unshare the > KSM-placed zeropages in the same way, resulting in the same latency. > > Signed-off-by: David Hildenbrand Nice work. Looks good to me and indeed it fixes the memory over-consumption that you mentioned. Reviewed-by: Christian Borntraeger Tested-by: Christian Borntraeger (can also be seen with virsh managedsave; virsh start) I guess its too invasive for stable, but I would say it is real fix.