From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4EFACD11C2 for ; Tue, 19 Mar 2024 15:04:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6A03A6B0092; Tue, 19 Mar 2024 11:04:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 677E36B0093; Tue, 19 Mar 2024 11:04:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5661C6B0095; Tue, 19 Mar 2024 11:04:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 48CEA6B0092 for ; Tue, 19 Mar 2024 11:04:53 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 144E11C13D3 for ; Tue, 19 Mar 2024 15:04:53 +0000 (UTC) X-FDA: 81914110866.08.505841B Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf30.hostedemail.com (Postfix) with ESMTP id CADDE80051 for ; Tue, 19 Mar 2024 15:04:45 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=opGsWAno; spf=pass (imf30.hostedemail.com: domain of 3jKn5ZQYKCPsvhdqmfjrrjoh.frpolqx0-ppnydfn.ruj@flex--seanjc.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3jKn5ZQYKCPsvhdqmfjrrjoh.frpolqx0-ppnydfn.ruj@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710860685; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NplQauo3T05ITJQVb58CyaY10AXYgYxjqrk7R2EQZPo=; b=iwCqRLZxrqs349YndbD2R9TWZDjPxHUTh8S0cmklSf4guJ/VvBqGYGqqS22MsDIs1QKbgi LNorQkAG/mh+hGTQtkjywITWLa2cjKkS1z2VNmJcTix8GdfGhNuer4FJEtPPcLQ1aEkk+c SqAGummS4aB9MXxXlSl6qJwWlIK8XB4= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=opGsWAno; spf=pass (imf30.hostedemail.com: domain of 3jKn5ZQYKCPsvhdqmfjrrjoh.frpolqx0-ppnydfn.ruj@flex--seanjc.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3jKn5ZQYKCPsvhdqmfjrrjoh.frpolqx0-ppnydfn.ruj@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710860685; a=rsa-sha256; cv=none; b=g0LZb4rZtaIpei625MBTu3W5NG1N1pKr9WxhSByXHgP4YwNZdZwZubnFVd+mPmI5cBOA+M R/fl/WB1JI8Wau7ft2Lsq7O2S/M6S7b1hhI4VIljGhA/ybL7Kzybfk1lGAnGmVtgEHHIUS MmcJyfICyZC7MFVLQ5ip+LOUTpH92JI= Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dcc58cddb50so8993744276.0 for ; Tue, 19 Mar 2024 08:04:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1710860684; x=1711465484; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=NplQauo3T05ITJQVb58CyaY10AXYgYxjqrk7R2EQZPo=; b=opGsWAnoJZnqAjHRyAzwOpWgupe2YtVtoOoXz06wXboniiTQ+fHXrxz71oUjUF8JAf TWoRgUa1UxCaOoEOVqgmQQ4umpWu7Wi5fSgeBBb/ZM/eefzhyUBrmfUdHSDuioy9oa6U +4lhdXXVTFtXAGPMuGkeFRDUkpZpOiC9VaOInh8dq7vMOJMXaezOQiH6SpOocnn/FehB 4QVCDclMRJz3NpD7q6C3dJ3MRuGe9jzXWiPsYK4IlD1ek4BGyPb+hK5rrvPTCbCUVzGF d/8kQgKEdm+tVIHOV8/KMEUak6ZOan/F6V4lT6Myh09+d2IDZAlO0wgX3fBSH5i9f2Cw a7Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710860684; x=1711465484; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NplQauo3T05ITJQVb58CyaY10AXYgYxjqrk7R2EQZPo=; b=SwM+wpefkmnfghcqxRr645AkX3TmllF3f0f/wpPTb7Rhc+ZMg6jKxMKn7V65vaCDtv pnoU8YLUeSLr2PTLcCVxwfvFNp/46mNNgQeKt6YR9c1n/da9IPlfqPFJFpH1+0HCbir1 fJ9G6w0cMmtsem0d65cYyoSLMuTBfYyGjW08poOMsEc5hjeC1n2iRtVmP8pnIGtNi2rc pmGVmzGkM2NmaW39Vas9+0B7A44xMUhYKFvuiMP0sKvkXM+2kSKBQSwSYzdOA/VGDPvV JaO2hrSwIMAfY90nE+PsuUomDJX7weisCNvsjOp4XUaO3FH2OyAfmx6GWeExRVhNrrc5 x14w== X-Forwarded-Encrypted: i=1; AJvYcCVDU05pid1Gp3n5eyfaYklzV6TCBZCW22KnZb/bVTWEhH59M3jRLN3pqiruBrecqUsb9qeXAQ5CPBer+GyS9qCy5UU= X-Gm-Message-State: AOJu0YwFujl4LYGqhWR/sqsKec2HeF/Z8/KgO+L3wKSXSSLM1+UzSsOy o5/ZawKIK+Zg2+oThO3bf3AjVoC/cCeglD2Ee6Qv8SClQPc0POk65k9FPLnJafWg8FbsvQtLlEk iYQ== X-Google-Smtp-Source: AGHT+IFzQ4ti420RAZwrAPIehk+RvMSqfW7livgvdq7TKEUIDyocbziT3vORaOcgDir/FqOa9fXPF8w4MQc= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:2484:b0:dbd:b165:441 with SMTP id ds4-20020a056902248400b00dbdb1650441mr3823227ybb.0.1710860684341; Tue, 19 Mar 2024 08:04:44 -0700 (PDT) Date: Tue, 19 Mar 2024 08:04:43 -0700 In-Reply-To: <40f82a61-39b0-4dda-ac32-a7b5da2a31e8@redhat.com> Mime-Version: 1.0 References: <7470390a-5a97-475d-aaad-0f6dfb3d26ea@redhat.com> <40f82a61-39b0-4dda-ac32-a7b5da2a31e8@redhat.com> Message-ID: Subject: Re: folio_mmapped From: Sean Christopherson To: David Hildenbrand Cc: Vishal Annapurve , Quentin Perret , Matthew Wilcox , Fuad Tabba , kvm@vger.kernel.org, kvmarm@lists.linux.dev, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, viro@zeniv.linux.org.uk, brauner@kernel.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, ackerleytng@google.com, mail@maciej.szmigiero.name, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, keirf@google.com, linux-mm@kvack.org Content-Type: text/plain; charset="us-ascii" X-Rspamd-Queue-Id: CADDE80051 X-Rspam-User: X-Stat-Signature: 4jjcjmjhqaqryi4yhpksguw3r7zx1fxb X-Rspamd-Server: rspam01 X-HE-Tag: 1710860685-283509 X-HE-Meta: U2FsdGVkX1+tntQqCNAc5+bExjHIT8pMDpK7BHtTcF3oEPzbU6tXgN7EsAERxbL2s1+F65azCNFQW8rFBmVOOuoYhr96MMLVf4OQs/V4vZ5Amexs3v3X5/iGgWX7gWKuq6ZaUMt5Ng8p1jkE6TDszcZH16hAirjvIRQSrK9b+2EGEH0HlbczN1+S/XaV++ZkOqnmWiDB7Nw/wRqHkwQmp2jAXlK3I8/SIDwpl5gmLg1oJxIn3XKBkv4J90/rKFB8E2oVD+YvZaNB57sa2n06ZiH84MA7X5YmltbEw8IUeNytC77E0ujkJae/6cHYBe7dx+yXVOyjjc+dGltW8FoqU3F5CMxgbruOnY+s8JcdqiHToWpRokNHq2z3ehcmaeRfP3l+QCezvu+oHVCNvQavN4d9Dja2+0EFIK8ATR7vYe7C8Zdvy5rQ+Abs2EglSChhAdBB3ZgCtviYFE9dfK0h4zmTh3jEj8FeF/cplSo6Bme3sp31dg907vOnv6LmWa0OIVWV6DQpXFk31aHd9SkQg3Hz8rY6rZkm/zfNWDzdnOYAnCMm6ULzsAYEC9dj4snXf/SiWtAJXf4Jj9dk60Q7mBH6E7NgIlLLXyc0EvtbkH3OaN48nX6oMc89GVQTstLv/CeGX7uePs+N6YqAf6wHKoyey+yS7p5BS4G2cQvLQcTGanIi3KLEgQAH1hEQ7Gac3umv/reUkaG2gQ5HvJMueUE235HrP6Mk+kfFLToYoLab2ZOpSqRn5K+fnbnG0FR/6UBqERaPQmP+Ectf/EaLmpDVP7IgdyALvjz1oGzfdzTa/+ysm21q3ADS5nVe+CTrro8U7dHjGkuMT+qwa1/HxjEiGIxNyopnQQ3jDfYNXUA7a88sS+7HCEcgenZHniTv0jArZHdKxM9B+06swz6mH6O9Zgqpjyuz3j10003rmgna0cAXhZjryvmpP5MIhK9V/JB11JaDOQAB8oc4EBz kpAsj/Fn gJ7Nv5xd1Uvpl6xEIfH4jaCrCYf2e+tmLIZ2tv5FtvFMXQUxHwZKO5zdkaNlUmXVOa54EShNr6P3/x/Y0Ap5fkMhkDdEDyXhLObIWxgQkRxWDzZ+b2ls7sdXa1QhAj6R1xMUj+8NUmo/IiebD04EUJQuLSa+Y3uGyZdAsDamaUfo0KEPWctBLdagDdNQfnA76Eey7B2CnN895DHEwnt8h+QMC7bjyK32ttSBQAzXN5qmXY6jvwrHRlk0vu0agpjqUtoh3QCZ4wJsaUq3K2qMEgtN1oymxieyYtKU3f7LqBVkqaP8Q3EEzRvp5EVtc+RAr22zTu666OPWAYstli77ePCUgnr6ioEF3VEBGFVB5ojL3F785CBcot0q4hmnEqmTQHoU1oHp/A4Cwm0qEf+IefEJwa31wy/I2CQ4QEtkHAosA6HVZPv6nOPkf5zlDq1RsZrL2TkF/fnULFhxdWqku/tEaK8yUWjiOfjY4bXqbSduIzLa1Bt9yqBvOY5UNeqN7Z6WSU+K4OpacND/j55EBN8UzRx1E18p74yeuoYm7IroHGP95GpFAr8nYfciYvZdPGlUr3JulPbM4Rr6gyJexREka0oF0+zkj4mj5TsLywVNWFl0XL/u3q7OYFPy4ZGcT0Fw2fFjwv/TtzV83qlR2vHqkDUtzmgVmMAqL9qeDeIp63yo8wLCcBYefobXSiddvLEFMPRYMkm8jhyQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000028, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Mar 19, 2024, David Hildenbrand wrote: > On 19.03.24 01:10, Sean Christopherson wrote: > > Performance is a secondary concern. If this were _just_ about guest performance, > > I would unequivocally side with David: the guest gets to keep the pieces if it > > fragments a 1GiB page. > > > > The main problem we're trying to solve is that we want to provision a host such > > that the host can serve 1GiB pages for non-CoCo VMs, and can also simultaneously > > run CoCo VMs, with 100% fungibility. I.e. a host could run 100% non-CoCo VMs, > > 100% CoCo VMs, or more likely, some sliding mix of the two. Ideally, CoCo VMs > > would also get the benefits of 1GiB mappings, that's not the driving motiviation > > for this discussion. > > Supporting 1 GiB mappings there sounds like unnecessary complexity and > opening a big can of worms, especially if "it's not the driving motivation". > > If I understand you correctly, the scenario is > > (1) We have free 1 GiB hugetlb pages lying around > (2) We want to start a CoCo VM > (3) We don't care about 1 GiB mappings for that CoCo VM, We care about 1GiB mappings for CoCo VMs. My comment about performance being a secondary concern was specifically saying that it's the guest's responsilibity to play nice with huge mappings if the guest cares about its performance. For guests that are well behaved, we most definitely want to provide a configuration that performs as close to non-CoCo VMs as we can reasonably make it. And we can do that today, but it requires some amount of host memory to NOT be in the HugeTLB pool, and instead be kept in reserved so that it can be used for shared memory for CoCo VMs. That approach has many downsides, as the extra memory overhead affects CoCo VM shapes, our ability to use a common pool for non-CoCo and CoCo VMs, and so on and so forth. > but hguetlb pages is all we have. > (4) We want to be able to use the 1 GiB hugetlb page in the future. ... > > The other big advantage that we should lean into is that we can make assumptions > > about guest_memfd usage that would never fly for a general purpose backing stores, > > e.g. creating a dedicated memory pool for guest_memfd is acceptable, if not > > desirable, for (almost?) all of the CoCo use cases. > > > > I don't have any concrete ideas at this time, but my gut feeling is that this > > won't be _that_ crazy hard to solve if commit hard to guest_memfd _not_ being > > general purposes, and if we we account for conversion scenarios when designing > > hugepage support for guest_memfd. > > I'm hoping guest_memfd won't end up being the wild west of hacky MM ideas ;) Quite the opposite, I'm saying we should be very deliberate in how we add hugepage support and others features to guest_memfd, so that guest_memfd doesn't become a hacky mess. And I'm saying say we should stand firm in what guest_memfd _won't_ support, e.g. swap/reclaim and probably page migration should get a hard "no". In other words, ditch the complexity for features that are well served by existing general purpose solutions, so that guest_memfd can take on a bit of complexity to serve use cases that are unique to KVM guests, without becoming an unmaintainble mess due to cross-products.