From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sean Christopherson Date: Tue, 31 Oct 2023 07:16:15 -0700 Subject: [PATCH v13 17/35] KVM: Add transparent hugepage support for dedicated guest memory In-Reply-To: <7c0844d8-6f97-4904-a140-abeabeb552c1@intel.com> References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-18-seanjc@google.com> <7c0844d8-6f97-4904-a140-abeabeb552c1@intel.com> Message-ID: List-Id: To: kvm-riscv@lists.infradead.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Tue, Oct 31, 2023, Xiaoyao Li wrote: > On 10/28/2023 2:21 AM, Sean Christopherson wrote: > > Extended guest_memfd to allow backing guest memory with transparent > > hugepages. Require userspace to opt-in via a flag even though there's no > > known/anticipated use case for forcing small pages as THP is optional, > > i.e. to avoid ending up in a situation where userspace is unaware that > > KVM can't provide hugepages. > > Personally, it seems not so "transparent" if requiring userspace to opt-in. > > People need to 1) check if the kernel built with TRANSPARENT_HUGEPAGE > support, or check is the sysfs of transparent hugepage exists; 2)get the > maximum support hugepage size 3) ensure the size satisfies the alignment; > before opt-in it. > > Even simpler, userspace can blindly try to create guest memfd with > transparent hugapage flag. If getting error, fallback to create without the > transparent hugepage flag. > > However, it doesn't look transparent to me. The "transparent" part is referring to the underlying kernel mechanism, it's not saying anything about the API. The "transparent" part of THP is that the kernel doesn't guarantee hugepages, i.e. whether or not hugepages are actually used is (mostly) transparent to userspace. Paolo also isn't the biggest fan[*], but there are also downsides to always allowing hugepages, e.g. silent failure due to lack of THP or unaligned size, and there's precedent in the form of MADV_HUGEPAGE. [*] https://lore.kernel.org/all/84a908ae-04c7-51c7-c9a8-119e1933a189 at redhat.com From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19B1A1DFC2 for ; Tue, 31 Oct 2023 14:16:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="N38xQyv5" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-1cc3ad55c75so20002495ad.0 for ; Tue, 31 Oct 2023 07:16:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698761777; x=1699366577; darn=lists.linux.dev; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+SCWDVcx1K76tZHWI6JguONJ6NnGeKLP7+QbYfrb3Rs=; b=N38xQyv5qajq3YZ/7r8R3wcsNEiZT/01QPJ/4n8pTbPD7z36ePaadXmXAXXPMYwq5h ORYZ1XkNgTijwTeoTim13X5DVgzO6Krhetv4dhgTRY7P7q/2Am6OAYvAdSH+HyUXe6sh lrtmV+IONJHq+zcLRsp5GHol1/XedE/ggG+2xivTKX2wfyweWmTm/iHUQmBA82zp7UTZ o8VkVKN5F13/ZyKeo0VF0SAcm39fv/GDjySGc1myf14rZNo6ry3YYX/qXSB3dysBQFZf ZVCEWwJhfDJIQkBJHrPLDKFjmm2TQe3SqxHTg+Ul9xZD03e5dtAbbac0wo8/r0gzp9sH zCAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698761777; x=1699366577; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+SCWDVcx1K76tZHWI6JguONJ6NnGeKLP7+QbYfrb3Rs=; b=PndDb7+adRNNBXOZJBc0gfVVyI9hfTeDgK+sVjfT7ngCkwxv7ePMvjpRHzenx4iH2D wcdmVVdInSXO4HONLQ06ddzp1zqFQf7nvhzXnkmfC/yQo1EfIPg54wE7UdK36i2wENFR 35Nmd/bIoCx7vdB1DvnF3ZigPTlgvOYv7lYURBiHgDiLejJ0miXi0jyZAeyuQI7Dj6uq hBUpWqksSZ63tcbrIBxPqC4ns/rUBqHI9/KghTU47Ykzum20CAxDiDbj5o6kEEBZj26w fNzOJZ/3AWUip1ZkghBhMfgKsi7p4QuwZkoNkVY/2HhUdGpEbyWk13eO5Y0ubJQqs/0b Z3rw== X-Gm-Message-State: AOJu0YwchX3Gi6V37X9/y5+Jw39XoSIC8EHnX6Iri6+FhDimZpfA4TAv KftrdBkHrH90kP6QyUnBPg+gtioDmQ4= X-Google-Smtp-Source: AGHT+IG8tFKfIuO6cbI6ukftnUzYB8muWE8fOHZgWpajrHJaT+RkL+zDaTOo8ZNIlEatfvapWXD9jFRr58g= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:2616:b0:1cc:2549:c281 with SMTP id jd22-20020a170903261600b001cc2549c281mr206233plb.13.1698761777291; Tue, 31 Oct 2023 07:16:17 -0700 (PDT) Date: Tue, 31 Oct 2023 07:16:15 -0700 In-Reply-To: <7c0844d8-6f97-4904-a140-abeabeb552c1@intel.com> Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-18-seanjc@google.com> <7c0844d8-6f97-4904-a140-abeabeb552c1@intel.com> Message-ID: Subject: Re: [PATCH v13 17/35] KVM: Add transparent hugepage support for dedicated guest memory From: Sean Christopherson To: Xiaoyao Li Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , "=?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" Content-Type: text/plain; charset="us-ascii" On Tue, Oct 31, 2023, Xiaoyao Li wrote: > On 10/28/2023 2:21 AM, Sean Christopherson wrote: > > Extended guest_memfd to allow backing guest memory with transparent > > hugepages. Require userspace to opt-in via a flag even though there's no > > known/anticipated use case for forcing small pages as THP is optional, > > i.e. to avoid ending up in a situation where userspace is unaware that > > KVM can't provide hugepages. > > Personally, it seems not so "transparent" if requiring userspace to opt-in. > > People need to 1) check if the kernel built with TRANSPARENT_HUGEPAGE > support, or check is the sysfs of transparent hugepage exists; 2)get the > maximum support hugepage size 3) ensure the size satisfies the alignment; > before opt-in it. > > Even simpler, userspace can blindly try to create guest memfd with > transparent hugapage flag. If getting error, fallback to create without the > transparent hugepage flag. > > However, it doesn't look transparent to me. The "transparent" part is referring to the underlying kernel mechanism, it's not saying anything about the API. The "transparent" part of THP is that the kernel doesn't guarantee hugepages, i.e. whether or not hugepages are actually used is (mostly) transparent to userspace. Paolo also isn't the biggest fan[*], but there are also downsides to always allowing hugepages, e.g. silent failure due to lack of THP or unaligned size, and there's precedent in the form of MADV_HUGEPAGE. [*] https://lore.kernel.org/all/84a908ae-04c7-51c7-c9a8-119e1933a189@redhat.com From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B02D0C4332F for ; Tue, 31 Oct 2023 14:16:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=eClB8kMwFeKjnGkoe0iw+i7jw33TBaa6My7nPeBGr/k=; b=claUB8AFf+xLOMUuQ1EeB4ds2i HaZXcgZMx3qMMA61NzyepcvZJI8FNjm9bPcSK5K9BHxVogPLHg68qlfoRyrSH+Wy2WaSliknfJICw RpMOwq6Antxd5dqnOhOJsGzbD6PnAsMXwzHvdZCS6lsu1wzj30rgdbUyTuJKeAkQ2G5HDWE9ywLUQ waXlM/De2aTGc09y30IdMbdGZzb9dYNpjqRJkKoYOrBY9lDbTF6UBcBIwfy+lH5+DL1vzQo9qXhnW i9fDM+aQA2xGtsLQ/mxbYtE0Pk+aRBSm22p8gD6K4mT5LcZ7zVnYHcOWt1qTOtRg5DTvMh41Yy+bw oZWu/bBQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qxpXc-005Pnu-0Q; Tue, 31 Oct 2023 14:16:24 +0000 Received: from mail-pl1-x64a.google.com ([2607:f8b0:4864:20::64a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qxpXY-005Plh-2f for linux-riscv@lists.infradead.org; Tue, 31 Oct 2023 14:16:22 +0000 Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-1cc591d8177so15197515ad.3 for ; Tue, 31 Oct 2023 07:16:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698761777; x=1699366577; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+SCWDVcx1K76tZHWI6JguONJ6NnGeKLP7+QbYfrb3Rs=; b=KBY6N2Lbd87l9vaXmLehzeqK+hRZwP8dJVF6XwdV4rjbeNUQVnHg5X0ef6Hjs1lCFk b32LAMNidP+u3WB14OP54SJ6j3+kaVOoTUWSF6ixinumkKoHwKao1oKEjozFoNYo9EBL EIUQd1AkK+xx/IUmnlByJPZkPu3T3/pVgiBb3GXdc6w0dmQdJc6Vj5kfg1Jlh/Bo5SG8 OTIOCxRJApLVL+AY4LC65gySo2ZqIj1agg1vihUSUQfoGf60dO1cVPRl6hZKMq2sD8m6 E+XmXgy3dNH3+4voCB9g4EdP2jTDU1pA1xWF7kQDps5E84fjpoaP4RzXmuWkoKMVBjox rRTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698761777; x=1699366577; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+SCWDVcx1K76tZHWI6JguONJ6NnGeKLP7+QbYfrb3Rs=; b=XU0GByD16agG+72fiaRYJp7Ky+PdSDICRwyxSpw8kA6kYPwZ0HpaJuH6O88yLIouYc VVmQ23AVXUeiPIfMMfEcyu4fbFwL7jCuc6nxAy4HZqkj2yuRJevhCY1SGniQ9G01O8VE BsHcvIkd2Xz+W7cK9qDBknq/eM4Z/Lu7Xtve15aK3l0YfQjJpn62BON/YfoKbvQxWgrl lMYFlcAu5ykIFfsIgdjR47KAqEMYsT/wK3DyokhrD0yst2Nhs8Icyk/STPoLONVrutaG t2LXu8A/V0mRfkRQw2AE9STxzB4kdtH+ULFLT7qYAIq0QEKGnPxtXliNyj18+RFYJy4F KAGQ== X-Gm-Message-State: AOJu0Yw5vHCzxaHu5XeBzLYQHdDINlhLMX56k35DWsNQNnMCD/6A8jEO O9tpGxRDH23qe4Qs4Sop+JZRDtwM5lQ= X-Google-Smtp-Source: AGHT+IG8tFKfIuO6cbI6ukftnUzYB8muWE8fOHZgWpajrHJaT+RkL+zDaTOo8ZNIlEatfvapWXD9jFRr58g= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:2616:b0:1cc:2549:c281 with SMTP id jd22-20020a170903261600b001cc2549c281mr206233plb.13.1698761777291; Tue, 31 Oct 2023 07:16:17 -0700 (PDT) Date: Tue, 31 Oct 2023 07:16:15 -0700 In-Reply-To: <7c0844d8-6f97-4904-a140-abeabeb552c1@intel.com> Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-18-seanjc@google.com> <7c0844d8-6f97-4904-a140-abeabeb552c1@intel.com> Message-ID: Subject: Re: [PATCH v13 17/35] KVM: Add transparent hugepage support for dedicated guest memory From: Sean Christopherson To: Xiaoyao Li Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , "=?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231031_071620_869703_9DC1FE08 X-CRM114-Status: GOOD ( 15.62 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org On Tue, Oct 31, 2023, Xiaoyao Li wrote: > On 10/28/2023 2:21 AM, Sean Christopherson wrote: > > Extended guest_memfd to allow backing guest memory with transparent > > hugepages. Require userspace to opt-in via a flag even though there's no > > known/anticipated use case for forcing small pages as THP is optional, > > i.e. to avoid ending up in a situation where userspace is unaware that > > KVM can't provide hugepages. > > Personally, it seems not so "transparent" if requiring userspace to opt-in. > > People need to 1) check if the kernel built with TRANSPARENT_HUGEPAGE > support, or check is the sysfs of transparent hugepage exists; 2)get the > maximum support hugepage size 3) ensure the size satisfies the alignment; > before opt-in it. > > Even simpler, userspace can blindly try to create guest memfd with > transparent hugapage flag. If getting error, fallback to create without the > transparent hugepage flag. > > However, it doesn't look transparent to me. The "transparent" part is referring to the underlying kernel mechanism, it's not saying anything about the API. The "transparent" part of THP is that the kernel doesn't guarantee hugepages, i.e. whether or not hugepages are actually used is (mostly) transparent to userspace. Paolo also isn't the biggest fan[*], but there are also downsides to always allowing hugepages, e.g. silent failure due to lack of THP or unaligned size, and there's precedent in the form of MADV_HUGEPAGE. [*] https://lore.kernel.org/all/84a908ae-04c7-51c7-c9a8-119e1933a189@redhat.com _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DFA44C4167B for ; Tue, 31 Oct 2023 14:17:13 +0000 (UTC) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=hDQ2Nw44; dkim-atps=neutral Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4SKXJm2Zrqz3cSd for ; Wed, 1 Nov 2023 01:17:12 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=hDQ2Nw44; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--seanjc.bounces.google.com (client-ip=2607:f8b0:4864:20::64a; helo=mail-pl1-x64a.google.com; envelope-from=3mqxbzqykdh4ugcpleiqqing.eqonkpwzrre-fgxnkuvu.q1ncdu.qti@flex--seanjc.bounces.google.com; receiver=lists.ozlabs.org) Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4SKXHn2RSvz2yGF for ; Wed, 1 Nov 2023 01:16:19 +1100 (AEDT) Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-1cc29f3afe0so27231315ad.2 for ; Tue, 31 Oct 2023 07:16:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698761777; x=1699366577; darn=lists.ozlabs.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+SCWDVcx1K76tZHWI6JguONJ6NnGeKLP7+QbYfrb3Rs=; b=hDQ2Nw44CkM7pOrgWfRrd8G3SmGDGLyy0xsX4otPPooBXi+cpgFKr4jM1j6/KcCg/d Pll7huQsX7mgZommOI4dXwUkMvZdAvHO7Y9GNhkTnsTPYERevdc51hvhNg66vnIUTfNw yC6M2mSsYPRADxXbzAq2YHhcJGF+kilLdZAHAubNArWff4IxNLeeiIdbLbBdyew3de5a chJy4g/ejWTvYnLk6ZzOdeQiWPuyyJmCI8B4IZZQzcj+TwdWCCKRIc3Zrtp+1v0D45bb Y3Q0yxPRcsEAUJdYt7xetOpacy3L2PO9+xJGaNKHWHi2apfHKDsNVk99mqRIZv1Q8ujg Tblg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698761777; x=1699366577; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+SCWDVcx1K76tZHWI6JguONJ6NnGeKLP7+QbYfrb3Rs=; b=sRf5J6ltVXpctg7fDLPnjg2ew60Oe3YsO6+a2tllw1bHPoroMarpHEaieWVUpQwn/5 gYhJHNQub3T4+Yslu4vZa5b/7ZFqWCUxnh/bEZBWz4YoTidW5OT273o1DZdgfFE+lR70 lFuaDCyrDCmstHeNWuM4ohW2MjGyBEsMb1QB30bjbihpfhp/BlqqTzdq6XSlflZesq8/ 7+/1B73Ru4aOQfJNCHqk7JHeYJcad7F3nEo+yI/IHXd7GBO4YIzTwAVeA+Cgh4YShzyw lIG/iJe8RNQKOLoI6IkcIiYA5FYOilcl8LqymLaOatPt0WaBjp23sotnEs3cULEF1Qer oVqw== X-Gm-Message-State: AOJu0Yy2hOm3itWFQtAWNFGUg34udzvqyHvC4RE0bopzyaNQx3E6ozSv qEu0hdcOx/wkCy8SDqUrWQfCAMaRj1M= X-Google-Smtp-Source: AGHT+IG8tFKfIuO6cbI6ukftnUzYB8muWE8fOHZgWpajrHJaT+RkL+zDaTOo8ZNIlEatfvapWXD9jFRr58g= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:2616:b0:1cc:2549:c281 with SMTP id jd22-20020a170903261600b001cc2549c281mr206233plb.13.1698761777291; Tue, 31 Oct 2023 07:16:17 -0700 (PDT) Date: Tue, 31 Oct 2023 07:16:15 -0700 In-Reply-To: <7c0844d8-6f97-4904-a140-abeabeb552c1@intel.com> Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-18-seanjc@google.com> <7c0844d8-6f97-4904-a140-abeabeb552c1@intel.com> Message-ID: Subject: Re: [PATCH v13 17/35] KVM: Add transparent hugepage support for dedicated guest memory From: Sean Christopherson To: Xiaoyao Li Content-Type: text/plain; charset="us-ascii" X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kvm@vger.kernel.org, David Hildenbrand , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Chao Peng , linux-riscv@lists.infradead.org, Isaku Yamahata , Marc Zyngier , Huacai Chen , "Matthew Wilcox \(Oracle\)" , Wang , Fuad Tabba , Yu Zhang , Maciej Szmigiero , Albert Ou , Vlastimil Babka , Michael Roth , Ackerley Tng , Alexander Viro , Paul Walmsley , kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, =?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?= , Isaku Yamahata , Christian Brauner , Quentin Perret , Liam Merwick , linux-mips@vger.kernel.org, Oliver Upton , David Matlack , Jarkko Sakkinen , Palmer Dabbelt , "Kirill A . Shutemov" , kvm-riscv@lists.infradead.org, Anup Patel , linux-fsdevel@vger.kernel.org, Paolo Bonzini , Andrew Morton , Vishal Annapurve , linuxppc-dev@lists.ozlabs.org, Xu Yilun , Anish Moorthy Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Tue, Oct 31, 2023, Xiaoyao Li wrote: > On 10/28/2023 2:21 AM, Sean Christopherson wrote: > > Extended guest_memfd to allow backing guest memory with transparent > > hugepages. Require userspace to opt-in via a flag even though there's no > > known/anticipated use case for forcing small pages as THP is optional, > > i.e. to avoid ending up in a situation where userspace is unaware that > > KVM can't provide hugepages. > > Personally, it seems not so "transparent" if requiring userspace to opt-in. > > People need to 1) check if the kernel built with TRANSPARENT_HUGEPAGE > support, or check is the sysfs of transparent hugepage exists; 2)get the > maximum support hugepage size 3) ensure the size satisfies the alignment; > before opt-in it. > > Even simpler, userspace can blindly try to create guest memfd with > transparent hugapage flag. If getting error, fallback to create without the > transparent hugepage flag. > > However, it doesn't look transparent to me. The "transparent" part is referring to the underlying kernel mechanism, it's not saying anything about the API. The "transparent" part of THP is that the kernel doesn't guarantee hugepages, i.e. whether or not hugepages are actually used is (mostly) transparent to userspace. Paolo also isn't the biggest fan[*], but there are also downsides to always allowing hugepages, e.g. silent failure due to lack of THP or unaligned size, and there's precedent in the form of MADV_HUGEPAGE. [*] https://lore.kernel.org/all/84a908ae-04c7-51c7-c9a8-119e1933a189@redhat.com From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EB22FC4332F for ; Tue, 31 Oct 2023 14:16:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=pRvpaL4ageq50g1w6bzl7vMl2ukyZzjLmH/qzEgCQ0E=; b=ewXu+7iER1lVrpFcAgTYhbSKC/ ytI/3+7xU3pQeBetuVNfFNNSQs5LBOwU55nkWri0NE3RT1KBKOw8DlgdJkNf8/RXZUVv3S6Xsrc3B GrpCtcX+Is2dbrtaV29F9BShicB2044oJqyTSZ7bj3Co2z6bHama/xP+u4YgXD7qlN1gfZQn/B+Og KOdrqTpzn3eUublWnhiZZ9F+VauwhFEfwvtgG9k+XGhYfOkAvt6B3sI8MQ69gHXdVeRGJ5Yy9hfYg O6VWQ8avldeULOMEwgL4nMy2mfWnSdwhcdQJOqgowTaV8v8IIPD3poWJkGcVI3MlGb/ydQHsLMh0b 7lH9UCVQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qxpXd-005Poe-0Y; Tue, 31 Oct 2023 14:16:25 +0000 Received: from mail-pl1-x649.google.com ([2607:f8b0:4864:20::649]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qxpXZ-005Plf-04 for linux-arm-kernel@lists.infradead.org; Tue, 31 Oct 2023 14:16:23 +0000 Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1cc29f3afe0so27231255ad.2 for ; Tue, 31 Oct 2023 07:16:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698761777; x=1699366577; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+SCWDVcx1K76tZHWI6JguONJ6NnGeKLP7+QbYfrb3Rs=; b=KBY6N2Lbd87l9vaXmLehzeqK+hRZwP8dJVF6XwdV4rjbeNUQVnHg5X0ef6Hjs1lCFk b32LAMNidP+u3WB14OP54SJ6j3+kaVOoTUWSF6ixinumkKoHwKao1oKEjozFoNYo9EBL EIUQd1AkK+xx/IUmnlByJPZkPu3T3/pVgiBb3GXdc6w0dmQdJc6Vj5kfg1Jlh/Bo5SG8 OTIOCxRJApLVL+AY4LC65gySo2ZqIj1agg1vihUSUQfoGf60dO1cVPRl6hZKMq2sD8m6 E+XmXgy3dNH3+4voCB9g4EdP2jTDU1pA1xWF7kQDps5E84fjpoaP4RzXmuWkoKMVBjox rRTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698761777; x=1699366577; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+SCWDVcx1K76tZHWI6JguONJ6NnGeKLP7+QbYfrb3Rs=; b=VbTkhu4LzTser9H1DOSRS+DPhR7GTPIzQN7EcsuIzdrdQIPtUS8gMqmZD/un0Iqgu5 swQ8SF8Gl0vGtOvfGbmqaSooaGn0STyTjMzDVcOH7VjsXNVGlhcG6Ch97F8lt3qqZG6J 6U07Vx2DJXBr3C9PSYPDvPQOdlxeekZsoneQ6TWFwPZ1CUhpbMM4Z4jXEQmFoKJqFWmB BsVRHu8Y/S6oJivMdgiN3DwJ+GDZvuya7NbzbsMsv7/sj+Vn6NdgBN1yElZSxcuW1Crw TWIVlkRpmQ53v/nHdYUxRRiQq/qcXvS2jMKUfrP5SYmgPeE2Z12PQoxz+RKWxTcLu2e8 LKpQ== X-Gm-Message-State: AOJu0YybmUyDpJUerqgwYX40D4WrDQgQN94LHirH4fiK0HUwooSpx+jF xt4dNqAprFCu6JUqeFWeXevRADa2N6A= X-Google-Smtp-Source: AGHT+IG8tFKfIuO6cbI6ukftnUzYB8muWE8fOHZgWpajrHJaT+RkL+zDaTOo8ZNIlEatfvapWXD9jFRr58g= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:2616:b0:1cc:2549:c281 with SMTP id jd22-20020a170903261600b001cc2549c281mr206233plb.13.1698761777291; Tue, 31 Oct 2023 07:16:17 -0700 (PDT) Date: Tue, 31 Oct 2023 07:16:15 -0700 In-Reply-To: <7c0844d8-6f97-4904-a140-abeabeb552c1@intel.com> Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-18-seanjc@google.com> <7c0844d8-6f97-4904-a140-abeabeb552c1@intel.com> Message-ID: Subject: Re: [PATCH v13 17/35] KVM: Add transparent hugepage support for dedicated guest memory From: Sean Christopherson To: Xiaoyao Li Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , "=?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231031_071621_059977_5776CD0C X-CRM114-Status: GOOD ( 17.03 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, Oct 31, 2023, Xiaoyao Li wrote: > On 10/28/2023 2:21 AM, Sean Christopherson wrote: > > Extended guest_memfd to allow backing guest memory with transparent > > hugepages. Require userspace to opt-in via a flag even though there's no > > known/anticipated use case for forcing small pages as THP is optional, > > i.e. to avoid ending up in a situation where userspace is unaware that > > KVM can't provide hugepages. > > Personally, it seems not so "transparent" if requiring userspace to opt-in. > > People need to 1) check if the kernel built with TRANSPARENT_HUGEPAGE > support, or check is the sysfs of transparent hugepage exists; 2)get the > maximum support hugepage size 3) ensure the size satisfies the alignment; > before opt-in it. > > Even simpler, userspace can blindly try to create guest memfd with > transparent hugapage flag. If getting error, fallback to create without the > transparent hugepage flag. > > However, it doesn't look transparent to me. The "transparent" part is referring to the underlying kernel mechanism, it's not saying anything about the API. The "transparent" part of THP is that the kernel doesn't guarantee hugepages, i.e. whether or not hugepages are actually used is (mostly) transparent to userspace. Paolo also isn't the biggest fan[*], but there are also downsides to always allowing hugepages, e.g. silent failure due to lack of THP or unaligned size, and there's precedent in the form of MADV_HUGEPAGE. [*] https://lore.kernel.org/all/84a908ae-04c7-51c7-c9a8-119e1933a189@redhat.com _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel