From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC515C0015E for ; Wed, 12 Jul 2023 23:47:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231608AbjGLXre (ORCPT ); Wed, 12 Jul 2023 19:47:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230134AbjGLXrb (ORCPT ); Wed, 12 Jul 2023 19:47:31 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF3D2139 for ; Wed, 12 Jul 2023 16:47:25 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1b896096287so1065255ad.0 for ; Wed, 12 Jul 2023 16:47:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689205645; x=1691797645; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=o7nUtvGiaimaAY0cZqfNqB77rOB1r7Cx9gcJevlv5Iw=; b=dH1TkkwIH2N3rOv98vFjMjLNR2RzgZXxmQvbeFRMHxFC/oiz1qxPNuXga6xPpLXjvo YYscOCfrhUcI5vrQ/IQMdHShU3Li7MRlpxpTNq2PYQ1OGj0vjymqubLgQztkxs62pGRu lI2TpohTkYwtrRZDNkPZtSruf4F91iAda7vn8NPbIIgULXidctMeUZ1xMWpEzKmLk7FZ M70o/gYhexV2rzlWbmYOnVs+QnfrwDR01qu9qCiM+3iZd8SgFUgMpEp+3gNndOhmRBow +T0Ua51N+3qQ9sqijHhfoCPVp91c54/Dwki66c4p7n54uLSffPzfXMsVk7z4RbS3TD72 nHgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689205645; x=1691797645; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=o7nUtvGiaimaAY0cZqfNqB77rOB1r7Cx9gcJevlv5Iw=; b=F8vfS2SJqFz/v6p1BSUB4t7Kj4N4JwavbScp3D3aUUrJpbVJFHP0Rp1xvlWo7V3AVh qXYARc8oITyKCLXqG8crgTG0dk2RdPGHGMtEsG5D67e4QlwZAH6VeFtaLMY+qy3CpyzL 9/HRkQwPjCYo5Tnh6VRlFu+BudouI6iul0tR/e1j83vnr3bCX8ntgVFNnyNxzqgaRW7q sdGEje7fBLWI6wyRwLZ/pDOsGchNZa4bzYwLJ17YxQ7p6jaq9LzLMUfHIh01/vls28dR Qoa1H23UfgCjpICGziZ9lr+nQobNhCKz/Ghcz5QdhMbmqi4jT6Sj9CXfGXjDmK0renXz KA0g== X-Gm-Message-State: ABy/qLaHdleYp/pY8xdb63COFQacfebF9h6VLtNXv+2vpyuXiC4xz2NX AH98u+JQIOAznjXGflr3vWmuqvSnYaE= X-Google-Smtp-Source: APBJJlG8AVJYisMIvs4i+GAPybPJ3kflUqvW2Anjys1LU6hiE7fvf1/XgKFKGN5opvdzCvacxKCsy6n5hhs= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:d2cd:b0:1af:f80f:185d with SMTP id n13-20020a170902d2cd00b001aff80f185dmr545plc.4.1689205645373; Wed, 12 Jul 2023 16:47:25 -0700 (PDT) Date: Wed, 12 Jul 2023 16:47:24 -0700 In-Reply-To: <4b621470-8c58-264b-1e8b-75cec73cd7b0@gmail.com> Mime-Version: 1.0 References: <20230602005859.784190-1-seanjc@google.com> <168667299355.1927151.1998349801097712999.b4-ty@google.com> <4b621470-8c58-264b-1e8b-75cec73cd7b0@gmail.com> Message-ID: Subject: Re: [PATCH] KVM: x86/mmu: Add "never" option to allow sticky disabling of nx_huge_pages From: Sean Christopherson To: Like Xu Cc: Luiz Capitulino , Paolo Bonzini , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Li RongQing , Yong He , Robert Hoo , Kai Huang Content-Type: text/plain; charset="us-ascii" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 12, 2023, Like Xu wrote: > On 2023/6/15 03:07, Sean Christopherson wrote: > > On Wed, Jun 14, 2023, Luiz Capitulino wrote: > > > > Applied to kvm-x86 mmu. I kept the default as "auto" for now, as that can go on > > > > top and I don't want to introduce that change this late in the cycle. If no one > > > > beats me to the punch (hint, hint ;-) ), I'll post a patch to make "never" the > > > > default for unaffected hosts so that we can discuss/consider that change for 6.6. > > > > > > Thanks Sean, I agree with the plan. I could give a try on the patch if you'd like. > > > > Yes please, thanks! > > As a KVM/x86 *feature*, playing with splitting and reconstructing large > pages have other potential user scenarios, e.g. for performance test > comparisons in a easier approach, not just for itlb_multihit mitigation. Enabling and disabling dirty logging is a far better tool for that, as it gives userspace much more explicit control over what pages are are split/reconstituted, and when. > On unaffected machines (ICX and later), nx_huge_pages is already "N", > and turning it into "never" doesn't help materially in the mitigation > implementation, but loses flexibility. I'm becoming more and more convinced that losing the flexibility is perfectly acceptable. There's a very good argument to be made that mitigating DoS attacks from the guest kernel should be done several levels up, e.g. by refusing to create VMs for a customer that is bringing down hosts. As Jim has a pointed out, plugging the hole only works if you are 100% confident there are no other holes, and will never be other holes. > IMO, the real issue here is that the kernel thread "kvm-nx-lpage- > recovery" is created unconditionally. We also need to be aware of the > existence of this commit 084cc29f8bbb ("KVM: x86/MMU: Allow NX huge > pages to be disabled on a per-vm basis"). > > One of the technical proposals is to defer kvm_vm_create_worker_thread() > to kvm_mmu_create() or kvm_init_mmu(), based on > kvm->arch.disable_nx_huge_pages, even until guest paging mode is enabled > on the first vcpu. > > Is this step worth taking ? IMO, no. In hindsight, adding KVM_CAP_VM_DISABLE_NX_HUGE_PAGES was likely a mistake; requiring CAP_SYS_BOOT makes it annoyingly difficult to safely use the capability. My preference at this point is to make changes to the NX hugepage mitigation only when there is a substantial benefit to an already-deployed usecase.