From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F299FC433F5 for ; Wed, 20 Oct 2021 01:24:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CD603610D0 for ; Wed, 20 Oct 2021 01:24:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229711AbhJTB0P (ORCPT ); Tue, 19 Oct 2021 21:26:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44122 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229604AbhJTB0O (ORCPT ); Tue, 19 Oct 2021 21:26:14 -0400 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F316BC061746 for ; Tue, 19 Oct 2021 18:24:00 -0700 (PDT) Received: by mail-pl1-x634.google.com with SMTP id t11so14955170plq.11 for ; Tue, 19 Oct 2021 18:24:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=XBiL1+Vl4EIWz6otoAM83emtBnaUQE6re7Zy4F/Tl+c=; b=jzeiEr+S9aqIMWjCIxkL3Vz/2qUZyF/ErkeZvyr/HpMW/YUa4nzLSMVZLl4Uk7RM9J GFU9arnMgMlJUjmHO3BtrgTOlxkcsmXYfSw9e7G3jCYCxB5j9GiciLGHDPG2bfJIyLSF d/4PzOVWxoNxMAFMp1rqcbEThM3R0+iTH6SLk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=XBiL1+Vl4EIWz6otoAM83emtBnaUQE6re7Zy4F/Tl+c=; b=Us9wPC+LFlA+mMggIT5aKx/DUPNwGe7WYKYjPrv3i6yA66dSbYFdereqQMwMeCGjIe qHgV+rGiBbW/WZZZjmtEGTbMulq65Y2N9NnCxRmsaQgT3iG9HFz5HcY5axgHtqC9E8Ab rORkiLWL2Im4fFTP2ucUUqoR9vDKSLYp65HvilnWTRoUMNQ0Yhmv3kcdpr+CXeXd+Ae3 D/nWwBLzefRqbX0c9TyYmYsT/+/0a/D0rle0IZysPf13QmqR2Xp3B5BPmbaUI2kOuSZB 04rLuzeyspHtC7izjjUyt7W7J9kKXMST4fdXlngONeo8Hfgnp4O7mFUAvQHbl0V3XrgS l7+g== X-Gm-Message-State: AOAM530WPfADmPIIHub4Ra2Yo+KDGRGpMvzOMwkLRlu+rM7fb4BiYWgo rgb/Xqdut59khU618Pd0tFLa5Q== X-Google-Smtp-Source: ABdhPJyHsTnt8CO2I4siTq9gz2mS3EIOy0HlQCeT4J7kNCw9lXcsEO8FgTyCvhnVA8pGiJR7+vBWjg== X-Received: by 2002:a17:90b:248:: with SMTP id fz8mr3702172pjb.157.1634693040458; Tue, 19 Oct 2021 18:24:00 -0700 (PDT) Received: from google.com ([2409:10:2e40:5100:490f:f89:7449:e615]) by smtp.gmail.com with ESMTPSA id y1sm428665pfo.104.2021.10.19.18.23.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Oct 2021 18:23:59 -0700 (PDT) Date: Wed, 20 Oct 2021 10:23:54 +0900 From: Sergey Senozhatsky To: David Matlack Cc: Sergey Senozhatsky , Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Suleiman Souhlal , kvm list , LKML Subject: Re: [PATCHV2 1/3] KVM: x86: introduce kvm_mmu_pte_prefetch structure Message-ID: References: <20211019153214.109519-1-senozhatsky@chromium.org> <20211019153214.109519-2-senozhatsky@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On (21/10/19 15:44), David Matlack wrote: > On Tue, Oct 19, 2021 at 8:32 AM Sergey Senozhatsky > wrote: > > > > kvm_mmu_pte_prefetch is a per-VCPU structure that holds a PTE > > prefetch pages array, lock and the number of PTE to prefetch. > > > > This is needed to turn PTE_PREFETCH_NUM into a tunable VM > > parameter. > > > > Signed-off-by: Sergey Senozhatsky > > --- > > arch/x86/include/asm/kvm_host.h | 12 +++++++ > > arch/x86/kvm/mmu.h | 4 +++ > > arch/x86/kvm/mmu/mmu.c | 57 ++++++++++++++++++++++++++++++--- > > arch/x86/kvm/x86.c | 9 +++++- > > 4 files changed, 77 insertions(+), 5 deletions(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index 5271fce6cd65..11400bc3c70d 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -607,6 +607,16 @@ struct kvm_vcpu_xen { > > u64 runstate_times[4]; > > }; > > > > +struct kvm_mmu_pte_prefetch { > > + /* > > + * This will be cast either to array of pointers to struct page, > > + * or array of u64, or array of u32 > > + */ > > + void *ents; > > + unsigned int num_ents; > > + spinlock_t lock; > > The spinlock is overkill. I'd suggest something like this: > - When VM-ioctl is invoked to update prefetch count, store it in > kvm_arch. No synchronization with vCPUs needed. > - When a vCPU takes a fault: Read the prefetch count from kvm_arch. If > different than count at last fault, re-allocate vCPU prefetch array. > (So you'll need to add prefetch array and count to kvm_vcpu_arch as > well.) > > No extra locks are needed. vCPUs that fault after the VM-ioctl will > get the new prefetch count. We don't really care if a prefetch count > update races with a vCPU fault as long as vCPUs are careful to only > read the count once (i.e. use READ_ONCE(vcpu->kvm.prefetch_count)) and > use that. Assuming prefetch count ioctls are rare, the re-allocation > on the fault path will be rare as well. So reallocation from the faul-path should happen before vCPU takes the mmu_lock? And READ_ONCE(prefetch_count) should also happen before vCPU takes mmu_lock, I assume, so we need to pass it as a parameter to all the functions that will access prefetch array. > Note: You could apply this same approach to a module param, except > vCPUs would be reading the module param rather than vcpu->kvm during > each fault. > > And the other alternative, like you suggested in the other patch, is > to use a vCPU ioctl. That would side-step the synchronization issue > because vCPU ioctls require the vCPU mutex. So the reallocation could > be done in the ioctl and not at fault time. One more idea, wonder what do you think: There is an upper limit on the number of PTEs we prefault, which is 128 as of now, but I think 64 will be good enough, or maybe even 32. So we can always allocate MAX_PTE_PREFETCH_NUM arrays in vcpu->arch and ioctl() will change ->num_ents only, which is always in (0, MAX_PTE_PREFETCH_NUM - 1] range. This way we never have to reallocate anything, we just adjust the "maximum index" value. > Taking a step back, can you say a bit more about your usecase? We are looking at various ways of reducing the number of vm-exits. There is only one VM running on the device (a pretty low-end laptop).