From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mm01.cs.columbia.edu (mm01.cs.columbia.edu [128.59.11.253]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6680AECAAA1 for ; Fri, 28 Oct 2022 16:51:37 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id A46CB4B635; Fri, 28 Oct 2022 12:51:36 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Authentication-Results: mm01.cs.columbia.edu (amavisd-new); dkim=softfail (fail, message has been altered) header.i=@google.com Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gXYf5Odmm2Bs; Fri, 28 Oct 2022 12:51:35 -0400 (EDT) Received: from mm01.cs.columbia.edu (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 463DA4B2AE; Fri, 28 Oct 2022 12:51:35 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id E33B14B20F for ; Fri, 28 Oct 2022 12:51:33 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zvESx42H+LAo for ; Fri, 28 Oct 2022 12:51:32 -0400 (EDT) Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id 6E9FE4B0D7 for ; Fri, 28 Oct 2022 12:51:32 -0400 (EDT) Received: by mail-pf1-f173.google.com with SMTP id m6so5291472pfb.0 for ; Fri, 28 Oct 2022 09:51:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=sUnyfcJ4wqSO0xrTWs88o/JJTjjWDs2reti8Xe/IcQQ=; b=ldL5pcfBfn3omzF9slxHxojXuFGUDmy5v7rvR+xOn27cGZqDgXeDSAkaRfB8MfsoYW gWG2NjXidjE4A1C8/QkcjcLxYQHrJIQFPQ6be1l8ULN9NlSJBaPeR2ZjHlib71W0w+7N XpvVD0R6JlnDgZI9I1fAkPBHcE8TX2aPKtCMI+iPCYWy+13yw1tZQWGUEzkbfc0NqYS6 IdOE+74TYzdhdzcV+aH3pqQ+yXLh1c9Um5V35VZThCVo/nSL6uwaxVds6H4gOF3Ffj+u IEbNlEtPr61tAD0vuyICQBzWqlDqDpesXBXwCYcgGTjaMW6+GfwquKP++NG+nrUurMqc qWqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=sUnyfcJ4wqSO0xrTWs88o/JJTjjWDs2reti8Xe/IcQQ=; b=bJa2Mf7JUukOqoDilXxycgYoqwLGPaaunzG0NSzXNIx920bjwDJJTHwN9o0PGkojmO SYFrBxJsy5b+JYvMCOVpyIRnaNPcFVF9bjq5cd8UPLrC67vbsdkGPNXUU3LFkMA0tIAu ygNUilaNEq7EY3ef0A89jYay0C56JR3sqegqOVPFYdUgucjJd4t91G085PMf+v11V2JP +KQXuX6ioxMkZKGZWxkIM6jvGXVZVcUVXAjS3Gck0Ruc6fAUth3SN+8ltBk0nDZvKy23 Tppnv+vRl5YgHkxGgJjhHvNBa0SonwvGsxPmcTz/bVoRguhoV0SYmn7OC/Wym1LUbNqQ roTg== X-Gm-Message-State: ACrzQf0/I+kPCu5mozbeCEpsUUbkDtiGqDTnJaKzZPPowtZo2clpZ9QE bcciShflsdVFg3pBltvEqkVtIA== X-Google-Smtp-Source: AMsMyM5Ryr2eWmEIIA6XfONaasJIcUlq05GHYZ5aaMlzGaWeANzSZA6GZ295dwKjBXLv7vFYiD1Rnw== X-Received: by 2002:a62:1a97:0:b0:562:5587:12d6 with SMTP id a145-20020a621a97000000b00562558712d6mr49177pfa.37.1666975891200; Fri, 28 Oct 2022 09:51:31 -0700 (PDT) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00177f4ef7970sm3305243plc.11.2022.10.28.09.51.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 Oct 2022 09:51:30 -0700 (PDT) Date: Fri, 28 Oct 2022 16:51:27 +0000 From: Sean Christopherson To: Gavin Shan Subject: Re: [PATCH v6 3/8] KVM: Add support for using dirty ring in conjunction with bitmap Message-ID: References: <87edv0gnb3.wl-maz@kernel.org> <878rl4gxzx.wl-maz@kernel.org> <877d0lhdo9.wl-maz@kernel.org> <875yg5glvk.wl-maz@kernel.org> <36c97b96-1427-ce05-8fce-fd21c4711af9@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <36c97b96-1427-ce05-8fce-fd21c4711af9@redhat.com> Cc: shuah@kernel.org, kvm@vger.kernel.org, Marc Zyngier , bgardon@google.com, andrew.jones@linux.dev, dmatlack@google.com, shan.gavin@gmail.com, catalin.marinas@arm.com, kvmarm@lists.linux.dev, pbonzini@redhat.com, zhenyzha@redhat.com, will@kernel.org, kvmarm@lists.cs.columbia.edu X-BeenThere: kvmarm@lists.cs.columbia.edu X-Mailman-Version: 2.1.14 Precedence: list List-Id: Where KVM/ARM decisions are made List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu On Fri, Oct 28, 2022, Gavin Shan wrote: > Hi Sean and Marc, > > On 10/28/22 2:30 AM, Marc Zyngier wrote: > > On Thu, 27 Oct 2022 18:44:51 +0100, > > Sean Christopherson wrote: > > > > > > On Thu, Oct 27, 2022, Marc Zyngier wrote: > > > > On Tue, 25 Oct 2022 18:47:12 +0100, Sean Christopherson wrote: > > [...] > > > > > > > > And ideally such bugs would detected without relying on userspace to > > > > > enabling dirty logging, e.g. the Hyper-V bug lurked for quite some > > > > > time and was only found when mark_page_dirty_in_slot() started > > > > > WARNing. > > > > > > > > > > I'm ok if arm64 wants to let userspace shoot itself in the foot with > > > > > the ITS, but I'm not ok dropping the protections in the common > > > > > mark_page_dirty_in_slot(). > > > > > > > > > > One somewhat gross idea would be to let architectures override the > > > > > "there must be a running vCPU" rule, e.g. arm64 could toggle a flag > > > > > in kvm->arch in its kvm_write_guest_lock() to note that an expected > > > > > write without a vCPU is in-progress: > > > > > > > > > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > > > > index 8c5c69ba47a7..d1da8914f749 100644 > > > > > --- a/virt/kvm/kvm_main.c > > > > > +++ b/virt/kvm/kvm_main.c > > > > > @@ -3297,7 +3297,10 @@ void mark_page_dirty_in_slot(struct kvm *kvm, > > > > > struct kvm_vcpu *vcpu = kvm_get_running_vcpu(); > > > > > #ifdef CONFIG_HAVE_KVM_DIRTY_RING > > > > > - if (WARN_ON_ONCE(!vcpu) || WARN_ON_ONCE(vcpu->kvm != kvm)) > > > > > + if (!kvm_arch_allow_write_without_running_vcpu(kvm) && WARN_ON_ONCE(!vcpu)) > > > > > + return; > > > > > + > > > > > + if (WARN_ON_ONCE(vcpu && vcpu->kvm != kvm)) > > > > > return; > > > > > #endif > > > > > @@ -3305,10 +3308,10 @@ void mark_page_dirty_in_slot(struct kvm *kvm, > > > > > unsigned long rel_gfn = gfn - memslot->base_gfn; > > > > > u32 slot = (memslot->as_id << 16) | memslot->id; > > > > > - if (kvm->dirty_ring_size) > > > > > + if (kvm->dirty_ring_size && vcpu) > > > > > kvm_dirty_ring_push(&vcpu->dirty_ring, > > > > > slot, rel_gfn); > > > > > - else > > > > > + else if (memslot->dirty_bitmap) > > > > > set_bit_le(rel_gfn, memslot->dirty_bitmap); > > > > > } > > > > > } ... > > > A slightly different alternative would be have a completely separate > > > API for writing guest memory without an associated vCPU. I.e. start > > > building up proper device emulation support. Then the vCPU-based > > > APIs could yell if a vCPU isn't provided (or there is no running > > > vCPU in the current mess). And the deviced-based API could be > > > provided if and only if the architecture actually supports emulating > > > writes from devices, i.e. x86 would not opt-in and so would even > > > have access to the API. > > > > Which is what I was putting under the "major surgery" label in my > > previous email. > > > > Anyhow, for the purpose of unblocking Gavin's series, I suggest to > > adopt your per-arch opt-out suggestion as a stop gap measure, and we > > will then be able to bike-shed for weeks on what the shape of the > > device-originated memory dirtying API should be. > > > > It's really a 'major surgery' and I would like to make sure I fully understand > 'a completely separate API for writing guest memory without an associated vCPU", > before I'm going to working on v7 for this. > > There are 7 functions and 2 macros involved as below. I assume Sean is suggesting > to add another argument, whose name can be 'has_vcpu', for these functions and macros? No. As March suggested, for your series just implement the hacky arch opt-out, don't try and do surgery at this time as that's likely going to be a months-long effort that touches a lot of cross-arch code. E.g. I believe the ARM opt-out (opt-in?) for the above hack would be bool kvm_arch_allow_write_without_running_vcpu(struct kvm *kvm) { return vgic_has_its(kvm); } _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EA0D22916 for ; Fri, 28 Oct 2022 16:51:31 +0000 (UTC) Received: by mail-pf1-f177.google.com with SMTP id y13so5240405pfp.7 for ; Fri, 28 Oct 2022 09:51:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=sUnyfcJ4wqSO0xrTWs88o/JJTjjWDs2reti8Xe/IcQQ=; b=ldL5pcfBfn3omzF9slxHxojXuFGUDmy5v7rvR+xOn27cGZqDgXeDSAkaRfB8MfsoYW gWG2NjXidjE4A1C8/QkcjcLxYQHrJIQFPQ6be1l8ULN9NlSJBaPeR2ZjHlib71W0w+7N XpvVD0R6JlnDgZI9I1fAkPBHcE8TX2aPKtCMI+iPCYWy+13yw1tZQWGUEzkbfc0NqYS6 IdOE+74TYzdhdzcV+aH3pqQ+yXLh1c9Um5V35VZThCVo/nSL6uwaxVds6H4gOF3Ffj+u IEbNlEtPr61tAD0vuyICQBzWqlDqDpesXBXwCYcgGTjaMW6+GfwquKP++NG+nrUurMqc qWqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=sUnyfcJ4wqSO0xrTWs88o/JJTjjWDs2reti8Xe/IcQQ=; b=E9WXud6zIjCAmhXEgAUPcgBRqkeDtDqZk3yM5SyGLAdQl0Nf2DMqMqatV+iILilngc lh29jPjW0cHgJKtOMXiFLpUbWNHgnrV03nYMr5KwPKUKxwlEvwKifW57RQKEEU6RML++ D+SU/OBuVs/rOEccGo4C+U1WZp50UPP8HBP0mhZsU6qOe7sjXWOxTMRSrBhoeSolUBOp ozo3HvwkIJUnZjY5oriKlSqtD3dKcuRsk2A9MqevRGmmEVaXBTE+dkfH8laMchsMC0Rt /sHs3f/vMRYw/cdQ8lKu7oHXssx0cis1srn9IIQOg+QI7SyBea4DUt6f61akrYAaVwtg N0xA== X-Gm-Message-State: ACrzQf0VQ3yAnQ85f5GfSEjbDZ/WhbHbbvIrTzYHbnjVW14+q8BUUaxB zs8Y6QrCiwFP8lUK+Ye/hkJK7g== X-Google-Smtp-Source: AMsMyM5Ryr2eWmEIIA6XfONaasJIcUlq05GHYZ5aaMlzGaWeANzSZA6GZ295dwKjBXLv7vFYiD1Rnw== X-Received: by 2002:a62:1a97:0:b0:562:5587:12d6 with SMTP id a145-20020a621a97000000b00562558712d6mr49177pfa.37.1666975891200; Fri, 28 Oct 2022 09:51:31 -0700 (PDT) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b00177f4ef7970sm3305243plc.11.2022.10.28.09.51.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 Oct 2022 09:51:30 -0700 (PDT) Date: Fri, 28 Oct 2022 16:51:27 +0000 From: Sean Christopherson To: Gavin Shan Cc: Marc Zyngier , Oliver Upton , kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, peterx@redhat.com, will@kernel.org, catalin.marinas@arm.com, bgardon@google.com, shuah@kernel.org, andrew.jones@linux.dev, dmatlack@google.com, pbonzini@redhat.com, zhenyzha@redhat.com, james.morse@arm.com, suzuki.poulose@arm.com, alexandru.elisei@arm.com, shan.gavin@gmail.com Subject: Re: [PATCH v6 3/8] KVM: Add support for using dirty ring in conjunction with bitmap Message-ID: References: <87edv0gnb3.wl-maz@kernel.org> <878rl4gxzx.wl-maz@kernel.org> <877d0lhdo9.wl-maz@kernel.org> <875yg5glvk.wl-maz@kernel.org> <36c97b96-1427-ce05-8fce-fd21c4711af9@redhat.com> Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <36c97b96-1427-ce05-8fce-fd21c4711af9@redhat.com> Message-ID: <20221028165127.XrDmMve-Q1WFXub6pW1886m1gbJFqurdLUBK_QrLAlw@z> On Fri, Oct 28, 2022, Gavin Shan wrote: > Hi Sean and Marc, > > On 10/28/22 2:30 AM, Marc Zyngier wrote: > > On Thu, 27 Oct 2022 18:44:51 +0100, > > Sean Christopherson wrote: > > > > > > On Thu, Oct 27, 2022, Marc Zyngier wrote: > > > > On Tue, 25 Oct 2022 18:47:12 +0100, Sean Christopherson wrote: > > [...] > > > > > > > > And ideally such bugs would detected without relying on userspace to > > > > > enabling dirty logging, e.g. the Hyper-V bug lurked for quite some > > > > > time and was only found when mark_page_dirty_in_slot() started > > > > > WARNing. > > > > > > > > > > I'm ok if arm64 wants to let userspace shoot itself in the foot with > > > > > the ITS, but I'm not ok dropping the protections in the common > > > > > mark_page_dirty_in_slot(). > > > > > > > > > > One somewhat gross idea would be to let architectures override the > > > > > "there must be a running vCPU" rule, e.g. arm64 could toggle a flag > > > > > in kvm->arch in its kvm_write_guest_lock() to note that an expected > > > > > write without a vCPU is in-progress: > > > > > > > > > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > > > > index 8c5c69ba47a7..d1da8914f749 100644 > > > > > --- a/virt/kvm/kvm_main.c > > > > > +++ b/virt/kvm/kvm_main.c > > > > > @@ -3297,7 +3297,10 @@ void mark_page_dirty_in_slot(struct kvm *kvm, > > > > > struct kvm_vcpu *vcpu = kvm_get_running_vcpu(); > > > > > #ifdef CONFIG_HAVE_KVM_DIRTY_RING > > > > > - if (WARN_ON_ONCE(!vcpu) || WARN_ON_ONCE(vcpu->kvm != kvm)) > > > > > + if (!kvm_arch_allow_write_without_running_vcpu(kvm) && WARN_ON_ONCE(!vcpu)) > > > > > + return; > > > > > + > > > > > + if (WARN_ON_ONCE(vcpu && vcpu->kvm != kvm)) > > > > > return; > > > > > #endif > > > > > @@ -3305,10 +3308,10 @@ void mark_page_dirty_in_slot(struct kvm *kvm, > > > > > unsigned long rel_gfn = gfn - memslot->base_gfn; > > > > > u32 slot = (memslot->as_id << 16) | memslot->id; > > > > > - if (kvm->dirty_ring_size) > > > > > + if (kvm->dirty_ring_size && vcpu) > > > > > kvm_dirty_ring_push(&vcpu->dirty_ring, > > > > > slot, rel_gfn); > > > > > - else > > > > > + else if (memslot->dirty_bitmap) > > > > > set_bit_le(rel_gfn, memslot->dirty_bitmap); > > > > > } > > > > > } ... > > > A slightly different alternative would be have a completely separate > > > API for writing guest memory without an associated vCPU. I.e. start > > > building up proper device emulation support. Then the vCPU-based > > > APIs could yell if a vCPU isn't provided (or there is no running > > > vCPU in the current mess). And the deviced-based API could be > > > provided if and only if the architecture actually supports emulating > > > writes from devices, i.e. x86 would not opt-in and so would even > > > have access to the API. > > > > Which is what I was putting under the "major surgery" label in my > > previous email. > > > > Anyhow, for the purpose of unblocking Gavin's series, I suggest to > > adopt your per-arch opt-out suggestion as a stop gap measure, and we > > will then be able to bike-shed for weeks on what the shape of the > > device-originated memory dirtying API should be. > > > > It's really a 'major surgery' and I would like to make sure I fully understand > 'a completely separate API for writing guest memory without an associated vCPU", > before I'm going to working on v7 for this. > > There are 7 functions and 2 macros involved as below. I assume Sean is suggesting > to add another argument, whose name can be 'has_vcpu', for these functions and macros? No. As March suggested, for your series just implement the hacky arch opt-out, don't try and do surgery at this time as that's likely going to be a months-long effort that touches a lot of cross-arch code. E.g. I believe the ARM opt-out (opt-in?) for the above hack would be bool kvm_arch_allow_write_without_running_vcpu(struct kvm *kvm) { return vgic_has_its(kvm); }