From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6F54FA3740 for ; Thu, 27 Oct 2022 17:22:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235396AbiJ0RW7 (ORCPT ); Thu, 27 Oct 2022 13:22:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234802AbiJ0RW5 (ORCPT ); Thu, 27 Oct 2022 13:22:57 -0400 Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 49BDA18C42F for ; Thu, 27 Oct 2022 10:22:56 -0700 (PDT) Received: by mail-pg1-x531.google.com with SMTP id q1so2127143pgl.11 for ; Thu, 27 Oct 2022 10:22:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=xdt63SxEY4K5w2+Q3gX7B2Gg/JGFNAEvy1g4Fz6Bigs=; b=mwkBETp7Xvfd3Q3OLg0X9IIOB8LNnUSkzSnhm0OTfyWbWfN3oK5uf763RGar2BdjU5 arLT/eHJrhVQNpkaStEFQ8+BhVfxgEUuRLlZNdORyEC8cwD+Q+4aJFnISlD+1pjt8xTW 1WtykhcYzWwr06eGi4SxRq8RPtQ7eXJB/ATOYTAhhlDOm472lps2aPwfx62D4B3bTbUm LLHQYK4KQ0xgFz+1KIYEUdzCUCnk5AAkQyX7nqE14H28sG8mvHjo1kXzAb/QB3DsP5rJ i/iYDj4qmdat4AeBPb/AT1PY2E8R7snjgQsaRSIrl0hMaMzHrj1tD2/9JLCoJsOHfCEE m4Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=xdt63SxEY4K5w2+Q3gX7B2Gg/JGFNAEvy1g4Fz6Bigs=; b=KE1RrAVLJMJftnb0ouA6JNT5cEr5MOsW98gWqaN2mIUnUtWa4R3LyDjQFDEg+ib9r9 LOd7Vko62+T4S6b3L3SjKkp9abRfzi0Flak/cwFSDLIy/doUwfyE8OMCpAEtpewGql1h mKgmBTb0rdpGfl75/dWR3MnI45zWqk9uWn3f1IrTAPyrr420nvXosVpoTIR+y8YdFk7i Phd4dNAZuKieWxExWqvsjLjljCp6HUVRAu4MZYXK1bxaQiQwk8QgOfepAcN9kKI6isyg WGVU29UMCbrhLIuVTDGa+Rep8R5S38IFu4867N+LMPjS+TnTtUlhxpTc7G3HE33WmZhb 821A== X-Gm-Message-State: ACrzQf1EnEubyEMpoyV0GJf1zz98V+vZbd3WbL/iwIfHydUU5v4ak6kk yF6SOE4U869mm2O0qbcGiziTxuWUoE7C8w== X-Google-Smtp-Source: AMsMyM4Va5X7EVlPPmVuN2nFXHsVirgEyHnheYF9OPRkjs0ttBzH9CwemB+VfbK3vYt9XDKlEtPGUA== X-Received: by 2002:a63:1a07:0:b0:46b:2825:f9cf with SMTP id a7-20020a631a07000000b0046b2825f9cfmr44154524pga.370.1666891375310; Thu, 27 Oct 2022 10:22:55 -0700 (PDT) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id i26-20020aa796fa000000b0056b8e788acesm1412983pfq.82.2022.10.27.10.22.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 Oct 2022 10:22:54 -0700 (PDT) Date: Thu, 27 Oct 2022 17:22:51 +0000 From: Sean Christopherson To: Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, mhal@rbox.co Subject: Re: [PATCH 03/16] KVM: x86: set gfn-to-pfn cache length consistently with VM word size Message-ID: References: <20221027161849.2989332-1-pbonzini@redhat.com> <20221027161849.2989332-4-pbonzini@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221027161849.2989332-4-pbonzini@redhat.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Thu, Oct 27, 2022, Paolo Bonzini wrote: > So, use the short size at activation time as well. This means > re-activating the cache if the guest requests the hypercall page > multiple times with different word sizes (this can happen when > kexec-ing, for example). I don't understand the motivation for allowing a conditionally valid GPA. I see a lot of complexity and sub-optimal error handling for a use case that no one cares about. Realistically, userspace is never going to provide a GPA that only works some of the time, because doing otherwise is just asking for a dead guest. > +static int kvm_xen_reactivate_runstate_gpcs(struct kvm *kvm) > +{ > + struct kvm_vcpu *vcpu; > + unsigned long i; > + > + kvm_for_each_vcpu(i, vcpu, kvm) { > + if (vcpu->arch.xen.runstate_cache.active) { This is not safe when called from kvm_xen_write_hypercall_page(), which doesn't acquire kvm->lock and thus doesn't guard against a concurrent call via kvm_xen_vcpu_set_attr(). That's likely a bug in itself, but even if that issue is fixed, I don't see how this is yields a better uAPI than forcing userspace to provide an address that is valid for all modes. If the address becomes bad when the guest changes the hypercall page, the guest is completely hosed through no fault of its own, whereas limiting the misaligned detection to KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR means that any "bad" address will result in immediate failure, i.e. makes it so that errors are 100% userspace misconfiguration bugs. > + int r = kvm_xen_activate_runstate_gpc(vcpu, > + vcpu->arch.xen.runstate_cache.gpa); > + if (r < 0) Returning immediately is wrong, as later vCPUs will have a valid, active cache that hasn't been verified for 64-bit mode. > + return r; > + } > + } > + return 0; > +} > + > void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) > { > struct kvm_vcpu_xen *vx = &v->arch.xen; > @@ -212,11 +243,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) > if (!vx->runstate_cache.active) > return; > > - if (IS_ENABLED(CONFIG_64BIT) && v->kvm->arch.xen.long_mode) > - user_len = sizeof(struct vcpu_runstate_info); > - else > - user_len = sizeof(struct compat_vcpu_runstate_info); > - > + user_len = kvm_xen_runstate_info_size(v->kvm); > read_lock_irqsave(&gpc->lock, flags); > while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa, > user_len)) { > @@ -461,7 +488,7 @@ int kvm_xen_hvm_set_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data) > mutex_lock(&kvm->lock); > kvm->arch.xen.long_mode = !!data->u.long_mode; > mutex_unlock(&kvm->lock); > - r = 0; > + r = kvm_xen_reactivate_runstate_gpcs(kvm); Needs to be called under kvm->lock. This path also needs to acquire kvm->srcu. > } > break; > > @@ -596,9 +623,7 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) > break; > } > > - r = kvm_gpc_activate(vcpu->kvm, &vcpu->arch.xen.runstate_cache, > - NULL, KVM_HOST_USES_PFN, data->u.gpa, > - sizeof(struct vcpu_runstate_info)); > + r = kvm_xen_activate_runstate_gpc(vcpu, data->u.gpa); > break; > > case KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT: > @@ -843,9 +868,13 @@ int kvm_xen_write_hypercall_page(struct kvm_vcpu *vcpu, u64 data) > u32 page_num = data & ~PAGE_MASK; > u64 page_addr = data & PAGE_MASK; > bool lm = is_long_mode(vcpu); > + int r; > > /* Latch long_mode for shared_info pages etc. */ > vcpu->kvm->arch.xen.long_mode = lm; > + r = kvm_xen_reactivate_runstate_gpcs(kvm); > + if (r < 0) > + return 1; Aren't we just making up behavior at this point? Injecting a #GP into the guest for what is a completely legal operation from the guest's perspective seems all kinds of wrong.