From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 632B825DB0A for ; Tue, 23 Sep 2025 23:14:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758669266; cv=none; b=GHq3SK4SPm/hzaVbz0qfYjn51yKjlzcbsNWbmUPjxFWX0Y+XAEqZw4rQUDg29cA1EUL/xvnSFu2gg42KxEZtbpPn52gNIUomV5YxMgAhS4iWxjvZVZZX6sSlaFRqyfClY+QQtFY0qIR2dHvxQlwdsod3OZaVHJfRzRh/czKrSRc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758669266; c=relaxed/simple; bh=MpCZNf/GOy6NjD2pRT0oH9uqgGjRWpMlqYprIbTOK8Y=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=XbjTlHiiXJbOYvHPH9ukQF16QbfKIDddJpG1/ribHypdJ2Kq6kSLQ+YytRkneO+Q7xSt1E/qUCsN5FZmOD/xpDm5UE4iN5BwgHG344kcurzV5w+D1Vncq5ZqndQZ+/Fo26rteQ0yUn4csc25RMnxpnCrVWMawhV4Jgx1Xft7GgI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=3xCdy4jj; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="3xCdy4jj" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-32eb18b5529so5407342a91.2 for ; Tue, 23 Sep 2025 16:14:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1758669265; x=1759274065; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=FTiou86pN61TAcjufn8tJnmM28G83FXY+P+7y8ivzfs=; b=3xCdy4jjEQO8VNjQpRVsirO7yBb40rq8HGPYuL71iGKvKwCox4qHoUlRkWXb/Xnlwg IhoQ5hskXcs5AOqK7Ef7JKT1Lh+yQfLEhm3UIh7w4HTQB+8asg5KZ+rqzlra4GMvvmCr ts+C6WyaFcyGykjEqyEaoKPqH5YSo1dfA5QRC+4WLiz06BmgFlnSwUKH694JDE5k2vyN /R+Ex7hY0PawyDwXmKUHPHihnfAQ7BSRGs3474ztzK2Gz9nrkoobfTiJ5BJA11ahTkUH 27KKSVpVAr5tGlWOKThQsEJWUBxc+I0Me9968+XDQ9ySUN2bPXHJXn8ItPO3uC33oLJh BZxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758669265; x=1759274065; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FTiou86pN61TAcjufn8tJnmM28G83FXY+P+7y8ivzfs=; b=Ip9TAvxiPnrme5a9bwHqZsBBL3mao8z1WaycDJpXAzq6RjD/s15y9gcX7pGSZbc1sz prZyontZpeq96L0M6G0bfN3C5iD3IXiFYa360GOf32exDrg1xP7sFju+zPaYzwPhdPG9 tPz5oh0IBBPK9razg21P24D7hTiPzhQcncEMp2gi7c62dmP8Fz725wMDAoZdSUqIMqUi GtSxiLCuIOSZrzLTq35oFpyPy44+pnslOXCFsryo2Sz7fIPQZu+W3W74JGlyqfWKeZaZ FJJXnqObt5w4+tPgxVTY+BlxEORBXGlodimiva1ObnLS9xFGu2DCaIgwil18yKDrpRO4 ohNQ== X-Forwarded-Encrypted: i=1; AJvYcCWzn3d7DgJNjwiAfgrwk11RK7lovTIVyrDcH4mPlrE4xmp8GfM7+/Eu+N8nuDwzldYC+WGWTFDhEQK79jg=@vger.kernel.org X-Gm-Message-State: AOJu0YxM0D+O6BK3etevja5DiGZ2kBDAXPdx3oCrtHXXhJgO3aN4egn8 18P1dh2wdjx3VCiktJSPR0thHJ96PEZdQ+1Py5V7wBjyD/6JQCObS2bkbsezBocXgwjoAqgC4v9 uyHeTpQ== X-Google-Smtp-Source: AGHT+IFU+CrjvrJHu/esfPz+HcGN3HFugrI+IxFF24KE5muZangbuUY6NmTSpKqTbb1DNw37VcIcYKwJVjE= X-Received: from pjbmw15.prod.google.com ([2002:a17:90b:4d0f:b0:32e:ddac:6ea5]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1f8d:b0:32e:1b1c:f8b8 with SMTP id 98e67ed59e1d1-332a95b42c2mr4833792a91.26.1758669264699; Tue, 23 Sep 2025 16:14:24 -0700 (PDT) Date: Tue, 23 Sep 2025 16:14:23 -0700 In-Reply-To: <23f11dc1-4fd1-4286-a69a-3892a869ed33@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250813192313.132431-1-mlevitsk@redhat.com> <20250813192313.132431-3-mlevitsk@redhat.com> <7c7a5a75-a786-4a05-a836-4368582ca4c2@redhat.com> <23f11dc1-4fd1-4286-a69a-3892a869ed33@redhat.com> Message-ID: Subject: Re: [PATCH 2/3] KVM: x86: Fix a semi theoretical bug in kvm_arch_async_page_present_queued From: Sean Christopherson To: Paolo Bonzini Cc: Maxim Levitsky , kvm@vger.kernel.org, Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , x86@kernel.org, Borislav Petkov , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="us-ascii" On Tue, Sep 23, 2025, Paolo Bonzini wrote: > On 9/23/25 20:55, Sean Christopherson wrote: > > On Tue, Sep 23, 2025, Paolo Bonzini wrote: > > > On 8/13/25 21:23, Maxim Levitsky wrote: > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > > index 9018d56b4b0a..3d45a4cd08a4 100644 > > > > --- a/arch/x86/kvm/x86.c > > > > +++ b/arch/x86/kvm/x86.c > > > > @@ -13459,9 +13459,14 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, > > > > void kvm_arch_async_page_present_queued(struct kvm_vcpu *vcpu) > > > > { > > > > - kvm_make_request(KVM_REQ_APF_READY, vcpu); > > > > - if (!vcpu->arch.apf.pageready_pending) > > > > + /* Pairs with smp_store_release in vcpu_enter_guest. */ > > > > + bool in_guest_mode = (smp_load_acquire(&vcpu->mode) == IN_GUEST_MODE); > > > > + bool page_ready_pending = READ_ONCE(vcpu->arch.apf.pageready_pending); > > > > + > > > > + if (!in_guest_mode || !page_ready_pending) { > > > > + kvm_make_request(KVM_REQ_APF_READY, vcpu); > > > > kvm_vcpu_kick(vcpu); > > > > + } > > > > > > Unlike Sean, I think the race exists in abstract and is not benign > > > > How is it not benign? I never said the race doesn't exist, I said that consuming > > a stale vcpu->arch.apf.pageready_pending in kvm_arch_async_page_present_queued() > > is benign. > > In principle there is a possibility that a KVM_REQ_APF_READY is missed. I think you mean a kick (wakeup or IPI), is missed, not that the APF_READY itself is missed. I.e. KVM_REQ_APF_READY will never be lost, KVM just might enter the guest or schedule out the vCPU with the flag set. All in all, I think we're in violent agreement. I agree that kvm_vcpu_kick() could be missed (theoretically), but I'm saying that missing the kick would be benign due to a myriad of other barriers and checks, i.e. that the vCPU is guaranteed to see KVM_REQ_APF_READY anyways. E.g. my suggestion earlier regarding OUTSIDE_GUEST_MODE was to rely on the smp_mb__after_srcu_read_{,un}lock() barriers in vcpu_enter_guest() to ensure KVM_REQ_APF_READY would be observed before trying VM-Enter, and that if KVM might be in the process of emulating HLT (blocking), that either KVM_REQ_APF_READY is visible to the vCPU or that kvm_arch_async_page_present() wakes the vCPU. Oh, hilarious, async_pf_execute() also does an unconditional __kvm_vcpu_wake_up(). Huh. But isn't that a real bug? KVM doesn't consider KVM_REQ_APF_READY to be a wake event, so isn't this an actual race? vCPU async #PF kvm_check_async_pf_completion() pageready_pending = false VM-Enter HLT VM-Exit kvm_make_request(KVM_REQ_APF_READY, vcpu) kvm_vcpu_kick(vcpu) // nop as the vCPU isn't blocking, yet __kvm_vcpu_wake_up() // nop for the same reason vcpu_block() On x86, the "page ready" IRQ is only injected from vCPU context, so AFAICT nothing is guarnateed wake the vCPU in the above sequence. > broken: > > kvm_make_request(KVM_REQ_APF_READY, vcpu); > if (!vcpu->arch.apf.pageready_pending) > kvm_vcpu_kick(vcpu); > > It won't happen because set_bit() is written with asm("memory"), because x86 > set_bit() does prevent reordering at the processor level, etc. > > In other words the race is only avoided by the fact that compiler > reorderings are prevented even in cases that memory-barriers.txt does not > promise.