Re: [PATCH 4/6] Revert "KVM: Fix vcpu_array[0] races"

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Paolo Bonzini <pbonzini@redhat.com>
To: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Will Deacon <will@kernel.org>, Michal Luczaj <mhal@rbox.co>,
	Alexander Potapenko <glider@google.com>,
	Marc Zyngier <maz@kernel.org>,
	Oliver Upton <oliver.upton@linux.dev>
Subject: Re: [PATCH 4/6] Revert "KVM: Fix vcpu_array[0] races"
Date: Sun, 20 Oct 2024 13:23:46 +0200	[thread overview]
Message-ID: <8889dc3b-d672-41c3-8d11-e88861b7b38e@redhat.com> (raw)
In-Reply-To: <ZwgTUNCOIh2xwU6e@google.com>

On 10/10/24 19:48, Sean Christopherson wrote:
> On Thu, Oct 10, 2024, Paolo Bonzini wrote:
>> On 10/9/24 17:04, Sean Christopherson wrote:
>>> Now that KVM loads from vcpu_array if and only if the target index is
>>> valid with respect to online_vcpus, i.e. now that it is safe to erase a
>>> not-fully-onlined vCPU entry, revert to storing into vcpu_array before
>>> success is guaranteed.
>>>
>>> If xa_store() fails, which _should_ be impossible, then putting the vCPU's
>>> reference to 'struct kvm' results in a refcounting bug as the vCPU fd has
>>> been installed and owns the vCPU's reference.
>>>
>>> This was found by inspection, but forcing the xa_store() to fail
>>> confirms the problem:
>>>
>>>    | Unable to handle kernel paging request at virtual address ffff800080ecd960
>>>    | Call trace:
>>>    |  _raw_spin_lock_irq+0x2c/0x70
>>>    |  kvm_irqfd_release+0x24/0xa0
>>>    |  kvm_vm_release+0x1c/0x38
>>>    |  __fput+0x88/0x2ec
>>>    |  ____fput+0x10/0x1c
>>>    |  task_work_run+0xb0/0xd4
>>>    |  do_exit+0x210/0x854
>>>    |  do_group_exit+0x70/0x98
>>>    |  get_signal+0x6b0/0x73c
>>>    |  do_signal+0xa4/0x11e8
>>>    |  do_notify_resume+0x60/0x12c
>>>    |  el0_svc+0x64/0x68
>>>    |  el0t_64_sync_handler+0x84/0xfc
>>>    |  el0t_64_sync+0x190/0x194
>>>    | Code: b9000909 d503201f 2a1f03e1 52800028 (88e17c08)
>>>
>>> Practically speaking, this is a non-issue as xa_store() can't fail, absent
>>> a nasty kernel bug.  But the code is visually jarring and technically
>>> broken.
>>>
>>> This reverts commit afb2acb2e3a32e4d56f7fbd819769b98ed1b7520.
>>>
>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>> Cc: Michal Luczaj <mhal@rbox.co>
>>> Cc: Alexander Potapenko <glider@google.com>
>>> Cc: Marc Zyngier <maz@kernel.org>
>>> Reported-by: Will Deacon <will@kernel.org>
>>> Signed-off-by: Sean Christopherson <seanjc@google.com>
>>> ---
>>>    virt/kvm/kvm_main.c | 14 +++++---------
>>>    1 file changed, 5 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>> index fca9f74e9544..f081839521ef 100644
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -4283,7 +4283,8 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id)
>>>    	}
>>>    	vcpu->vcpu_idx = atomic_read(&kvm->online_vcpus);
>>> -	r = xa_reserve(&kvm->vcpu_array, vcpu->vcpu_idx, GFP_KERNEL_ACCOUNT);
>>> +	r = xa_insert(&kvm->vcpu_array, vcpu->vcpu_idx, vcpu, GFP_KERNEL_ACCOUNT);
>>> +	BUG_ON(r == -EBUSY);
>>>    	if (r)
>>>    		goto unlock_vcpu_destroy;
>>> @@ -4298,12 +4299,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id)
>>>    	kvm_get_kvm(kvm);
>>>    	r = create_vcpu_fd(vcpu);
>>>    	if (r < 0)
>>> -		goto kvm_put_xa_release;
>>> -
>>> -	if (KVM_BUG_ON(xa_store(&kvm->vcpu_array, vcpu->vcpu_idx, vcpu, 0), kvm)) {
>>> -		r = -EINVAL;
>>> -		goto kvm_put_xa_release;
>>> -	}
>>> +		goto kvm_put_xa_erase;
>>
>> I also find it a bit jarring though that we have to undo the insertion. This
>> is a chicken-and-egg situation where you are pick one operation B that will
>> have to undo operation A if it fails.  But what xa_store is doing, is
>> breaking this deadlock.
>>
>> The code is a bit longer, sure, but I don't see the point in complicating
>> the vcpu_array invariants and letting an entry disappear.
> 
> But we only need one rule: vcpu_array[x] is valid if and only if 'x' is less than
> online_vcpus.  And that rule is necessary regardless of whether or not vcpu_array[x]
> is filled before success is guaranteed.

Even if the invariant is explainable I still find xa_erase to be uglier 
than xa_release, but maybe it's just me.

The reason I'm not fully convinced by the explanation is that...

> I'm not concerned about the code length, it's that we need to do _something_ if
> xa_store() fails.  Yeah, it should never happen, but knowingly doing nothing feels
> all kinds of wrong.

... it seems to me that this is not just an issue in KVM code; it should 
apply to other uses of xa_reserve()/xa_store() as well.  If xa_store() 
fails after xa_reserve(), you're pretty much using the xarray API 
incorrectly... and then, just make it a BUG()?  I know that BUG() is 
frowned upon, but if the API causes invalid memory accesses when used 
incorrectly, one might as well fail as early as possible and before the 
invalid memory access becomes exploitable.

> I don't like BUG(), because it's obviously very doable to
> gracefully handle failure.

Yes, you can by using a different API.  But the point is that in the 
reserve/store case the insert failure becomes a reserve failure, never a 
store failure.

Maybe there should be an xa_store_reserved() that BUGs on failure, I 
don't know.

Paolo

next prev parent reply	other threads:[~2024-10-20 11:23 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-09 15:04 [PATCH 0/6] KVM: Fix bugs in vCPUs xarray usage Sean Christopherson
2024-10-09 15:04 ` [PATCH 1/6] KVM: Explicitly verify target vCPU is online in kvm_get_vcpu() Sean Christopherson
2024-10-10  5:31   ` Gupta, Pankaj
2024-10-10 15:54     ` Sean Christopherson
2024-10-10 16:33       ` Gupta, Pankaj
2024-10-09 15:04 ` [PATCH 2/6] KVM: Verify there's at least one online vCPU when iterating over all vCPUs Sean Christopherson
2024-10-09 15:04 ` [PATCH 3/6] KVM: Grab vcpu->mutex across installing the vCPU's fd and bumping online_vcpus Sean Christopherson
2024-10-09 15:04 ` [PATCH 4/6] Revert "KVM: Fix vcpu_array[0] races" Sean Christopherson
2024-10-10 12:46   ` Paolo Bonzini
2024-10-10 17:48     ` Sean Christopherson
2024-10-20 11:23       ` Paolo Bonzini [this message]
2024-12-16 23:05         ` Sean Christopherson
2024-10-09 15:04 ` [PATCH 5/6] KVM: Don't BUG() the kernel if xa_insert() fails with -EBUSY Sean Christopherson
2024-10-10  5:33   ` Gupta, Pankaj
2024-10-09 15:04 ` [PATCH 6/6] KVM: Drop hack that "manually" informs lockdep of kvm->lock vs. vcpu->mutex Sean Christopherson
2024-10-29 14:18 ` [PATCH 0/6] KVM: Fix bugs in vCPUs xarray usage Will Deacon
2024-12-19  2:40 ` Sean Christopherson
2024-12-19 14:18   ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8889dc3b-d672-41c3-8d11-e88861b7b38e@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=glider@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mhal@rbox.co \
    --cc=oliver.upton@linux.dev \
    --cc=seanjc@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox