Re: [PATCH 5/7] KVM: x86: Implement KVM_HYPERV_SET_TLB_FLUSH_INHIBIT

Linux Kernel Selftest development
 help / color / mirror / Atom feed

From: Nikolas Wipper <nik.wipper@gmx.de>
To: Vitaly Kuznetsov <vkuznets@redhat.com>,
	Nikolas Wipper <nikwip@amazon.de>
Cc: Nicolas Saenz Julienne <nsaenz@amazon.com>,
	Alexander Graf <graf@amazon.de>,
	James Gowans <jgowans@amazon.com>,
	nh-open-source@amazon.com,
	Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	x86@kernel.org, linux-doc@vger.kernel.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH 5/7] KVM: x86: Implement KVM_HYPERV_SET_TLB_FLUSH_INHIBIT
Date: Mon, 14 Oct 2024 20:02:19 +0200	[thread overview]
Message-ID: <a5ed763f-beba-43e9-8846-0d140f030b94@gmx.de> (raw)
In-Reply-To: <878quwgwsh.fsf@redhat.com>

On 10.10.24 10:57, Vitaly Kuznetsov wrote:
> Nikolas Wipper <nikwip@amazon.de> writes:
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 7571ac578884..ab3a9beb61a2 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -698,6 +698,8 @@ struct kvm_vcpu_hv {
>>
>>  	bool suspended;
>>  	int waiting_on;
>> +
>> +	int tlb_flush_inhibit;
>
> This is basically boolean, right? And we only make it 'int' to be able
> to store 'u8' from the ioctl? This doesn't look very clean. Do you
> envision anything but '1'/'0' in 'inhibit'? If not, maybe we can just
> make it a flag (and e.g. extend 'flags' to be u32/u64)? This way we can
> convert 'tlb_flush_inhibit' to a normal bool.
>

Yes, inhibit would always be binary, so incorporating it into the flags
sounds reasonable. Even with the current API, this could just be a bool
(tlb_flush_inhibit = inhibit == 1;)

>>  };
>>
>>  struct kvm_hypervisor_cpuid {
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index e68fbc0c7fc1..40ea8340838f 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
>> @@ -2137,6 +2137,9 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
>>  		bitmap_zero(vcpu_mask, KVM_MAX_VCPUS);
>>
>>  		kvm_for_each_vcpu(i, v, kvm) {
>> +			if (READ_ONCE(v->arch.hyperv->tlb_flush_inhibit))
>> +				goto ret_suspend;
>> +
>>  			__set_bit(i, vcpu_mask);
>>  		}
>>  	} else if (!is_guest_mode(vcpu)) {
>> @@ -2148,6 +2151,9 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
>>  				__clear_bit(i, vcpu_mask);
>>  				continue;
>>  			}
>> +
>> +			if (READ_ONCE(v->arch.hyperv->tlb_flush_inhibit))
>> +				goto ret_suspend;
>>  		}
>>  	} else {
>>  		struct kvm_vcpu_hv *hv_v;
>> @@ -2175,6 +2181,9 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
>>  						    sparse_banks))
>>  				continue;
>>
>> +			if (READ_ONCE(v->arch.hyperv->tlb_flush_inhibit))
>> +				goto ret_suspend;
>> +
>
> These READ_ONCEs make me think I misunderstand something here, please
> bear with me :-).
>
> Like we're trying to protect against 'tlb_flush_inhibit' being read
> somewhere in the beginning of the function and want to generate real
> memory accesses. But what happens if tlb_flush_inhibit changes right
> _after_ we checked it here and _before_ we actuall do
> kvm_make_vcpus_request_mask()? Wouldn't it be a problem? In case it
> would, I think we need to reverse the order: do
> kvm_make_vcpus_request_mask() anyway and after it go through vcpu_mask
> checking whether any of the affected vCPUs has 'tlb_flush_inhibit' and
> if it does, suspend the caller.
>

The case you're describing is prevented through SRCU synchronisation in
the ioctl. The hypercall actually holds a read side critical section
during the whole of its execution, so when tlb_flush_inhibit changes
after we read it, the ioctl would wait for the flushes to complete:

vCPU 0                   | vCPU 1
-------------------------+------------------------
                         | hypercall enter
                         | srcu_read_lock()
ioctl enter              |
                         | tlb_flush_inhibit read
tlb_flush_inhibit write  |
synchronize_srcu() start |
                         | TLB flush reqs send
                         | srcu_read_unlock()
synchronize_srcu() end   |
ioctl exit               |

>>  			__set_bit(i, vcpu_mask);
>>  		}
>>  	}
>> @@ -2193,6 +2202,9 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
>>  	/* We always do full TLB flush, set 'Reps completed' = 'Rep Count' */
>>  	return (u64)HV_STATUS_SUCCESS |
>>  		((u64)hc->rep_cnt << HV_HYPERCALL_REP_COMP_OFFSET);
>> +ret_suspend:
>> +	kvm_hv_vcpu_suspend_tlb_flush(vcpu, v->vcpu_id);
>> +	return -EBUSY;
>>  }
>>
>>  static void kvm_hv_send_ipi_to_many(struct kvm *kvm, u32 vector,
>> @@ -2380,6 +2392,13 @@ static int kvm_hv_hypercall_complete(struct kvm_vcpu *vcpu, u64 result)
>>  	u32 tlb_lock_count = 0;
>>  	int ret;
>>
>> +	/*
>> +	 * Reached when the hyper-call resulted in a suspension of the vCPU.
>> +	 * The instruction will be re-tried once the vCPU is unsuspended.
>> +	 */
>> +	if (kvm_hv_vcpu_suspended(vcpu))
>> +		return 1;
>> +
>>  	if (hv_result_success(result) && is_guest_mode(vcpu) &&
>>  	    kvm_hv_is_tlb_flush_hcall(vcpu) &&
>>  	    kvm_read_guest(vcpu->kvm, to_hv_vcpu(vcpu)->nested.pa_page_gpa,
>> @@ -2919,6 +2938,9 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>>
>>  void kvm_hv_vcpu_suspend_tlb_flush(struct kvm_vcpu *vcpu, int vcpu_id)
>>  {
>> +	RCU_LOCKDEP_WARN(!srcu_read_lock_held(&vcpu->kvm->srcu),
>> +			 "Suspicious Hyper-V TLB flush inhibit usage\n");
>> +
>>  	/* waiting_on's store should happen before suspended's */
>>  	WRITE_ONCE(vcpu->arch.hyperv->waiting_on, vcpu_id);
>>  	WRITE_ONCE(vcpu->arch.hyperv->suspended, true);
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 18d0a300e79a..1f925e32a927 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -4642,6 +4642,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>  	case KVM_CAP_HYPERV_CPUID:
>>  	case KVM_CAP_HYPERV_ENFORCE_CPUID:
>>  	case KVM_CAP_SYS_HYPERV_CPUID:
>> +	case KVM_CAP_HYPERV_TLB_FLUSH_INHIBIT:
>>  #endif
>>  	case KVM_CAP_PCI_SEGMENT:
>>  	case KVM_CAP_DEBUGREGS:
>> @@ -5853,6 +5854,31 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
>>  	}
>>  }
>>
>> +static int kvm_vcpu_ioctl_set_tlb_flush_inhibit(struct kvm_vcpu *vcpu,
>> +						struct kvm_hyperv_tlb_flush_inhibit *set)
>> +{
>> +	if (set->inhibit == READ_ONCE(vcpu->arch.hyperv->tlb_flush_inhibit))
>> +		return 0;
>> +
>> +	WRITE_ONCE(vcpu->arch.hyperv->tlb_flush_inhibit, set->inhibit);
>
> As you say before, vCPU ioctls are serialized and noone else sets
> tlb_flush_inhibit, do I understand correctly that
> READ_ONCE()/WRITE_ONCE() are redundant here?
>

As mentioned before, since tlb_flush_inhibit is shared it needs
these calls.

Nikolas

next prev parent reply	other threads:[~2024-10-14 18:02 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-04 14:08 [PATCH 0/7] KVM: x86: Introduce new ioctl KVM_HYPERV_SET_TLB_FLUSH_INHIBIT Nikolas Wipper
2024-10-04 14:08 ` [PATCH 1/7] KVM: Add API documentation for KVM_HYPERV_SET_TLB_FLUSH_INHIBIT Nikolas Wipper
2024-10-10  8:57   ` Vitaly Kuznetsov
2024-10-04 14:08 ` [PATCH 2/7] KVM: x86: Implement Hyper-V's vCPU suspended state Nikolas Wipper
2024-10-10  8:57   ` Vitaly Kuznetsov
2024-10-14 17:50     ` Nikolas Wipper
2024-10-15  8:18       ` Vitaly Kuznetsov
2024-10-15 15:58         ` Sean Christopherson
2024-10-15 17:16           ` Nicolas Saenz Julienne
2024-10-15 17:51             ` Sean Christopherson
2024-10-15 17:40           ` Nikolas Wipper
2024-10-15 18:10             ` Sean Christopherson
2024-10-04 14:08 ` [PATCH 3/7] KVM: x86: Check vCPUs before enqueuing TLB flushes in kvm_hv_flush_tlb() Nikolas Wipper
2024-10-04 14:08 ` [PATCH 4/7] KVM: Introduce KVM_HYPERV_SET_TLB_FLUSH_INHIBIT Nikolas Wipper
2024-10-10  8:57   ` Vitaly Kuznetsov
2024-10-04 14:08 ` [PATCH 5/7] KVM: x86: Implement KVM_HYPERV_SET_TLB_FLUSH_INHIBIT Nikolas Wipper
2024-10-10  8:57   ` Vitaly Kuznetsov
2024-10-14 18:02     ` Nikolas Wipper [this message]
2024-10-04 14:08 ` [PATCH 6/7] KVM: x86: Add trace events to track Hyper-V suspensions Nikolas Wipper
2024-10-04 14:08 ` [PATCH 7/7] KVM: selftests: Add tests for KVM_HYPERV_SET_TLB_FLUSH_INHIBIT Nikolas Wipper
2024-10-14 23:36 ` [PATCH 0/7] KVM: x86: Introduce new ioctl KVM_HYPERV_SET_TLB_FLUSH_INHIBIT Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a5ed763f-beba-43e9-8846-0d140f030b94@gmx.de \
    --to=nik.wipper@gmx.de \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=graf@amazon.de \
    --cc=jgowans@amazon.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nh-open-source@amazon.com \
    --cc=nikwip@amazon.de \
    --cc=nsaenz@amazon.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox