All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Sheng Yang <sheng@linux.intel.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Joerg Roedel <joerg.roedel@amd.com>,
	kvm@vger.kernel.org, "Yaozu (Eddie) Dong" <eddie.dong@intel.com>
Subject: Re: [PATCH v3] KVM: VMX: Execute WBINVD to keep data consistency with assigned devices
Date: Mon, 28 Jun 2010 06:56:08 +0300	[thread overview]
Message-ID: <4C281D58.9090202@redhat.com> (raw)
In-Reply-To: <1277696187-3571-1-git-send-email-sheng@linux.intel.com>

On 06/28/2010 06:36 AM, Sheng Yang wrote:
> Some guest device driver may leverage the "Non-Snoop" I/O, and explicitly
> WBINVD or CLFLUSH to a RAM space. Since migration may occur before WBINVD or
> CLFLUSH, we need to maintain data consistency either by:
>    

Don't we always force enable snooping?  Or is that only for the 
processor, and you're worried about devices?

> 1: flushing cache (wbinvd) when the guest is scheduled out if there is no
> wbinvd exit, or
> 2: execute wbinvd on all dirty physical CPUs when guest wbinvd exits.
>
>
>    


>   	/* fields used by HYPER-V emulation */
>   	u64 hv_vapic;
> +
> +	cpumask_t wbinvd_dirty_mask;
>   };
>
>    

Need alloc_cpumask_var()/free_cpumask_var() for very large hosts.

>
> +static void wbinvd_ipi(void *garbage)
> +{
> +	wbinvd();
> +}
>    

Like Jan mentioned, this is quite heavy.  What about a clflush() loop 
instead?  That may take more time, but at least it's preemptible.  Of 
course, it isn't preemptible in an IPI.

> +
>   void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>   {
> +	/* Address WBINVD may be executed by guest */
> +	if (vcpu->kvm->arch.iommu_domain) {
> +		if (kvm_x86_ops->has_wbinvd_exit())
> +			cpu_set(cpu, vcpu->arch.wbinvd_dirty_mask);
> +		else if (vcpu->cpu != -1)
> +			smp_call_function_single(vcpu->cpu,
> +					wbinvd_ipi, NULL, 1);
>    

Is there any point to doing this if !has_wbinvd_exit()?  The vcpu might 
not have migrated in time, so the cache is flushed too late.

> +	}
> +
>   	kvm_x86_ops->vcpu_load(vcpu, cpu);
>   	if (unlikely(per_cpu(cpu_tsc_khz, cpu) == 0)) {
>   		unsigned long khz = cpufreq_quick_get(cpu);
> @@ -3650,6 +3664,21 @@ int emulate_invlpg(struct kvm_vcpu *vcpu, gva_t address)
>   	return X86EMUL_CONTINUE;
>   }
>
> +int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu)
> +{
> +	if (!vcpu->kvm->arch.iommu_domain)
> +		return X86EMUL_CONTINUE;
> +
> +	if (kvm_x86_ops->has_wbinvd_exit()) {
> +		smp_call_function_many(&vcpu->arch.wbinvd_dirty_mask,
> +				wbinvd_ipi, NULL, 1);
> +		cpus_clear(vcpu->arch.wbinvd_dirty_mask);
>    

Race - a migration may set a new bit in wbinvd_dirty_mask after the 
s_c_f_m().

However, it's probably benign, since we won't be entering the guest in 
that period.

> +	} else
> +		wbinvd();
> +	return X86EMUL_CONTINUE;
> +}
> +EXPORT_SYMBOL_GPL(kvm_emulate_wbinvd);
> +
>   int emulate_clts(struct kvm_vcpu *vcpu)
>   {
>   	kvm_x86_ops->set_cr0(vcpu, kvm_read_cr0_bits(vcpu, ~X86_CR0_TS));
>    


-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


  reply	other threads:[~2010-06-28  3:56 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-28  3:36 [PATCH v3] KVM: VMX: Execute WBINVD to keep data consistency with assigned devices Sheng Yang
2010-06-28  3:56 ` Avi Kivity [this message]
2010-06-28  6:42   ` Sheng Yang
2010-06-28  6:56     ` Avi Kivity
2010-06-28  6:56       ` Sheng Yang
2010-06-28  7:08         ` Avi Kivity
2010-06-28  7:41           ` Sheng Yang
2010-06-28  8:07             ` Avi Kivity
2010-06-28  8:42               ` [PATCH v4] " Sheng Yang
2010-06-28  9:27                 ` Avi Kivity
2010-06-28  9:31                   ` Gleb Natapov
2010-06-28  9:35                     ` Avi Kivity
2010-06-29  3:16                       ` [PATCH v5] " Sheng Yang
2010-06-29  9:39                         ` Avi Kivity
2010-06-29 10:32                           ` Jan Kiszka
2010-06-29 10:42                             ` Avi Kivity
2010-06-29 12:32                               ` Roedel, Joerg
2010-06-29 12:37                                 ` Avi Kivity
2010-06-29 10:14                         ` Roedel, Joerg
2010-06-29 10:44                           ` Avi Kivity
2010-06-29 12:28                             ` Roedel, Joerg
2010-06-29 12:35                               ` Avi Kivity
2010-06-29 13:34                                 ` Roedel, Joerg
2010-06-29 13:25                         ` Marcelo Tosatti
2010-06-29 13:28                           ` Avi Kivity
2010-06-29 13:35                             ` Marcelo Tosatti
2010-06-29 13:50                               ` Avi Kivity
2010-06-29 14:31                                 ` Marcelo Tosatti
2010-06-28  7:30       ` [PATCH v3] " Dong, Eddie
2010-06-28  8:04         ` Avi Kivity
2010-06-28  8:16           ` Dong, Eddie
2010-06-28  8:45             ` Jan Kiszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C281D58.9090202@redhat.com \
    --to=avi@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=jan.kiszka@siemens.com \
    --cc=joerg.roedel@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=sheng@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.