All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sheng Yang <sheng@linux.intel.com>
To: Avi Kivity <avi@redhat.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Joerg Roedel <joerg.roedel@amd.com>,
	kvm@vger.kernel.org, "Yaozu (Eddie) Dong" <eddie.dong@intel.com>
Subject: Re: [PATCH v3] KVM: VMX: Execute WBINVD to keep data consistency with assigned devices
Date: Mon, 28 Jun 2010 14:42:34 +0800	[thread overview]
Message-ID: <201006281442.35437.sheng@linux.intel.com> (raw)
In-Reply-To: <4C281D58.9090202@redhat.com>

On Monday 28 June 2010 11:56:08 Avi Kivity wrote:
> On 06/28/2010 06:36 AM, Sheng Yang wrote:
> > Some guest device driver may leverage the "Non-Snoop" I/O, and explicitly
> > WBINVD or CLFLUSH to a RAM space. Since migration may occur before WBINVD
> > or
> 
> > CLFLUSH, we need to maintain data consistency either by:
> Don't we always force enable snooping?  Or is that only for the
> processor, and you're worried about devices?
 
We only force enabling snooping for capable VT-d engine(with 
KVM_IOMMU_CACHE_COHERENCY flag, on most recent server board). And you're right, 
with the snooping capable VT-d engine we don't need to do all these. Would address 
it in the next version.

> > 1: flushing cache (wbinvd) when the guest is scheduled out if there is no
> > wbinvd exit, or
> > 2: execute wbinvd on all dirty physical CPUs when guest wbinvd exits.
> > 
> >   	/* fields used by HYPER-V emulation */
> >   	u64 hv_vapic;
> > 
> > +
> > +	cpumask_t wbinvd_dirty_mask;
> > 
> >   };
> 
> Need alloc_cpumask_var()/free_cpumask_var() for very large hosts.

OK.
> 
> > +static void wbinvd_ipi(void *garbage)
> > +{
> > +	wbinvd();
> > +}
> 
> Like Jan mentioned, this is quite heavy.  What about a clflush() loop
> instead?  That may take more time, but at least it's preemptible.  Of
> course, it isn't preemptible in an IPI.
 
I think this kind of behavior happened rarely, and most recent processor should 
have WBINVD exit which means it's an IPI... So I think it's maybe acceptable here.

> > +
> > 
> >   void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> >   {
> > 
> > +	/* Address WBINVD may be executed by guest */
> > +	if (vcpu->kvm->arch.iommu_domain) {
> > +		if (kvm_x86_ops->has_wbinvd_exit())
> > +			cpu_set(cpu, vcpu->arch.wbinvd_dirty_mask);
> > +		else if (vcpu->cpu != -1)
> > +			smp_call_function_single(vcpu->cpu,
> > +					wbinvd_ipi, NULL, 1);
> 
> Is there any point to doing this if !has_wbinvd_exit()?  The vcpu might
> not have migrated in time, so the cache is flushed too late.

For the !has_wbinvd_exit(), the instruction would be executed by guest and flush 
the current processor immediately. And we can ensure that it's clean in the last 
CPU, so we're fine.
 
> > +	}
> > +
> > 
> >   	kvm_x86_ops->vcpu_load(vcpu, cpu);
> >   	if (unlikely(per_cpu(cpu_tsc_khz, cpu) == 0)) {
> >   	
> >   		unsigned long khz = cpufreq_quick_get(cpu);
> > 
> > @@ -3650,6 +3664,21 @@ int emulate_invlpg(struct kvm_vcpu *vcpu, gva_t
> > address)
> > 
> >   	return X86EMUL_CONTINUE;
> >   
> >   }
> > 
> > +int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu)
> > +{
> > +	if (!vcpu->kvm->arch.iommu_domain)
> > +		return X86EMUL_CONTINUE;
> > +
> > +	if (kvm_x86_ops->has_wbinvd_exit()) {
> > +		smp_call_function_many(&vcpu->arch.wbinvd_dirty_mask,
> > +				wbinvd_ipi, NULL, 1);
> > +		cpus_clear(vcpu->arch.wbinvd_dirty_mask);
> 
> Race - a migration may set a new bit in wbinvd_dirty_mask after the
> s_c_f_m().
> 
> However, it's probably benign, since we won't be entering the guest in
> that period.

Yes. :)

--
regards
Yang, Sheng

> 
> > +	} else
> > +		wbinvd();
> > +	return X86EMUL_CONTINUE;
> > +}
> > +EXPORT_SYMBOL_GPL(kvm_emulate_wbinvd);
> > +
> > 
> >   int emulate_clts(struct kvm_vcpu *vcpu)
> >   {
> >   
> >   	kvm_x86_ops->set_cr0(vcpu, kvm_read_cr0_bits(vcpu, ~X86_CR0_TS));

  reply	other threads:[~2010-06-28  6:44 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-28  3:36 [PATCH v3] KVM: VMX: Execute WBINVD to keep data consistency with assigned devices Sheng Yang
2010-06-28  3:56 ` Avi Kivity
2010-06-28  6:42   ` Sheng Yang [this message]
2010-06-28  6:56     ` Avi Kivity
2010-06-28  6:56       ` Sheng Yang
2010-06-28  7:08         ` Avi Kivity
2010-06-28  7:41           ` Sheng Yang
2010-06-28  8:07             ` Avi Kivity
2010-06-28  8:42               ` [PATCH v4] " Sheng Yang
2010-06-28  9:27                 ` Avi Kivity
2010-06-28  9:31                   ` Gleb Natapov
2010-06-28  9:35                     ` Avi Kivity
2010-06-29  3:16                       ` [PATCH v5] " Sheng Yang
2010-06-29  9:39                         ` Avi Kivity
2010-06-29 10:32                           ` Jan Kiszka
2010-06-29 10:42                             ` Avi Kivity
2010-06-29 12:32                               ` Roedel, Joerg
2010-06-29 12:37                                 ` Avi Kivity
2010-06-29 10:14                         ` Roedel, Joerg
2010-06-29 10:44                           ` Avi Kivity
2010-06-29 12:28                             ` Roedel, Joerg
2010-06-29 12:35                               ` Avi Kivity
2010-06-29 13:34                                 ` Roedel, Joerg
2010-06-29 13:25                         ` Marcelo Tosatti
2010-06-29 13:28                           ` Avi Kivity
2010-06-29 13:35                             ` Marcelo Tosatti
2010-06-29 13:50                               ` Avi Kivity
2010-06-29 14:31                                 ` Marcelo Tosatti
2010-06-28  7:30       ` [PATCH v3] " Dong, Eddie
2010-06-28  8:04         ` Avi Kivity
2010-06-28  8:16           ` Dong, Eddie
2010-06-28  8:45             ` Jan Kiszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201006281442.35437.sheng@linux.intel.com \
    --to=sheng@linux.intel.com \
    --cc=avi@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=jan.kiszka@siemens.com \
    --cc=joerg.roedel@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.