From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH v3] KVM: VMX: Execute WBINVD to keep data consistency with assigned devices Date: Mon, 28 Jun 2010 11:07:30 +0300 Message-ID: <4C285842.3060406@redhat.com> References: <1277696187-3571-1-git-send-email-sheng@linux.intel.com> <201006281456.05750.sheng@linux.intel.com> <4C284A88.9000303@redhat.com> <201006281541.25302.sheng@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Jan Kiszka , Marcelo Tosatti , Joerg Roedel , kvm@vger.kernel.org, "Yaozu (Eddie) Dong" To: Sheng Yang Return-path: Received: from mx1.redhat.com ([209.132.183.28]:40032 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754602Ab0F1IHf (ORCPT ); Mon, 28 Jun 2010 04:07:35 -0400 In-Reply-To: <201006281541.25302.sheng@linux.intel.com> Sender: kvm-owner@vger.kernel.org List-ID: On 06/28/2010 10:41 AM, Sheng Yang wrote: > >> Hm, the manual says (regarding clflush): >> >>> Invalidates the cache line that contains the linear address specified >>> with the source >>> operand from all levels of the processor cache hierarchy (data and >>> instruction). The >>> invalidation is broadcast throughout the cache coherence domain. If, >>> at any level of >>> the cache hierarchy, the line is inconsistent with memory (dirty) it >>> is written to >>> memory before invalidation. >>> >> So I don't think you need to queue_work_on(), instead you can work in >> vcpu thread context. But better check with someone that really knows. >> > Yeah, I've just checked the instruction as well. For it would be boardcasted, > seems we even don't need(and can't have) a dirty bitmap. So the overhead on the > large machine should be big. > > And I've calculated the times we need to execute clflush for whole guest memory. If > I calculate it right, for a 64bit guest, clflush can only cover 64 bytes one time, > so for a typical 4G guest, we would need to execute the command for 4G / 64 = 64M > times. The cycles used by clflush can be vary, suppose it would use 10 cycles each > (which sounds impossible, for involving boardcast and writeback, and not including > cache refill time for all processors), it would cost more than 0.2 seconds one time > on an 3.2Ghz machine... > Right, so clflush can't be made to work. -- error compiling committee.c: too many arguments to function