From mboxrd@z Thu Jan  1 00:00:00 1970
From: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Subject: Re: [PATCH 14/15] KVM: MTRR: do not map huage page for non-consistent
 range
Date: Fri, 05 Jun 2015 14:33:36 +0800
Message-ID: <557142C0.4070103@linux.intel.com>
References: <1432983566-15773-1-git-send-email-guangrong.xiao@linux.intel.com> <1432983566-15773-15-git-send-email-guangrong.xiao@linux.intel.com> <556C27A5.1040908@redhat.com> <556E6CF8.9070602@linux.intel.com> <556EB30F.8030100@redhat.com> <55700B0D.8080808@linux.intel.com> <55700E1F.9090803@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8;
	format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
Cc: gleb@kernel.org, mtosatti@redhat.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	"Zhang, Yang Z" <yang.z.zhang@intel.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Return-path: <linux-kernel-owner@vger.kernel.org>
In-Reply-To: <55700E1F.9090803@redhat.com>
Sender: linux-kernel-owner@vger.kernel.org
List-Id: kvm.vger.kernel.org


[ CCed Zhang Yang ]

On 06/04/2015 04:36 PM, Paolo Bonzini wrote:
>
>
> On 04/06/2015 10:23, Xiao Guangrong wrote:
>>>
>>> So, why do you need to always use IPAT=3D0?  Can patch 15 keep the =
current
>>> logic for RAM, like this:
>>>
>>>      if (is_mmio || kvm_arch_has_noncoherent_dma(vcpu->kvm))
>>>          ret =3D kvm_mtrr_get_guest_memory_type(vcpu, gfn) <<
>>>                VMX_EPT_MT_EPTE_SHIFT;
>>>      else
>>>          ret =3D (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT)
>>>              | VMX_EPT_IPAT_BIT;
>>
>> Yeah, it's okay, actually we considered this way, however
>> - it's light enough, it did not hurt guest performance based on our
>>    benchmark.
>> - the logic has always used for noncherent_dma case, extend it to
>>    normal case should have low risk and also help us to check the lo=
gic.
>
> But noncoherent_dma is not the common case, so it's not necessarily t=
rue
> that the risk is low.

I thought noncoherent_dma exists on 1st generation(s) IOMMU, it should
be fully tested at that time.

>
>> - completely follow MTRRS spec would be better than host hides it.
>
> We are a virtualization platform, we know well when MTRRs are necessa=
ry.
>
> Tis a risk from blindly obeying the guest MTRRs: userspace can see st=
ale
> data if the guest's accesses bypass the cache.  AMD bypasses this by
> enabling snooping even in cases that ordinarily wouldn't snoop; for
> Intel the solution is that RAM-backed areas should always use IPAT.

Not sure if UC and other cacheable type combinations on guest and host
will cause problem. The SMD mentioned that snoop is not required only w=
hen
"The UC attribute comes from the MTRRs and the processors are not requi=
red
  to snoop their caches since the data could never have been cached."
(Vol 3. 11.5.2.2)
VMX do not touch hardware MTRR MSRs and i guess snoop works under this =
case.

I also noticed if SS (self-snooping) is supported we need not to invali=
date
cache when programming memory type (Vol 3. 11.11.8), so that means CPU =
works
well on the page which has different cache types i guess.

After think it carefully, we (Zhang Yang) doubt if always set WB for DM=
A
memory is really a good idea because we can not assume WB DMA works wel=
l for
all devices. One example is that audio DMA (not a MMIO region) is requi=
red WC
to improve its performance.

However, we think the SDM is not clear enough so let's do full vMTRR on=
 MMIO
and noncoherent_dma first.=E3=80=80:)