From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?B?IuKAnHRpZWp1bi5jaGVu4oCdIg==?= Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages Date: Thu, 18 Jul 2013 18:14:46 +0800 Message-ID: <51E7C016.1070808@windriver.com> References: <1374127456-9614-1-git-send-email-Bharat.Bhushan@freescale.com> <1374127456-9614-2-git-send-email-Bharat.Bhushan@freescale.com> <51E78A88.4000904@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6AFF@039-SN2MPN1-013.039d.mgd.msft.net> <51E799D0.1030406@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6DAB@039-SN2MPN1-013.039d.mgd.msft.net> <51E7A5A2.5040107@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6E79@039-SN2MPN1-013.039d.mgd.msft.net> <51E7AD7F.5080004@windriver.com> <42B0AAD2-F16A-4931-9B84-284260E4B436@suse.de> <51E7BBCA.6020908@windriver.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Bhushan Bharat-R65777 , "kvm-ppc@vger.kernel.org" , "kvm@vger.kernel.org" , Wood Scott-B07421 To: Alexander Graf Return-path: In-Reply-To: Sender: kvm-ppc-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On 07/18/2013 06:00 PM, Alexander Graf wrote: > > On 18.07.2013, at 11:56, =E2=80=9Ctiejun.chen=E2=80=9D wrote: > >> On 07/18/2013 05:44 PM, Alexander Graf wrote: >>> >>> On 18.07.2013, at 10:55, =EF=BF=BDtiejun.chen=EF=BF=BD wrote: >>> >>>> On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote: >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Bhushan Bharat-R65777 >>>>>> Sent: Thursday, July 18, 2013 1:53 PM >>>>>> To: '"=EF=BF=BDtiejun.chen=EF=BF=BD"' >>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de;= Wood Scott- >>>>>> B07421 >>>>>> Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only = for kernel >>>>>> managed pages >>>>>> >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: "=EF=BF=BDtiejun.chen=EF=BF=BD" [mailto:tiejun.chen@windr= iver.com] >>>>>>> Sent: Thursday, July 18, 2013 1:52 PM >>>>>>> To: Bhushan Bharat-R65777 >>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de= ; Wood >>>>>>> Scott- >>>>>>> B07421 >>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only= for >>>>>>> kernel managed pages >>>>>>> >>>>>>> On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: kvm-ppc-owner@vger.kernel.org >>>>>>>>> [mailto:kvm-ppc-owner@vger.kernel.org] On Behalf Of "=EF=BF=BD= tiejun.chen=EF=BF=BD" >>>>>>>>> Sent: Thursday, July 18, 2013 1:01 PM >>>>>>>>> To: Bhushan Bharat-R65777 >>>>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.= de; >>>>>>>>> Wood >>>>>>>>> Scott- >>>>>>>>> B07421 >>>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency on= ly for >>>>>>>>> kernel managed pages >>>>>>>>> >>>>>>>>> On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: "=EF=BF=BDtiejun.chen=EF=BF=BD" [mailto:tiejun.chen@w= indriver.com] >>>>>>>>>>> Sent: Thursday, July 18, 2013 11:56 AM >>>>>>>>>>> To: Bhushan Bharat-R65777 >>>>>>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@sus= e.de; >>>>>>>>>>> Wood >>>>>>>>>>> Scott- B07421; Bhushan Bharat-R65777 >>>>>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency = only >>>>>>>>>>> for kernel managed pages >>>>>>>>>>> >>>>>>>>>>> On 07/18/2013 02:04 PM, Bharat Bhushan wrote: >>>>>>>>>>>> If there is a struct page for the requested mapping then i= t's >>>>>>>>>>>> normal DDR and the mapping sets "M" bit (coherent, cacheab= le) >>>>>>>>>>>> else this is treated as I/O and we set "I + G" (cache >>>>>>>>>>>> inhibited, >>>>>>>>>>>> guarded) >>>>>>>>>>>> >>>>>>>>>>>> This helps setting proper TLB mapping for direct assigned = device >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Bharat Bhushan >>>>>>>>>>>> --- >>>>>>>>>>>> arch/powerpc/kvm/e500_mmu_host.c | 17 ++++++++++++-= ---- >>>>>>>>>>>> 1 files changed, 12 insertions(+), 5 deletions(-) >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>>> b/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>>> index 1c6a9d7..089c227 100644 >>>>>>>>>>>> --- a/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>>> +++ b/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>>> @@ -64,13 +64,20 @@ static inline u32 >>>>>>>>>>>> e500_shadow_mas3_attrib(u32 mas3, int >>>>>>>>>>> usermode) >>>>>>>>>>>> return mas3; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int >>>>>>>>>>>> usermode) >>>>>>>>>>>> +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t= pfn) >>>>>>>>>>>> { >>>>>>>>>>>> + u32 mas2_attr; >>>>>>>>>>>> + >>>>>>>>>>>> + mas2_attr =3D mas2 & MAS2_ATTRIB_MASK; >>>>>>>>>>>> + >>>>>>>>>>>> + if (!pfn_valid(pfn)) { >>>>>>>>>>> >>>>>>>>>>> Why not directly use kvm_is_mmio_pfn()? >>>>>>>>>> >>>>>>>>>> What I understand from this function (someone can correct me= ) is >>>>>>>>>> that it >>>>>>>>> returns "false" when the page is managed by kernel and is not >>>>>>>>> marked as RESERVED (for some reason). For us it does not matt= er >>>>>>>>> whether the page is reserved or not, if it is kernel visible = page then it >>>>>> is DDR. >>>>>>>>>> >>>>>>>>> >>>>>>>>> I think you are setting I|G by addressing all mmio pages, rig= ht? If >>>>>>>>> so, >>>>>>>>> >>>>>>>>> KVM: direct mmio pfn check >>>>>>>>> >>>>>>>>> Userspace may specify memory slots that are backed by = mmio >>>>>>>>> pages rather than >>>>>>>>> normal RAM. In some cases it is not enough to identif= y these >>>>>>>>> mmio >>>>>>> pages >>>>>>>>> by pfn_valid(). This patch adds checking the PageRese= rved as well. >>>>>>>> >>>>>>>> Do you know what are those "some cases" and how checking >>>>>>>> PageReserved helps in >>>>>>> those cases? >>>>>>> >>>>>>> No, myself didn't see these actual cases in qemu,too. But this = should >>>>>>> be chronically persistent as I understand ;-) >>>>>> >>>>>> Then I will wait till someone educate me :) >>>>> >>>>> The reason is , kvm_is_mmio_pfn() function looks pretty heavy and= I do not want to call this for all tlbwe operation unless it is necess= ary. >>>> >>>> Furthermore, how to distinguish we're creating TLB entry for the d= evice assigned directly to the GS? >>> >>> Because other devices wouldn't be available to the guest through me= mory slots. >> >> Yes. >> >>> >>>> I think its unnecessary to always check if that is mmio's pfn sinc= e we have more non direct assigned devices. >>> >>> I'm not sure I understand. The shadow TLB code only knows "here is = a host virtual address". It needs to figure out whether the host physic= al address behind that is RAM (can access with cache enabled) or not (h= as to disable cache) >>> >> >> Sorry, looks I'm misleading you :-P >> >>>> So maybe we can introduce another helper to fixup that TLB entry i= n instead of this path. >>> >>> This path does fix up the shadow (host) TLB entry :). >>> >> >> I just mean whether we can have a separate path dedicated to those d= irect assigned devices, not go this common path :) > > I don't think it's possible to have a separate path without a certain= level of trust. In the current flow we don't trust anyone. We just che= ck every translated page whether we should enable caching or not. > > We could take that information from 2 other side though: > > 1) Memory Slot > 2) Guest TLB Flags > > If we take it from the memory slot we would have to trust QEMU (or an= y other user space) to give us the right hints. Malicious user space co= uld set invalid flags. Also we'd have to add logic to track this - whic= h doesn't exist today. > > If we take it from the guest we have to trust the guest. Malicious gu= ests could set invalid flags. Understood. > > Now why is setting invalid flags a problem? If I understand Scott cor= rectly, it can break the host if you access certain host devices with c= aching enabled. But to be sure I'd say we ask him directly :). Yes, we should certainly set I|G for that TLB entry mapping to device. > > Either way, not trusting anyone is definitely the safer choice. Definitely :) Tiejun