From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?B?IuKAnHRpZWp1bi5jaGVu4oCdIg==?= Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages Date: Thu, 18 Jul 2013 18:19:56 +0800 Message-ID: <51E7C14C.6060600@windriver.com> References: <1374127456-9614-1-git-send-email-Bharat.Bhushan@freescale.com> <1374127456-9614-2-git-send-email-Bharat.Bhushan@freescale.com> <51E78A88.4000904@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6AFF@039-SN2MPN1-013.039d.mgd.msft.net> <51E799D0.1030406@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6DAB@039-SN2MPN1-013.039d.mgd.msft.net> <51E7A5A2.5040107@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6E79@039-SN2MPN1-013.039d.mgd.msft.net> <51E7BE90.60300@windriver.com> <50683BE9-8BDD-4730-B866-F972BCF1EDD6@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Bhushan Bharat-R65777 , "kvm-ppc@vger.kernel.org" , "kvm@vger.kernel.org" , Wood Scott-B07421 To: Alexander Graf Return-path: In-Reply-To: <50683BE9-8BDD-4730-B866-F972BCF1EDD6@suse.de> Sender: kvm-ppc-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On 07/18/2013 06:12 PM, Alexander Graf wrote: > > On 18.07.2013, at 12:08, =E2=80=9Ctiejun.chen=E2=80=9D wrote: > >> On 07/18/2013 05:48 PM, Alexander Graf wrote: >>> >>> On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: >>> >>>> >>>> >>>>> -----Original Message----- >>>>> From: Bhushan Bharat-R65777 >>>>> Sent: Thursday, July 18, 2013 1:53 PM >>>>> To: '"=EF=BF=BDtiejun.chen=EF=BF=BD"' >>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de; = Wood Scott- >>>>> B07421 >>>>> Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only f= or kernel >>>>> managed pages >>>>> >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: "=EF=BF=BDtiejun.chen=EF=BF=BD" [mailto:tiejun.chen@windri= ver.com] >>>>>> Sent: Thursday, July 18, 2013 1:52 PM >>>>>> To: Bhushan Bharat-R65777 >>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de;= Wood >>>>>> Scott- >>>>>> B07421 >>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only = for >>>>>> kernel managed pages >>>>>> >>>>>> On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: kvm-ppc-owner@vger.kernel.org >>>>>>>> [mailto:kvm-ppc-owner@vger.kernel.org] On Behalf Of "=EF=BF=BD= tiejun.chen=EF=BF=BD" >>>>>>>> Sent: Thursday, July 18, 2013 1:01 PM >>>>>>>> To: Bhushan Bharat-R65777 >>>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.d= e; >>>>>>>> Wood >>>>>>>> Scott- >>>>>>>> B07421 >>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency onl= y for >>>>>>>> kernel managed pages >>>>>>>> >>>>>>>> On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: "=EF=BF=BDtiejun.chen=EF=BF=BD" [mailto:tiejun.chen@wi= ndriver.com] >>>>>>>>>> Sent: Thursday, July 18, 2013 11:56 AM >>>>>>>>>> To: Bhushan Bharat-R65777 >>>>>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse= =2Ede; >>>>>>>>>> Wood >>>>>>>>>> Scott- B07421; Bhushan Bharat-R65777 >>>>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency o= nly >>>>>>>>>> for kernel managed pages >>>>>>>>>> >>>>>>>>>> On 07/18/2013 02:04 PM, Bharat Bhushan wrote: >>>>>>>>>>> If there is a struct page for the requested mapping then it= 's >>>>>>>>>>> normal DDR and the mapping sets "M" bit (coherent, cacheabl= e) >>>>>>>>>>> else this is treated as I/O and we set "I + G" (cache >>>>>>>>>>> inhibited, >>>>>>>>>>> guarded) >>>>>>>>>>> >>>>>>>>>>> This helps setting proper TLB mapping for direct assigned d= evice >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Bharat Bhushan >>>>>>>>>>> --- >>>>>>>>>>> arch/powerpc/kvm/e500_mmu_host.c | 17 ++++++++++++---= -- >>>>>>>>>>> 1 files changed, 12 insertions(+), 5 deletions(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>> b/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>> index 1c6a9d7..089c227 100644 >>>>>>>>>>> --- a/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>> +++ b/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>> @@ -64,13 +64,20 @@ static inline u32 >>>>>>>>>>> e500_shadow_mas3_attrib(u32 mas3, int >>>>>>>>>> usermode) >>>>>>>>>>> return mas3; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int >>>>>>>>>>> usermode) >>>>>>>>>>> +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t = pfn) >>>>>>>>>>> { >>>>>>>>>>> + u32 mas2_attr; >>>>>>>>>>> + >>>>>>>>>>> + mas2_attr =3D mas2 & MAS2_ATTRIB_MASK; >>>>>>>>>>> + >>>>>>>>>>> + if (!pfn_valid(pfn)) { >>>>>>>>>> >>>>>>>>>> Why not directly use kvm_is_mmio_pfn()? >>>>>>>>> >>>>>>>>> What I understand from this function (someone can correct me)= is >>>>>>>>> that it >>>>>>>> returns "false" when the page is managed by kernel and is not >>>>>>>> marked as RESERVED (for some reason). For us it does not matte= r >>>>>>>> whether the page is reserved or not, if it is kernel visible p= age then it >>>>> is DDR. >>>>>>>>> >>>>>>>> >>>>>>>> I think you are setting I|G by addressing all mmio pages, righ= t? If >>>>>>>> so, >>>>>>>> >>>>>>>> KVM: direct mmio pfn check >>>>>>>> >>>>>>>> Userspace may specify memory slots that are backed by mm= io >>>>>>>> pages rather than >>>>>>>> normal RAM. In some cases it is not enough to identify = these >>>>>>>> mmio >>>>>> pages >>>>>>>> by pfn_valid(). This patch adds checking the PageReserv= ed as well. >>>>>>> >>>>>>> Do you know what are those "some cases" and how checking >>>>>>> PageReserved helps in >>>>>> those cases? >>>>>> >>>>>> No, myself didn't see these actual cases in qemu,too. But this s= hould >>>>>> be chronically persistent as I understand ;-) >>>>> >>>>> Then I will wait till someone educate me :) >>>> >>>> The reason is , kvm_is_mmio_pfn() function looks pretty heavy and = I do not want to call this for all tlbwe operation unless it is necessa= ry. >>> >>> It certainly does more than we need and potentially slows down the = fast path (RAM mapping). The only thing it does on top of "if (pfn_vali= d())" is to check for pages that are declared reserved on the host. Thi= s happens in 2 cases: >>> >>> 1) Non cache coherent DMA >>> 2) Memory hot remove >>> >>> The non coherent DMA case would be interesting, as with the mechani= sm as it is in place in Linux today, we could potentially break normal = guest operation if we don't take it into account. However, it's Kconfig= guarded by: >>> >>> depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_C= OMMON >>> default n if PPC_47x >>> default y >>> >>> so we never hit it with any core we care about ;). >>> >>> Memory hot remove does not exist on e500 FWIW, so we don't have to = worry about that one either. >> >> Thanks for this good information :) >> >> So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside kvm_i= s_mmio_pfn() to make sure that check is only valid when that is really = needed? This can decrease those unnecessary performance loss. >> >> If I'm wrong please correct me :) > > You're perfectly right, but this is generic KVM code. So it gets run = across all architectures. What if someone has the great idea to add a n= ew case here for x86, but doesn't tell us? In that case we potentially = break x86. > > I'd rather not like to break x86 :). > > However, it'd be very interesting to see a benchmark with this change= =2E Do you think you could just rip out the whole reserved check and ru= n a few benchmarks and show us the results? > Often what case should be adopted to validate this scenario? Tiejun