From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries Date: Fri, 6 May 2011 16:34:46 +0100 Message-ID: References: <4DC40841.9020108@amd.com> <20110506150028.GF24068@whitby.uk.xensource.com> <4DC40B7A.4010605@amd.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <4DC40B7A.4010605@amd.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Christoph Egger Cc: "xen-devel@lists.xensource.com" , Tim Deegan List-Id: xen-devel@lists.xenproject.org On Fri, May 6, 2011 at 3:53 PM, Christoph Egger w= rote: > What about this: > > #define PAGE_ORDER_4K =A00 > #define PAGE_ORDER_2M =A09 > #define PAGE_ORDER_1G =A018 That would be 0, 1, and 2, respectively. I had thought about something like this, but the common usage seems to be to use L1-3 rather than 4k, 2M, or 1G; and #define PAGE_ORDER_L1 0 #define PAGE_ORDER_L2 1 #define PAGE_ORDER_L3 2 seemed a bit redundant. This patch is actually not necessary for the series -- just for the verification that it worked. I could drop this patch so we can discuss it, and send the other three by themselves (since they seem pretty uncontroversial). -George > >> >> On the other hand, maybe the array itself could have a more descriptive >> name than "stats.entries". >> >> Tim. >> >>> On 05/06/11 16:01, George Dunlap wrote: >>>> >>>> Count the number of 4kiB, 2MiB, and 1GiB p2m entries. >>>> >>>> Signed-off-by: George Dunlap >>>> >>>> diff -r 4b0692880dfa -r be5d93d38f28 xen/arch/x86/mm/hap/p2m-ept.c >>>> --- a/xen/arch/x86/mm/hap/p2m-ept.c =A0 =A0 Thu May 05 17:40:34 2011 += 0100 >>>> +++ b/xen/arch/x86/mm/hap/p2m-ept.c =A0 =A0 Fri May 06 15:01:08 2011 += 0100 >>>> @@ -39,6 +39,8 @@ >>>> >>>> =A0 #define is_epte_present(ept_entry) =A0 =A0 =A0((ept_entry)->epte& = =A0 0x7) >>>> =A0 #define is_epte_superpage(ept_entry) =A0 =A0((ept_entry)->sp) >>>> +#define is_epte_countable(ept_entry) =A0 =A0(is_epte_present(ept_entr= y) \ >>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 || ((ept_entry)->sa_p2mt =3D=3D >>>> p2m_populate_on_demand)) >>>> >>>> =A0 /* Non-ept "lock-and-check" wrapper */ >>>> =A0 static int ept_pod_check_and_populate(struct p2m_domain *p2m, unsi= gned >>>> long gfn, >>>> @@ -167,11 +169,14 @@ >>>> =A0 void ept_free_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry= , >>>> int level) >>>> =A0 { >>>> =A0 =A0 =A0 /* End if the entry is a leaf entry. */ >>>> - =A0 =A0if ( level =3D=3D 0 || !is_epte_present(ept_entry) || >>>> - =A0 =A0 =A0 =A0 is_epte_superpage(ept_entry) ) >>>> + =A0 =A0if ( level =3D=3D 0 || !is_epte_present(ept_entry) || >>>> is_epte_superpage(ept_entry) ) >>>> + =A0 =A0{ >>>> + =A0 =A0 =A0 =A0if ( is_epte_countable(ept_entry) ) >>>> + =A0 =A0 =A0 =A0 =A0 =A0p2m->stats.entries[level]--; >>>> =A0 =A0 =A0 =A0 =A0 return; >>>> + =A0 =A0} >>>> >>>> - =A0 =A0if ( level> =A0 1 ) >>>> + =A0 =A0if ( level> =A0 0 ) >>>> =A0 =A0 =A0 { >>>> =A0 =A0 =A0 =A0 =A0 ept_entry_t *epte =3D map_domain_page(ept_entry->m= fn); >>>> =A0 =A0 =A0 =A0 =A0 for ( int i =3D 0; i< =A0 EPT_PAGETABLE_ENTRIES; i= ++ ) >>>> @@ -217,7 +222,10 @@ >>>> =A0 =A0 =A0 =A0 =A0 ept_p2m_type_to_flags(epte, epte->sa_p2mt, epte->a= ccess); >>>> >>>> =A0 =A0 =A0 =A0 =A0 if ( (level - 1) =3D=3D target ) >>>> + =A0 =A0 =A0 =A0{ >>>> + =A0 =A0 =A0 =A0 =A0 =A0p2m->stats.entries[target]++; >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 continue; >>>> + =A0 =A0 =A0 =A0} >>>> >>>> =A0 =A0 =A0 =A0 =A0 ASSERT(is_epte_superpage(epte)); >>>> >>>> @@ -400,6 +408,10 @@ >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 ept_p2m_type_to_flags(&new_entry, p2mt, p2= ma); >>>> =A0 =A0 =A0 =A0 =A0 } >>>> >>>> + =A0 =A0 =A0 =A0/* old_entry will be handled by ept_free_entry below = */ >>>> + =A0 =A0 =A0 =A0if ( is_epte_countable(&new_entry) ) >>>> + =A0 =A0 =A0 =A0 =A0 =A0p2m->stats.entries[i]++; >>>> + >>>> =A0 =A0 =A0 =A0 =A0 atomic_write_ept_entry(ept_entry, new_entry); >>>> =A0 =A0 =A0 } >>>> =A0 =A0 =A0 else >>>> @@ -412,12 +424,16 @@ >>>> >>>> =A0 =A0 =A0 =A0 =A0 split_ept_entry =3D atomic_read_ept_entry(ept_entr= y); >>>> >>>> + =A0 =A0 =A0 =A0/* Accounting should be OK here; split_ept_entry bump= the >>>> counts, >>>> + =A0 =A0 =A0 =A0 * free_entry will reduce them. */ >>>> =A0 =A0 =A0 =A0 =A0 if ( !ept_split_super_page(p2m,&split_ept_entry, i= , target) ) >>>> =A0 =A0 =A0 =A0 =A0 { >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 ept_free_entry(p2m,&split_ept_entry, i); >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out; >>>> =A0 =A0 =A0 =A0 =A0 } >>>> >>>> + =A0 =A0 =A0 =A0/* We know this was countable or we wouldn't be here.= */ >>>> + =A0 =A0 =A0 =A0p2m->stats.entries[i]--; >>>> =A0 =A0 =A0 =A0 =A0 /* now install the newly split ept sub-tree */ >>>> =A0 =A0 =A0 =A0 =A0 /* NB: please make sure domian is paused and no in= -fly VT-d >>>> DMA. */ >>>> =A0 =A0 =A0 =A0 =A0 atomic_write_ept_entry(ept_entry, split_ept_entry)= ; >>>> @@ -449,9 +465,13 @@ >>>> >>>> =A0 =A0 =A0 =A0 =A0 ept_p2m_type_to_flags(&new_entry, p2mt, p2ma); >>>> >>>> + =A0 =A0 =A0 =A0/* old_entry will be handled by ept_free_entry below = */ >>>> + =A0 =A0 =A0 =A0if ( is_epte_countable(&new_entry) ) >>>> + =A0 =A0 =A0 =A0 =A0 =A0p2m->stats.entries[i]++; >>>> + >>>> =A0 =A0 =A0 =A0 =A0 atomic_write_ept_entry(ept_entry, new_entry); >>>> =A0 =A0 =A0 } >>>> - >>>> + >>>> =A0 =A0 =A0 /* Track the highest gfn for which we have ever had a vali= d >>>> mapping */ >>>> =A0 =A0 =A0 if ( mfn_valid(mfn_x(mfn))&& >>>> =A0 =A0 =A0 =A0 =A0 =A0(gfn + (1UL<< =A0 order) - 1> =A0 p2m->max_mapp= ed_pfn) ) >>>> diff -r 4b0692880dfa -r be5d93d38f28 xen/arch/x86/mm/p2m.c >>>> --- a/xen/arch/x86/mm/p2m.c =A0 =A0 Thu May 05 17:40:34 2011 +0100 >>>> +++ b/xen/arch/x86/mm/p2m.c =A0 =A0 Fri May 06 15:01:08 2011 +0100 >>>> @@ -184,11 +184,15 @@ >>>> =A0 { >>>> =A0 =A0 =A0 /* End if the entry is a leaf entry. */ >>>> =A0 =A0 =A0 if ( page_order =3D=3D 0 >>>> - =A0 =A0 =A0 =A0 || !(l1e_get_flags(*p2m_entry)& =A0 _PAGE_PRESENT) >>>> + =A0 =A0 =A0 =A0 || !(l1e_get_flags(*p2m_entry)& =A0 _PAGE_PRESENT) >>>> =A0 =A0 =A0 =A0 =A0 =A0|| (l1e_get_flags(*p2m_entry)& =A0 _PAGE_PSE) ) >>>> + =A0 =A0{ >>>> + =A0 =A0 =A0 =A0if ( l1e_get_flags(*p2m_entry) ) >>>> + =A0 =A0 =A0 =A0 =A0 =A0p2m->stats.entries[page_order/9]--; >>>> =A0 =A0 =A0 =A0 =A0 return; >>>> - >>>> - =A0 =A0if ( page_order> =A0 9 ) >>>> + =A0 =A0} >>>> + >>>> + =A0 =A0if ( page_order ) >>>> =A0 =A0 =A0 { >>>> =A0 =A0 =A0 =A0 =A0 l1_pgentry_t *l3_table =3D >>>> map_domain_page(l1e_get_pfn(*p2m_entry)); >>>> =A0 =A0 =A0 =A0 =A0 for ( int i =3D 0; i< =A0 L3_PAGETABLE_ENTRIES; i+= + ) >>>> @@ -242,6 +246,7 @@ >>>> =A0 =A0 =A0 =A0 =A0 new_entry =3D l1e_from_pfn(mfn_x(page_to_mfn(pg)), >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0__PAGE_HYPERVISOR | _PAGE_USER); >>>> >>>> + =A0 =A0 =A0 =A0/* Stats: Empty entry, no mods needed */ >>>> =A0 =A0 =A0 =A0 =A0 switch ( type ) { >>>> =A0 =A0 =A0 =A0 =A0 case PGT_l3_page_table: >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 p2m_add_iommu_flags(&new_entry, 3, >>>> IOMMUF_readable|IOMMUF_writable); >>>> @@ -285,10 +290,12 @@ >>>> =A0 =A0 =A0 =A0 =A0 { >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 new_entry =3D l1e_from_pfn(pfn + (i * L1_P= AGETABLE_ENTRIES), >>>> flags); >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 p2m_add_iommu_flags(&new_entry, 1, >>>> IOMMUF_readable|IOMMUF_writable); >>>> + =A0 =A0 =A0 =A0 =A0 =A0p2m->stats.entries[1]++; >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 p2m->write_p2m_entry(p2m, gfn, >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 l1_entry+i, *table_mfn, new_entry,= 2); >>>> =A0 =A0 =A0 =A0 =A0 } >>>> =A0 =A0 =A0 =A0 =A0 unmap_domain_page(l1_entry); >>>> + =A0 =A0 =A0 =A0p2m->stats.entries[2]--; >>>> =A0 =A0 =A0 =A0 =A0 new_entry =3D l1e_from_pfn(mfn_x(page_to_mfn(pg)), >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0__PAGE_HYPERVISOR|_PAGE_USER); >>>> //disable PSE >>>> =A0 =A0 =A0 =A0 =A0 p2m_add_iommu_flags(&new_entry, 2, >>>> IOMMUF_readable|IOMMUF_writable); >>>> @@ -320,6 +327,7 @@ >>>> =A0 =A0 =A0 =A0 =A0 { >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 new_entry =3D l1e_from_pfn(pfn + i, flags)= ; >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 p2m_add_iommu_flags(&new_entry, 0, 0); >>>> + =A0 =A0 =A0 =A0 =A0 =A0p2m->stats.entries[0]++; >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 p2m->write_p2m_entry(p2m, gfn, >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 l1_entry+i, *table_mfn, new_entry,= 1); >>>> =A0 =A0 =A0 =A0 =A0 } >>>> @@ -328,6 +336,7 @@ >>>> =A0 =A0 =A0 =A0 =A0 new_entry =3D l1e_from_pfn(mfn_x(page_to_mfn(pg)), >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0__PAGE_HYPERVISOR|_PAGE_USER); >>>> =A0 =A0 =A0 =A0 =A0 p2m_add_iommu_flags(&new_entry, 1, >>>> IOMMUF_readable|IOMMUF_writable); >>>> + =A0 =A0 =A0 =A0p2m->stats.entries[1]--; >>>> =A0 =A0 =A0 =A0 =A0 p2m->write_p2m_entry(p2m, gfn, >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 p2m_entry, *table_mfn, new_entry, 2); >>>> =A0 =A0 =A0 } >>>> @@ -908,6 +917,15 @@ >>>> =A0 void >>>> =A0 p2m_pod_dump_data(struct p2m_domain *p2m) >>>> =A0 { >>>> + =A0 =A0int i; >>>> + =A0 =A0long entries; >>>> + =A0 =A0printk(" =A0 =A0P2M entry stats:\n"); >>>> + =A0 =A0for ( i=3D0; i<3; i++) >>>> + =A0 =A0 =A0 =A0if ( (entries=3Dp2m->stats.entries[i]) ) >>>> + =A0 =A0 =A0 =A0 =A0 =A0printk(" =A0 =A0 L%d: %8ld entries, %ld bytes= \n", >>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i+1, >>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 entries, >>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 entries<<(i*9+12)); >>>> =A0 =A0 =A0 printk(" =A0 =A0PoD entries=3D%d cachesize=3D%d\n", >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0p2m->pod.entry_count, p2m->pod.count); >>>> =A0 } >>>> @@ -1475,6 +1493,12 @@ >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 old_mfn =3D l1e_get_pfn(*p2m_entry); >>>> =A0 =A0 =A0 =A0 =A0 } >>>> >>>> + =A0 =A0 =A0 =A0/* Adjust count for present/not-present entries added= */ >>>> + =A0 =A0 =A0 =A0if ( l1e_get_flags(*p2m_entry) ) >>>> + =A0 =A0 =A0 =A0 =A0 =A0p2m->stats.entries[page_order/9]--; >>>> + =A0 =A0 =A0 =A0if ( l1e_get_flags(entry_content) ) >>>> + =A0 =A0 =A0 =A0 =A0 =A0p2m->stats.entries[page_order/9]++; >>>> + >>>> =A0 =A0 =A0 =A0 =A0 p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mf= n, >>>> entry_content, 3); >>>> =A0 =A0 =A0 =A0 =A0 /* NB: paging_write_p2m_entry() handles tlb flushe= s properly >>>> */ >>>> >>>> @@ -1519,6 +1543,13 @@ >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 p2m_add_iommu_flags(&entry_content, 0, iom= mu_pte_flags); >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 old_mfn =3D l1e_get_pfn(*p2m_entry); >>>> =A0 =A0 =A0 =A0 =A0 } >>>> + >>>> + =A0 =A0 =A0 =A0/* Adjust count for present/not-present entries added= */ >>>> + =A0 =A0 =A0 =A0if ( l1e_get_flags(*p2m_entry) ) >>>> + =A0 =A0 =A0 =A0 =A0 =A0p2m->stats.entries[page_order/9]--; >>>> + =A0 =A0 =A0 =A0if ( l1e_get_flags(entry_content) ) >>>> + =A0 =A0 =A0 =A0 =A0 =A0p2m->stats.entries[page_order/9]++; >>>> + >>>> =A0 =A0 =A0 =A0 =A0 /* level 1 entry */ >>>> =A0 =A0 =A0 =A0 =A0 p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mf= n, >>>> entry_content, 1); >>>> =A0 =A0 =A0 =A0 =A0 /* NB: paging_write_p2m_entry() handles tlb flushe= s properly >>>> */ >>>> @@ -1556,6 +1587,12 @@ >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 old_mfn =3D l1e_get_pfn(*p2m_entry); >>>> =A0 =A0 =A0 =A0 =A0 } >>>> >>>> + =A0 =A0 =A0 =A0/* Adjust count for present/not-present entries added= */ >>>> + =A0 =A0 =A0 =A0if ( l1e_get_flags(*p2m_entry) ) >>>> + =A0 =A0 =A0 =A0 =A0 =A0p2m->stats.entries[page_order/9]--; >>>> + =A0 =A0 =A0 =A0if ( l1e_get_flags(entry_content) ) >>>> + =A0 =A0 =A0 =A0 =A0 =A0p2m->stats.entries[page_order/9]++; >>>> + >>>> =A0 =A0 =A0 =A0 =A0 p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mf= n, >>>> entry_content, 2); >>>> =A0 =A0 =A0 =A0 =A0 /* NB: paging_write_p2m_entry() handles tlb flushe= s properly >>>> */ >>>> >>>> @@ -2750,6 +2787,8 @@ >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 continue; >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >>>> >>>> + =A0 =A0 =A0 =A0 =A0 =A0/* STATS: Should change only type; no stats s= hould need >>>> adjustment */ >>>> + >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 l2mfn =3D _mfn(l3e_get_pfn(l3e[i3])); >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 l2e =3D map_domain_page(l3e_get_pfn(l3e[i3= ])); >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 for ( i2 =3D 0; i2< =A0 L2_PAGETABLE_ENTRI= ES; i2++ ) >>>> diff -r 4b0692880dfa -r be5d93d38f28 xen/include/asm-x86/p2m.h >>>> --- a/xen/include/asm-x86/p2m.h Thu May 05 17:40:34 2011 +0100 >>>> +++ b/xen/include/asm-x86/p2m.h Fri May 06 15:01:08 2011 +0100 >>>> @@ -278,6 +278,10 @@ >>>> =A0 =A0 =A0 =A0 =A0 unsigned =A0 =A0 =A0 =A0 reclaim_single; /* Last g= pfn of a scan */ >>>> =A0 =A0 =A0 =A0 =A0 unsigned =A0 =A0 =A0 =A0 max_guest; =A0 =A0/* gpfn= of max guest >>>> demand-populate */ >>>> =A0 =A0 =A0 } pod; >>>> + >>>> + =A0 =A0struct { >>>> + =A0 =A0 =A0 =A0long entries[3]; >>>> + =A0 =A0} stats; >>>> =A0 }; >>>> >>>> =A0 /* get host p2m table */ > > > -- > ---to satisfy European Law for business letters: > Advanced Micro Devices GmbH > Einsteinring 24, 85689 Dornach b. Muenchen > Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd > Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen > Registergericht Muenchen, HRB Nr. 43632 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >