From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Wang2 Subject: Re: [PATCH 3/4] amd iommu: Large io page support - enablement Date: Mon, 6 Dec 2010 10:47:52 +0100 Message-ID: <201012061047.52379.wei.wang2@amd.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On Friday 03 December 2010 19:28:17 Keir Fraser wrote: > On 03/12/2010 08:45, "Wei Wang2" wrote: > > On Friday 03 December 2010 17:24:53 Keir Fraser wrote: > >> Well, let's see. The change to p2m_set_entry() now allows (superpage) > >> calls to the iommu mapping functions even if !need_iommu(). That seems= a > >> semantic change. > > > > That is because we have iommu_populate_page_table() which will delay io > > page table construction until device assignment. But this function can > > only updates io page table with 4k entries. I didn't find a better way = to > > tracking page orders after page allocation (Q: could we extend struct > > page_info to cache page orders?). So my thought is to update IO page > > table earlier. And therefore, enabling super io page will also disable > > lazy io page table construction. > > How about hiding the superpage mapping stuff entirely within the existing > iommu_[un]map_page() hooks? If you have 9 spare bits per iommu pde (seems > very likely), you could cache in the page-directory entry how many entries > one level down currently are suitable for coalescing into a superpage > mapping. When a new iommu pte/pde is written, if it is a candidate for > coalescing, increment the parent pde's count. If the count =3D=3D > 2^superpage_order, then coalesce. You can maintain such counts in every p= de > up the hierarchy, for 2MB, 1GB, ... superpages. This looks good to me. According to iommu specification, bit 63 =EF=BC=8B b= it 1-8 in=20 pde entry should be free to use. I will implement this algorithm for the ne= xt=20 version. Thanks, Wei > Personally I think we could do similar for ordinary host p2m maintenance = as > well, if the bits are available. With 64-bit entries, we probably have > sufficient bits (we only need 9 spare bits). What we have now for host p2m > maintenance I can't say I love very much, and I don't think we need follow > that as a model for how we introduce superpage mappings to iommu > pagetables. > > Anyway, this would make your patch only touch AMD code. Similar could be > done on the Intel side later, and for bonus points at that point perhaps > this coalescing/uncoalescing logic could be pulled out to some degree into > shared code. > > -- Keir > > > Also, without need_iommu() checking both passthru and non-passthru gues= ts > > will get io page table allocation. Since super paging will highly reduce > > io page table size, we might not waste too much memories here...