From mboxrd@z Thu Jan  1 00:00:00 1970
From: Wei Wang2 <wei.wang2@amd.com>
Subject: Re: [PATCH 3/4] amd iommu: Large io page support -
	enablement
Date: Mon, 6 Dec 2010 10:47:52 +0100
Message-ID: <201012061047.52379.wei.wang2@amd.com>
References: <C91E76C1.14D5C%keir@xen.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <C91E76C1.14D5C%keir@xen.org>
Content-Disposition: inline
List-Unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: Keir Fraser <keir@xen.org>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
List-Id: xen-devel@lists.xenproject.org

On Friday 03 December 2010 19:28:17 Keir Fraser wrote:
> On 03/12/2010 08:45, "Wei Wang2" <wei.wang2@amd.com> wrote:
> > On Friday 03 December 2010 17:24:53 Keir Fraser wrote:
> >> Well, let's see. The change to p2m_set_entry() now allows (superpage)
> >> calls to the iommu mapping functions even if !need_iommu(). That seems=
 a
> >> semantic change.
> >
> > That is because we have iommu_populate_page_table() which will delay io
> > page table construction until device assignment. But this function can
> > only updates io page table with 4k entries. I didn't find a better way =
to
> > tracking page orders after page allocation (Q: could we extend struct
> > page_info to cache page orders?). So my thought is to update IO page
> > table earlier. And therefore, enabling super io page will also disable
> > lazy io page table construction.
>
> How about hiding the superpage mapping stuff entirely within the existing
> iommu_[un]map_page() hooks? If you have 9 spare bits per iommu pde (seems
> very likely), you could cache in the page-directory entry how many entries
> one level down currently are suitable for coalescing into a superpage
> mapping. When a new iommu pte/pde is written, if it is a candidate for
> coalescing, increment the parent pde's count. If the count =3D=3D
> 2^superpage_order, then coalesce. You can maintain such counts in every p=
de
> up the hierarchy, for 2MB, 1GB, ... superpages.

This looks good to me. According to iommu specification, bit 63 =EF=BC=8B b=
it 1-8 in=20
pde entry should be free to use. I will implement this algorithm for the ne=
xt=20
version.
Thanks,
Wei

> Personally I think we could do similar for ordinary host p2m maintenance =
as
> well, if the bits are available. With 64-bit entries, we probably have
> sufficient bits (we only need 9 spare bits). What we have now for host p2m
> maintenance I can't say I love very much, and I don't think we need follow
> that as a model for how we introduce superpage mappings to iommu
> pagetables.
>
> Anyway, this would make your patch only touch AMD code. Similar could be
> done on the Intel side later, and for bonus points at that point perhaps
> this coalescing/uncoalescing logic could be pulled out to some degree into
> shared code.
>
>  -- Keir
>
> > Also, without need_iommu() checking both passthru and non-passthru gues=
ts
> > will get io page table allocation. Since super paging will highly reduce
> > io page table size, we might not waste too much memories here...