From mboxrd@z Thu Jan  1 00:00:00 1970
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: [RFC PATCH v2] Utilize the PCI API in the TTM framework.
Date: Tue, 11 Jan 2011 13:28:57 -0500
Message-ID: <20110111182857.GC29223@dumpdata.com>
References: <1294420304-24811-1-git-send-email-konrad.wilk@oracle.com>
 <4D2B16F3.1070105@shipmail.org>
 <20110110152135.GA9732@dumpdata.com>
 <4D2B2CC1.2050203@shipmail.org>
 <20110110164519.GA27066@dumpdata.com>
 <4D2B70FB.3000504@shipmail.org>
 <20110111155545.GD10897@dumpdata.com>
 <AANLkTimarEHNAs-1hCJf05YhkRQP2qF1D9it81NA3VTb@mail.gmail.com>
 <20110111165953.GI10897@dumpdata.com>
 <AANLkTinkP8b9GNMFqWygJb0O17neB7RLnxTi-DaG13pR@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE
Return-path: <linux-kernel-owner@vger.kernel.org>
Content-Disposition: inline
In-Reply-To: <AANLkTinkP8b9GNMFqWygJb0O17neB7RLnxTi-DaG13pR@mail.gmail.com>
Sender: linux-kernel-owner@vger.kernel.org
To: Alex Deucher <alexdeucher@gmail.com>
Cc: Thomas Hellstrom <thomas@shipmail.org>, konrad@darnok.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org

On Tue, Jan 11, 2011 at 01:12:57PM -0500, Alex Deucher wrote:
> On Tue, Jan 11, 2011 at 11:59 AM, Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com> wrote:
> >> >> Another thing that I was thinking of is what happens if you hav=
e a
> >> >> huge gart and allocate a lot of coherent memory. Could that
> >> >> potentially exhaust IOMMU resources?
> >> >
> >> > <scratches his head>
> >> >
> >> > So the GART is in the PCI space in one of the BARs of the device=
 right?
> >> > (We are talking about the discrete card GART, not the poor man A=
MD IOMMU?)
> >> > The PCI space is under the 4GB, so it would be considered cohere=
nt by
> >> > definition.
> >>
> >> GART is not a PCI BAR; it's just a remapper for system pages. =A0O=
n
> >> radeon GPUs at least there is a memory controller with 3 programma=
ble
> >> apertures: vram, internal gart, and agp gart. =A0You can map these
> >
> > To access it, ie, to program it, you would need to access the PCIe =
card
> > MMIO regions, right? So that would be considered in PCI BAR space?
>=20
> yes, you need access to the mmio aperture to configure the gpu.  I wa=
s
> thinking you mean something akin the the framebuffer BAR only for gar=
t
> space which is not the case.

Aaah, gotcha.
>=20
> >
> >> resources whereever you want in the GPU's address space and then t=
he
> >> memory controller takes care of the translation to off-board resou=
rces
> >> like gart pages. =A0On chip memory clients (display controllers, t=
exture
> >> blocks, render blocks, etc.) write to internal GPU addresses. =A0T=
he GPU
> >> has it's own direct connection to vram, so that's not an issue. =A0=
=46or
> >> AGP, the GPU specifies aperture base and size, and you point it to=
 the
> >> bus address of gart aperture provided by the northbridge's AGP
> >> controller. =A0For internal gart, the GPU has a page table stored =
in
> >
> > I think we are just talking about the GART on the GPU, not the old =
AGP
> > GART.
>=20
> Ok.  I just mentioned it for completeness.

<nods>
>=20
> >
> >> either vram or uncached system memory depending on the asic. =A0It
> >> provides a contiguous linear aperture to GPU clients and the memor=
y
> >> controller translates the transactions to the backing pages via th=
e
> >> pagetable.
> >
> > So I think I misunderstood what is meant by 'huge gart'. That sound=
s
> > like linear address space provided by GPU. And hooking up a lot of =
coherent
> > memory (so System RAM) to that linear address space would be no dif=
ferent that what
> > is currently being done. When you allocate memory using page_alloc(=
GFP_DMA32)
> > and hook up that memory to the linear space you exhaust the same am=
ount of
> > ZONE_DMA32 memory as if you were to use the PCI API. It comes from =
the same
> > pool, except that doing it from the PCI API gets you the bus addres=
s right
> > away.
> >
>=20
> In this case GPU clients refers to the hw blocks on the GPU; they are
> the ones that see the contiguous linear aperture.  From the
> application's perspective, gart memory looks like any other pages.

<nods>. Those 'hw blocks' or 'gart memory' are in reality
just pages received via the 'alloc_page()' (before this patchset and=20
also after this patchset) Oh wait, this 'hw blocks' or 'gart memory' ca=
n also
refer to the VRAM memory right? In which case that is not memory alloca=
ted via
'alloc_page' but using a different mechanism. Is TTM used then? If so h=
ow
do you stick those VRAM pages under its accounting rules? Or do the dri=
vers
use some other mechanism for that that is dependent on each driver?